GitHub #10 reopened: operator mikeakasully cloned v2.0.50 fresh and ran the
canonical quickstart (docker compose -f deploy/docker-compose.yml up -d --build);
postgres reported unhealthy indefinitely, dependent containers never started.
Root cause: deploy/docker-compose.yml mounted a hand-curated subset of
migrations/*.up.sql + seed.sql into postgres /docker-entrypoint-initdb.d/.
Postgres applied them at initdb time. Once seed.sql referenced columns added
by migrations *after* the mounted cutoff (e.g., policy_rules.severity from
migration 000013), initdb crashed mid-seed and the container loop wedged.
Two sources of truth (compose mount list vs in-tree migration ladder)
diverged the moment a seed-touching migration shipped, and the only thing
that fixed it was hand-editing the compose file every release.
Fix: remove the dual source. Postgres boots empty; the server applies
migrations + seed at startup via RunMigrations + RunSeed. Helm has used
this pattern since day one (postgres-init emptyDir); compose now matches.
Bundled with four ride-along audit findings whose fixes share the same
schema/db code surface, so operators take the schema-change pain only once:
cat-u-seed_initdb_schema_drift [P1, primary] — initdb-mount fix
cat-o-retry_interval_unit_mismatch [P1] — column rename minutes→seconds
cat-o-notification_created_at_dead_field [P2] — add column + populate
cat-o-health_check_column_orphans [P1] — drop unwired columns
cat-u-no_version_endpoint [P2] — add /api/v1/version
Single migration (000017_db_coupling_cleanup) bundles the three schema
changes under a DO \$\$ guard so re-application is safe; reduces
operator-visible 'schema-change releases' from four to one.
Backend
- internal/repository/postgres/db.go: add RunSeed (baseline) + RunDemoSeed
(gated by CERTCTL_DEMO_SEED). Both idempotent (ON CONFLICT DO NOTHING in
every shipped INSERT) so repeated boots are safe; missing-file is no-op
so custom packaging that strips seeds still boots cleanly.
- cmd/server/main.go: invoke RunSeed (always) + RunDemoSeed (when flag set)
immediately after RunMigrations.
- internal/repository/postgres/notification.go: NotificationRepository.Create
now sets created_at (with time.Now() fallback when caller leaves it zero);
scanNotification reads it back; List + ListRetryEligible SELECT extended.
- internal/repository/postgres/renewal_policy.go: column references updated
to retry_interval_seconds across SELECT/INSERT/UPDATE sites.
- internal/api/handler/version.go: new VersionHandler exposes
{version, commit, modified, build_time, go_version} from
runtime/debug.ReadBuildInfo() with ldflags-supplied Version override.
- internal/api/router/router.go: register GET /api/v1/version through the
no-auth chain (CORS + ContentType) alongside /health, /ready,
/api/v1/auth/info.
- cmd/server/main.go: add /api/v1/version to no-auth dispatch + audit
ExcludePaths so rollout polling doesn't dominate the audit trail.
- internal/config/config.go: add DatabaseConfig.DemoSeed +
CERTCTL_DEMO_SEED env var.
Migration
- migrations/000017_db_coupling_cleanup.up.sql + .down.sql:
(1) renewal_policies.retry_interval_minutes → retry_interval_seconds
(DO \$\$ guard, idempotent re-application)
(2) notification_events ADD COLUMN created_at TIMESTAMPTZ
NOT NULL DEFAULT NOW()
(3) network_scan_targets DROP orphan health_check_enabled +
health_check_interval_seconds
- migrations/seed.sql: column reference updated to retry_interval_seconds.
- migrations/seed_demo.sql: same column rename + applied at runtime now via
RunDemoSeed (no longer initdb-mounted).
Compose
- deploy/docker-compose.yml: drop ALL initdb mounts (10 migration files +
seed.sql); add start_period: 30s to postgres + certctl-server healthchecks
to absorb the runtime migration + seed application window on first boot.
- deploy/docker-compose.test.yml: same drop (+ ghost seed_test.sql mount
removed; that file never existed); same healthcheck start_period.
- deploy/docker-compose.demo.yml: replace seed_demo.sql initdb mount with
CERTCTL_DEMO_SEED=true env var on certctl-server.
Tests
- internal/api/handler/version_handler_test.go: TestVersion_ReturnsBuildInfo,
TestVersion_RejectsNonGet, TestVersion_LdflagsOverride.
- internal/repository/postgres/seed_test.go: TestRunSeed_AppliesIdempotently,
TestRunSeed_MissingFileIsNoOp, TestRunDemoSeed_AppliesIdempotently,
TestMigration000017_RetryIntervalRename,
TestMigration000017_NotificationCreatedAt,
TestMigration000017_HealthCheckOrphansDropped (testcontainers, -short skips).
- internal/repository/postgres/notification_test.go:
TestNotificationRepository_CreatedAt_IsPersisted +
TestNotificationRepository_CreatedAt_DefaultsToNow.
CI guardrail
- .github/workflows/ci.yml: new 'Forbidden migration mount in compose initdb
(U-3)' step grep-fails the build if any migrations/*.sql or seed*.sql
re-appears in /docker-entrypoint-initdb.d in any compose file. Catches
future drift before a fresh-clone operator hits it.
Spec / Docs
- api/openapi.yaml: add /api/v1/version operation under Health tag.
- docs/architecture.md: replace the 'initdb may run the same SQL' paragraph
with a post-U-3 single-source-of-truth explanation.
- CHANGELOG.md: full unreleased-section entry covering all 5 closures,
breaking changes, and the new env var.
Audit doc
- coverage-gap-audit-2026-04-24-v5/unified-audit.md: add new P1 #14
cat-u-seed_initdb_schema_drift; flip the 4 ride-along findings to
✅ RESOLVED with closure prose pointing at this commit.
Verification: build/vet/test -short -race all clean across all touched
packages locally; govulncheck reports 0 vulnerabilities affecting our
code; OpenAPI YAML parses; CI U-3 grep guardrail clears against the
post-fix tree.
Rewrote docs/features.md from scratch as authoritative feature inventory
(1255 lines, every claim verified against source files).
Audited README.md and architecture.md against repo — fixed 19 stale
references: K8s Secrets status, issuer counts, dashboard page counts,
CI thresholds, missing connectors in Mermaid diagrams, OpenAPI operation
count, GetCACertPEM behavior, and V2/V4 roadmap accuracy.
Also includes related fixes discovered during audit:
- Scheduler skips expired/failed/revoked certs from auto-renewal
- Seed demo expiry dates moved outside 31-day scheduler query window
- Agent pages use correct last_heartbeat_at field name
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4-step wizard (Connect CA → Deploy Agent → Add Certificate → Done) shown
on fresh installs when no user-configured issuers or certificates exist.
Auto-seeded env var issuers (source="env") are excluded from first-run
detection. Wizard state latches to prevent query refetches from dismissing
it mid-flow. Split docker-compose into clean default (wizard-compatible)
and demo override (seed_demo.sql). Added missing migrations 000009/000010
to test compose.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>