Files
certctl/internal/api/handler/version_handler_test.go
T
shankar0123 a3d8b9c607 fix(deploy,db,handler): close fresh-clone postgres init failure + 4 ride-along audit findings (U-3 master)
GitHub #10 reopened: operator mikeakasully cloned v2.0.50 fresh and ran the
canonical quickstart (docker compose -f deploy/docker-compose.yml up -d --build);
postgres reported unhealthy indefinitely, dependent containers never started.

Root cause: deploy/docker-compose.yml mounted a hand-curated subset of
migrations/*.up.sql + seed.sql into postgres /docker-entrypoint-initdb.d/.
Postgres applied them at initdb time. Once seed.sql referenced columns added
by migrations *after* the mounted cutoff (e.g., policy_rules.severity from
migration 000013), initdb crashed mid-seed and the container loop wedged.
Two sources of truth (compose mount list vs in-tree migration ladder)
diverged the moment a seed-touching migration shipped, and the only thing
that fixed it was hand-editing the compose file every release.

Fix: remove the dual source. Postgres boots empty; the server applies
migrations + seed at startup via RunMigrations + RunSeed. Helm has used
this pattern since day one (postgres-init emptyDir); compose now matches.

Bundled with four ride-along audit findings whose fixes share the same
schema/db code surface, so operators take the schema-change pain only once:

  cat-u-seed_initdb_schema_drift           [P1, primary] — initdb-mount fix
  cat-o-retry_interval_unit_mismatch       [P1] — column rename minutes→seconds
  cat-o-notification_created_at_dead_field [P2] — add column + populate
  cat-o-health_check_column_orphans        [P1] — drop unwired columns
  cat-u-no_version_endpoint                [P2] — add /api/v1/version

Single migration (000017_db_coupling_cleanup) bundles the three schema
changes under a DO \$\$ guard so re-application is safe; reduces
operator-visible 'schema-change releases' from four to one.

Backend
- internal/repository/postgres/db.go: add RunSeed (baseline) + RunDemoSeed
  (gated by CERTCTL_DEMO_SEED). Both idempotent (ON CONFLICT DO NOTHING in
  every shipped INSERT) so repeated boots are safe; missing-file is no-op
  so custom packaging that strips seeds still boots cleanly.
- cmd/server/main.go: invoke RunSeed (always) + RunDemoSeed (when flag set)
  immediately after RunMigrations.
- internal/repository/postgres/notification.go: NotificationRepository.Create
  now sets created_at (with time.Now() fallback when caller leaves it zero);
  scanNotification reads it back; List + ListRetryEligible SELECT extended.
- internal/repository/postgres/renewal_policy.go: column references updated
  to retry_interval_seconds across SELECT/INSERT/UPDATE sites.
- internal/api/handler/version.go: new VersionHandler exposes
  {version, commit, modified, build_time, go_version} from
  runtime/debug.ReadBuildInfo() with ldflags-supplied Version override.
- internal/api/router/router.go: register GET /api/v1/version through the
  no-auth chain (CORS + ContentType) alongside /health, /ready,
  /api/v1/auth/info.
- cmd/server/main.go: add /api/v1/version to no-auth dispatch + audit
  ExcludePaths so rollout polling doesn't dominate the audit trail.
- internal/config/config.go: add DatabaseConfig.DemoSeed +
  CERTCTL_DEMO_SEED env var.

Migration
- migrations/000017_db_coupling_cleanup.up.sql + .down.sql:
    (1) renewal_policies.retry_interval_minutes → retry_interval_seconds
        (DO \$\$ guard, idempotent re-application)
    (2) notification_events ADD COLUMN created_at TIMESTAMPTZ
        NOT NULL DEFAULT NOW()
    (3) network_scan_targets DROP orphan health_check_enabled +
        health_check_interval_seconds
- migrations/seed.sql: column reference updated to retry_interval_seconds.
- migrations/seed_demo.sql: same column rename + applied at runtime now via
  RunDemoSeed (no longer initdb-mounted).

Compose
- deploy/docker-compose.yml: drop ALL initdb mounts (10 migration files +
  seed.sql); add start_period: 30s to postgres + certctl-server healthchecks
  to absorb the runtime migration + seed application window on first boot.
- deploy/docker-compose.test.yml: same drop (+ ghost seed_test.sql mount
  removed; that file never existed); same healthcheck start_period.
- deploy/docker-compose.demo.yml: replace seed_demo.sql initdb mount with
  CERTCTL_DEMO_SEED=true env var on certctl-server.

Tests
- internal/api/handler/version_handler_test.go: TestVersion_ReturnsBuildInfo,
  TestVersion_RejectsNonGet, TestVersion_LdflagsOverride.
- internal/repository/postgres/seed_test.go: TestRunSeed_AppliesIdempotently,
  TestRunSeed_MissingFileIsNoOp, TestRunDemoSeed_AppliesIdempotently,
  TestMigration000017_RetryIntervalRename,
  TestMigration000017_NotificationCreatedAt,
  TestMigration000017_HealthCheckOrphansDropped (testcontainers, -short skips).
- internal/repository/postgres/notification_test.go:
  TestNotificationRepository_CreatedAt_IsPersisted +
  TestNotificationRepository_CreatedAt_DefaultsToNow.

CI guardrail
- .github/workflows/ci.yml: new 'Forbidden migration mount in compose initdb
  (U-3)' step grep-fails the build if any migrations/*.sql or seed*.sql
  re-appears in /docker-entrypoint-initdb.d in any compose file. Catches
  future drift before a fresh-clone operator hits it.

Spec / Docs
- api/openapi.yaml: add /api/v1/version operation under Health tag.
- docs/architecture.md: replace the 'initdb may run the same SQL' paragraph
  with a post-U-3 single-source-of-truth explanation.
- CHANGELOG.md: full unreleased-section entry covering all 5 closures,
  breaking changes, and the new env var.

Audit doc
- coverage-gap-audit-2026-04-24-v5/unified-audit.md: add new P1 #14
  cat-u-seed_initdb_schema_drift; flip the 4 ride-along findings to
   RESOLVED with closure prose pointing at this commit.

Verification: build/vet/test -short -race all clean across all touched
packages locally; govulncheck reports 0 vulnerabilities affecting our
code; OpenAPI YAML parses; CI U-3 grep guardrail clears against the
post-fix tree.
2026-04-25 13:29:23 +00:00

109 lines
4.2 KiB
Go

package handler
import (
"encoding/json"
"net/http"
"net/http/httptest"
"runtime"
"strings"
"testing"
)
// TestVersion_ReturnsBuildInfo is the regression for the U-3 ride-along
// cat-u-no_version_endpoint (P2). Three behaviors must hold for the
// endpoint to be useful in operator tooling:
//
// 1. GET /api/v1/version returns 200 with a JSON body that decodes into
// the documented VersionInfo shape — the wire contract that rollout
// systems and Prometheus blackbox probes parse.
// 2. The Go runtime version always populates (runtime.Version() can never
// return empty), so consumers can always answer "which Go did this
// binary compile with" even when ldflags / VCS info are missing.
// 3. The Version field is never empty — the fallback ladder
// (ldflags > VCS commit > "dev") guarantees a non-empty string so
// consumers don't have to special-case absent values.
//
// We don't pin the exact Version value because it depends on whether the
// test binary was built with -ldflags or under `go test`, both of which
// the handler must tolerate. The "no empty string" check is the
// behavioral contract.
func TestVersion_ReturnsBuildInfo(t *testing.T) {
h := NewVersionHandler()
req := httptest.NewRequest(http.MethodGet, "/api/v1/version", nil)
rec := httptest.NewRecorder()
h.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("status = %d, want 200", rec.Code)
}
contentType := rec.Header().Get("Content-Type")
if !strings.HasPrefix(contentType, "application/json") {
t.Errorf("Content-Type = %q, want application/json prefix (operator tooling parses JSON)", contentType)
}
var got VersionInfo
if err := json.NewDecoder(rec.Body).Decode(&got); err != nil {
t.Fatalf("response body did not decode into VersionInfo: %v\nbody: %s", err, rec.Body.String())
}
// Version must never be empty — the fallback ladder in readBuildInfo
// guarantees this. An empty Version would force every downstream
// consumer (k8s rollouts, Prometheus blackbox, the support tooling)
// to special-case the missing value, which defeats the point of
// /api/v1/version existing.
if got.Version == "" {
t.Error("Version is empty — the fallback ladder (ldflags > VCS commit > 'dev') must guarantee a non-empty value")
}
// GoVersion must equal runtime.Version() — the handler reads it
// directly and cannot be subverted by ldflags or BuildInfo. This is
// the one field that should always be ground-truth.
if got.GoVersion != runtime.Version() {
t.Errorf("GoVersion = %q, want %q (must come straight from runtime.Version())",
got.GoVersion, runtime.Version())
}
}
// TestVersion_RejectsNonGet pins the GET-only contract. /api/v1/version
// is read-only build identity; POST/PUT/DELETE etc. are nonsensical and
// should return 405 like the HealthHandler does. Operator tooling that
// fat-fingers the verb gets a clear error rather than a confusing 200
// from the wrong code path.
func TestVersion_RejectsNonGet(t *testing.T) {
h := NewVersionHandler()
for _, method := range []string{
http.MethodPost, http.MethodPut, http.MethodDelete, http.MethodPatch,
} {
req := httptest.NewRequest(method, "/api/v1/version", nil)
rec := httptest.NewRecorder()
h.ServeHTTP(rec, req)
if rec.Code != http.StatusMethodNotAllowed {
t.Errorf("%s /api/v1/version → status %d, want 405", method, rec.Code)
}
}
}
// TestVersion_LdflagsOverride locks in the priority order: when the
// build-time Version variable is non-empty (e.g. "v2.0.50" injected by
// release.yml), readBuildInfo MUST surface that value verbatim and not
// silently substitute the VCS commit. The release-pipeline contract
// depends on this — a release tagged v2.0.50 should report "v2.0.50",
// not the underlying SHA.
//
// We achieve test isolation by save/restore on the package-level Version
// variable; t.Cleanup ensures parallel/subsequent tests see the original.
func TestVersion_LdflagsOverride(t *testing.T) {
original := Version
t.Cleanup(func() { Version = original })
Version = "v2.0.50-test"
got := readBuildInfo()
if got.Version != "v2.0.50-test" {
t.Errorf("Version = %q, want %q (ldflags-supplied Version must take priority over VCS fallback)",
got.Version, "v2.0.50-test")
}
}