mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 12:41:30 +00:00
a3d8b9c607
GitHub #10 reopened: operator mikeakasully cloned v2.0.50 fresh and ran the canonical quickstart (docker compose -f deploy/docker-compose.yml up -d --build); postgres reported unhealthy indefinitely, dependent containers never started. Root cause: deploy/docker-compose.yml mounted a hand-curated subset of migrations/*.up.sql + seed.sql into postgres /docker-entrypoint-initdb.d/. Postgres applied them at initdb time. Once seed.sql referenced columns added by migrations *after* the mounted cutoff (e.g., policy_rules.severity from migration 000013), initdb crashed mid-seed and the container loop wedged. Two sources of truth (compose mount list vs in-tree migration ladder) diverged the moment a seed-touching migration shipped, and the only thing that fixed it was hand-editing the compose file every release. Fix: remove the dual source. Postgres boots empty; the server applies migrations + seed at startup via RunMigrations + RunSeed. Helm has used this pattern since day one (postgres-init emptyDir); compose now matches. Bundled with four ride-along audit findings whose fixes share the same schema/db code surface, so operators take the schema-change pain only once: cat-u-seed_initdb_schema_drift [P1, primary] — initdb-mount fix cat-o-retry_interval_unit_mismatch [P1] — column rename minutes→seconds cat-o-notification_created_at_dead_field [P2] — add column + populate cat-o-health_check_column_orphans [P1] — drop unwired columns cat-u-no_version_endpoint [P2] — add /api/v1/version Single migration (000017_db_coupling_cleanup) bundles the three schema changes under a DO \$\$ guard so re-application is safe; reduces operator-visible 'schema-change releases' from four to one. Backend - internal/repository/postgres/db.go: add RunSeed (baseline) + RunDemoSeed (gated by CERTCTL_DEMO_SEED). Both idempotent (ON CONFLICT DO NOTHING in every shipped INSERT) so repeated boots are safe; missing-file is no-op so custom packaging that strips seeds still boots cleanly. - cmd/server/main.go: invoke RunSeed (always) + RunDemoSeed (when flag set) immediately after RunMigrations. - internal/repository/postgres/notification.go: NotificationRepository.Create now sets created_at (with time.Now() fallback when caller leaves it zero); scanNotification reads it back; List + ListRetryEligible SELECT extended. - internal/repository/postgres/renewal_policy.go: column references updated to retry_interval_seconds across SELECT/INSERT/UPDATE sites. - internal/api/handler/version.go: new VersionHandler exposes {version, commit, modified, build_time, go_version} from runtime/debug.ReadBuildInfo() with ldflags-supplied Version override. - internal/api/router/router.go: register GET /api/v1/version through the no-auth chain (CORS + ContentType) alongside /health, /ready, /api/v1/auth/info. - cmd/server/main.go: add /api/v1/version to no-auth dispatch + audit ExcludePaths so rollout polling doesn't dominate the audit trail. - internal/config/config.go: add DatabaseConfig.DemoSeed + CERTCTL_DEMO_SEED env var. Migration - migrations/000017_db_coupling_cleanup.up.sql + .down.sql: (1) renewal_policies.retry_interval_minutes → retry_interval_seconds (DO \$\$ guard, idempotent re-application) (2) notification_events ADD COLUMN created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() (3) network_scan_targets DROP orphan health_check_enabled + health_check_interval_seconds - migrations/seed.sql: column reference updated to retry_interval_seconds. - migrations/seed_demo.sql: same column rename + applied at runtime now via RunDemoSeed (no longer initdb-mounted). Compose - deploy/docker-compose.yml: drop ALL initdb mounts (10 migration files + seed.sql); add start_period: 30s to postgres + certctl-server healthchecks to absorb the runtime migration + seed application window on first boot. - deploy/docker-compose.test.yml: same drop (+ ghost seed_test.sql mount removed; that file never existed); same healthcheck start_period. - deploy/docker-compose.demo.yml: replace seed_demo.sql initdb mount with CERTCTL_DEMO_SEED=true env var on certctl-server. Tests - internal/api/handler/version_handler_test.go: TestVersion_ReturnsBuildInfo, TestVersion_RejectsNonGet, TestVersion_LdflagsOverride. - internal/repository/postgres/seed_test.go: TestRunSeed_AppliesIdempotently, TestRunSeed_MissingFileIsNoOp, TestRunDemoSeed_AppliesIdempotently, TestMigration000017_RetryIntervalRename, TestMigration000017_NotificationCreatedAt, TestMigration000017_HealthCheckOrphansDropped (testcontainers, -short skips). - internal/repository/postgres/notification_test.go: TestNotificationRepository_CreatedAt_IsPersisted + TestNotificationRepository_CreatedAt_DefaultsToNow. CI guardrail - .github/workflows/ci.yml: new 'Forbidden migration mount in compose initdb (U-3)' step grep-fails the build if any migrations/*.sql or seed*.sql re-appears in /docker-entrypoint-initdb.d in any compose file. Catches future drift before a fresh-clone operator hits it. Spec / Docs - api/openapi.yaml: add /api/v1/version operation under Health tag. - docs/architecture.md: replace the 'initdb may run the same SQL' paragraph with a post-U-3 single-source-of-truth explanation. - CHANGELOG.md: full unreleased-section entry covering all 5 closures, breaking changes, and the new env var. Audit doc - coverage-gap-audit-2026-04-24-v5/unified-audit.md: add new P1 #14 cat-u-seed_initdb_schema_drift; flip the 4 ride-along findings to ✅ RESOLVED with closure prose pointing at this commit. Verification: build/vet/test -short -race all clean across all touched packages locally; govulncheck reports 0 vulnerabilities affecting our code; OpenAPI YAML parses; CI U-3 grep guardrail clears against the post-fix tree.
318 lines
12 KiB
Go
318 lines
12 KiB
Go
package postgres_test
|
|
|
|
// Integration tests for RenewalPolicyRepository (post-G-1, 289 lines, 5
|
|
// methods). Closes the L-1 coverage gap flagged in coverage-gap-audit.md:
|
|
// the repository's auto-generated-ID collision retry loop and its two
|
|
// typed error sentinels (ErrRenewalPolicyDuplicateName on pg 23505,
|
|
// ErrRenewalPolicyInUse on pg 23503) shipped with zero live-DB regression
|
|
// coverage — a mock-only test surface cannot exercise the PostgreSQL
|
|
// constraint semantics these paths depend on.
|
|
//
|
|
// The audit listed the file as "92 lines, 2 methods"; that was stale
|
|
// pre-G-1. Current state is 5 methods (Get/List/Create/Update/Delete),
|
|
// all covered below.
|
|
|
|
import (
|
|
"context"
|
|
"errors"
|
|
"strings"
|
|
"testing"
|
|
"time"
|
|
|
|
"github.com/shankar0123/certctl/internal/domain"
|
|
"github.com/shankar0123/certctl/internal/repository"
|
|
"github.com/shankar0123/certctl/internal/repository/postgres"
|
|
)
|
|
|
|
// TestRenewalPolicyRepository_CRUD exercises the happy path for all five
|
|
// interface methods. In particular it drives the slug-based ID
|
|
// auto-generation branch (policy.ID left empty → Create emits
|
|
// rp-<slug(name)>) so any regression to slugifyPolicyName or the retry
|
|
// loop surfaces immediately. The AlertThresholdsDays JSONB round-trip is
|
|
// asserted end-to-end: marshal on Create → store as JSONB → scan back on
|
|
// Get preserves the slice ordering and values.
|
|
func TestRenewalPolicyRepository_CRUD(t *testing.T) {
|
|
tdb := getTestDB(t)
|
|
db := tdb.freshSchema(t)
|
|
repo := postgres.NewRenewalPolicyRepository(db)
|
|
ctx := context.Background()
|
|
|
|
// Create: leave ID empty so the repository generates rp-<slug(name)>.
|
|
// "Prod TLS 90d" → rp-prod-tls-90d per slugifyPolicyName's rules
|
|
// (lowercase, spaces→hyphens, non-alphanumeric stripped).
|
|
policy := &domain.RenewalPolicy{
|
|
Name: "Prod TLS 90d",
|
|
RenewalWindowDays: 30,
|
|
AutoRenew: true,
|
|
MaxRetries: 5,
|
|
RetryInterval: 3600, // stored as seconds in retry_interval_seconds column (renamed in 000017_db_coupling_cleanup, U-3)
|
|
AlertThresholdsDays: []int{30, 14, 7, 0},
|
|
}
|
|
|
|
if err := repo.Create(ctx, policy); err != nil {
|
|
t.Fatalf("Create failed: %v", err)
|
|
}
|
|
if policy.ID != "rp-prod-tls-90d" {
|
|
t.Errorf("auto-generated ID = %q, want %q", policy.ID, "rp-prod-tls-90d")
|
|
}
|
|
if policy.CreatedAt.IsZero() {
|
|
t.Error("Create did not populate CreatedAt (RETURNING clause regression?)")
|
|
}
|
|
if policy.UpdatedAt.IsZero() {
|
|
t.Error("Create did not populate UpdatedAt (RETURNING clause regression?)")
|
|
}
|
|
|
|
// Get: pull the just-created row back and confirm every stored field
|
|
// survives the scanRenewalPolicy path, including the JSONB unmarshal.
|
|
got, err := repo.Get(ctx, policy.ID)
|
|
if err != nil {
|
|
t.Fatalf("Get failed: %v", err)
|
|
}
|
|
if got.Name != "Prod TLS 90d" {
|
|
t.Errorf("Get: Name = %q, want %q", got.Name, "Prod TLS 90d")
|
|
}
|
|
if got.RenewalWindowDays != 30 {
|
|
t.Errorf("Get: RenewalWindowDays = %d, want 30", got.RenewalWindowDays)
|
|
}
|
|
if !got.AutoRenew {
|
|
t.Error("Get: AutoRenew = false, want true")
|
|
}
|
|
if got.MaxRetries != 5 {
|
|
t.Errorf("Get: MaxRetries = %d, want 5", got.MaxRetries)
|
|
}
|
|
if len(got.AlertThresholdsDays) != 4 {
|
|
t.Fatalf("Get: AlertThresholdsDays length = %d, want 4 (JSONB round-trip regression)", len(got.AlertThresholdsDays))
|
|
}
|
|
for i, want := range []int{30, 14, 7, 0} {
|
|
if got.AlertThresholdsDays[i] != want {
|
|
t.Errorf("Get: AlertThresholdsDays[%d] = %d, want %d", i, got.AlertThresholdsDays[i], want)
|
|
}
|
|
}
|
|
|
|
// Update: 3-arg signature is a house invariant — don't let it slip to
|
|
// 2-arg without the test catching the breakage. Tweak scalar + JSONB
|
|
// simultaneously so both SET branches exercise.
|
|
updated := *got
|
|
updated.Name = "Prod TLS 90d (tightened)"
|
|
updated.RenewalWindowDays = 45
|
|
updated.AlertThresholdsDays = []int{45, 30, 14, 7, 0}
|
|
|
|
// Sleep long enough that NOW() ticks past the Create timestamp so we
|
|
// can assert UpdatedAt monotonicity without a flaky equality check.
|
|
time.Sleep(2 * time.Millisecond)
|
|
|
|
if err := repo.Update(ctx, policy.ID, &updated); err != nil {
|
|
t.Fatalf("Update failed: %v", err)
|
|
}
|
|
if !updated.UpdatedAt.After(got.UpdatedAt) {
|
|
t.Errorf("Update: UpdatedAt %v not after Create's %v (RETURNING NOW() regression?)", updated.UpdatedAt, got.UpdatedAt)
|
|
}
|
|
|
|
refetched, err := repo.Get(ctx, policy.ID)
|
|
if err != nil {
|
|
t.Fatalf("Get after Update failed: %v", err)
|
|
}
|
|
if refetched.Name != "Prod TLS 90d (tightened)" {
|
|
t.Errorf("Get after Update: Name = %q, want %q", refetched.Name, "Prod TLS 90d (tightened)")
|
|
}
|
|
if refetched.RenewalWindowDays != 45 {
|
|
t.Errorf("Get after Update: RenewalWindowDays = %d, want 45", refetched.RenewalWindowDays)
|
|
}
|
|
if len(refetched.AlertThresholdsDays) != 5 {
|
|
t.Errorf("Get after Update: AlertThresholdsDays length = %d, want 5", len(refetched.AlertThresholdsDays))
|
|
}
|
|
|
|
// List: add a second policy so the ORDER BY name contract is non-vacuous.
|
|
second := &domain.RenewalPolicy{
|
|
Name: "Aa Earliest",
|
|
RenewalWindowDays: 14,
|
|
AutoRenew: false,
|
|
MaxRetries: 1,
|
|
RetryInterval: 60,
|
|
AlertThresholdsDays: []int{7, 0},
|
|
}
|
|
if err := repo.Create(ctx, second); err != nil {
|
|
t.Fatalf("Create second failed: %v", err)
|
|
}
|
|
|
|
all, err := repo.List(ctx)
|
|
if err != nil {
|
|
t.Fatalf("List failed: %v", err)
|
|
}
|
|
if len(all) != 2 {
|
|
t.Fatalf("List: len = %d, want 2", len(all))
|
|
}
|
|
// "Aa Earliest" sorts before "Prod TLS 90d (tightened)" under ORDER BY name ASC.
|
|
if all[0].Name != "Aa Earliest" {
|
|
t.Errorf("List[0].Name = %q, want %q (ORDER BY name regression?)", all[0].Name, "Aa Earliest")
|
|
}
|
|
|
|
// Delete: removes the policy and a follow-up Get surfaces "not found".
|
|
if err := repo.Delete(ctx, policy.ID); err != nil {
|
|
t.Fatalf("Delete failed: %v", err)
|
|
}
|
|
if _, err := repo.Get(ctx, policy.ID); err == nil {
|
|
t.Error("Get after Delete: err = nil, want not-found")
|
|
}
|
|
}
|
|
|
|
// TestRenewalPolicyRepository_DuplicateName verifies the pg 23505
|
|
// unique_violation translation. The name UNIQUE constraint is enforced
|
|
// on the renewal_policies.name column; Create's inner scanRenewalPolicy
|
|
// must see the pq.Error, call isUniqueViolation, check the constraint
|
|
// name, and return ErrRenewalPolicyDuplicateName. A non-sentinel error
|
|
// here would cause the handler to emit 500 instead of 409.
|
|
func TestRenewalPolicyRepository_DuplicateName(t *testing.T) {
|
|
tdb := getTestDB(t)
|
|
db := tdb.freshSchema(t)
|
|
repo := postgres.NewRenewalPolicyRepository(db)
|
|
ctx := context.Background()
|
|
|
|
first := &domain.RenewalPolicy{
|
|
ID: "rp-first",
|
|
Name: "Shared Name",
|
|
RenewalWindowDays: 30,
|
|
AutoRenew: true,
|
|
MaxRetries: 3,
|
|
RetryInterval: 300,
|
|
AlertThresholdsDays: domain.DefaultAlertThresholds(),
|
|
}
|
|
if err := repo.Create(ctx, first); err != nil {
|
|
t.Fatalf("Create first failed: %v", err)
|
|
}
|
|
|
|
// Second policy with a distinct ID but the same Name — the name UNIQUE
|
|
// constraint fires, Create's collision branch inspects pqErr.Constraint,
|
|
// and because it's NOT *_pkey, it returns ErrRenewalPolicyDuplicateName
|
|
// without retrying.
|
|
second := &domain.RenewalPolicy{
|
|
ID: "rp-second",
|
|
Name: "Shared Name",
|
|
RenewalWindowDays: 60,
|
|
AutoRenew: false,
|
|
MaxRetries: 1,
|
|
RetryInterval: 600,
|
|
AlertThresholdsDays: domain.DefaultAlertThresholds(),
|
|
}
|
|
err := repo.Create(ctx, second)
|
|
if err == nil {
|
|
t.Fatal("Create second: err = nil, want ErrRenewalPolicyDuplicateName")
|
|
}
|
|
if !errors.Is(err, repository.ErrRenewalPolicyDuplicateName) {
|
|
t.Errorf("Create second: err = %v, want ErrRenewalPolicyDuplicateName (via errors.Is)", err)
|
|
}
|
|
|
|
// Also verify Update surfaces the same sentinel when an existing row's
|
|
// name is changed to collide with another policy's name.
|
|
third := &domain.RenewalPolicy{
|
|
ID: "rp-third",
|
|
Name: "Third Name",
|
|
RenewalWindowDays: 90,
|
|
AutoRenew: true,
|
|
MaxRetries: 2,
|
|
RetryInterval: 1200,
|
|
AlertThresholdsDays: domain.DefaultAlertThresholds(),
|
|
}
|
|
if err := repo.Create(ctx, third); err != nil {
|
|
t.Fatalf("Create third failed: %v", err)
|
|
}
|
|
third.Name = "Shared Name" // collide with first
|
|
err = repo.Update(ctx, third.ID, third)
|
|
if err == nil {
|
|
t.Fatal("Update: err = nil, want ErrRenewalPolicyDuplicateName")
|
|
}
|
|
if !errors.Is(err, repository.ErrRenewalPolicyDuplicateName) {
|
|
t.Errorf("Update: err = %v, want ErrRenewalPolicyDuplicateName (via errors.Is)", err)
|
|
}
|
|
}
|
|
|
|
// TestRenewalPolicyRepository_DeleteInUse verifies the pg 23503
|
|
// foreign_key_violation translation. managed_certificates.renewal_policy_id
|
|
// REFERENCES renewal_policies(id) ON DELETE RESTRICT; attempting to Delete
|
|
// a policy while a certificate still references it must surface as
|
|
// ErrRenewalPolicyInUse so the handler can emit 409 Conflict. Any change
|
|
// to either the FK definition or the isForeignKeyViolation mapping breaks
|
|
// this.
|
|
func TestRenewalPolicyRepository_DeleteInUse(t *testing.T) {
|
|
tdb := getTestDB(t)
|
|
db := tdb.freshSchema(t)
|
|
repo := postgres.NewRenewalPolicyRepository(db)
|
|
ctx := context.Background()
|
|
|
|
// The policy under test — create via repo so ID auto-generation is
|
|
// also exercised end-to-end in this path.
|
|
policy := &domain.RenewalPolicy{
|
|
Name: "InUse Policy",
|
|
RenewalWindowDays: 30,
|
|
AutoRenew: true,
|
|
MaxRetries: 3,
|
|
RetryInterval: 300,
|
|
AlertThresholdsDays: domain.DefaultAlertThresholds(),
|
|
}
|
|
if err := repo.Create(ctx, policy); err != nil {
|
|
t.Fatalf("Create policy failed: %v", err)
|
|
}
|
|
|
|
// Create owner/team/issuer prerequisites, then raw-INSERT a
|
|
// managed_certificate row referencing the policy. Using raw SQL here
|
|
// (matching insertCertPrereqsRaw's idiom) keeps the test independent
|
|
// of the service layer.
|
|
ownerID, teamID, issuerID, _ := insertCertPrereqsRaw(t, db, ctx, "inuse")
|
|
|
|
now := time.Now().UTC().Truncate(time.Microsecond)
|
|
_, err := db.ExecContext(ctx, `
|
|
INSERT INTO managed_certificates (
|
|
id, name, common_name, sans, environment,
|
|
owner_id, team_id, issuer_id, renewal_policy_id,
|
|
status, expires_at, tags, created_at, updated_at
|
|
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
|
|
`,
|
|
"mc-inuse", "inuse-cert", "inuse.example.com", []string{}, "production",
|
|
ownerID, teamID, issuerID, policy.ID,
|
|
string(domain.CertificateStatusActive), now.Add(90*24*time.Hour), "{}", now, now,
|
|
)
|
|
if err != nil {
|
|
t.Fatalf("INSERT managed_certificates failed: %v", err)
|
|
}
|
|
|
|
// Delete: the ON DELETE RESTRICT FK fires, the pg driver returns a
|
|
// *pq.Error with Code 23503, isForeignKeyViolation detects it, and
|
|
// the repository returns ErrRenewalPolicyInUse.
|
|
err = repo.Delete(ctx, policy.ID)
|
|
if err == nil {
|
|
t.Fatal("Delete: err = nil, want ErrRenewalPolicyInUse (ON DELETE RESTRICT should have fired)")
|
|
}
|
|
if !errors.Is(err, repository.ErrRenewalPolicyInUse) {
|
|
t.Errorf("Delete: err = %v, want ErrRenewalPolicyInUse (via errors.Is)", err)
|
|
}
|
|
|
|
// And the policy is still there — RESTRICT aborted the delete.
|
|
if _, err := repo.Get(ctx, policy.ID); err != nil {
|
|
t.Errorf("Get after failed Delete: err = %v, want nil (policy should still exist)", err)
|
|
}
|
|
|
|
// After removing the referencing cert, Delete succeeds — proves the
|
|
// RESTRICT was the only thing blocking the earlier Delete and rules
|
|
// out any unrelated failure mode.
|
|
if _, err := db.ExecContext(ctx, `DELETE FROM managed_certificates WHERE id = $1`, "mc-inuse"); err != nil {
|
|
t.Fatalf("cleanup DELETE managed_certificates failed: %v", err)
|
|
}
|
|
if err := repo.Delete(ctx, policy.ID); err != nil {
|
|
t.Errorf("Delete after cleanup: err = %v, want nil", err)
|
|
}
|
|
|
|
// Also verify Delete on a non-existent ID returns a not-found error
|
|
// (not nil, not the InUse sentinel) — guards against a silent no-op
|
|
// regression in the RowsAffected check.
|
|
err = repo.Delete(ctx, "rp-does-not-exist")
|
|
if err == nil {
|
|
t.Fatal("Delete(non-existent): err = nil, want not-found")
|
|
}
|
|
if errors.Is(err, repository.ErrRenewalPolicyInUse) {
|
|
t.Errorf("Delete(non-existent): err = %v, should not be ErrRenewalPolicyInUse", err)
|
|
}
|
|
if !strings.Contains(err.Error(), "not found") {
|
|
t.Errorf("Delete(non-existent): err = %v, want substring %q", err, "not found")
|
|
}
|
|
}
|