mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:51:30 +00:00
34d52009048c92f241ccaef484acaabba0b80331
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
25996f86fa |
fix(deploy): wire CERTCTL_DEMO_MODE_ACK_TS into the demo overlay path
Phase 2 SEC-H3 (commit
|
||
|
|
072e2af198 |
fix(compose): pin CERTCTL_DATABASE_URL in demo overlay (cold-DB smoke fix #4)
Fourth latent bug surfaced by the Auditable Codebase Bundle's cold-DB compose smoke. CI run on master tip |
||
|
|
a849c8b8cf |
fix(security): close BUNDLE 2 — safe first run, demo mode, agent bootstrap
Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.
Source findings closed:
R2 R3 C1 D9 finding-2 S9 (repo audit)
SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)
Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.
Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
• HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse
• HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
• LOW-5: CORS_ORIGINS contains "*" (CWE-942 + CWE-352) → refuse
Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.
Agent enrollment doc alignment:
• docs/reference/configuration.md L83: corrected the non-existent
URL `POST /api/v1/agents/register` to the real route
`POST /api/v1/agents`; added the bootstrap-token note and the
install-agent.sh handoff sequence.
• docs/reference/architecture.md L154: replaced "agents register
themselves at first heartbeat" (false — cmd/agent/main.go fail-
fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
operator-driven flow (REST or GUI registration first, returned ID
fed to install-agent.sh second).
Tests + CI guard:
• 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
covering: placeholder-secret refused + demo-ack exempt; placeholder
encryption-key refused + demo-ack exempt; real key not mistaken for
placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
into a concrete allowlist still refused; concrete allowlist accepted.
• scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
compose for any of the demo-mode env vars + placeholder credentials.
Comments stripped before checking so the narrative header in the
base file can still reference the overlay's posture in prose.
Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.
Receipts:
• Current first-run truth table
compose flag → posture
-f docker-compose.yml (production)
→ requires .env;
fail-fasts on
missing AUTH_SECRET
/ CONFIG_ENCRYPTION
_KEY / POSTGRES
_PASSWORD; agent
fail-fasts on
missing AGENT_ID
-f docker-compose.yml -f docker-compose.demo.yml (demo)
→ zero-config;
AUTH_TYPE=none +
DEMO_MODE_ACK=true
+ KEYGEN=server +
DEMO_SEED=true;
boot banner WARN
-f docker-compose.yml -f docker-compose.dev.yml (dev)
→ base + PgAdmin
+ debug logging
-f docker-compose.test.yml (test, standalone)
→ production-shape
posture, real CA
backends
• Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
gofmt -l # clean (no diffs)
go vet ./internal/config ./cmd/server # clean
go test -short -count=1 ./internal/config/... # PASS (cumulative +
all 9 new Bundle 2
cases green)
go test -short -count=1 # PASS (no regression
./internal/connector/target/configcheck in the Bundle 1 -
closure tests)
go build ./cmd/server ./cmd/agent # clean
./cmd/cli ./cmd/mcp-server
bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean
bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
bash scripts/ci-guards/G-3-env-docs-drift.sh # clean
Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
• The first `docker compose -f docker-compose.yml up -d` against a
pre-Bundle-2 .env (placeholder values still in place) will now
fail-fast. This is the intended posture but operators upgrading
from v2.0.x via .env-from-old-master need to rotate before
upgrading. The CHANGELOG note for the v2.1.0 release should
call this out alongside Auth Bundle 2's other breaking changes.
Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
|
||
|
|
a3d8b9c607 |
fix(deploy,db,handler): close fresh-clone postgres init failure + 4 ride-along audit findings (U-3 master)
GitHub #10 reopened: operator mikeakasully cloned v2.0.50 fresh and ran the canonical quickstart (docker compose -f deploy/docker-compose.yml up -d --build); postgres reported unhealthy indefinitely, dependent containers never started. Root cause: deploy/docker-compose.yml mounted a hand-curated subset of migrations/*.up.sql + seed.sql into postgres /docker-entrypoint-initdb.d/. Postgres applied them at initdb time. Once seed.sql referenced columns added by migrations *after* the mounted cutoff (e.g., policy_rules.severity from migration 000013), initdb crashed mid-seed and the container loop wedged. Two sources of truth (compose mount list vs in-tree migration ladder) diverged the moment a seed-touching migration shipped, and the only thing that fixed it was hand-editing the compose file every release. Fix: remove the dual source. Postgres boots empty; the server applies migrations + seed at startup via RunMigrations + RunSeed. Helm has used this pattern since day one (postgres-init emptyDir); compose now matches. Bundled with four ride-along audit findings whose fixes share the same schema/db code surface, so operators take the schema-change pain only once: cat-u-seed_initdb_schema_drift [P1, primary] — initdb-mount fix cat-o-retry_interval_unit_mismatch [P1] — column rename minutes→seconds cat-o-notification_created_at_dead_field [P2] — add column + populate cat-o-health_check_column_orphans [P1] — drop unwired columns cat-u-no_version_endpoint [P2] — add /api/v1/version Single migration (000017_db_coupling_cleanup) bundles the three schema changes under a DO \$\$ guard so re-application is safe; reduces operator-visible 'schema-change releases' from four to one. Backend - internal/repository/postgres/db.go: add RunSeed (baseline) + RunDemoSeed (gated by CERTCTL_DEMO_SEED). Both idempotent (ON CONFLICT DO NOTHING in every shipped INSERT) so repeated boots are safe; missing-file is no-op so custom packaging that strips seeds still boots cleanly. - cmd/server/main.go: invoke RunSeed (always) + RunDemoSeed (when flag set) immediately after RunMigrations. - internal/repository/postgres/notification.go: NotificationRepository.Create now sets created_at (with time.Now() fallback when caller leaves it zero); scanNotification reads it back; List + ListRetryEligible SELECT extended. - internal/repository/postgres/renewal_policy.go: column references updated to retry_interval_seconds across SELECT/INSERT/UPDATE sites. - internal/api/handler/version.go: new VersionHandler exposes {version, commit, modified, build_time, go_version} from runtime/debug.ReadBuildInfo() with ldflags-supplied Version override. - internal/api/router/router.go: register GET /api/v1/version through the no-auth chain (CORS + ContentType) alongside /health, /ready, /api/v1/auth/info. - cmd/server/main.go: add /api/v1/version to no-auth dispatch + audit ExcludePaths so rollout polling doesn't dominate the audit trail. - internal/config/config.go: add DatabaseConfig.DemoSeed + CERTCTL_DEMO_SEED env var. Migration - migrations/000017_db_coupling_cleanup.up.sql + .down.sql: (1) renewal_policies.retry_interval_minutes → retry_interval_seconds (DO \$\$ guard, idempotent re-application) (2) notification_events ADD COLUMN created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() (3) network_scan_targets DROP orphan health_check_enabled + health_check_interval_seconds - migrations/seed.sql: column reference updated to retry_interval_seconds. - migrations/seed_demo.sql: same column rename + applied at runtime now via RunDemoSeed (no longer initdb-mounted). Compose - deploy/docker-compose.yml: drop ALL initdb mounts (10 migration files + seed.sql); add start_period: 30s to postgres + certctl-server healthchecks to absorb the runtime migration + seed application window on first boot. - deploy/docker-compose.test.yml: same drop (+ ghost seed_test.sql mount removed; that file never existed); same healthcheck start_period. - deploy/docker-compose.demo.yml: replace seed_demo.sql initdb mount with CERTCTL_DEMO_SEED=true env var on certctl-server. Tests - internal/api/handler/version_handler_test.go: TestVersion_ReturnsBuildInfo, TestVersion_RejectsNonGet, TestVersion_LdflagsOverride. - internal/repository/postgres/seed_test.go: TestRunSeed_AppliesIdempotently, TestRunSeed_MissingFileIsNoOp, TestRunDemoSeed_AppliesIdempotently, TestMigration000017_RetryIntervalRename, TestMigration000017_NotificationCreatedAt, TestMigration000017_HealthCheckOrphansDropped (testcontainers, -short skips). - internal/repository/postgres/notification_test.go: TestNotificationRepository_CreatedAt_IsPersisted + TestNotificationRepository_CreatedAt_DefaultsToNow. CI guardrail - .github/workflows/ci.yml: new 'Forbidden migration mount in compose initdb (U-3)' step grep-fails the build if any migrations/*.sql or seed*.sql re-appears in /docker-entrypoint-initdb.d in any compose file. Catches future drift before a fresh-clone operator hits it. Spec / Docs - api/openapi.yaml: add /api/v1/version operation under Health tag. - docs/architecture.md: replace the 'initdb may run the same SQL' paragraph with a post-U-3 single-source-of-truth explanation. - CHANGELOG.md: full unreleased-section entry covering all 5 closures, breaking changes, and the new env var. Audit doc - coverage-gap-audit-2026-04-24-v5/unified-audit.md: add new P1 #14 cat-u-seed_initdb_schema_drift; flip the 4 ride-along findings to ✅ RESOLVED with closure prose pointing at this commit. Verification: build/vet/test -short -race all clean across all touched packages locally; govulncheck reports 0 vulnerabilities affecting our code; OpenAPI YAML parses; CI U-3 grep guardrail clears against the post-fix tree. |
||
|
|
c015cab2f4 |
docs: rewrite features.md, audit README + architecture against repo
Rewrote docs/features.md from scratch as authoritative feature inventory (1255 lines, every claim verified against source files). Audited README.md and architecture.md against repo — fixed 19 stale references: K8s Secrets status, issuer counts, dashboard page counts, CI thresholds, missing connectors in Mermaid diagrams, OpenAPI operation count, GetCACertPEM behavior, and V2/V4 roadmap accuracy. Also includes related fixes discovered during audit: - Scheduler skips expired/failed/revoked certs from auto-renewal - Seed demo expiry dates moved outside 31-day scheduler query window - Agent pages use correct last_heartbeat_at field name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
8f146e08d6 |
feat(M36): onboarding wizard for first-run experience
4-step wizard (Connect CA → Deploy Agent → Add Certificate → Done) shown on fresh installs when no user-configured issuers or certificates exist. Auto-seeded env var issuers (source="env") are excluded from first-run detection. Wizard state latches to prevent query refetches from dismissing it mid-flow. Split docker-compose into clean default (wizard-compatible) and demo override (seed_demo.sql). Added missing migrations 000009/000010 to test compose. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |