From 151107c969db5b432529c190b7916bd964f73adc Mon Sep 17 00:00:00 2001 From: shankar0123 Date: Sat, 16 May 2026 23:15:22 +0000 Subject: [PATCH] fix(test-compose): set CERTCTL_AGENT_BOOTSTRAP_TOKEN placeholder (deploy-vendor-e2e job) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit deploy-vendor-e2e was hidden behind the go-build-and-test failure; once that cleared (b1ca046), the vendor-e2e job actually booted certctl-test- server for the first time in a while and hit the Sprint 5 ACQ RED-003 fallout: Failed to load configuration: phase-2 SEC-H1 fail-closed guard: CERTCTL_AGENT_BOOTSTRAP_TOKEN is empty and CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY=true — refuse to start. The Sprint 5 RED-003 closure flipped DENY_EMPTY's default from false→true in production code, but the test compose stack never set a token. The fail-closed guard (internal/config/config.go:1054) refuses to start unless one of: - CERTCTL_AGENT_BOOTSTRAP_TOKEN is non-empty, OR - CERTCTL_DEMO_MODE_ACK=true (demo-mode override), OR - CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY=false (warn-mode escape hatch for v2.1.x→v2.2.x upgrade window) This is the e2e TEST stack with production-like auth posture (CERTCTL_AUTH_TYPE=api-key), not a demo stack. The right fix is the first option — set a deterministic placeholder token. Picking the warn-mode escape hatch would silently test the wrong posture; picking DEMO_MODE_ACK would also flip CERTCTL_AUTH_TYPE expectations. Also fixed deploy/ENVIRONMENTS.md: the entry still said 'default flip to true scheduled for v2.2.0', which became stale on 2026-05-16 when Sprint 5 ACQ RED-003 actually flipped it. Updated the default column from `false` to `true` and rewrote the description to reflect the current posture + the v2.1.x→v2.2.x warn-mode escape hatch. Verified locally: all 53 locally-runnable ci-guards still green (4 skipped: H-001-bare-from + H-002-bare-compose-image + digest-validity + no-precompiled-binary, all need docker-registry network). CI re-run on this commit should clear deploy-vendor-e2e's certctl-test-server dependency-failed-to-start step. --- deploy/ENVIRONMENTS.md | 4 ++-- deploy/docker-compose.test.yml | 12 ++++++++++++ 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/deploy/ENVIRONMENTS.md b/deploy/ENVIRONMENTS.md index 021bbe6..8a4a354 100644 --- a/deploy/ENVIRONMENTS.md +++ b/deploy/ENVIRONMENTS.md @@ -419,8 +419,8 @@ Every `CERTCTL_*` environment variable is read by the server's `internal/config/ | `CERTCTL_RATE_LIMIT_BURST` | `20` | Burst allowance above RPS | | `CERTCTL_RATE_LIMIT_BUCKET_TTL` | `1h` | Sprint 2 SEC-006: lifetime of an unused token-bucket entry. A background sweeper running every `BucketTTL/4` reclaims buckets whose last `allow()` call is older than this. Values < 1m clamp up to 1m. Lower when facing high-cardinality unauthenticated traffic (CGNAT churn, scanners) where the bucket-map RSS becomes a concern. | | `CERTCTL_SCHEDULER_JOB_CLAIM_LIMIT` | `1000` | Sprint 2 SCALE-001: cap on the number of Pending rows a single scheduler tick may claim via `ClaimPendingJobs`. Pre-Sprint-2 the scheduler claimed every Pending row in one transaction, which page-thrashed on 100K-job bursts. Values ≤ 0 fail-safe to `1000` (legacy unlimited semantics are no longer reachable). Pair-tune with `CERTCTL_RENEWAL_CONCURRENCY` (default 25) — the default 40:1 ratio keeps the fan-out busy without exhausting upstream-CA rate limits. | -| `CERTCTL_AGENT_BOOTSTRAP_TOKEN` | (empty) | Agent-registration bootstrap secret. Empty = v2.1.x warn-mode pass-through. Set to a real value (`openssl rand -base64 32`); the deny-empty flag's default flip in v2.2.0 will require it. | -| `CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY` | `false` | Phase 2 SEC-H1 staged flag. When `true`, the server refuses to start unless `CERTCTL_AGENT_BOOTSTRAP_TOKEN` is non-empty. Default flip to `true` scheduled for v2.2.0. | +| `CERTCTL_AGENT_BOOTSTRAP_TOKEN` | (empty — required) | Agent-registration bootstrap secret. Set to a real value (`openssl rand -base64 32`). Sprint 5 ACQ RED-003 (2026-05-16) flipped the paired `_DENY_EMPTY` flag's default to `true`, so leaving this empty now refuses server start (unless `CERTCTL_DEMO_MODE_ACK=true`). Operators on v2.1.x reopening the warn-mode escape hatch one upgrade-window can set `CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY=false` explicitly. | +| `CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY` | `true` | Phase 2 SEC-H1 fail-closed guard. When `true` (default since Sprint 5 ACQ RED-003 closure, 2026-05-16), the server refuses to start unless `CERTCTL_AGENT_BOOTSTRAP_TOKEN` is non-empty. Set to `false` only for a v2.1.x→v2.2.x upgrade-window warn-mode escape hatch. | | `CERTCTL_DEMO_MODE_ACK` | `false` | Acknowledges demo-mode synthetic admin posture (required when `CERTCTL_AUTH_TYPE=none` binds to a non-loopback host). Must be paired with `CERTCTL_DEMO_MODE_ACK_TS` per Phase 2 SEC-H3. | | `CERTCTL_DEMO_MODE_ACK_TS` | (empty) | Phase 2 SEC-H3: unix-epoch timestamp at which DemoModeAck was last acknowledged. When `CERTCTL_DEMO_MODE_ACK=true`, this must parse as a unix epoch within the last 24h. Set via `CERTCTL_DEMO_MODE_ACK_TS=$(date +%s)` at every `docker compose up`. | | `CERTCTL_ACME_INSECURE_ACK` | `false` | Phase 2 SEC-M4: explicit ACK required to boot with `CERTCTL_ACME_INSECURE=true`. Production deploys MUST never set either flag. | diff --git a/deploy/docker-compose.test.yml b/deploy/docker-compose.test.yml index 15cacf3..4890bdd 100644 --- a/deploy/docker-compose.test.yml +++ b/deploy/docker-compose.test.yml @@ -264,6 +264,18 @@ services: CERTCTL_AUTH_TYPE: api-key CERTCTL_AUTH_SECRET: test-key-2026 + # Phase 2 SEC-H1 + Sprint 5 RED-003 closure (2026-05-16): the + # AgentBootstrapTokenDenyEmpty fail-closed guard refuses to start + # the server when CERTCTL_AGENT_BOOTSTRAP_TOKEN is empty (the + # default DENY_EMPTY=true flipped on Sprint 5). Demo stacks + # bypass the guard via CERTCTL_DEMO_MODE_ACK=true, but this is + # the e2e TEST stack (production-like auth posture), not a demo + # stack — set a deterministic placeholder token so the server + # boots and the vendor-edge integration tests can run. Clearly + # test-only; do NOT copy to production. Operators set this from + # `openssl rand -base64 32` per docs/operator/security.md. + CERTCTL_AGENT_BOOTSTRAP_TOKEN: test-agent-bootstrap-token-deterministic-fixture + # Key generation — agent-side (production-like) CERTCTL_KEYGEN_MODE: agent