fix(security): close BUNDLE 2 — safe first run, demo mode, agent bootstrap

Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.

Source findings closed:
  R2 R3 C1 D9 finding-2 S9               (repo audit)
  SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)

Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.

Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
  • HIGH-6:  AUTH_SECRET = "change-me-in-production"        → refuse
  • HIGH-6:  CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
  • LOW-5:   CORS_ORIGINS contains "*"  (CWE-942 + CWE-352) → refuse

Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.

Agent enrollment doc alignment:
  • docs/reference/configuration.md L83: corrected the non-existent
    URL `POST /api/v1/agents/register` to the real route
    `POST /api/v1/agents`; added the bootstrap-token note and the
    install-agent.sh handoff sequence.
  • docs/reference/architecture.md L154: replaced "agents register
    themselves at first heartbeat" (false — cmd/agent/main.go fail-
    fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
    operator-driven flow (REST or GUI registration first, returned ID
    fed to install-agent.sh second).

Tests + CI guard:
  • 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
    covering: placeholder-secret refused + demo-ack exempt; placeholder
    encryption-key refused + demo-ack exempt; real key not mistaken for
    placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
    into a concrete allowlist still refused; concrete allowlist accepted.
  • scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
    compose for any of the demo-mode env vars + placeholder credentials.
    Comments stripped before checking so the narrative header in the
    base file can still reference the overlay's posture in prose.

Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.

Receipts:
  • Current first-run truth table
        compose flag                                  → posture
        -f docker-compose.yml                          (production)
                                                       → requires .env;
                                                       fail-fasts on
                                                       missing AUTH_SECRET
                                                       / CONFIG_ENCRYPTION
                                                       _KEY / POSTGRES
                                                       _PASSWORD; agent
                                                       fail-fasts on
                                                       missing AGENT_ID
        -f docker-compose.yml -f docker-compose.demo.yml  (demo)
                                                       → zero-config;
                                                       AUTH_TYPE=none +
                                                       DEMO_MODE_ACK=true
                                                       + KEYGEN=server +
                                                       DEMO_SEED=true;
                                                       boot banner WARN
        -f docker-compose.yml -f docker-compose.dev.yml   (dev)
                                                       → base + PgAdmin
                                                       + debug logging
        -f docker-compose.test.yml                     (test, standalone)
                                                       → production-shape
                                                       posture, real CA
                                                       backends
  • Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
        gofmt -l                                      # clean (no diffs)
        go vet ./internal/config ./cmd/server         # clean
        go test -short -count=1 ./internal/config/... # PASS (cumulative +
                                                       all 9 new Bundle 2
                                                       cases green)
        go test -short -count=1                       # PASS (no regression
            ./internal/connector/target/configcheck    in the Bundle 1 -
                                                       closure tests)
        go build ./cmd/server ./cmd/agent             # clean
            ./cmd/cli ./cmd/mcp-server
        bash scripts/ci-guards/B2-compose-base-no-demo-env.sh  # clean
        bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
        bash scripts/ci-guards/G-3-env-docs-drift.sh           # clean

Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
  • The first `docker compose -f docker-compose.yml up -d` against a
    pre-Bundle-2 .env (placeholder values still in place) will now
    fail-fast. This is the intended posture but operators upgrading
    from v2.0.x via .env-from-old-master need to rotate before
    upgrading. The CHANGELOG note for the v2.1.0 release should
    call this out alongside Auth Bundle 2's other breaking changes.

Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
This commit is contained in:
shankar0123
2026-05-13 00:14:59 +00:00
parent d60a0ac297
commit a849c8b8cf
13 changed files with 645 additions and 90 deletions
+64
View File
@@ -2633,6 +2633,70 @@ func (c *Config) Validate() error {
}
}
// Bundle 2 (2026-05-12) — fail-closed startup guards for placeholder
// credentials shipped by the demo overlay (docker-compose.demo.yml).
//
// Rationale: pre-Bundle-2 the base docker-compose.yml file interpolated
// these strings as the default value when an operator didn't set
// CERTCTL_AUTH_SECRET / CERTCTL_API_KEY / CERTCTL_CONFIG_ENCRYPTION_KEY
// in deploy/.env. The result: `docker compose up` produced a working
// stack with documented "weak" credentials that nobody actually
// remembered to rotate before going to production. The Bundle 2 compose
// split moved those defaults into the demo overlay; the guards below
// catch any path that still surfaces them in a non-demo deploy (e.g.
// the .env-example was committed unedited, or a custom compose copied
// the placeholder verbatim).
//
// All three sentinels exactly match the literal strings shipped in
// deploy/docker-compose.demo.yml. The demo overlay also sets
// DemoModeAck=true, so the demo path itself is exempt and these
// strings only fail in production.
const (
placeholderAPISecret = "change-me-in-production"
placeholderEncryptionKey = "change-me-32-char-encryption-key"
)
if !c.Auth.DemoModeAck {
// HIGH-6 closure (Audit Bundle 2): placeholder API-key secret.
if c.Auth.Type == string(AuthTypeAPIKey) && c.Auth.Secret == placeholderAPISecret {
return fmt.Errorf(
"CERTCTL_AUTH_SECRET is set to the demo placeholder %q — refuse to start. "+
"Generate a real value with: openssl rand -base64 32. "+
"This guard exempts demo mode (CERTCTL_DEMO_MODE_ACK=true); production "+
"deploys MUST rotate.",
placeholderAPISecret)
}
// HIGH-6 closure (Audit Bundle 2): placeholder encryption key.
if c.Encryption.ConfigEncryptionKey == placeholderEncryptionKey {
return fmt.Errorf(
"CERTCTL_CONFIG_ENCRYPTION_KEY is set to the demo placeholder %q — refuse to start. "+
"Generate a real value with: openssl rand -base64 32 (must be ≥ 32 bytes). "+
"This guard exempts demo mode (CERTCTL_DEMO_MODE_ACK=true); production "+
"deploys MUST rotate before any issuer/target credentials are encrypted at rest "+
"with the placeholder passphrase.",
placeholderEncryptionKey)
}
// LOW-5 closure (Audit Bundle 2): CORS wildcard in non-demo mode.
// Wildcard CORS combined with credentialed cookies (the session
// auth Bundle 2 ships) is a CSRF cross-origin escalation channel
// (CWE-942 + CWE-352). The auth-exempt routes already route through
// middleware.NewCORS with the operator's allowlist; "*" in the
// allowlist short-circuits the entire defense. Demo mode is
// exempt because the demo synthetic actor has no real credentials
// worth stealing, and demo screencaps frequently want to exercise
// the dashboard from a Mermaid-rendered URL or whatever.
for _, origin := range c.CORS.AllowedOrigins {
if origin == "*" {
return fmt.Errorf(
"CERTCTL_CORS_ORIGINS contains \"*\" wildcard — refuse to start. " +
"Wildcard CORS combined with credentialed cookies is a cross-origin " +
"CSRF / session-theft channel (CWE-942 + CWE-352). Set a concrete " +
"allowlist (e.g. CERTCTL_CORS_ORIGINS=https://dashboard.example.com) " +
"or set CERTCTL_DEMO_MODE_ACK=true if this is a demo deploy that " +
"has no real session credentials worth defending.")
}
}
}
// Validate keygen mode
validKeygenModes := map[string]bool{
"agent": true,
+156
View File
@@ -1526,3 +1526,159 @@ func TestValidate_SCEPDisabled_EmptyRAPair_Accepts(t *testing.T) {
t.Errorf("Validate() = %v, want nil for SCEP disabled with empty RA pair", err)
}
}
// Bundle 2 closure (2026-05-12) — fail-closed startup guards against
// placeholder credentials shipped by the demo overlay
// (deploy/docker-compose.demo.yml). The literal strings below MUST stay
// in sync with the sentinels in internal/config/config.go::Validate; the
// demo overlay also writes these exact values into its env block, so any
// drift between the three locations would silently break the closure.
// TestValidate_Bundle2_PlaceholderAuthSecret_Refused pins the contract
// that the placeholder string "change-me-in-production" in
// CERTCTL_AUTH_SECRET hard-fails Validate() outside demo mode.
func TestValidate_Bundle2_PlaceholderAuthSecret_Refused(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.Auth.Type = "api-key"
cfg.Auth.Secret = "change-me-in-production"
cfg.Auth.DemoModeAck = false
err := cfg.Validate()
if err == nil {
t.Fatal("Validate() returned nil; expected refusal on placeholder CERTCTL_AUTH_SECRET")
}
for _, want := range []string{"CERTCTL_AUTH_SECRET", "change-me-in-production", "openssl rand"} {
if !strings.Contains(err.Error(), want) {
t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want)
}
}
}
// TestValidate_Bundle2_PlaceholderAuthSecret_DemoAckExempt pins that
// the demo overlay (which sets the placeholder + DemoModeAck=true) is
// exempt — without this exemption the demo path would fail to boot.
func TestValidate_Bundle2_PlaceholderAuthSecret_DemoAckExempt(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
// Demo overlay sets AUTH_TYPE=none (so the placeholder doesn't even
// hit the api-key branch), but cover the api-key + ack edge case too
// in case an operator manually flips the demo overlay's AUTH_TYPE.
cfg.Auth.Type = "api-key"
cfg.Auth.Secret = "change-me-in-production"
cfg.Auth.DemoModeAck = true
if err := cfg.Validate(); err != nil {
t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept placeholder secret", err)
}
}
// TestValidate_Bundle2_PlaceholderEncryptionKey_Refused pins the
// contract that "change-me-32-char-encryption-key" hard-fails Validate()
// outside demo mode. Note: this string is exactly 32 bytes, so it
// passes the H-1 length floor; the only thing catching it is the
// Bundle 2 value-equality guard.
func TestValidate_Bundle2_PlaceholderEncryptionKey_Refused(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.Encryption.ConfigEncryptionKey = "change-me-32-char-encryption-key"
cfg.Auth.DemoModeAck = false
err := cfg.Validate()
if err == nil {
t.Fatal("Validate() returned nil; expected refusal on placeholder CERTCTL_CONFIG_ENCRYPTION_KEY")
}
for _, want := range []string{"CERTCTL_CONFIG_ENCRYPTION_KEY", "change-me-32-char-encryption-key", "openssl rand"} {
if !strings.Contains(err.Error(), want) {
t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want)
}
}
}
// TestValidate_Bundle2_PlaceholderEncryptionKey_DemoAckExempt covers
// the demo overlay's posture (placeholder + DemoModeAck=true).
func TestValidate_Bundle2_PlaceholderEncryptionKey_DemoAckExempt(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.Encryption.ConfigEncryptionKey = "change-me-32-char-encryption-key"
cfg.Auth.DemoModeAck = true
if err := cfg.Validate(); err != nil {
t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept placeholder encryption key", err)
}
}
// TestValidate_Bundle2_RealEncryptionKey_NotMistakenForPlaceholder
// pins that a real `openssl rand -base64 32` output sails through.
// Defense against an over-broad match (e.g. accidentally rejecting any
// key starting with "change-me-").
func TestValidate_Bundle2_RealEncryptionKey_NotMistakenForPlaceholder(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
// 44-char base64 sample — same shape `openssl rand -base64 32` produces.
cfg.Encryption.ConfigEncryptionKey = "Tc1hZ4n3Ph5gC8e2zR0qV6jX9mYwL1pK4wB7uE3nQ5o="
cfg.Auth.DemoModeAck = false
if err := cfg.Validate(); err != nil {
t.Errorf("Validate() returned %v; want nil for realistic operator key", err)
}
}
// TestValidate_Bundle2_CORSWildcard_Refused pins the LOW-5 closure:
// CERTCTL_CORS_ORIGINS containing "*" hard-fails Validate() outside
// demo mode. Wildcard CORS + session cookies = CWE-942 + CWE-352.
func TestValidate_Bundle2_CORSWildcard_Refused(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.CORS.AllowedOrigins = []string{"*"}
cfg.Auth.DemoModeAck = false
err := cfg.Validate()
if err == nil {
t.Fatal("Validate() returned nil; expected refusal on wildcard CORS")
}
for _, want := range []string{"CERTCTL_CORS_ORIGINS", "wildcard", "CSRF"} {
if !strings.Contains(err.Error(), want) {
t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want)
}
}
}
// TestValidate_Bundle2_CORSWildcard_DemoAckExempt covers the demo
// posture (operators frequently want unrestricted CORS for dashboard
// screencaps + curl-from-any-origin diagnostics).
func TestValidate_Bundle2_CORSWildcard_DemoAckExempt(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.CORS.AllowedOrigins = []string{"*"}
cfg.Auth.DemoModeAck = true
if err := cfg.Validate(); err != nil {
t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept wildcard CORS", err)
}
}
// TestValidate_Bundle2_CORSWildcard_MixedAllowlistStillRefused pins
// that "*" mixed into an otherwise-concrete allowlist still trips the
// guard. The wildcard short-circuits the entire allowlist in
// middleware.NewCORS, so leaving "*" alongside legit origins is just
// as dangerous as "*" alone.
func TestValidate_Bundle2_CORSWildcard_MixedAllowlistStillRefused(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.CORS.AllowedOrigins = []string{"https://dashboard.example.com", "*", "https://other.example.com"}
cfg.Auth.DemoModeAck = false
err := cfg.Validate()
if err == nil {
t.Fatal("Validate() returned nil; expected refusal on wildcard mixed into allowlist")
}
if !strings.Contains(err.Error(), "wildcard") {
t.Errorf("Validate() error = %q; want wildcard mention", err)
}
}
// TestValidate_Bundle2_CORSConcreteAllowlist_Accepted pins that a real
// operator allowlist sails through (no false-positive on substring match
// or similar over-broad matching).
func TestValidate_Bundle2_CORSConcreteAllowlist_Accepted(t *testing.T) {
cfg := validBaseConfigForEncryption(t)
cfg.CORS.AllowedOrigins = []string{"https://dashboard.example.com", "https://admin.example.com"}
cfg.Auth.DemoModeAck = false
if err := cfg.Validate(); err != nil {
t.Errorf("Validate() returned %v; want nil for concrete CORS allowlist", err)
}
}