Files
certctl/docs/reference/configuration.md
T
shankar0123 d030c26914 fix(security): close BUNDLE 2 — safe first run, demo mode, agent bootstrap
Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.

Source findings closed:
  R2 R3 C1 D9 finding-2 S9               (repo audit)
  SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)

Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.

Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
  • HIGH-6:  AUTH_SECRET = "change-me-in-production"        → refuse
  • HIGH-6:  CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
  • LOW-5:   CORS_ORIGINS contains "*"  (CWE-942 + CWE-352) → refuse

Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.

Agent enrollment doc alignment:
  • docs/reference/configuration.md L83: corrected the non-existent
    URL `POST /api/v1/agents/register` to the real route
    `POST /api/v1/agents`; added the bootstrap-token note and the
    install-agent.sh handoff sequence.
  • docs/reference/architecture.md L154: replaced "agents register
    themselves at first heartbeat" (false — cmd/agent/main.go fail-
    fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
    operator-driven flow (REST or GUI registration first, returned ID
    fed to install-agent.sh second).

Tests + CI guard:
  • 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
    covering: placeholder-secret refused + demo-ack exempt; placeholder
    encryption-key refused + demo-ack exempt; real key not mistaken for
    placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
    into a concrete allowlist still refused; concrete allowlist accepted.
  • scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
    compose for any of the demo-mode env vars + placeholder credentials.
    Comments stripped before checking so the narrative header in the
    base file can still reference the overlay's posture in prose.

Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.

Receipts:
  • Current first-run truth table
        compose flag                                  → posture
        -f docker-compose.yml                          (production)
                                                       → requires .env;
                                                       fail-fasts on
                                                       missing AUTH_SECRET
                                                       / CONFIG_ENCRYPTION
                                                       _KEY / POSTGRES
                                                       _PASSWORD; agent
                                                       fail-fasts on
                                                       missing AGENT_ID
        -f docker-compose.yml -f docker-compose.demo.yml  (demo)
                                                       → zero-config;
                                                       AUTH_TYPE=none +
                                                       DEMO_MODE_ACK=true
                                                       + KEYGEN=server +
                                                       DEMO_SEED=true;
                                                       boot banner WARN
        -f docker-compose.yml -f docker-compose.dev.yml   (dev)
                                                       → base + PgAdmin
                                                       + debug logging
        -f docker-compose.test.yml                     (test, standalone)
                                                       → production-shape
                                                       posture, real CA
                                                       backends
  • Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
        gofmt -l                                      # clean (no diffs)
        go vet ./internal/config ./cmd/server         # clean
        go test -short -count=1 ./internal/config/... # PASS (cumulative +
                                                       all 9 new Bundle 2
                                                       cases green)
        go test -short -count=1                       # PASS (no regression
            ./internal/connector/target/configcheck    in the Bundle 1 -
                                                       closure tests)
        go build ./cmd/server ./cmd/agent             # clean
            ./cmd/cli ./cmd/mcp-server
        bash scripts/ci-guards/B2-compose-base-no-demo-env.sh  # clean
        bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
        bash scripts/ci-guards/G-3-env-docs-drift.sh           # clean

Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
  • The first `docker compose -f docker-compose.yml up -d` against a
    pre-Bundle-2 .env (placeholder values still in place) will now
    fail-fast. This is the intended posture but operators upgrading
    from v2.0.x via .env-from-old-master need to rotate before
    upgrading. The CHANGELOG note for the v2.1.0 release should
    call this out alongside Auth Bundle 2's other breaking changes.

Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
2026-05-13 00:14:59 +00:00

8.4 KiB

Configuration Reference

Last reviewed: 2026-05-05

Compact reference for CERTCTL_* environment variables consumed by certctl-server and certctl-agent. Most operators don't need to touch these — defaults are tuned for the common case. Reach for them when the system's behaviour needs tuning beyond what's exposed in the GUI / API.

This page enumerates the operator-tunable knobs that don't have a dedicated home elsewhere. Connector-specific env vars are documented on the per-connector pages under docs/reference/connectors/. Protocol env vars (ACME server, EST, SCEP) are documented under docs/reference/protocols/. TLS env vars are documented in docs/operator/tls.md.

Scheduler intervals

The scheduler runs N background loops; intervals are tunable for performance / contention tuning.

Variable Default Description
CERTCTL_SCHEDULER_AGENT_HEALTH_CHECK_INTERVAL 2m How often the agent-health loop scans for stale heartbeats and transitions agents to Unhealthy / Offline.
CERTCTL_SCHEDULER_JOB_PROCESSOR_INTERVAL 30s How often the job-processor loop dispatches Pending jobs to agents.
CERTCTL_SCHEDULER_NOTIFICATION_PROCESS_INTERVAL 1m How often the notification-dispatcher loop fans out queued alerts to channels.
CERTCTL_SHORT_LIVED_EXPIRY_CHECK_INTERVAL 5m How often the short-lived-expiry loop watches certs whose TTL is less than 1h for imminent expiry.

For the full scheduler topology (14 loops, 9 always-on + 5 opt-in) see architecture.md "Scheduler topology".

Job lifecycle

Variable Default Description
CERTCTL_JOB_AWAITING_CSR_TIMEOUT 24h How long a job stays in AwaitingCSR before the scheduler marks it Failed (the agent never picked it up).

Rate limiting

The control plane API is rate-limited by default; tune for high-volume environments (mass-rotation events, bulk imports).

Variable Default Description
CERTCTL_RATE_LIMIT_ENABLED true Master toggle. Disable only for trusted-network single-tenant deploys where the API is firewall-protected.
CERTCTL_RATE_LIMIT_PER_USER_RPS 0 (= use global default) Per-user requests-per-second cap. Zero opts each user into the global default in internal/api/middleware.
CERTCTL_RATE_LIMIT_PER_USER_BURST 0 (= use global default) Per-user token-bucket burst size. Same opt-in semantics.

Audit trail

Variable Default Description
CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS 30 How long the audit-event flush worker waits for the buffered batch to drain before forcing a flush at shutdown.

Deploy verification

The deploy-hardening primitive wraps every cert deploy in atomic-write + post-verify + rollback. These env vars tune the post-deploy TLS verification phase.

Variable Default Description
CERTCTL_VERIFY_DEPLOYMENT true Master toggle for post-deploy TLS verify. Disable only for connectors / environments where the verify endpoint is not reachable from the agent.
CERTCTL_VERIFY_DELAY 2s How long to wait after the reload command completes before the first verify-handshake attempt (gives the daemon time to pick up new keys).
CERTCTL_VERIFY_TIMEOUT 10s Per-attempt TLS-handshake timeout.
CERTCTL_DEPLOY_BACKUP_RETENTION 3 How many .certctl-bak.<unix-nanos>.<ext> rollback snapshots to keep per target after a successful deploy. 0 uses the default of 3; -1 opts out of pruning entirely.

For the full deploy contract see deployment-model.md.

Database

Variable Default Description
CERTCTL_DATABASE_MIGRATIONS_PATH ./migrations Filesystem path to the *.up.sql / *.down.sql migration set. Override only when running certctl-server from a non-standard layout.

Agent

Variable Default Description
CERTCTL_AGENT_ID (none — required) The agent's unique ID, issued by POST /api/v1/agents (requires CERTCTL_AGENT_BOOTSTRAP_TOKEN when configured) and returned in the registration response body. Pass via this env var when the agent runs as a systemd unit / container without the -agent-id CLI flag. The bundled install-agent.sh does NOT auto-register — operators pre-register an agent via the REST endpoint (or the dashboard), then pass the returned ID to the script via --agent-id.

Auth (RBAC + OIDC + sessions + break-glass)

Configuration knobs for the RBAC + OIDC + sessions + break-glass auth surface. Full operator guidance lives in operator/rbac.md, operator/oidc-runbooks/, and operator/auth-threat-model.md.

Variable Default Description
CERTCTL_SESSION_BIND_USER_AGENT false Bind every session cookie to the User-Agent header captured at login; mismatch -> 401. Defense in depth against stolen cookies on the same network.
CERTCTL_SESSION_GC_INTERVAL 1h How often the scheduler's session-GC loop sweeps expired/revoked rows out of sessions. Trade-off: shorter = smaller table, more DB churn; longer = pile-up.
CERTCTL_OIDC_BCL_MAX_AGE_SECONDS 60 Back-channel logout iat freshness window. Tokens older or newer than this skew (in either direction) are rejected.
CERTCTL_OIDC_PRELOGIN_REQUIRE_UA false Reject the OIDC callback if the User-Agent at callback differs from the UA captured at pre-login. RFC 9700 §4.7.1 defense-in-depth.
CERTCTL_OIDC_PRELOGIN_REQUIRE_IP false Same as _UA but for client IP. Set carefully — corporate networks with carrier-grade NAT can change apparent IP mid-flow.
CERTCTL_DEMO_MODE_ACK false Operator acknowledgement that demo mode is intentional in this deploy. Required when CERTCTL_AUTH_TYPE=none to allow server startup; safety net against demo-mode-in-production leakage.
CERTCTL_TRUSTED_PROXIES (empty) Comma-separated list of trusted-proxy CIDRs (e.g. 10.0.0.0/8,192.0.2.1). XFF is consulted for client-IP derivation only when the immediate peer sits in this allowlist.
CERTCTL_TRUSTED_PROXIES_COUNT (synthesised) Read-only counter exposed by /api/v1/auth/runtime-config; mirrors len(CERTCTL_TRUSTED_PROXIES). Not operator-settable; documented here so the G-3 env-docs-drift guard catches drift.
CERTCTL_BOOTSTRAP_TOKEN (empty) One-shot token used to mint the first admin role binding via POST /api/v1/auth/bootstrap. Once consumed, deletes itself from memory and unsets the bootstrap endpoint.
CERTCTL_BOOTSTRAP_TOKEN_SET (synthesised) Boolean exposed by /api/v1/auth/runtime-config; true when CERTCTL_BOOTSTRAP_TOKEN was set at server start. Not operator-settable; documented here so the G-3 guard catches drift.
CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID (empty) When OIDC is enabled, restricts the first-admin OIDC strategy to the named provider only — any other provider's tokens won't trigger the bootstrap hook.
CERTCTL_BOOTSTRAP_ADMIN_GROUPS_COUNT (synthesised) Read-only counter exposed by /api/v1/auth/runtime-config; mirrors len(CERTCTL_BOOTSTRAP_ADMIN_GROUPS). Documented here so the G-3 guard catches drift.
CERTCTL_BREAKGLASS_LOCKOUT_THRESHOLD 5 Number of consecutive failed /auth/breakglass/login attempts that lock the credential.

SCEP profile binding (single-profile back-compat)

Variable Default Description
CERTCTL_SCEP_PROFILE_ID (empty) Optional certificate profile ID for the legacy single-profile SCEP path. The multi-profile path uses CERTCTL_SCEP_PROFILES=<list> + CERTCTL_SCEP_PROFILE_<NAME>_PROFILE_ID instead — see scep-server.md.