mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 22:21:30 +00:00
a849c8b8cf
Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.
Source findings closed:
R2 R3 C1 D9 finding-2 S9 (repo audit)
SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)
Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.
Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
• HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse
• HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
• LOW-5: CORS_ORIGINS contains "*" (CWE-942 + CWE-352) → refuse
Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.
Agent enrollment doc alignment:
• docs/reference/configuration.md L83: corrected the non-existent
URL `POST /api/v1/agents/register` to the real route
`POST /api/v1/agents`; added the bootstrap-token note and the
install-agent.sh handoff sequence.
• docs/reference/architecture.md L154: replaced "agents register
themselves at first heartbeat" (false — cmd/agent/main.go fail-
fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
operator-driven flow (REST or GUI registration first, returned ID
fed to install-agent.sh second).
Tests + CI guard:
• 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
covering: placeholder-secret refused + demo-ack exempt; placeholder
encryption-key refused + demo-ack exempt; real key not mistaken for
placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
into a concrete allowlist still refused; concrete allowlist accepted.
• scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
compose for any of the demo-mode env vars + placeholder credentials.
Comments stripped before checking so the narrative header in the
base file can still reference the overlay's posture in prose.
Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.
Receipts:
• Current first-run truth table
compose flag → posture
-f docker-compose.yml (production)
→ requires .env;
fail-fasts on
missing AUTH_SECRET
/ CONFIG_ENCRYPTION
_KEY / POSTGRES
_PASSWORD; agent
fail-fasts on
missing AGENT_ID
-f docker-compose.yml -f docker-compose.demo.yml (demo)
→ zero-config;
AUTH_TYPE=none +
DEMO_MODE_ACK=true
+ KEYGEN=server +
DEMO_SEED=true;
boot banner WARN
-f docker-compose.yml -f docker-compose.dev.yml (dev)
→ base + PgAdmin
+ debug logging
-f docker-compose.test.yml (test, standalone)
→ production-shape
posture, real CA
backends
• Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
gofmt -l # clean (no diffs)
go vet ./internal/config ./cmd/server # clean
go test -short -count=1 ./internal/config/... # PASS (cumulative +
all 9 new Bundle 2
cases green)
go test -short -count=1 # PASS (no regression
./internal/connector/target/configcheck in the Bundle 1 -
closure tests)
go build ./cmd/server ./cmd/agent # clean
./cmd/cli ./cmd/mcp-server
bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean
bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
bash scripts/ci-guards/G-3-env-docs-drift.sh # clean
Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
• The first `docker compose -f docker-compose.yml up -d` against a
pre-Bundle-2 .env (placeholder values still in place) will now
fail-fast. This is the intended posture but operators upgrading
from v2.0.x via .env-from-old-master need to rotate before
upgrading. The CHANGELOG note for the v2.1.0 release should
call this out alongside Auth Bundle 2's other breaking changes.
Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
123 lines
8.4 KiB
Markdown
123 lines
8.4 KiB
Markdown
# Configuration Reference
|
|
|
|
> Last reviewed: 2026-05-05
|
|
|
|
Compact reference for `CERTCTL_*` environment variables consumed by
|
|
`certctl-server` and `certctl-agent`. Most operators don't need to
|
|
touch these — defaults are tuned for the common case. Reach for them
|
|
when the system's behaviour needs tuning beyond what's exposed in the
|
|
GUI / API.
|
|
|
|
This page enumerates the operator-tunable knobs that don't have a
|
|
dedicated home elsewhere. Connector-specific env vars are documented
|
|
on the per-connector pages under
|
|
[`docs/reference/connectors/`](connectors/index.md). Protocol env
|
|
vars (ACME server, EST, SCEP) are documented under
|
|
[`docs/reference/protocols/`](protocols/). TLS env vars are
|
|
documented in [`docs/operator/tls.md`](../operator/tls.md).
|
|
|
|
## Scheduler intervals
|
|
|
|
The scheduler runs N background loops; intervals are tunable for
|
|
performance / contention tuning.
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_SCHEDULER_AGENT_HEALTH_CHECK_INTERVAL` | `2m` | How often the agent-health loop scans for stale heartbeats and transitions agents to `Unhealthy` / `Offline`. |
|
|
| `CERTCTL_SCHEDULER_JOB_PROCESSOR_INTERVAL` | `30s` | How often the job-processor loop dispatches `Pending` jobs to agents. |
|
|
| `CERTCTL_SCHEDULER_NOTIFICATION_PROCESS_INTERVAL` | `1m` | How often the notification-dispatcher loop fans out queued alerts to channels. |
|
|
| `CERTCTL_SHORT_LIVED_EXPIRY_CHECK_INTERVAL` | `5m` | How often the short-lived-expiry loop watches certs whose TTL is less than 1h for imminent expiry. |
|
|
|
|
For the full scheduler topology (14 loops, 9 always-on + 5 opt-in)
|
|
see [`architecture.md`](architecture.md) "Scheduler topology".
|
|
|
|
## Job lifecycle
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_JOB_AWAITING_CSR_TIMEOUT` | `24h` | How long a job stays in `AwaitingCSR` before the scheduler marks it `Failed` (the agent never picked it up). |
|
|
|
|
## Rate limiting
|
|
|
|
The control plane API is rate-limited by default; tune for
|
|
high-volume environments (mass-rotation events, bulk imports).
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_RATE_LIMIT_ENABLED` | `true` | Master toggle. Disable only for trusted-network single-tenant deploys where the API is firewall-protected. |
|
|
| `CERTCTL_RATE_LIMIT_PER_USER_RPS` | `0` (= use global default) | Per-user requests-per-second cap. Zero opts each user into the global default in `internal/api/middleware`. |
|
|
| `CERTCTL_RATE_LIMIT_PER_USER_BURST` | `0` (= use global default) | Per-user token-bucket burst size. Same opt-in semantics. |
|
|
|
|
## Audit trail
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS` | `30` | How long the audit-event flush worker waits for the buffered batch to drain before forcing a flush at shutdown. |
|
|
|
|
## Deploy verification
|
|
|
|
The deploy-hardening primitive wraps every cert deploy in
|
|
atomic-write + post-verify + rollback. These env vars tune the
|
|
post-deploy TLS verification phase.
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_VERIFY_DEPLOYMENT` | `true` | Master toggle for post-deploy TLS verify. Disable only for connectors / environments where the verify endpoint is not reachable from the agent. |
|
|
| `CERTCTL_VERIFY_DELAY` | `2s` | How long to wait after the reload command completes before the first verify-handshake attempt (gives the daemon time to pick up new keys). |
|
|
| `CERTCTL_VERIFY_TIMEOUT` | `10s` | Per-attempt TLS-handshake timeout. |
|
|
| `CERTCTL_DEPLOY_BACKUP_RETENTION` | `3` | How many `.certctl-bak.<unix-nanos>.<ext>` rollback snapshots to keep per target after a successful deploy. `0` uses the default of 3; `-1` opts out of pruning entirely. |
|
|
|
|
For the full deploy contract see
|
|
[`deployment-model.md`](deployment-model.md).
|
|
|
|
## Database
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_DATABASE_MIGRATIONS_PATH` | `./migrations` | Filesystem path to the `*.up.sql` / `*.down.sql` migration set. Override only when running `certctl-server` from a non-standard layout. |
|
|
|
|
## Agent
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_AGENT_ID` | (none — required) | The agent's unique ID, issued by `POST /api/v1/agents` (requires `CERTCTL_AGENT_BOOTSTRAP_TOKEN` when configured) and returned in the registration response body. Pass via this env var when the agent runs as a systemd unit / container without the `-agent-id` CLI flag. The bundled `install-agent.sh` does NOT auto-register — operators pre-register an agent via the REST endpoint (or the dashboard), then pass the returned ID to the script via `--agent-id`. |
|
|
|
|
## Auth (RBAC + OIDC + sessions + break-glass)
|
|
|
|
Configuration knobs for the RBAC + OIDC + sessions + break-glass
|
|
auth surface. Full operator guidance lives in
|
|
[`operator/rbac.md`](../operator/rbac.md),
|
|
[`operator/oidc-runbooks/`](../operator/oidc-runbooks/index.md), and
|
|
[`operator/auth-threat-model.md`](../operator/auth-threat-model.md).
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_SESSION_BIND_USER_AGENT` | `false` | Bind every session cookie to the User-Agent header captured at login; mismatch -> 401. Defense in depth against stolen cookies on the same network. |
|
|
| `CERTCTL_SESSION_GC_INTERVAL` | `1h` | How often the scheduler's session-GC loop sweeps expired/revoked rows out of `sessions`. Trade-off: shorter = smaller table, more DB churn; longer = pile-up. |
|
|
| `CERTCTL_OIDC_BCL_MAX_AGE_SECONDS` | `60` | Back-channel logout `iat` freshness window. Tokens older or newer than this skew (in either direction) are rejected. |
|
|
| `CERTCTL_OIDC_PRELOGIN_REQUIRE_UA` | `false` | Reject the OIDC callback if the User-Agent at callback differs from the UA captured at pre-login. RFC 9700 §4.7.1 defense-in-depth. |
|
|
| `CERTCTL_OIDC_PRELOGIN_REQUIRE_IP` | `false` | Same as `_UA` but for client IP. Set carefully — corporate networks with carrier-grade NAT can change apparent IP mid-flow. |
|
|
| `CERTCTL_DEMO_MODE_ACK` | `false` | Operator acknowledgement that demo mode is intentional in this deploy. Required when `CERTCTL_AUTH_TYPE=none` to allow server startup; safety net against demo-mode-in-production leakage. |
|
|
| `CERTCTL_TRUSTED_PROXIES` | (empty) | Comma-separated list of trusted-proxy CIDRs (e.g. `10.0.0.0/8,192.0.2.1`). XFF is consulted for client-IP derivation only when the immediate peer sits in this allowlist. |
|
|
| `CERTCTL_TRUSTED_PROXIES_COUNT` | (synthesised) | Read-only counter exposed by `/api/v1/auth/runtime-config`; mirrors `len(CERTCTL_TRUSTED_PROXIES)`. Not operator-settable; documented here so the G-3 env-docs-drift guard catches drift. |
|
|
| `CERTCTL_BOOTSTRAP_TOKEN` | (empty) | One-shot token used to mint the first admin role binding via `POST /api/v1/auth/bootstrap`. Once consumed, deletes itself from memory and unsets the bootstrap endpoint. |
|
|
| `CERTCTL_BOOTSTRAP_TOKEN_SET` | (synthesised) | Boolean exposed by `/api/v1/auth/runtime-config`; `true` when `CERTCTL_BOOTSTRAP_TOKEN` was set at server start. Not operator-settable; documented here so the G-3 guard catches drift. |
|
|
| `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` | (empty) | When OIDC is enabled, restricts the first-admin OIDC strategy to the named provider only — any other provider's tokens won't trigger the bootstrap hook. |
|
|
| `CERTCTL_BOOTSTRAP_ADMIN_GROUPS_COUNT` | (synthesised) | Read-only counter exposed by `/api/v1/auth/runtime-config`; mirrors `len(CERTCTL_BOOTSTRAP_ADMIN_GROUPS)`. Documented here so the G-3 guard catches drift. |
|
|
| `CERTCTL_BREAKGLASS_LOCKOUT_THRESHOLD` | `5` | Number of consecutive failed `/auth/breakglass/login` attempts that lock the credential. |
|
|
|
|
## SCEP profile binding (single-profile back-compat)
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_SCEP_PROFILE_ID` | (empty) | Optional certificate profile ID for the legacy single-profile SCEP path. The multi-profile path uses `CERTCTL_SCEP_PROFILES=<list>` + `CERTCTL_SCEP_PROFILE_<NAME>_PROFILE_ID` instead — see [`scep-server.md`](protocols/scep-server.md). |
|
|
|
|
## Related references
|
|
|
|
- [`architecture.md`](architecture.md) — scheduler topology, system design, security model
|
|
- [`deployment-model.md`](deployment-model.md) — atomic write + verify + rollback contract
|
|
- [`operator/security.md`](../operator/security.md) — full security posture (auth, rate limits, encryption at rest)
|
|
- [`operator/tls.md`](../operator/tls.md) — control-plane TLS env vars
|
|
- Per-connector pages under [`reference/connectors/`](connectors/index.md) for connector-specific config
|
|
- Per-protocol pages under [`reference/protocols/`](protocols/) for ACME / SCEP / EST / CRL+OCSP / async-CA polling
|