mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 22:21:30 +00:00
622cd29f20
Per the cowork/docs-audit-2026-05-05/ end-to-end factuality audit (20 confirmed findings across 76 docs, 7 parallel subagents + audit-of-the-audit). Hot + Warm tier fixes ship here; STALE findings (qa-test-suite.md test-count snapshot) need 'make qa-stats' which is operator-side. BROKEN links repaired (3): - docs/reference/api.md L195: [Quick Start](quickstart.md) → ../getting-started/quickstart.md (404 pre-fix) - docs/reference/api.md L196: [Connector Guide](connectors.md) → connectors/index.md (Phase 4 rename, was 404 pre-fix) - docs/reference/protocols/scep-intune.md L377: [legacy-est-scep.md](legacy-est-scep.md) → scep-server.md (file was deleted in Phase 7 commite9b1510) INCORRECT count claims repaired (12): - api.md L5 + L18-19 + L155: '78 API operations' / '# 78' / 'all 78 documented operations' → re-derive via grep -cE '^\s+operationId:' (actual at HEAD: 144) - architecture.md L66 (Mermaid label) + L502 + L1047 + L1253: '8 always-on + 4 optional loops' / '12-loop topology' → 9 always-on + 5 opt-in loops (14 total). Always-on/opt-in breakdown derived from cmd/server/main.go startup wiring: always-on are agentHealthCheck, crlGeneration, jobProcessor, jobRetry, jobTimeout, notificationProcess, notificationRetry, renewalCheck, shortLivedExpiryCheck (9); opt-in are networkScan, digest, healthCheck, cloudDiscovery, acmeGC (5). Re-derive count via grep -cE '^func \(s \*Scheduler\) [a-zA-Z]+Loop' internal/scheduler/scheduler.go. - configuration.md L31: '12 loops, 8 always-on + 4 opt-in' → '14 loops, 9 always-on + 5 opt-in'. Self-introduced regression from commit3275f9f(2026-05-05). - mcp.md L11 + L65: 'all 78 API endpoints' / '78 available tools' → re-derive via grep -cE 'mcp\.AddTool\(' (actual at HEAD: 87 MCP tools, 144 API operations). - connectors/index.md L111: '9 built-in' issuer connectors → '12 built-in', extending the inline enumeration to include Entrust, GlobalSign, EJBCA (which had been added since the L111 prose was written). Local-CA framing extended to mention tree mode + ADCS sub-CA mode-doc. - connectors/index.md L112: '14 built-in' target connectors → '15 built-in', adding AWS ACM target + Azure Key Vault target (which had been added since the L112 prose was written). - why-certctl.md L37 + the inline list: 'Nine issuer connectors ship today' → 'Twelve issuer connectors', adding AWS ACM PCA, Entrust, GlobalSign, EJBCA to the list and removing the misleading 'EST enrollment' bullet (EST is a protocol surface, not an issuer; clarified in trailing note). - why-certctl.md L66: '13 deployment targets' → '15', adding Kubernetes Secrets, AWS ACM, and Azure KV to the inline list. - why-certctl.md L92: 'supports 9 issuer types' → '12 issuer types'. - quickstart.md L135: '35 demo certificates across 5 issuers' → re-derive cert count via 'grep -oE "mc-[a-z0-9_-]+" migrations/seed_demo.sql | sort -u | wc -l' (actual: 32, matches README L86; quickstart was off-by-3). - quickstart.md L452 (Demo Data Reference table): Certificates '35' → '32' (matches the cert count from seed_demo.sql). Verification: - grep confirms no remaining stale refs across the touched files (8 files, 31 insertions / 28 deletions). - All 24 ci-guards/*.sh pass locally. - The audit's STALE findings (S-1, S-2 qa-test-suite.md Bundle-P snapshot) are operator-side: run 'make qa-stats' to refresh the Test Suite Health table. Companion: cowork/docs-audit-2026-05-05/RESULTS.md captures the full audit with subagent false positives and missed findings called out.
99 lines
5.2 KiB
Markdown
99 lines
5.2 KiB
Markdown
# Configuration Reference
|
|
|
|
> Last reviewed: 2026-05-05
|
|
|
|
Compact reference for `CERTCTL_*` environment variables consumed by
|
|
`certctl-server` and `certctl-agent`. Most operators don't need to
|
|
touch these — defaults are tuned for the common case. Reach for them
|
|
when the system's behaviour needs tuning beyond what's exposed in the
|
|
GUI / API.
|
|
|
|
This page enumerates the operator-tunable knobs that don't have a
|
|
dedicated home elsewhere. Connector-specific env vars are documented
|
|
on the per-connector pages under
|
|
[`docs/reference/connectors/`](connectors/index.md). Protocol env
|
|
vars (ACME server, EST, SCEP) are documented under
|
|
[`docs/reference/protocols/`](protocols/). TLS env vars are
|
|
documented in [`docs/operator/tls.md`](../operator/tls.md).
|
|
|
|
## Scheduler intervals
|
|
|
|
The scheduler runs N background loops; intervals are tunable for
|
|
performance / contention tuning.
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_SCHEDULER_AGENT_HEALTH_CHECK_INTERVAL` | `2m` | How often the agent-health loop scans for stale heartbeats and transitions agents to `Unhealthy` / `Offline`. |
|
|
| `CERTCTL_SCHEDULER_JOB_PROCESSOR_INTERVAL` | `30s` | How often the job-processor loop dispatches `Pending` jobs to agents. |
|
|
| `CERTCTL_SCHEDULER_NOTIFICATION_PROCESS_INTERVAL` | `1m` | How often the notification-dispatcher loop fans out queued alerts to channels. |
|
|
| `CERTCTL_SHORT_LIVED_EXPIRY_CHECK_INTERVAL` | `5m` | How often the short-lived-expiry loop watches certs whose TTL is less than 1h for imminent expiry. |
|
|
|
|
For the full scheduler topology (14 loops, 9 always-on + 5 opt-in)
|
|
see [`architecture.md`](architecture.md) "Scheduler topology".
|
|
|
|
## Job lifecycle
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_JOB_AWAITING_CSR_TIMEOUT` | `24h` | How long a job stays in `AwaitingCSR` before the scheduler marks it `Failed` (the agent never picked it up). |
|
|
|
|
## Rate limiting
|
|
|
|
The control plane API is rate-limited by default; tune for
|
|
high-volume environments (mass-rotation events, bulk imports).
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_RATE_LIMIT_ENABLED` | `true` | Master toggle. Disable only for trusted-network single-tenant deploys where the API is firewall-protected. |
|
|
| `CERTCTL_RATE_LIMIT_PER_USER_RPS` | `0` (= use global default) | Per-user requests-per-second cap. Zero opts each user into the global default in `internal/api/middleware`. |
|
|
| `CERTCTL_RATE_LIMIT_PER_USER_BURST` | `0` (= use global default) | Per-user token-bucket burst size. Same opt-in semantics. |
|
|
|
|
## Audit trail
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS` | `30` | How long the audit-event flush worker waits for the buffered batch to drain before forcing a flush at shutdown. |
|
|
|
|
## Deploy verification
|
|
|
|
The deploy-hardening primitive wraps every cert deploy in
|
|
atomic-write + post-verify + rollback. These env vars tune the
|
|
post-deploy TLS verification phase.
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_VERIFY_DEPLOYMENT` | `true` | Master toggle for post-deploy TLS verify. Disable only for connectors / environments where the verify endpoint is not reachable from the agent. |
|
|
| `CERTCTL_VERIFY_DELAY` | `2s` | How long to wait after the reload command completes before the first verify-handshake attempt (gives the daemon time to pick up new keys). |
|
|
| `CERTCTL_VERIFY_TIMEOUT` | `10s` | Per-attempt TLS-handshake timeout. |
|
|
| `CERTCTL_DEPLOY_BACKUP_RETENTION` | `3` | How many `.certctl-bak.<unix-nanos>.<ext>` rollback snapshots to keep per target after a successful deploy. `0` uses the default of 3; `-1` opts out of pruning entirely. |
|
|
|
|
For the full deploy contract see
|
|
[`deployment-model.md`](deployment-model.md).
|
|
|
|
## Database
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_DATABASE_MIGRATIONS_PATH` | `./migrations` | Filesystem path to the `*.up.sql` / `*.down.sql` migration set. Override only when running `certctl-server` from a non-standard layout. |
|
|
|
|
## Agent
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_AGENT_ID` | (none — required) | The agent's unique ID, issued by `POST /api/v1/agents/register` and bundled into the agent's registration response. Pass via this env var when the agent runs as a systemd unit / container without the `-agent-id` CLI flag. |
|
|
|
|
## SCEP profile binding (single-profile back-compat)
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `CERTCTL_SCEP_PROFILE_ID` | (empty) | Optional certificate profile ID for the legacy single-profile SCEP path. The multi-profile path uses `CERTCTL_SCEP_PROFILES=<list>` + `CERTCTL_SCEP_PROFILE_<NAME>_PROFILE_ID` instead — see [`scep-server.md`](protocols/scep-server.md). |
|
|
|
|
## Related references
|
|
|
|
- [`architecture.md`](architecture.md) — scheduler topology, system design, security model
|
|
- [`deployment-model.md`](deployment-model.md) — atomic write + verify + rollback contract
|
|
- [`operator/security.md`](../operator/security.md) — full security posture (auth, rate limits, encryption at rest)
|
|
- [`operator/tls.md`](../operator/tls.md) — control-plane TLS env vars
|
|
- Per-connector pages under [`reference/connectors/`](connectors/index.md) for connector-specific config
|
|
- Per-protocol pages under [`reference/protocols/`](protocols/) for ACME / SCEP / EST / CRL+OCSP / async-CA polling
|