Closes the #9 acquisition-readiness blocker from the 2026-05-01 issuer coverage audit. Pre-fix, JobService.ProcessPendingJobs ran every claimed job sequentially in a single goroutine: safe but slow, and operators with large fleets had no lever to dial throughput up. Switching to fire-and-forget per-job goroutines would have unbounded the upstream-CA call rate and tripped DigiCert / Entrust / Sectigo rate limits — certctl's response to 429 was to retry on the next tick, re-fanning out the same calls and digging deeper into the limit. Operators need a knob. This commit: - Adds CERTCTL_RENEWAL_CONCURRENCY env var (default 25) loaded via the existing getEnvInt pattern in internal/config/config.go. Documented inline as the cap for the per-tick renewal/issuance/ deployment goroutine fan-out, with operator-tuning guidance: permissive upstream limits + large fleets (>10k certs) → 100; strict limits or async-CA-heavy fleets → 25 or lower. - Wires golang.org/x/sync/semaphore.Weighted around the per-job goroutine launch in JobService.ProcessPendingJobs. Acquire(ctx, 1) is the load-bearing piece — it BLOCKS the loop when at the cap, providing real backpressure rather than fire-and-forget. The fan-out is split into processPendingJobsSequential (legacy, preserved for unit-test wiring that doesn't call SetRenewalConcurrency) and processPendingJobsConcurrent (production, delegates to a generic boundedFanOut helper). - boundedFanOut takes the per-job work as a closure so the cap can be tested directly without standing up the renewal/deployment service graph. processed/failed counters use atomic.Int64 to avoid mutex overhead on every job completion; final log line reads both AFTER wg.Wait so the counts reflect every dispatched job. ctx-aware Acquire ensures a shutdown ctx cancel interrupts the dispatch loop promptly; in-flight goroutines drain via Wait before the function returns so no goroutine outlives the scheduler tick. - shouldSkipJob extracted as a package-private helper so the agent-routed-deployment skip logic is shared between the sequential and concurrent paths byte-for-byte (the audit prompt's "channel-based semaphore without ctx-aware acquire" anti-pattern is explicitly avoided — semaphore.Weighted.Acquire returns on ctx done; channel <- struct{}{} would block forever). - SetRenewalConcurrency setter on JobService normalises ≤0 to 1. semaphore.NewWeighted(0) constructs a semaphore that blocks every Acquire forever; the normalisation prevents a misconfigured env var from wedging the scheduler. - cmd/server/main.go wires SetRenewalConcurrency(cfg.Scheduler. RenewalConcurrency) on the freshly-built jobService, immediately after SetAuditService. Production deployments always take the bounded path; tests that build JobService directly via NewJobService keep their strict-sequential behaviour because renewalConcurrency is the zero value. - Tests in internal/service/job_concurrency_test.go: * TestBoundedFanOut_CapHolds — primary regression guard. 50 jobs × 50ms work × cap=5 → asserts peak in-flight never exceeds 5 AND reaches 5 at least once (catches both upper-bound regressions and gates that incorrectly cap below the configured value). Lock-free max via CompareAndSwap so the measurement instrument doesn't itself constrain concurrency. * TestBoundedFanOut_AllJobsRun — lower-bound: every non-skipped job is dispatched. * TestBoundedFanOut_SkipsAgentRoutedDeployments — pins the shouldSkipJob contract. * TestBoundedFanOut_CtxCancelInterrupts — ctx cancellation interrupts a stuck fan-out within the timeout budget. * TestBoundedFanOut_FailedJobsCounted — per-job errors don't abort the fan-out. * TestSetRenewalConcurrency_NormalizesNonPositive — ≤0 → 1 fail-safe pinned across negative/zero/positive inputs. - docs/features.md: scheduler-loop table augmented with the concurrency-cap env-var pointer alongside the job-processor row. - docs/architecture.md: Concurrency Safety section gains a paragraph explaining the cap, the operator-tuning guidance, the ctx-aware Acquire semantics, and the audit reference. Operator-facing impact: the first big renewal sweep no longer takes down the upstream CA's rate-limit budget. Existing deployments get the bounded path automatically (default 25); operators can override via env var without code changes. Verified locally: - gofmt -l . clean - go vet ./... clean - staticcheck ./... clean - go test -short -count=1 across service / scheduler / config / integration: green - Six new tests under TestBoundedFanOut* + TestSetRenewalConcurrency*: green Audit reference: cowork/issuer-coverage-audit-2026-05-01/RESULTS.md Top-10 fix #9.
90 KiB
certctl Feature Inventory
Complete reference of every feature shipped in certctl through v2.1.0 (April 2026). Every claim in this document is verified against source code. If a number, default, or behavior isn't here, check the source file listed in the margin.
At a Glance
| Metric | Count |
|---|
| Surface | Count (rebuild command) |
|---|---|
| HTTP routes | rebuild via grep -cE 'r\.Register\("[A-Z]' internal/api/router/router.go |
| OpenAPI 3.1 operations | rebuild via grep -cE '^\s+operationId:' api/openapi.yaml |
| MCP tools | rebuild via grep -cE 'gomcp\.AddTool\(' internal/mcp/tools.go |
| CLI commands | rebuild via `grep -cE 'AddCommand |
| Issuer connectors | rebuild via ls -d internal/connector/issuer/*/ | wc -l (+ EST server) |
| Target connectors | rebuild via ls -d internal/connector/target/*/ | wc -l (includes shared certutil/) |
| Notifier connectors | rebuild via ls -d internal/connector/notifier/*/ | wc -l |
| Discovery connectors | rebuild via ls -d internal/connector/discovery/*/ | wc -l |
| Database tables | rebuild via grep -hE '^CREATE TABLE' migrations/*.up.sql | sed -E 's/CREATE TABLE (IF NOT EXISTS )?([a-zA-Z_]+).*/\2/' | sort -u | wc -l (across ls migrations/*.up.sql | wc -l migrations) |
| Background scheduler loops | rebuild via grep -cE '^func \(s \*Scheduler\) [a-zA-Z]+Loop' internal/scheduler/scheduler.go |
| Web dashboard pages | rebuild via ls web/src/pages/*.tsx | grep -v '\.test\.' | wc -l |
| Test functions (Go backend) | rebuild via the find + grep '^func Test' recipe in CLAUDE.md::Current-state commands |
| Supported platforms | linux/amd64, linux/arm64, darwin/amd64, darwin/arm64 |
API Surface
Authentication
Every API call requires authentication by default. Configurable via CERTCTL_AUTH_TYPE.
| Setting | Behavior |
|---|---|
api-key (default) |
SHA-256 hashed keys, constant-time comparison, Authorization: Bearer {key} |
none |
Disables auth with a log warning at startup |
Two endpoints are served without auth so the GUI can detect auth mode before login:
GET /api/v1/auth/info— returns{"auth_type":"api-key"}GET /api/v1/auth/check— validates credentials
Rate Limiting
Token bucket algorithm protecting the control plane from misbehaving clients.
Bundle B (Audit M-025 / OWASP ASVS L2 §11.2.1): per-key keying. Each
authenticated caller gets a bucket keyed on their API-key name; each
unauthenticated source IP gets its own bucket. Bucket creation is
on-demand under a sync.RWMutex; no eviction (the leak is bounded by
realistic operator IP fan-out — appropriate for the OWASP ASVS L2 threat
model of abuse-by-known-clients, not infinite-cardinality scanners).
| Env Var | Default | Description |
|---|---|---|
CERTCTL_RATE_LIMIT_ENABLED |
true |
Enable/disable |
CERTCTL_RATE_LIMIT_RPS |
50 |
Per-key requests per second (default applies to IP-keyed buckets; user-keyed buckets fall back to this when PER_USER_RPS is unset) |
CERTCTL_RATE_LIMIT_BURST |
100 |
Per-key burst capacity (default applies to IP-keyed buckets; user-keyed buckets fall back to this when PER_USER_BURST is unset) |
CERTCTL_RATE_LIMIT_PER_USER_RPS |
0 |
Override RPS for authenticated callers. 0 means "use RATE_LIMIT_RPS". Set higher than RATE_LIMIT_RPS to grant authenticated clients a more generous budget than anonymous probes. |
CERTCTL_RATE_LIMIT_PER_USER_BURST |
0 |
Override burst for authenticated callers. 0 means "use RATE_LIMIT_BURST". |
Exceeded requests receive 429 Too Many Requests with a Retry-After header.
CORS
Deny-by-default. Empty CERTCTL_CORS_ORIGINS blocks all cross-origin requests.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_CORS_ORIGINS |
"" (deny all) |
Comma-separated origins or * |
Preflight responses include Access-Control-Max-Age for caching.
Request Body Size Limits
http.MaxBytesReader middleware positioned before auth in the middleware chain.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_MAX_BODY_SIZE |
1048576 (1 MB) |
Maximum request body in bytes |
Agent Bootstrap Token
Pre-shared secret enforced on POST /api/v1/agents. When set, the registration handler requires Authorization: Bearer <token> and verifies via crypto/subtle.ConstantTimeCompare BEFORE the JSON body parse — defeats both timing oracles and unauth payload allocation. Mismatch / missing / malformed → 401 invalid_or_missing_bootstrap_token.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_AGENT_BOOTSTRAP_TOKEN |
"" (warn-mode pass-through) |
Bearer token agents must present on first registration. v2.2.0 will require it; unset emits a one-shot startup deprecation WARN. Generate with openssl rand -hex 32. |
Graceful Shutdown Audit Flush
On SIGTERM / SIGINT, the server drains in-flight audit recordings before closing the DB pool. The drain budget is shared with the HTTP server graceful shutdown.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS |
30 |
Total budget (seconds) for HTTP shutdown + scheduler completion + audit-event drain. WARN-log on deadline exceeded; never exit hard. |
Liveness vs Readiness Probes
| Endpoint | Purpose | Probe |
|---|---|---|
GET /health |
Liveness — process alive only. Returns 200 unconditionally; never restart pods for DB hiccups. | k8s livenessProbe |
GET /ready |
Readiness — runs db.PingContext with 2 s ceiling. Returns 503 + {"status":"db_unavailable"} when DB unreachable so k8s drains the pod. |
k8s readinessProbe |
Query Features
All list endpoints support:
- Pagination — page-based (
?page=2&per_page=50) and cursor-based (?cursor=<token>&page_size=100) - Sparse fields —
?fields=id,common_name,statusreturns only requested fields - Sorting —
?sort=-notAfter(prefix-for descending). Whitelist:notAfter,expiresAt,createdAt,updatedAt,commonName,name,status,environment - Time-range filters —
?expires_before=,?expires_after=,?created_after=,?updated_after=(RFC 3339) - Resource filters —
?agent_id=,?profile_id=,?owner_id=,?team_id=,?issuer_id=,?status=
API Audit Log
Every API call is recorded to the immutable audit trail. Best-effort (non-blocking) via goroutine. Fields: method, path, actor (from auth context, falls back to "anonymous"), SHA-256 request body hash (truncated 16 chars), response status, latency. Health/readiness endpoints excluded via ExcludePaths.
Certificate Lifecycle
Certificate Statuses
| Status | Description |
|---|---|
Pending |
Created, awaiting issuance |
Active |
Issued and valid |
Expiring |
Within configured alert threshold |
Expired |
Past notAfter |
RenewalInProgress |
Renewal job in flight |
Failed |
Issuance or renewal failed |
Revoked |
Explicitly revoked |
Archived |
Superseded by newer version |
Key Generation Modes
| Mode | Env Var Value | Behavior |
|---|---|---|
| Agent-side (default) | CERTCTL_KEYGEN_MODE=agent |
Agent generates ECDSA P-256 key pair locally, submits CSR only. Private keys never leave agent infrastructure. Keys stored at CERTCTL_KEY_DIR (default /var/lib/certctl/keys) with 0600 permissions. |
| Server-side (demo only) | CERTCTL_KEYGEN_MODE=server |
Server generates RSA key + CSR. Logs a warning at startup. Used in Docker Compose demo for convenience. |
Issuance Flow
- Certificate created (status: Pending)
- Renewal/issuance job created (status: Pending or AwaitingCSR in agent keygen mode)
- Agent polls
GET /agents/{id}/work, receives job withcommon_nameandsans - Agent generates ECDSA P-256 key pair, creates CSR, submits via
POST /agents/{id}/csr - Server forwards CSR to issuer connector, stores signed certificate
- Deployment jobs created for each target (scoped to assigned agent via
agent_id) - Agent polls for deployment work, deploys to target connector
- Optional: post-deployment TLS verification
Renewal
The renewal scheduler runs every hour (configurable via CERTCTL_SCHEDULER_RENEWAL_CHECK_INTERVAL). For each certificate approaching expiration:
- Checks ACME ARI (RFC 9773) if available — CA-directed renewal timing takes priority
- Falls back to threshold-based logic using per-policy
alert_thresholds_days(default[30, 14, 7, 0]) - Creates renewal job if thresholds are met and no duplicate job exists
Interactive Approval
Jobs can require manual approval before execution. The AwaitingApproval state pauses the job until an operator acts.
POST /api/v1/jobs/{id}/approve— approve with optional reasonPOST /api/v1/jobs/{id}/reject— reject with reason tracking
Expiration Alerting
Configurable per-policy thresholds stored as alert_thresholds_days JSONB (default [30, 14, 7, 0]). The scheduler:
- Sends deduplicated notifications at each threshold crossing
- Transitions certificate status: Active → Expiring → Expired
- Short-lived certs (profile TTL < 1 hour) get a dedicated scheduler loop running every 30 seconds
Revocation Infrastructure
Revocation API
POST /api/v1/certificates/{id}/revoke with RFC 5280 reason codes:
| Reason | CRL Code |
|---|---|
unspecified |
0 |
keyCompromise |
1 |
caCompromise |
2 |
affiliationChanged |
3 |
superseded |
4 |
cessationOfOperation |
5 |
certificateHold |
6 |
privilegeWithdrawn |
9 |
Revocation is a 7-step process: validate eligibility → get serial → update status → record in certificate_revocations table → notify issuer (best-effort) → audit → send notification.
Bulk Revocation
POST /api/v1/certificates/bulk-revoke revokes multiple certificates matching filter criteria in a single operation.
Filter criteria (at least one required):
profile_id— revoke all certs issued with this profileowner_id— revoke all certs owned by this owneragent_id— revoke all certs deployed to this agentissuer_id— revoke all certs from this issuerteam_id— revoke all certs owned by members of this teamcertificate_ids— array of specific cert IDs to revoke
Request body example:
{
"reason": "keyCompromise",
"profile_id": "prof-staging",
"team_id": "team-platform"
}
Response:
{
"job_id": "job-bulk-rev-123",
"criteria": {
"reason": "keyCompromise",
"profile_id": "prof-staging",
"team_id": "team-platform"
},
"affected_count": 47,
"status": "Pending"
}
Behavior:
- Individual revocation jobs created for each matching cert (reuses existing revocation flow)
- Progress tracked via job system (job status: Pending → Running → Completed)
- Partial failures tolerated — if 47 certs match but 3 fail, the other 44 still revoke
- Audit trail: single
bulk_revocation_initiatedevent logs the criteria and actor - Optional
--reasondefaults tounspecifiedif omitted
CRL Endpoint
GET /.well-known/pki/crl/{issuer_id}— DER-encoded X.509 CRL signed by the issuing CA, 24-hour validity (RFC 5280 §5 + RFC 8615). Served unauthenticated withContent-Type: application/pkix-crlso relying parties without certctl API credentials can fetch it.
The CRL is pre-generated by the scheduler's crlGenerationLoop (internal/scheduler/scheduler.go) on a configurable interval (CERTCTL_CRL_GENERATION_INTERVAL, default 1h) and persisted in the crl_cache table (migration 000019). HTTP fetches read from the cache rather than rebuilding per request — a busy CA does not DOS itself at scale. Concurrent regeneration requests for the same issuer are coalesced via an in-tree singleflight gate (internal/service/crl_cache.go, ~30 LoC; no golang.org/x/sync dependency). Per-issuer generation events are recorded in crl_generation_events for ops visibility.
Prior non-standard JSON CRL and authenticated /api/v1/crl* paths were removed in M-006 — RFC 5280 defines only the DER wire format and relying parties do not have API keys.
OCSP Responder
certctl serves both forms RFC 6960 §A.1.1 defines:
GET /.well-known/pki/ocsp/{issuer_id}/{serial}— URL-path lookup (useful for ops curl-debugging).POST /.well-known/pki/ocsp/{issuer_id}— binaryapplication/ocsp-requestbody (the form most production clients use: Firefox, OpenSSLs_client -status, cert-manager, Intune).
Both forms are unauthenticated and return signed OCSP responses (good/revoked/unknown) with Content-Type: application/ocsp-response.
OCSP responses are signed by a dedicated per-issuer OCSP responder cert (RFC 6960 §2.6 / §4.2.2.2, migration 000020) — NOT by the CA private key directly. The responder cert is generated on first OCSP request via OCSPResponderService.EnsureResponder (internal/connector/issuer/local/ocsp_responder.go), persisted in the ocsp_responders table, and carries the id-pkix-ocsp-nocheck extension (OID 1.3.6.1.5.5.7.48.1.5, RFC 6960 §4.2.2.2.1) so OCSP clients do not recursively check the responder's own revocation status. The responder cert auto-rotates within CERTCTL_OCSP_RESPONDER_ROTATION_GRACE (default 7d) of expiry; new certs default to CERTCTL_OCSP_RESPONDER_VALIDITY (30d). Self-healing: if the persisted responder key file is missing (operator pruned the keydir), the service treats this as "rotate now" rather than crashing. Local CA + step-CA connectors expose CRL+OCSP; upstream issuers (Vault, EJBCA, DigiCert) serve their own infrastructure.
Admin Cache Observability
GET /api/v1/admin/crl/cache — admin-gated (Bearer required, admin flag enforced server-side via middleware.IsAdmin; returns HTTP 403 for non-admin callers). Returns the per-issuer cache state: crl_number, this_update, next_update, generated_at, generation_duration_ms, revoked_count, is_stale, plus the most-recent N generation events. Used by ops dashboards and the GUI cert-detail page's cache-age badge. The handler is pinned to the M-008 admin-gated handler allowlist (internal/api/handler/m008_admin_gate_test.go) — adding a new admin endpoint without the regression triplet (_NonAdmin_Returns403 / _AdminExplicitFalse_Returns403 / _AdminPermitted_ForwardsActor) fails CI.
GUI Revocation Endpoints Panel
The certificate-detail page (web/src/pages/CertificateDetailPage.tsx) renders a Revocation Endpoints card that shows the CRL Distribution Point URL (https://<host>/.well-known/pki/crl/<issuer_id>) and OCSP Responder URL (https://<host>/.well-known/pki/ocsp/<issuer_id>), plus two action buttons: "Test CRL fetch" (calls fetchCRL(issuer_id), shows byte count + content-type) and "Check OCSP status" (calls getOCSPStatus(issuer_id, serial_hex), shows DER response size). For admin callers, a cache-age badge ("Cache fresh · 2m ago" / "Cache stale" / "Not yet generated") consumes the admin observability endpoint above; non-admin callers don't trigger the fetch (gated client-side on useAuth().admin) so the badge cannot leak generation cadence.
Short-Lived Certificate Exemption
Certificates with profile TTL < 1 hour skip CRL/OCSP. Expiry is sufficient revocation for short-lived credentials.
For the full operator + relying-party guide (curl/OpenSSL/Firefox/cert-manager/Intune integration recipes, troubleshooting), see crl-ocsp.md.
Certificate Export
Two export formats. Private keys are never included — they live on agents only.
| Endpoint | Format | Notes |
|---|---|---|
GET /api/v1/certificates/{id}/export/pem |
PEM JSON or file download (?download=true) |
Splits leaf from chain |
POST /api/v1/certificates/{id}/export/pkcs12 |
Binary .p12 with Content-Disposition |
Cert-only bundle via go-pkcs12 EncodeTrustStore |
All exports generate audit events (export_pem, export_pkcs12) with serial number tracking.
Certificate Profiles
Named enrollment profiles defining crypto constraints and certificate properties. Stored in PostgreSQL with full CRUD API and GUI page.
Profile Fields
- Allowed key types (RSA 2048/4096, ECDSA P-256/P-384)
- Maximum TTL
- Required SANs
- Permitted Extended Key Usages (EKUs)
Crypto Policy Enforcement (M11c)
CSR validation is enforced at all five issuance paths: server-side renewal, agent-CSR renewal, agent fallback CSR submission, EST enrollment, and SCEP enrollment. When a certificate profile defines AllowedKeyAlgorithms, every incoming CSR is checked against the profile's rules — if the key algorithm or minimum size doesn't match, the request is rejected before reaching the issuer connector.
MaxTTL enforcement caps certificate validity at the profile's configured maximum. Behavior varies by issuer: the Local CA, Vault PKI, and step-ca enforce the cap directly (capping NotAfter or overriding TTL). OpenSSL logs an advisory warning. ACME, DigiCert, Sectigo, Google CAS, AWS ACM PCA, Entrust, GlobalSign, and EJBCA pass through because the CA controls validity. MaxTTL is resolved from the certificate profile at each issuance call site via resolveMaxTTL().
Key metadata persistence — when a certificate version is created from a CSR, the key algorithm (RSA, ECDSA, Ed25519) and key size (in bits) are extracted from the CSR and stored in the certificate_versions table (key_algorithm, key_size columns) for post-hoc compliance auditing.
Supported EKUs
| EKU Name | x509 Constant | Typical Use |
|---|---|---|
serverAuth |
ExtKeyUsageServerAuth |
TLS servers |
clientAuth |
ExtKeyUsageClientAuth |
Mutual TLS |
codeSigning |
ExtKeyUsageCodeSigning |
Code signing |
emailProtection |
ExtKeyUsageEmailProtection |
S/MIME |
timeStamping |
ExtKeyUsageTimeStamping |
Timestamping |
Adaptive KeyUsage
The Local CA adjusts KeyUsage flags based on EKU:
- TLS profiles:
DigitalSignature | KeyEncipherment - S/MIME profiles:
DigitalSignature | ContentCommitment
S/MIME Support
EKU threading from profile through the entire issuance flow. Agent CSR generation splits SANs by type — strings.Contains(san, "@") routes to EmailAddresses instead of DNSNames. Demo seed includes prof-smime profile with emailProtection EKU.
Policy Engine
5 rule types with violation tracking and severity levels:
- Key algorithm requirements
- Minimum key size
- Maximum certificate lifetime
- Required SAN patterns
- Issuer restrictions
Policies can be scoped to agent groups via agent_group_id foreign key. Violations are tracked and surfaced in the dashboard.
Issuer Connectors
The issuer connector catalog (rebuild count via ls -d internal/connector/issuer/*/ | wc -l) implements the issuer.Connector interface. All support ValidateConfig, IssueCertificate, RenewCertificate, RevokeCertificate, GetOrderStatus, GenerateCRL, SignOCSPResponse, GetCACertPEM, GetRenewalInfo.
Local CA
Self-signed or sub-CA mode using crypto/x509.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_CA_CERT_PATH |
(none) | Path to CA certificate PEM. When set, enables sub-CA mode. |
CERTCTL_CA_KEY_PATH |
(none) | Path to CA private key PEM (RSA, ECDSA, PKCS#8). |
CERTCTL_CRL_GENERATION_INTERVAL |
1h |
How often the scheduler walks every CRL-supporting issuer and rebuilds the cached CRL. HTTP fetches read from the cache, not from a per-request rebuild. |
CERTCTL_OCSP_RESPONDER_KEY_DIR |
(none) | Operator MUST set in production. Directory where the FileDriver persists each issuer's OCSP responder key (ocsp-responder-<issuer_id>.key). When unset, the responder service uses a temporary directory that does NOT survive restarts — fine for dev, NEVER for prod. |
CERTCTL_OCSP_RESPONDER_ROTATION_GRACE |
7d |
When the responder cert's NotAfter falls within this window, EnsureResponder rotates to a fresh cert+key on the next OCSP request or scheduler tick. |
CERTCTL_OCSP_RESPONDER_VALIDITY |
30d |
How long each newly-issued responder cert is valid for. Short by design: relying parties cache OCSP responses, not the responder cert chain, and id-pkix-ocsp-nocheck blocks recursive revocation checking on the responder itself. |
CERTCTL_OCSP_RATE_LIMIT_PER_IP_MIN |
1000 |
Production hardening II Phase 3. Per-source-IP cap on OCSP requests per minute. Zero disables the limit. Trip returns the canonical OCSP "unauthorized" status (RFC 6960 §2.3) plus Retry-After: 60. The limiter does NOT honor X-Forwarded-For (OCSP is publicly reachable; spoofed headers would bypass the cap). |
CERTCTL_CERT_EXPORT_RATE_LIMIT_PER_ACTOR_HR |
50 |
Production hardening II Phase 3. Per-actor cap on cert-export requests (PEM + PKCS#12) per hour. Zero disables. Trip returns HTTP 429 + JSON {"error":"rate_limit_exceeded","retry_after_seconds":3600} plus Retry-After: 3600. Defends against bulk-export from a compromised admin token. |
CERTCTL_DEPLOY_BACKUP_RETENTION |
3 |
Deploy-hardening I. How many <path>.certctl-bak.<unix-nanos> backup files the connector janitor keeps per deployed file. Setting to -1 disables backups entirely — rollback becomes impossible (documented foot-gun). Per-target override via the connector config's backup_retention field. |
CERTCTL_K8S_DEPLOY_KUBELET_SYNC_TIMEOUT |
60s |
Deploy-hardening I Phase 9. How long the K8s connector waits for kubelet sync after Secret update before timing out the post-deploy verify. Tunes for slow clusters (high pod count, slow node DNS). |
Sub-CA mode validates IsCA=true and KeyUsageCertSign on the loaded certificate. Falls back to self-signed when paths are not set. Supports CRL generation (GenerateCRL) and OCSP response signing (SignOCSPResponse). All CA-key signing flows through the signer.Signer interface (internal/crypto/signer/); the OCSP responder cert is signed by the CA via the existing issuance pipeline and OCSP responses are signed by the responder key (NOT the CA key directly) per RFC 6960 §2.6.
ACME
Full ACME v2 protocol via golang.org/x/crypto/acme.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_ACME_DIRECTORY_URL |
https://acme-v02.api.letsencrypt.org/directory |
ACME directory |
CERTCTL_ACME_EMAIL |
(required) | Account email |
CERTCTL_ACME_CHALLENGE_TYPE |
http-01 |
Challenge type: http-01, dns-01, dns-persist-01 |
CERTCTL_ACME_DNS_PRESENT_SCRIPT |
(none) | Script to create DNS-01 TXT record |
CERTCTL_ACME_DNS_CLEANUP_SCRIPT |
(none) | Script to remove DNS-01 TXT record |
CERTCTL_ACME_DNS_PROPAGATION_WAIT |
10s |
Wait after DNS record creation |
CERTCTL_ACME_DNS_PERSIST_ISSUER_DOMAIN |
(none) | Issuer domain for DNS-PERSIST-01 |
CERTCTL_ACME_EAB_KID |
(none) | External Account Binding key ID |
CERTCTL_ACME_EAB_HMAC |
(none) | EAB HMAC key (base64url) |
CERTCTL_ACME_ARI_ENABLED |
false |
Enable ACME Renewal Information (RFC 9773) |
CERTCTL_ACME_PROFILE |
(none) | Certificate profile for newOrder (e.g., tlsserver, shortlived) |
Challenge types:
- HTTP-01 — Standard HTTP challenge via
/.well-known/acme-challenge/token - DNS-01 — Pluggable DNS solver with script-based hooks. User-provided scripts create/cleanup
_acme-challengeTXT records. Compatible with any DNS provider. - DNS-PERSIST-01 — Standing
_validation-persistTXT record per IETF draft. Record value:<issuer-domain>; accounturi=<account-uri>. Set once, reused on every renewal. Auto-fallback to DNS-01 if CA doesn't support it.
External Account Binding (EAB): Required by ZeroSSL, Google Trust Services, SSL.com. For ZeroSSL, credentials are auto-fetched from api.zerossl.com/acme/eab-credentials-email when no EAB credentials are provided — zero-friction onboarding.
Certificate Profile Selection: Custom JWS-signed newOrder POST when profile is set (the golang.org/x/crypto/acme library lacks profile support). ES256 JWS signing with kid mode, nonce management, directory discovery. Empty profile delegates to the standard library path.
step-ca
Smallstep private CA via native /sign API with JWK provisioner authentication. Synchronous issuance.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_STEPCA_URL |
(required) | step-ca server URL |
CERTCTL_STEPCA_ROOT_CA |
(required) | Path to step-ca root CA PEM |
CERTCTL_STEPCA_PROVISIONER_NAME |
(required) | JWK provisioner name |
CERTCTL_STEPCA_PROVISIONER_KEY |
(required) | Path to provisioner private key |
CERTCTL_STEPCA_PROVISIONER_PASSWORD |
(none) | Provisioner key password |
OpenSSL / Custom CA
Script-based signing delegating to user-provided shell scripts. Configurable timeout.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_OPENSSL_SIGN_SCRIPT |
(required) | Script that signs a CSR (receives CSR on stdin, outputs PEM on stdout) |
CERTCTL_OPENSSL_REVOKE_SCRIPT |
(none) | Script for revocation |
CERTCTL_OPENSSL_CRL_SCRIPT |
(none) | Script for CRL generation |
CERTCTL_OPENSSL_TIMEOUT_SECONDS |
30 |
Script execution timeout |
Vault PKI
HashiCorp Vault /v1/{mount}/sign/{role} API. Token auth, synchronous issuance.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_VAULT_ADDR |
(required) | Vault server URL |
CERTCTL_VAULT_TOKEN |
(required) | Vault token |
CERTCTL_VAULT_MOUNT |
pki |
PKI secrets engine mount path |
CERTCTL_VAULT_ROLE |
(required) | PKI role name |
CERTCTL_VAULT_TTL |
8760h |
Certificate TTL |
CRL/OCSP delegated to Vault. Revocation via POST /v1/{mount}/revoke with serial number normalization.
DigiCert CertCentral
Async order model: submit → poll → download. OV/EV support.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_DIGICERT_API_KEY |
(required) | X-DC-DEVKEY auth header |
CERTCTL_DIGICERT_ORG_ID |
(required) | Organization ID |
CERTCTL_DIGICERT_PRODUCT_TYPE |
ssl_basic |
Product type |
CERTCTL_DIGICERT_BASE_URL |
https://www.digicert.com/services/v2 |
API base URL |
Issuance returns OrderID when pending. GetOrderStatus polls via GET /order/certificate/{order_id}, downloads PEM bundle when issued.
Sectigo SCM
Async order model: enroll → poll → collect PEM. 3-header auth.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_SECTIGO_CUSTOMER_URI |
(required) | Customer URI header |
CERTCTL_SECTIGO_LOGIN |
(required) | Login header |
CERTCTL_SECTIGO_PASSWORD |
(required) | Password header |
CERTCTL_SECTIGO_ORG_ID |
(required) | Organization ID |
CERTCTL_SECTIGO_CERT_TYPE |
(required) | Certificate type ID |
CERTCTL_SECTIGO_TERM |
365 |
Certificate term in days |
CERTCTL_SECTIGO_BASE_URL |
https://cert-manager.com/api |
API base URL |
Handles collect-not-ready (HTTP 400 / error code -183) gracefully — cert approved but not yet generated.
Google CAS
Google Cloud Certificate Authority Service. OAuth2 service account auth (JWT → access token), synchronous issuance. No Google SDK dependency — all stdlib.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_GOOGLE_CAS_PROJECT |
(required) | GCP project ID |
CERTCTL_GOOGLE_CAS_LOCATION |
(required) | GCP region |
CERTCTL_GOOGLE_CAS_CA_POOL |
(required) | CA pool name |
CERTCTL_GOOGLE_CAS_CREDENTIALS |
(required) | Path to service account JSON |
CERTCTL_GOOGLE_CAS_TTL |
8760h |
Certificate TTL |
Token caching with sync.Mutex and 5-minute refresh buffer. RS256 JWT signing.
AWS ACM Private CA
Synchronous issuance via IssueCertificate + GetCertificate AWS APIs. Injectable ACMPCAClient interface.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_AWS_PCA_REGION |
(required) | AWS region |
CERTCTL_AWS_PCA_CA_ARN |
(required) | CA ARN |
CERTCTL_AWS_PCA_SIGNING_ALGORITHM |
SHA256WITHRSA |
Signing algorithm |
CERTCTL_AWS_PCA_VALIDITY_DAYS |
365 |
Certificate validity |
CERTCTL_AWS_PCA_TEMPLATE_ARN |
(none) | Optional template ARN |
Revocation with RFC 5280 reason mapping. CRL/OCSP delegated to AWS.
Entrust Certificate Services
Entrust CA Gateway REST API with mTLS client certificate auth. Synchronous or approval-pending issuance.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_ENTRUST_API_URL |
(required) | Entrust CA Gateway base URL |
CERTCTL_ENTRUST_CLIENT_CERT_PATH |
(required) | Path to mTLS client certificate PEM |
CERTCTL_ENTRUST_CLIENT_KEY_PATH |
(required) | Path to mTLS client private key PEM |
CERTCTL_ENTRUST_CA_ID |
(required) | Certificate Authority ID |
CERTCTL_ENTRUST_PROFILE_ID |
(none) | Optional enrollment profile ID |
mTLS authentication via tls.LoadX509KeyPair(). Issuance returns PEM immediately (200) or tracking ID for approval-pending orders (201). CRL/OCSP delegated to Entrust.
GlobalSign Atlas HVCA
GlobalSign Atlas High Volume CA with dual auth: mTLS + API key/secret headers. Region-aware base URLs.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_GLOBALSIGN_API_URL |
(required) | Atlas HVCA API URL (region-specific) |
CERTCTL_GLOBALSIGN_API_KEY |
(required) | API key |
CERTCTL_GLOBALSIGN_API_SECRET |
(required) | API secret |
CERTCTL_GLOBALSIGN_CLIENT_CERT_PATH |
(required) | Path to mTLS client certificate PEM |
CERTCTL_GLOBALSIGN_CLIENT_KEY_PATH |
(required) | Path to mTLS client private key PEM |
Serial-based certificate tracking. CRL/OCSP delegated to GlobalSign.
EJBCA (Keyfactor)
Keyfactor EJBCA REST API for self-hosted CAs. Dual auth: mTLS (default) or OAuth2 Bearer token.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_EJBCA_API_URL |
(required) | EJBCA REST API base URL |
CERTCTL_EJBCA_AUTH_MODE |
mtls |
Auth mode: mtls or oauth2 |
CERTCTL_EJBCA_CLIENT_CERT_PATH |
(mTLS) | Client certificate path |
CERTCTL_EJBCA_CLIENT_KEY_PATH |
(mTLS) | Client key path |
CERTCTL_EJBCA_TOKEN |
(OAuth2) | Bearer token |
CERTCTL_EJBCA_CA_NAME |
(required) | EJBCA CA name |
CERTCTL_EJBCA_CERT_PROFILE |
(none) | Certificate profile |
CERTCTL_EJBCA_EE_PROFILE |
(none) | End-entity profile |
PKCS#10 enrollment via base64-encoded CSR. Revocation requires issuer DN + serial (stored as composite OrderID). CRL/OCSP delegated to EJBCA instance.
EST Server (RFC 7030)
Enrollment over Secure Transport for device/WiFi/IoT certificate enrollment. 4 endpoints under /.well-known/est/:
| Endpoint | Method | Description |
|---|---|---|
/cacerts |
GET | CA certificate chain (PKCS#7 certs-only, base64-encoded) |
/simpleenroll |
POST | New certificate enrollment |
/simplereenroll |
POST | Certificate re-enrollment |
/csrattrs |
GET | CSR attributes |
Accepts both base64-encoded DER (EST standard) and PEM-encoded PKCS#10 CSR input. PKCS#7 output built with hand-rolled ASN.1 (no external PKCS#7 dependency). Configurable issuer and profile binding.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_EST_ENABLED |
false |
Enable EST endpoints |
CERTCTL_EST_ISSUER_ID |
iss-local |
Issuer for EST enrollments. Legacy single-issuer mode; merged into Profiles[0] (PathID="") by the Phase 1 back-compat shim when CERTCTL_EST_PROFILES is unset. |
CERTCTL_EST_PROFILE_ID |
(none) | Optional profile constraint. Legacy single-issuer mode (same back-compat shim as above). |
CERTCTL_EST_PROFILES |
(none, single-issuer mode) | EST RFC 7030 hardening Phase 1. Comma-separated list of EST profile names enabling multi-endpoint dispatch. When set, certctl exposes one /.well-known/est/<pathID>/ endpoint group per name (e.g. CERTCTL_EST_PROFILES=corp,iot,wifi produces /.well-known/est/corp/{cacerts,simpleenroll,simplereenroll,csrattrs} etc.). Each name also drives the env-var prefix for the per-profile config below. When unset, certctl runs in legacy single-issuer mode using the flat CERTCTL_EST_ENABLED / CERTCTL_EST_ISSUER_ID / CERTCTL_EST_PROFILE_ID env vars above (which synthesise a single-element profile bound to the legacy /.well-known/est/ root path). PathID must be a path-safe slug ([a-z0-9-], no leading/trailing hyphen); names get lowercased for the URL path and uppercased for the env-var prefix. Mirrors the SCEP CERTCTL_SCEP_PROFILES family from the SCEP RFC 8894 master bundle (commit 6d30493). |
CERTCTL_EST_PROFILE_<NAME>_ISSUER_ID |
(none) | Per-profile issuer binding when CERTCTL_EST_PROFILES is set. <NAME> is the upper-cased profile name from the list (so a CERTCTL_EST_PROFILES entry of corp resolves the issuer-id env var key with <NAME> replaced by CORP, the _ISSUER_ID suffix unchanged). The same per-profile env-var prefix CERTCTL_EST_PROFILE_ is also used for _PROFILE_ID, _ENROLLMENT_PASSWORD, _MTLS_ENABLED, _MTLS_CLIENT_CA_TRUST_BUNDLE_PATH, _CHANNEL_BINDING_REQUIRED, _ALLOWED_AUTH_MODES, _RATE_LIMIT_PER_PRINCIPAL_24H, _SERVERKEYGEN_ENABLED — see the rows below. Required for every profile listed in CERTCTL_EST_PROFILES. Each profile is independently validated at startup; per-profile failures log the offending PathID. |
CERTCTL_EST_PROFILE_<NAME>_PROFILE_ID |
(none) | Per-profile optional CertificateProfile constraint, mirroring the legacy CERTCTL_EST_PROFILE_ID. Leave unset to allow the issuer's defaults. Required when _SERVERKEYGEN_ENABLED=true because the Phase 5 server-keygen path needs a profile to pin AllowedKeyAlgorithms (the server has to decide what key to generate). |
CERTCTL_EST_PROFILE_<NAME>_ENROLLMENT_PASSWORD |
(none) | EST RFC 7030 §3.2.3 alternative. Per-profile shared secret for HTTP Basic auth on the standard /.well-known/est/<pathID>/ route. Empty value means HTTP Basic auth is NOT required for this profile (mTLS-only or anonymous, depending on _ALLOWED_AUTH_MODES). Stored only in process memory; never logged. Constant-time comparison via crypto/subtle.ConstantTimeCompare in the handler. Required when _ALLOWED_AUTH_MODES lists basic (Phase 1 cross-check refuses the boot otherwise). The Phase 3 handler dispatches HTTP Basic auth using this value. |
CERTCTL_EST_PROFILE_<NAME>_MTLS_ENABLED |
false |
EST RFC 7030 hardening Phase 2 (opt-in). When true, certctl exposes a sibling /.well-known/est-mtls/<pathID>/ route alongside the standard /.well-known/est/<pathID>/ route. The sibling route requires the EST client to present an mTLS client cert that chains to _MTLS_CLIENT_CA_TRUST_BUNDLE_PATH. The standard route continues to honour _ENROLLMENT_PASSWORD (HTTP Basic) — operators can run BOTH routes simultaneously for migration / heterogeneous client fleets. mTLS is additive, not a replacement. Mirrors the SCEP _MTLS_ENABLED from commit e7a3075. |
CERTCTL_EST_PROFILE_<NAME>_MTLS_CLIENT_CA_TRUST_BUNDLE_PATH |
(none) | PEM bundle of CA certs that sign the client (device-bootstrap) certs the operator allows to enroll on this profile's /.well-known/est-mtls/<pathID>/ route. Required when _MTLS_ENABLED=true (Phase 1 Validate refuses the boot otherwise). The Phase 2 startup preflight (cmd/server/main.go::preflightESTMTLSClientCATrustBundle, lands in Phase 2) will validate: file exists, parses as PEM, contains ≥1 cert, none expired. Reloaded on SIGHUP via the same TrustAnchorHolder primitive the SCEP/Intune trust bundle uses. |
CERTCTL_EST_PROFILE_<NAME>_CHANNEL_BINDING_REQUIRED |
false |
EST RFC 7030 hardening Phase 2 — RFC 9266 tls-exporter channel binding. When true, the Phase 2 EST mTLS handler requires the CSR to carry a id-aa-channelBindings attribute matching the server-side r.TLS.ConnectionState().ExportKeyingMaterial("EXPORTER-Channel-Binding", nil, 32) output. Without this binding an attacker that bridges two TLS connections could submit a CSR over a TLS handshake authenticated by a different cert. Refused at boot when _MTLS_ENABLED=false (Phase 1 cross-check) — channel binding is meaningful only when mTLS is in use. Operators running clients that don't support RFC 9266 (older libest, etc.) can opt out per-profile by leaving this false. |
CERTCTL_EST_PROFILE_<NAME>_ALLOWED_AUTH_MODES |
(empty, no auth required) | EST RFC 7030 hardening Phases 2 + 3. Comma-separated list of accepted auth modes for this profile. Valid entries: mtls, basic. Empty (default) preserves the pre-Phase-1 unauthenticated behavior for back-compat (Phase 12 docs nudge operators to set this explicitly; a future bundle may flip the default to require explicit opt-in). Cross-checks at boot: mtls in the list requires _MTLS_ENABLED=true; basic requires _ENROLLMENT_PASSWORD non-empty. Unknown modes refused at boot with the offending token in the error message. |
CERTCTL_EST_PROFILE_<NAME>_RATE_LIMIT_PER_PRINCIPAL_24H |
0 (disabled) |
EST RFC 7030 hardening Phase 4. Sliding-window rate-limit cap on enrollments per (CSR.Subject.CN, sourceIP) pair in any rolling 24-hour window. Default 0 preserves the pre-Phase-1 unlimited behavior for back-compat; operators on production deploys set 3 (mirrors the SCEP/Intune per-device limit). Negative values refused at boot as a config typo. The Phase 4 handler dispatches via the extracted internal/ratelimit/SlidingWindowLimiter. |
CERTCTL_EST_PROFILE_<NAME>_SERVERKEYGEN_ENABLED |
false |
EST RFC 7030 hardening Phase 5 (opt-in). When true, certctl exposes the /.well-known/est/<pathID>/serverkeygen endpoint per RFC 7030 §4.4. The server generates the keypair on behalf of the client and returns both cert + private key (the latter wrapped in CMS EnvelopedData encrypted to the client's CSR pubkey per RFC 7030 §4.4.2). Used for resource-constrained IoT devices that lack a hardware RNG. Refused at boot when _PROFILE_ID is empty (Phase 1 cross-check) — server-keygen needs a CertificateProfile to pin AllowedKeyAlgorithms. The Phase 5 handler implements the CMS EnvelopedData wire format + key zeroization discipline. |
SCEP Server (RFC 8894)
Simple Certificate Enrollment Protocol for MDM platforms and network devices. Single endpoint with operation-based dispatch:
| Operation | Method | Description |
|---|---|---|
GetCACaps |
GET | Server capabilities (plaintext, one per line) |
GetCACert |
GET | CA certificate (DER for single cert, PKCS#7 for chain) |
PKIOperation |
POST | Certificate enrollment (PKCS#7-wrapped or raw CSR) |
SCEP uses a single URL (/scep?operation=...). The handler extracts PKCS#10 CSRs from PKCS#7 SignedData envelopes, with fallback support for base64-encoded and raw CSR submissions. Challenge password authentication via CSR attributes (OID 1.2.840.113549.1.9.7). Responses are PKCS#7 certs-only (same shared internal/pkcs7 package as EST).
| Env Var | Default | Description |
|---|---|---|
CERTCTL_SCEP_ENABLED |
false |
Enable SCEP endpoint |
CERTCTL_SCEP_ISSUER_ID |
iss-local |
Issuer for SCEP enrollments |
CERTCTL_SCEP_PROFILE_ID |
(none) | Optional profile constraint |
CERTCTL_SCEP_CHALLENGE_PASSWORD |
(none) | Shared secret for enrollment authentication |
CERTCTL_SCEP_RA_CERT_PATH |
(none) | Path to PEM-encoded RA (Registration Authority) certificate. Required when CERTCTL_SCEP_ENABLED=true for the RFC 8894 PKIMessage path: SCEP clients encrypt their PKCS#10 CSR to this cert's public key (EnvelopedData wrapper, RFC 8894 §3.2.2) and the server signs the outbound CertRep PKIMessage signerInfo with the matching key (RFC 8894 §3.3.2). Generation: a self-signed cert with CN=<your-ca-id>-RA and the id-kp-emailProtection / id-kp-cmcRA EKU is sufficient — see legacy-est-scep.md for the openssl recipe. The preflight gate at startup also enforces a cert/key match, non-expired NotAfter, and an RSA-or-ECDSA public-key algorithm. |
CERTCTL_SCEP_RA_KEY_PATH |
(none) | Path to PEM-encoded private key matching CERTCTL_SCEP_RA_CERT_PATH. Required when CERTCTL_SCEP_ENABLED=true. File MUST be mode 0600 (owner read/write only); preflight refuses to load a world- or group-readable RA key as defense-in-depth against credential leak. The server reads this file once at startup; rotation requires a restart. |
CERTCTL_SCEP_PROFILES |
(none, single-profile mode) | Comma-separated list of SCEP profile names enabling multi-endpoint dispatch (Phase 1.5). When set, certctl exposes one /scep/<pathID> endpoint per name (e.g. CERTCTL_SCEP_PROFILES=corp,iot,server produces /scep/corp, /scep/iot, /scep/server). Each name also drives the env-var prefix for the per-profile config below. When unset, certctl runs in legacy single-profile mode using the flat CERTCTL_SCEP_* env vars above (which synthesise a single-element profile bound to the legacy /scep root path). PathID must be a path-safe slug ([a-z0-9-], no leading/trailing hyphen); names get lowercased for the URL path and uppercased for the env-var prefix. |
CERTCTL_SCEP_PROFILE_<NAME>_ISSUER_ID |
(none) | Per-profile issuer binding when CERTCTL_SCEP_PROFILES is set. <NAME> is the upper-cased profile name from the list (so a CERTCTL_SCEP_PROFILES entry of corp resolves the issuer-id env var key with <NAME> replaced by CORP, the path-id _ISSUER_ID suffix unchanged). Same per-profile env-var prefix CERTCTL_SCEP_PROFILE_ is also used for _PROFILE_ID, _CHALLENGE_PASSWORD, _RA_CERT_PATH, _RA_KEY_PATH — see the four rows below. Required for every profile listed in CERTCTL_SCEP_PROFILES. Each profile is independently validated at startup; per-profile failures log the offending PathID. |
CERTCTL_SCEP_PROFILE_<NAME>_PROFILE_ID |
(none) | Per-profile optional CertificateProfile constraint, mirroring the legacy CERTCTL_SCEP_PROFILE_ID. Leave unset to allow the issuer's defaults. |
CERTCTL_SCEP_PROFILE_<NAME>_CHALLENGE_PASSWORD |
(none) | Per-profile shared secret. Required for every profile in CERTCTL_SCEP_PROFILES (CWE-306: per-profile auth boundary). Empty value at startup fails the boot with the offending PathID in the structured log. |
CERTCTL_SCEP_PROFILE_<NAME>_RA_CERT_PATH |
(none) | Per-profile RA certificate PEM path. Same semantics as CERTCTL_SCEP_RA_CERT_PATH but scoped to one profile. Required for every profile. |
CERTCTL_SCEP_PROFILE_<NAME>_RA_KEY_PATH |
(none) | Per-profile RA private key PEM path (mode 0600). Same semantics as CERTCTL_SCEP_RA_KEY_PATH but scoped to one profile. Required for every profile. |
CERTCTL_SCEP_PROFILE_<NAME>_MTLS_ENABLED |
false |
Phase 6.5 (opt-in). When true, certctl exposes a sibling /scep-mtls/<pathID> route alongside the standard /scep/<pathID> route. The sibling route requires the SCEP client to present an mTLS client cert that chains to _MTLS_CLIENT_CA_TRUST_BUNDLE_PATH. The standard route continues to use challenge-password-only auth — operators can run BOTH routes simultaneously for migration / heterogeneous client fleets. mTLS is additive (not a replacement for the challenge password). Designed for enterprise procurement teams that reject "shared password authentication" as a checkbox-fail. Same model Apple's MDM and Cisco's BRSKI use. |
CERTCTL_SCEP_PROFILE_<NAME>_MTLS_CLIENT_CA_TRUST_BUNDLE_PATH |
(none) | PEM bundle of CA certs that sign the client (device-bootstrap) certs the operator allows to enroll on this profile's /scep-mtls/<pathID> route. Required when _MTLS_ENABLED=true. Operators with multiple bootstrap CAs concatenate them. The startup preflight (cmd/server/main.go::preflightSCEPMTLSTrustBundle) validates: file exists, parses as PEM, contains ≥1 cert, none expired. |
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED |
false |
Phase 8 (opt-in). When true, this profile routes Intune-shaped challenge passwords (length > 200 + exactly two dots) to the Microsoft Intune Certificate Connector signed-challenge validator. Static challenge passwords still work as a fallback for non-Intune devices in mixed-fleet deployments. Per-profile flag so an operator running corp-laptops via Intune AND IoT devices via static challenge can opt-in on the corp profile only. |
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CONNECTOR_CERT_PATH |
(none) | Filesystem path to a PEM bundle of one or more Microsoft Intune Certificate Connector signing certs. Required when _INTUNE_ENABLED=true. Reloaded on SIGHUP (mirrors the server TLS-cert reload pattern). Startup preflight + reload both refuse empty bundles + expired certs and surface the offending subject CN in the error message. Operators who rotate the Connector signing cert update the file on disk then kill -HUP <certctl-pid> to apply (no restart required). |
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_AUDIENCE |
(empty, audience check disabled) | Expected aud claim in the Intune challenge — typically the public SCEP endpoint URL the Connector is configured to call (e.g. https://certctl.example.com/scep/corp). Empty disables the check, useful for proxy / load-balancer scenarios where the URL the Connector saw differs from the URL we see. Operators who pin a public URL gain defense-in-depth against challenge re-use across endpoints. |
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CHALLENGE_VALIDITY |
60m |
Maximum age of an Intune challenge, on top of the challenge's own iat/exp claims. Defense-in-depth: even if the Connector mints a 24h-valid challenge, this caps the window during which a leaked challenge can be replayed. Default matches Microsoft's published Connector defaults. Zero disables the cap (relies entirely on the challenge's exp). |
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_PER_DEVICE_RATE_LIMIT_24H |
3 |
Maximum enrollments per (claim.Subject, claim.Issuer) pair in any rolling 24-hour window. Catches a compromised Connector signing key issuing many DIFFERENT valid challenges for the same device. Default 3 covers legitimate first-cert + recovery + post-wipe re-enrollment. Zero disables the limiter (not recommended for production). |
ACME Renewal Information (RFC 9773)
CA-directed renewal timing. Instead of hardcoded expiration thresholds, the CA tells certctl when to renew.
How It Works
GetRenewalInfocomputes an RFC 9773 cert ID (base64url-encoded SHA-256 of DER cert)- Queries the CA's Renewal Information endpoint (discovered from ACME directory or constructed via fallback URL)
- Returns a
SuggestedWindow(start/end), optionalRetryAfter, andExplanationURL ShouldRenewNow()returns true if the current time is pastSuggestedWindowStartOptimalRenewalTime()picks a random time within the window for load distribution
Scheduler Integration
The renewal scheduler (CheckExpiringCertificates) queries ARI before creating renewal jobs:
- If ARI says "not yet" → skip renewal
- If ARI says "renew now" → create renewal job with
renewal_trigger: ariaudit event - If ARI errors → log warning, fall back to threshold-based logic
- Non-ARI issuers return nil (Local CA, step-ca, OpenSSL, Vault, DigiCert, Sectigo, Google CAS, AWS ACM PCA)
| Env Var | Default | Description |
|---|---|---|
CERTCTL_ACME_ARI_ENABLED |
false |
Enable ARI queries |
Shorter Certificate Validity Readiness
certctl's default thresholds [30, 14, 7, 0] work correctly at all CA/Browser Forum SC-081v3 validity reduction phases:
- 200-day certs (Phase 1, March 2026)
- 100-day certs (Phase 2, March 2027)
- 47-day certs (Phase 3, March 2029)
For Let's Encrypt 6-day shortlived certificates, ARI is the expected renewal path — threshold-based logic alone is insufficient at that lifetime.
Target Connectors
The target connector catalog (rebuild count via ls -d internal/connector/target/*/ | wc -l) implements the target.Connector interface. All support ValidateConfig, DeployCertificate, ValidateDeployment.
Deployment Model
Pull-only. The server never initiates outbound connections to agents or targets. Agents poll for work. For network appliances and agentless servers, a "proxy agent" in the same network zone executes deployment via the target's API.
NGINX
File write → nginx -t validation → nginx -s reload. Config: cert_path, key_path, chain_path, reload_command, validate_command.
Apache httpd
Separate cert/chain/key files → apachectl configtest → apachectl graceful. Config: cert_path, key_path, chain_path, reload_command, validate_command.
HAProxy
Combined PEM file (cert + chain + key) → optional validation → reload via socket/signal. Config: pem_path, reload_command, validate_command.
Traefik
File provider deployment: writes cert/key to Traefik's watched directory. Traefik auto-reloads via filesystem watch. Config: cert_dir, cert_filename, key_filename.
Caddy
Dual-mode: api (POST to Caddy admin endpoint for hot-reload) or file (file-based with configurable paths). Config: mode (api/file), admin_url, cert_path, key_path.
Envoy
File-based deployment with optional SDS JSON config. Envoy auto-reloads via filesystem watch. Path traversal prevention on all file paths. Optional SDS JSON bootstrap (type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret). Config: cert_dir, cert_filename, key_filename, chain_filename, sds_config.
F5 BIG-IP
iControl REST API via proxy agent. Token auth (POST /mgmt/shared/authn/login, X-F5-Auth-Token), 401 auto-retry. Transaction-based atomic SSL profile updates with auto-rollback on failure. Injectable F5Client interface.
Config: host, port (443), username, password, partition (Common), ssl_profile, insecure (true), timeout (30). Minimum BIG-IP v12.0+.
Deployment: file upload with Content-Range → crypto object install (cert/key/chain) → transaction create → SSL profile PATCH → transaction commit. cleanupCryptoObjects() removes installed objects on failure.
IIS
Dual-mode: agent-local PowerShell or WinRM proxy agent. PEM → PFX conversion via go-pkcs12, Import-PfxCertificate, IIS binding management (New-WebBinding + AddSslCertificate), SHA-1 thumbprint computation, SNI support.
Local mode config: site_name, cert_store (My), port (443), sni (false), ip_address (*).
WinRM mode config: adds mode (winrm), winrm_host, winrm_port (5985/5986), winrm_username, winrm_password, winrm_https, winrm_insecure, winrm_timeout (60s). Base64 PFX transfer via PowerShell with try/finally cleanup. Uses masterzen/winrm.
Injectable PowerShellExecutor interface for cross-platform testing. Regex-validated config fields prevent PowerShell injection.
SSH (Agentless)
Agentless deployment via SSH/SFTP to any Linux/Unix server. Uses golang.org/x/crypto/ssh + github.com/pkg/sftp.
Config: host, port (22), user, auth_method (key/password), private_key_path, password, cert_path, key_path, chain_path, reload_command, timeout (30s). Optional octal permission strings (e.g., "0644", "0600").
Shell injection prevention via validation.ValidateShellCommand() on reload commands. Injectable SSHClient interface.
Postfix / Dovecot
Dual-mode mail server TLS connector. File write → validation → reload.
- Postfix mode:
postfix check→postfix reload - Dovecot mode:
doveconf -n→doveadm reload
Config: mode (postfix/dovecot), cert_path, key_path, chain_path, reload_command, validate_command. Shell injection prevention.
Windows Certificate Store
PowerShell-based cert import via Import-PfxCertificate. PEM → PFX → base64 → PowerShell script with try/finally cleanup.
Config: store (My/Root/CA/WebHosting), store_location (LocalMachine/CurrentUser), friendly_name, cleanup_expired (bool). Dual-mode: local or WinRM (same pattern as IIS). Reuses shared certutil package.
Java Keystore
PEM → PKCS#12 (via certutil.CreatePFX) → temp file → keytool -importkeystore pipeline. JKS and PKCS12 format support.
Config: keystore_path, keystore_password, keystore_type (JKS/PKCS12), alias (server), reload_command. Path traversal prevention, existing alias deletion before import. Reuses shared certutil package.
Kubernetes Secrets
Deploys certificates as kubernetes.io/tls Secrets. Injectable K8sClient interface (proxy agent pattern). In-cluster auth by default, out-of-cluster via kubeconfig.
Config: namespace, secret_name, labels (map), kubeconfig_path (optional). Fingerprint-based validation in ValidateDeployment.
Shared certutil Package
Extracted from IIS connector. Reused by IIS, WinCertStore, and JavaKeystore:
CreatePFX— PEM → PKCS#12 viago-pkcs12ParsePrivateKey— PKCS#1, PKCS#8, EC key formatsComputeThumbprint— SHA-1 of DER cert (matches Windowscertutil)GenerateRandomPassword— 32-char crypto/rand passwordParseCertificatePEM— PEM →*x509.Certificate
Notifier Connectors
Notification Types
| Type | Description |
|---|---|
ExpirationWarning |
Certificate approaching threshold |
RenewalSuccess |
Renewal completed |
RenewalFailure |
Renewal failed |
DeploymentSuccess |
Deployment completed |
DeploymentFailure |
Deployment failed |
PolicyViolation |
Policy rule violated |
Revocation |
Certificate revoked |
Notification Channels
| Channel | Auth | Config Env Vars |
|---|---|---|
| SMTP | CERTCTL_SMTP_HOST, CERTCTL_SMTP_PORT (587), CERTCTL_SMTP_USERNAME, CERTCTL_SMTP_PASSWORD, CERTCTL_SMTP_FROM_ADDRESS, CERTCTL_SMTP_USE_TLS (true) |
|
| Webhook | URL-based | CERTCTL_WEBHOOK_URL |
| Slack | Incoming webhook | CERTCTL_SLACK_WEBHOOK_URL, CERTCTL_SLACK_CHANNEL, CERTCTL_SLACK_USERNAME |
| Microsoft Teams | Incoming webhook (MessageCard) | CERTCTL_TEAMS_WEBHOOK_URL |
| PagerDuty | Events API v2 | CERTCTL_PAGERDUTY_ROUTING_KEY, CERTCTL_PAGERDUTY_SEVERITY (warning) |
| OpsGenie | Alert API v2, GenieKey | CERTCTL_OPSGENIE_API_KEY, CERTCTL_OPSGENIE_PRIORITY (P3) |
All notifier connectors have 10-second HTTP client timeouts.
Certificate Digest
Scheduled HTML email digest with aggregated certificate status.
Content
- Stats grid: total certs, expiring, expired, active agents
- Jobs summary
- Expiring certificates table with color-coded badges
- Responsive CSS for email clients
Configuration
| Env Var | Default | Description |
|---|---|---|
CERTCTL_DIGEST_ENABLED |
false |
Enable digest |
CERTCTL_DIGEST_INTERVAL |
24h |
Send interval |
CERTCTL_DIGEST_RECIPIENTS |
(none) | Comma-separated emails. Falls back to certificate owner emails when empty. |
API
GET /api/v1/digest/preview— HTML preview of current digestPOST /api/v1/digest/send— trigger immediate send
Both endpoints return 503 when digest is not configured (nil-safe handler).
Post-Deployment TLS Verification
After deploying a certificate, the agent probes the live TLS endpoint and compares SHA-256 fingerprints.
Verification Statuses
| Status | Description |
|---|---|
pending |
Verification not yet attempted |
success |
Deployed cert matches live endpoint |
failed |
Fingerprint mismatch or connection error |
skipped |
Verification disabled or not applicable |
Flow
- Agent completes deployment
- Agent waits
CERTCTL_VERIFY_DELAY(configurable) - Agent connects via
crypto/tls.DialWithDialerwithInsecureSkipVerify=true - Compares SHA-256 fingerprint of served cert against deployed cert
- Submits result via
POST /api/v1/jobs/{id}/verify
Best-effort — failures are recorded but don't block or rollback deployments.
| Env Var | Default | Description |
|---|---|---|
CERTCTL_VERIFY_DEPLOYMENT |
false |
Enable verification |
CERTCTL_VERIFY_TIMEOUT |
5s |
TLS connection timeout |
CERTCTL_VERIFY_DELAY |
2s |
Wait after deployment before probing |
Discovery
Filesystem Discovery
Agents scan configured directories for existing certificates.
- Runs on agent startup and every 6 hours
- Walks directories recursively, parses PEM (
.pem,.crt,.cer,.cert) and DER (.der) files - Extracts: common name, SANs, serial, issuer DN, subject DN, validity, key algorithm, key size, is_ca, SHA-256 fingerprint
- Reports to server via
POST /api/v1/agents/{id}/discoveries - Server deduplicates by
(fingerprint_sha256, agent_id, source_path)unique constraint
| Env Var | Default | Description |
|---|---|---|
CERTCTL_DISCOVERY_DIRS |
(none) | Comma-separated directories for agent to scan |
Discovery Statuses
| Status | Description |
|---|---|
Unmanaged |
Discovered, not yet triaged |
Managed |
Claimed and linked to a managed certificate |
Dismissed |
Explicitly dismissed from triage queue |
Discovery API
| Endpoint | Method | Description |
|---|---|---|
/api/v1/agents/{id}/discoveries |
POST | Agent submits scan results |
/api/v1/discovered-certificates |
GET | List with ?agent_id, ?status filters |
/api/v1/discovered-certificates/{id} |
GET | Detail |
/api/v1/discovered-certificates/{id}/claim |
POST | Link to managed certificate |
/api/v1/discovered-certificates/{id}/dismiss |
POST | Dismiss from triage |
/api/v1/discovery-scans |
GET | Scan history |
/api/v1/discovery-summary |
GET | Aggregate status counts |
Network Certificate Discovery
Server-side active TLS scanning of CIDR ranges. Concurrent probing with semaphore (50 goroutines). Feeds into the existing discovery pipeline via server-scanner sentinel agent.
- CIDR expansion with
/20safety cap (4,096 IPs max per scan) crypto/tls.DialWithDialerwithInsecureSkipVerify=trueto discover all certs (including self-signed, expired, internal CA)- SSRF protection: reserved IP ranges filtered (loopback, link-local, multicast, broadcast)
| Env Var | Default | Description |
|---|---|---|
CERTCTL_NETWORK_SCAN_ENABLED |
false |
Enable network scanning |
CERTCTL_NETWORK_SCAN_INTERVAL |
6h |
Scan interval |
Network Scan Target API
| Endpoint | Method | Description |
|---|---|---|
/api/v1/network-scan-targets |
GET | List targets |
/api/v1/network-scan-targets/{id} |
GET | Detail |
/api/v1/network-scan-targets |
POST | Create target (name, CIDRs, ports, interval, timeout) |
/api/v1/network-scan-targets/{id} |
PUT | Update |
/api/v1/network-scan-targets/{id} |
DELETE | Delete |
/api/v1/network-scan-targets/{id}/scan |
POST | Trigger immediate scan |
Cloud Secret Manager Discovery
Discovers certificates stored in cloud secret managers and brings them into the certctl inventory. Extends the existing discovery pipeline with pluggable DiscoverySource implementations. Each source runs as part of the opt-in cloud discovery scheduler loop (6h default; see docs/architecture.md for the full 12-loop scheduler topology).
Supported sources:
- AWS Secrets Manager — filters by tag (
type=certificate) and name prefix. Usesaws-sdk-go-v2. Sentinel agent:cloud-aws-sm - Azure Key Vault — OAuth2 client credentials auth, no Azure SDK. Lists certificates from vault. Sentinel agent:
cloud-azure-kv - GCP Secret Manager — JWT-based OAuth2 service account auth, no Google SDK. Filters by label (
type=certificate). Sentinel agent:cloud-gcp-sm
| Env Var | Default | Description |
|---|---|---|
CERTCTL_CLOUD_DISCOVERY_ENABLED |
false |
Enable cloud discovery scheduler |
CERTCTL_CLOUD_DISCOVERY_INTERVAL |
6h |
Scheduler loop interval |
CERTCTL_AWS_SM_DISCOVERY_ENABLED |
false |
Enable AWS SM source |
CERTCTL_AWS_SM_REGION |
— | AWS region |
CERTCTL_AWS_SM_TAG_FILTER |
type=certificate |
Tag filter for secrets |
CERTCTL_AZURE_KV_DISCOVERY_ENABLED |
false |
Enable Azure KV source |
CERTCTL_AZURE_KV_VAULT_URL |
— | Key Vault URL |
CERTCTL_GCP_SM_DISCOVERY_ENABLED |
false |
Enable GCP SM source |
CERTCTL_GCP_SM_PROJECT |
— | GCP project ID |
CERTCTL_GCP_SM_CREDENTIALS |
— | Service account JSON path |
Continuous TLS Health Monitoring
Beyond one-time discovery (M18b, M21), the health monitor continuously probes TLS endpoints and tracks certificate freshness. Uses the shared internal/tlsprobe/ package (same as network scanner) to compare deployed certificate fingerprints against live endpoints, catching silent rollbacks and unauthorized replacements.
Status Transitions:
Healthy— endpoint responding, certificate matches expectedDegraded— consecutive probe failures reach threshold (default 2)Down— consecutive failures exceed degradation threshold (default 5)Cert_Mismatch— observed cert fingerprint differs from expected (unauthorized replacement)
Auto-Create: When a deployment completes successfully with TLS verification enabled (M25), certctl automatically creates a health check with the deployed certificate's fingerprint as the baseline.
Probe History: Each probe stores: TLS version, cipher suite, response time, cert metadata (subject, issuer, validity), status, and error details. Retained for 30 days (configurable), then purged by the scheduler.
Alerts on State Transitions:
- Cert_Mismatch: HIGH severity (catches unauthorized changes)
- Down: CRITICAL severity (service broken)
- Degraded: WARNING severity (intermittent issues)
- Recovery to Healthy: INFO severity (status update)
Configuration:
| Env Var | Default | Description |
|---|---|---|
CERTCTL_HEALTH_CHECK_ENABLED |
false |
Enable health monitoring |
CERTCTL_HEALTH_CHECK_INTERVAL |
60s |
Scheduler tick interval |
CERTCTL_HEALTH_CHECK_DEFAULT_INTERVAL |
300s |
Default per-endpoint check frequency |
CERTCTL_HEALTH_CHECK_DEFAULT_TIMEOUT |
5000ms |
TLS connection timeout per probe |
CERTCTL_HEALTH_CHECK_MAX_CONCURRENT |
20 |
Max concurrent TLS probes |
CERTCTL_HEALTH_CHECK_HISTORY_RETENTION |
30 days |
Purge probe history older than this |
CERTCTL_HEALTH_CHECK_AUTO_CREATE |
true |
Auto-create checks from deployments |
Health Check API:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/health-checks |
GET | List with ?status, ?certificate_id, ?network_scan_target_id, ?enabled filters + pagination |
/api/v1/health-checks/{id} |
GET | Detail |
/api/v1/health-checks |
POST | Create manual check (endpoint, expected_fingerprint, check_interval, timeout) |
/api/v1/health-checks/{id} |
PUT | Update thresholds, interval, or expected fingerprint |
/api/v1/health-checks/{id} |
DELETE | Delete |
/api/v1/health-checks/{id}/history |
GET | Probe history with ?limit param |
/api/v1/health-checks/{id}/acknowledge |
POST | Mark incident as acknowledged by operator |
/api/v1/health-checks/summary |
GET | Aggregate counts by status (Healthy, Degraded, Down, Cert_Mismatch) |
Ownership and Teams
Certificate Ownership
Certificates have an owner field linking to an owner record with email and team assignment. Notification routing uses owner email when no explicit recipients are configured.
Teams
Organizational grouping for owners. Full CRUD API and GUI page.
Agent Groups
Dynamic device grouping by matching criteria:
- OS (e.g.,
linux,darwin,windows) - Architecture (e.g.,
amd64,arm64) - IP CIDR range
- Agent version
Plus manual include/exclude membership lists. Agent groups can be referenced by renewal policies via agent_group_id FK.
MatchesAgent() method on the domain model evaluates all criteria against an agent's metadata.
Observability
Metrics
JSON metrics: GET /api/v1/metrics — gauges (cert totals by status, agent counts, pending jobs), counters (completed/failed jobs), uptime.
Prometheus metrics: GET /api/v1/metrics/prometheus — text/plain; version=0.0.4 exposition format. 11 metrics with certctl_ prefix:
| Metric | Type |
|---|---|
certctl_certificate_total |
gauge |
certctl_certificate_active |
gauge |
certctl_certificate_expiring_soon |
gauge |
certctl_certificate_expired |
gauge |
certctl_certificate_revoked |
gauge |
certctl_agent_total |
gauge |
certctl_agent_online |
gauge |
certctl_job_pending |
gauge |
certctl_job_completed_total |
counter |
certctl_job_failed_total |
counter |
certctl_uptime_seconds |
gauge |
Compatible with Prometheus, Grafana Agent, Datadog Agent, Victoria Metrics.
Stats API
| Endpoint | Description |
|---|---|
GET /api/v1/stats/summary |
Dashboard summary (total, active, expiring, expired) |
GET /api/v1/stats/certificates-by-status |
Status distribution |
GET /api/v1/stats/expiration-timeline?days=N |
Expiration buckets |
GET /api/v1/stats/job-trends?days=N |
Job completion trends |
GET /api/v1/stats/issuance-rate?days=N |
Issuance rate |
Structured Logging
slog-based middleware with request ID propagation. No fmt.Printf in production code paths.
Immutable Audit Trail
Append-only audit_events table. No UPDATE or DELETE permitted. Records:
- All API calls (via audit middleware)
- Certificate lifecycle events (issuance, renewal, deployment, revocation, export)
- Discovery events (scan completed, cert claimed, cert dismissed)
- Job lifecycle events (created, completed, failed, cancelled, verified)
- Approval events (approved, rejected with reason)
Job System
Job Types
| Type | Description |
|---|---|
Issuance |
New certificate issuance |
Renewal |
Certificate renewal |
Deployment |
Deploy cert to target |
Validation |
Validate deployment |
Job Statuses
| Status | Description |
|---|---|
Pending |
Queued for processing |
AwaitingCSR |
Waiting for agent to submit CSR (agent keygen mode) |
AwaitingApproval |
Paused for manual approval |
Running |
In progress |
Completed |
Successfully finished |
Failed |
Failed with error |
Cancelled |
Cancelled by operator |
Agent Work Routing
GetPendingWork() returns only jobs scoped to the requesting agent:
- Deployment jobs: matched by
jobs.agent_id(set at creation from target → agent relationship) - AwaitingCSR jobs: matched via certificate → target → agent chain
- Legacy fallback: target JOIN for jobs with NULL
agent_id
Single SQL UNION query replaces the previous "fetch all, filter in Go" approach.
Background Scheduler
12 background loops (8 always-on + 4 opt-in), each with an atomic.Bool idempotency guard preventing concurrent tick execution. sync.WaitGroup + WaitForCompletion() for graceful shutdown. Authoritative topology table lives in docs/architecture.md.
| Loop | Default Interval | Always-on | Env Var | Description |
|---|---|---|---|---|
| Renewal check | 1 hour | Yes | CERTCTL_SCHEDULER_RENEWAL_CHECK_INTERVAL |
Check expiring certs, query ARI, create renewal jobs |
| Job processor | 30 seconds | Yes | CERTCTL_SCHEDULER_JOB_PROCESSOR_INTERVAL |
Process pending jobs (concurrency cap via CERTCTL_RENEWAL_CONCURRENCY, default 25) |
| Job retry | 5 minutes | Yes | CERTCTL_SCHEDULER_RETRY_INTERVAL |
Retry Failed jobs (I-001) |
| Job timeout reaper | 10 minutes | Yes | CERTCTL_JOB_TIMEOUT_INTERVAL (per-state thresholds: CERTCTL_JOB_AWAITING_APPROVAL_TIMEOUT, CERTCTL_JOB_AWAITING_CSR_TIMEOUT) |
Fail AwaitingCSR/AwaitingApproval jobs past timeout (I-003) |
| Agent health check | 2 minutes | Yes | CERTCTL_SCHEDULER_AGENT_HEALTH_CHECK_INTERVAL |
Check agent heartbeat staleness |
| Notification processor | 1 minute | Yes | CERTCTL_SCHEDULER_NOTIFICATION_PROCESS_INTERVAL |
Send queued notifications |
| Notification retry | 2 minutes | Yes | CERTCTL_NOTIFICATION_RETRY_INTERVAL |
Exponential backoff retry for failed notifications; promote to dead-letter after 5 attempts (I-005) |
| Short-lived expiry check | 30 seconds | Yes | CERTCTL_SHORT_LIVED_EXPIRY_CHECK_INTERVAL |
Mark short-lived certs expired (C-1: pre-C-1 the setter was unwired and this env var had no effect; post-C-1 it's read by cmd/server/main.go::sched.SetShortLivedExpiryCheckInterval) |
| Network scan | 6 hours | Opt-in | CERTCTL_NETWORK_SCAN_ENABLED |
Run network discovery scans |
| Digest | 24 hours | Opt-in | CERTCTL_DIGEST_INTERVAL |
Send certificate digest email (does not run on startup) |
| Endpoint health | 60 seconds | Opt-in | CERTCTL_HEALTH_CHECK_INTERVAL |
Continuous TLS health probes (M48) |
| Cloud discovery | 6 hours | Opt-in | CERTCTL_CLOUD_DISCOVERY_INTERVAL |
Cloud secret manager certificate discovery (M50) |
Dynamic Configuration (GUI)
Issuer Configuration
GUI-driven issuer CRUD with AES-256-GCM encrypted config storage in PostgreSQL.
- Per-type config schema validation for all issuer types (rebuild count via
ls -d internal/connector/issuer/*/ | wc -l) - Test connection flow (instantiates throwaway connector, calls
ValidateConfig) - Dynamic
sync.RWMutex-guardedIssuerRegistry— rebuilds without server restart - Env var backward compatibility: seeds DB on first boot if no DB config exists
- Source tracking:
env(seeded from env vars) ordatabase(created via GUI)
| Env Var | Default | Description |
|---|---|---|
CERTCTL_CONFIG_ENCRYPTION_KEY |
(none) | AES-256-GCM encryption key for stored configs |
Encryption: AES-256-GCM with PBKDF2-SHA256 key derivation, 12-byte random nonce. Exported functions: EncryptAESGCM, DecryptAESGCM, DeriveKey, EncryptIfKeySet, DecryptIfEncrypted.
Target Configuration
Same pattern as issuer configuration:
- Per-type config validation for all 14 target types
- AES-256-GCM encrypted config storage
- Test connection via agent heartbeat status (online within 5 minutes)
- Source badge (database vs env), enabled/disabled toggle
Web Dashboard
The dashboard surface (rebuild count via ls web/src/pages/*.tsx | grep -v '\.test\.' | wc -l) wires every page to real API endpoints.
Pages
| Page | Route | Description |
|---|---|---|
| Dashboard | / |
Summary stats, 4 charts (status donut, expiration heatmap, renewal trends, issuance rate) |
| Certificates | /certificates |
List with bulk ops (renew, revoke by filter criteria, reassign owner), multi-select. Bulk revoke via server-side filter API, not client-side sequential calls. |
| Certificate Detail | /certificates/:id |
Versions, deployment timeline, inline policy editor, export buttons |
| Agents | /agents |
List with OS/arch metadata |
| Agent Detail | /agents/:id |
System info, heartbeat status, capabilities, recent jobs |
| Fleet Overview | /fleet |
OS/arch grouping, status/version distribution charts |
| Jobs | /jobs |
List with status filter, approval buttons, verification badges |
| Job Detail | /jobs/:id |
Full details, verification section (deployment jobs), timeline, audit events |
| Notifications | /notifications |
Grouped by cert, read/unread state, mark-read |
| Policies | /policies |
CRUD, severity summary bar, config preview |
| Profiles | /profiles |
CRUD, EKU configuration |
| Issuers | /issuers |
Catalog (10 cards), 3-step create wizard, config detail modal |
| Issuer Detail | /issuers/:id |
Config (sensitive redacted), test connection, issued certs list |
| Targets | /targets |
List with create wizard (3-step), per-type config fields for all 14 types |
| Target Detail | /targets/:id |
Config, agent link, deployment history with verification badges |
| Owners | /owners |
Team resolution, notification routing |
| Teams | /teams |
CRUD |
| Agent Groups | /agent-groups |
Dynamic criteria badges, manual membership |
| Audit | /audit |
Time range/actor/resource/action filters, CSV/JSON export |
| Short-Lived | /short-lived |
Filtered by profile TTL < 1 hour, live TTL countdown, auto-refresh 10s |
| Discovery | /discovery |
Triage GUI with summary stats, claim/dismiss, scan history |
| Network Scans | /network-scans |
CRUD for scan targets, Scan Now button |
| Digest | /digest |
Preview iframe + send button |
| Observability | /observability |
Health, metrics, Prometheus config, live output |
Onboarding Wizard
4-step first-run wizard shown when no user-configured issuers or certificates exist:
- Connect a CA — issuer catalog with 6+ types, config form, create + test connection
- Deploy Agent — OS tabs (Linux/macOS/Docker) with install commands, agent polling every 5s
- Add Certificate — CN, SANs, issuer/profile dropdowns, trigger issuance
- Done — summary, doc links
Latching state prevents refetch-driven dismissal. localStorage dismissal key: certctl:onboarding-dismissed.
CLI
certctl-cli — stdlib-only (flag + text/tabwriter), no Cobra dependency.
Scope (intentionally narrow)
The CLI focuses on read-heavy operator triage (list, get, status, version) and bulk-action surface (certs bulk-revoke, import). It deliberately omits admin CRUD for issuers, targets, owners, teams, agent groups, certificate profiles, renewal policies, policy rules, and notifications — those live in the GUI and the MCP server (rebuild count via grep -cE 'gomcp\.AddTool\(' internal/mcp/tools.go for the full operator surface). This split is intentional: CLI is the SSH-into-the-prod-host emergency console; GUI is the day-to-day operator console; MCP is the AI/automation surface. Closes audit finding cat-i-7c8b28936e3d — pre-this-doc the narrow scope was correct in code but confused readers who scanned docs/features.md's "CLI commands" count and assumed the CLI was incomplete.
Commands
| Command | Description |
|---|---|
certs list |
List certificates |
certs get ID |
Certificate details |
certs renew ID |
Trigger renewal |
certs revoke ID |
Revoke (with --reason) |
certs bulk-revoke |
Bulk revoke by filter criteria (see below) |
agents list |
List agents |
agents get ID |
Agent details |
jobs list |
List jobs |
jobs get ID |
Job details |
jobs cancel ID |
Cancel pending job |
import FILE |
Bulk import from PEM file(s) |
status |
Server health + summary |
version |
CLI version |
Global Flags
| Flag | Env Var | Default | Description |
|---|---|---|---|
--server |
CERTCTL_SERVER_URL |
http://localhost:8443 |
Server URL |
--api-key |
CERTCTL_API_KEY |
(none) | API key |
--format |
(none) | table |
Output: table or json |
Bulk Revocation Command
certs bulk-revoke revokes multiple certificates matching filter criteria.
Usage: certs bulk-revoke [CERT_IDs...] [flags]
Flags:
| Flag | Description |
|---|---|
--reason |
RFC 5280 revocation reason (keyCompromise, caCompromise, affiliationChanged, superseded, cessationOfOperation, certificateHold, privilegeWithdrawn, unspecified — default). |
--profile-id |
Revoke all certs with this profile ID |
--owner-id |
Revoke all certs owned by this owner |
--agent-id |
Revoke all certs deployed to this agent |
--issuer-id |
Revoke all certs issued by this issuer |
--team-id |
Revoke all certs owned by members of this team |
Examples:
# Revoke certs with specific IDs (positional args)
certctl-cli certs bulk-revoke mc-api-prod mc-web-prod --reason keyCompromise
# Revoke by profile
certctl-cli certs bulk-revoke --profile-id prof-staging --reason cessationOfOperation
# Revoke by team
certctl-cli certs bulk-revoke --team-id team-platform --reason superseded
# Revoke by issuer (all certs from one CA)
certctl-cli certs bulk-revoke --issuer-id iss-letsencrypt --reason caCompromise
MCP Server
Separate standalone binary (cmd/mcp-server/) using the official MCP Go SDK (modelcontextprotocol/go-sdk). Stdio transport for Claude, Cursor, and similar AI tool integrations.
- MCP tools covering all API endpoints (rebuild count via
grep -cE 'gomcp\.AddTool\(' internal/mcp/tools.go) - Stateless HTTP proxy — translates MCP tool calls to REST API calls
- Typed input structs with
jsonschemastruct tags for automatic schema generation - Binary response support (DER CRL, OCSP)
| Env Var | Description |
|---|---|
CERTCTL_SERVER_URL |
certctl server URL |
CERTCTL_API_KEY |
API key for authentication |
Agent
Standalone binary that runs on managed infrastructure. Communicates with the control plane via HTTP polling.
Capabilities
- Heartbeat reporting (OS, architecture, IP address, version via
runtime.GOOS/runtime.GOARCH/netstdlib) - Work polling (
GET /agents/{id}/work) - ECDSA P-256 key generation + CSR submission
- Target connector deployment (instantiates local connector based on job config)
- Post-deployment TLS verification
- Filesystem certificate discovery
- Exponential backoff on errors
Agent Metadata
Reported via heartbeat, stored in agents table: OS, platform, architecture, IP address, hostname, version.
Configuration
| Flag / Env Var | Default | Description |
|---|---|---|
--server-url / CERTCTL_SERVER_URL |
http://localhost:8443 |
Control plane URL |
--agent-id / CERTCTL_AGENT_ID |
(required) | Agent identifier |
--api-key / CERTCTL_API_KEY |
(none) | Auth key |
--key-dir / CERTCTL_KEY_DIR |
/var/lib/certctl/keys |
Local key storage |
--discovery-dirs / CERTCTL_DISCOVERY_DIRS |
(none) | Comma-separated scan directories |
Deployment
Docker Compose
deploy/docker-compose.yml— clean default (server + postgres + agent), wizard-compatibledeploy/docker-compose.demo.yml— override addingseed_demo.sqlfor demo modedeploy/docker-compose.test.yml— 7-container test environment (PostgreSQL, certctl-server, certctl-agent, step-ca, Pebble ACME, pebble-challtestsrv, NGINX) on static IP subnet10.30.50.0/24
Helm Chart
Production-ready Kubernetes deployment.
| Component | Kind | Notes |
|---|---|---|
| Server | Deployment | Configurable replicas (default 1), health probes, non-root, read-only rootfs |
| PostgreSQL | StatefulSet | Single replica, PVC (10Gi default, configurable storage class) |
| Agent | DaemonSet | One per node, key storage volume, server URL auto-discovery |
| Ingress | Ingress | Optional, configurable className, annotations, TLS |
| ServiceAccount | ServiceAccount | Optional with configurable annotations |
Config via values.yaml. Secrets for API key, database password, SMTP password.
Install Script
install-agent.sh — detects OS/arch via uname, downloads binary from GitHub Releases, installs to /usr/local/bin/certctl-agent, creates systemd unit (Linux) or launchd plist (macOS), prompts for server URL + API key.
Release Workflow
.github/workflows/release.yml — on tag push: cross-compiles server + agent for 4 targets, attaches as GitHub Release assets, pushes Docker images to ghcr.io.
Database Schema
PostgreSQL 16, database/sql + lib/pq (no ORM). TEXT primary keys with human-readable prefixed IDs. The catalog of tables and migrations rebuilds via the commands in the "At a Glance" table at the top of this doc — re-derive at release time rather than reading hardcoded numbers from prose.
The migration runner reads SQL files from ./migrations/ by default; the path is configurable via CERTCTL_DATABASE_MIGRATIONS_PATH for operators running certctl out of a non-standard layout (e.g. a Helm chart that bind-mounts migrations into /etc/certctl/migrations/).
Migrations
| Migration | Tables Added |
|---|---|
000001_initial_schema |
managed_certificates, certificate_versions, agents, targets, issuers, renewal_policies, jobs, audit_events, notifications, owners, teams |
000002_agent_metadata |
Columns on agents (os, platform, architecture, ip_address, hostname, version) |
000003_certificate_profiles |
certificate_profiles |
000004_agent_groups |
agent_groups, agent_group_members |
000005_revocation |
certificate_revocations + columns on managed_certificates |
000006_discovery |
discovered_certificates, discovery_scans |
000007_network_discovery |
network_scan_targets |
000008_verification |
Columns on jobs (verification fields) |
000009_issuer_config |
Columns on issuers (encrypted_config, source, test_status) |
000010_target_config |
Columns on targets (encrypted_config, source, test_status) |
000019_crl_cache |
crl_cache (per-issuer pre-generated DER CRL with monotonic crl_number per RFC 5280 §5.2.3, this_update / next_update timestamps, revoked_count, generation duration metric) + crl_generation_events (per-tick ops audit row with succeeded flag and error text) |
000020_ocsp_responder |
ocsp_responders (per-issuer dedicated OCSP responder cert PEM + on-disk key path + not_before / not_after for auto-rotation) |
The migration list above is illustrative; for the full sequence run ls migrations/*.up.sql. All migrations are idempotent (IF NOT EXISTS, ON CONFLICT).
Security
Input Validation
Centralized validation package with shell injection prevention. 80+ adversarial test cases. Used by all target connectors that execute shell commands (NGINX, Apache, HAProxy, Traefik, Caddy, Postfix/Dovecot, SSH, Java Keystore).
SSRF Protection
Network scanner filters reserved IP ranges before CIDR expansion: loopback, link-local, multicast, broadcast.
Encryption at Rest
AES-256-GCM with PBKDF2-SHA256 key derivation for issuer and target configs stored in PostgreSQL.
Agent Key Security
- Agent-side key generation (ECDSA P-256) — private keys never leave agent infrastructure
- Keys stored with
0600file permissions - Docker volumes persist keys across container restarts
CI/CD
GitHub Actions with parallel Go and Frontend jobs.
Go Pipeline
go build(server, agent, CLI, MCP server)go vetgo test -race(race detection)golangci-lint(11 linters)govulncheck(vulnerability scanning)- Test coverage with per-layer thresholds:
| Layer | Threshold |
|---|---|
| Service | 55% |
| Handler | 60% |
| Domain | 40% |
| Middleware | 30% |
Frontend Pipeline
tsc(TypeScript compilation)vitest(213 tests)vite build
Test Suite
1850+ tests across multiple layers:
| Layer | Approximate Count | Description |
|---|---|---|
| Service | ~400 | Unit tests for all service methods |
| Handler | ~200 | HTTP handler tests with mocked services |
| Domain | ~80 | Domain model validation and logic |
| Connector (issuer) | ~130 | Per-connector tests with httptest mocks |
| Connector (target) | ~200 | Per-connector tests with injectable interfaces |
| Middleware | ~30 | Auth, CORS, audit, rate limiting, body limit |
| Integration | ~50 | Multi-layer integration tests |
| Go integration | 34 subtests | Live Docker Compose environment (12 phases) |
| Repository | ~50 | testcontainers-go PostgreSQL tests |
| CLI | ~14 | Command tests with httptest mock server |
| Fuzz | ~5 | Validation and domain parsing |
| Frontend | 213 | Vitest (API client, components, utilities) |
Go Integration Tests
deploy/test/integration_test.go — //go:build integration tag, runs against live docker-compose.test.yml. 12 phases, 34 subtests: health, agent heartbeat, Local CA issuance, ACME issuance, renewal, step-ca issuance, revocation + CRL + OCSP, EST enrollment, S/MIME (EKU/KeyUsage/email SAN), discovery, network scan, deployment verification. Uses crypto/x509 for cert parsing, crypto/tls for NGINX verification, database/sql + lib/pq for PostgreSQL direct access.
Examples
5 turnkey Docker Compose scenarios in examples/:
| Directory | Scenario |
|---|---|
acme-nginx/ |
Let's Encrypt + NGINX |
acme-wildcard-dns01/ |
Wildcard with DNS-01 via Cloudflare hooks |
private-ca-traefik/ |
Local CA sub-CA mode + Traefik file provider |
step-ca-haproxy/ |
step-ca + HAProxy |
multi-issuer/ |
ACME (public) + Local CA (internal) from one dashboard |
Compliance Mapping
Pre-mapped to three compliance frameworks in docs/:
- SOC 2 Type II — CC6 (logical access), CC7 (system operations), CC8 (change management), A1 (availability)
- PCI-DSS 4.0 — Req 3 (key management), Req 4 (TLS inventory), Req 7 (access control), Req 8 (authentication), Req 10 (audit logging)
- NIST SP 800-57 — Key generation, storage, cryptoperiods, key states, algorithms, revocation
Architecture Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Language | Go 1.25 | stdlib routing, net/http, slog, crypto/x509 |
| Database | PostgreSQL 16 + database/sql + lib/pq |
No ORM, raw SQL |
| Primary keys | TEXT | Human-readable prefixed IDs (mc-api-prod) |
| Layering | Handler → Service → Repository | Dependency inversion (handlers define interfaces) |
| Frontend | Vite + React 18 + TypeScript + TanStack Query | Served from web/dist/ with SPA fallback |
| Deployment model | Pull-only | Server never initiates outbound to agents/targets |
| Service decomposition | Facade/delegation | CertificateService delegates to RevocationSvc + CAOperationsSvc |
| Handler wiring | HandlerRegistry struct (20 fields) |
Replaced 18-positional-parameter function |
| License | BSL 1.1 | Source-available; not for use in competing managed services |