Sprint 6 ACQ DEPL-004 closure follow-up. CI run on commit 58a15e0
caught two issues:
1. The fail-closed guard in templates/servicemonitor.yaml used
`{{ required "msg" nil }}`, which is wrong Helm syntax — the
bareword `nil` isn't valid in Go templates and Helm interprets
it as no value, hitting "wrong number of args for required:
want 2 got 0". The B3-helm-chart-coherence ci-guard's
production-hardening render
(`--set monitoring.serviceMonitor.enabled=true` without
explicit tlsConfig) failed with this error AND with the
downstream "missing kind: ServiceMonitor / PodDisruptionBudget /
NetworkPolicy" cascades (the entire render aborted before
producing the matrix).
2. The original DEPL-004 framing — "operators MUST explicitly
choose tlsConfig or you get a chart-render error" — was the
right intent but the wrong default. The chart's existingSecret
integration mounts the CA bundle at a canonical path
(/etc/prometheus/secrets/certctl-ca/ca.crt); defaulting to that
path closes the implicit-skipVerify gap without forcing every
operator to repeat the same boilerplate.
Fixes
=====
deploy/helm/certctl/values.yaml — flips
monitoring.serviceMonitor.tlsConfig from commented-out (which fell
through to implicit insecureSkipVerify: true) to a real verify
default:
tlsConfig:
caFile: /etc/prometheus/secrets/certctl-ca/ca.crt
serverName: certctl-server
Operators with a different CA mount path override caFile;
operators who genuinely want skipVerify back must set
`{ insecureSkipVerify: true }` explicitly. Operators who blank
tlsConfig entirely (`tlsConfig: null` or `tlsConfig: {}`) still
trip the fail-closed guard.
deploy/helm/certctl/templates/servicemonitor.yaml — replaces
`required "msg" nil` with `fail "msg"`. The `fail` builtin is
the correct Helm pattern for an unconditional render-time error;
`required` is for "this value MUST be non-empty" which is the
wrong semantic here (we want to fail when the operator went OUT OF
THEIR WAY to blank the default). Failure message updated to
reflect the new default + the operator-action recipes.
docs/operator/helm-deployment.md — rewrites the
"2026-05-16 — ServiceMonitor TLS default flipped" subsection to
match the new default-on-real-verify semantics. The three operator
recipes (default install / different CA mount / explicit
skipVerify) are documented; the explicit "there is no way to
inherit pre-2026-05-16 implicit-skipVerify behavior silently"
guarantee is preserved.
Verified locally: python3 YAML parse on values.yaml clean; the
helm-templates-lint and B3-helm-chart-coherence ci-guards require
helm itself which isn't in the sandbox — both should pass on the
CI re-run.
certctl Documentation
Last reviewed: 2026-05-12
The full docs index, organized by audience. Pick the section that matches what you need to do; each link below opens a focused doc rather than a wall of text.
For the elevator pitch and quickstart commands, see the repo README.md at the root. For the marketing site, see certctl.io.
Getting Started
You're new to certctl, just cloned the repo, or want to understand what it does before installing.
| Doc | What it covers |
|---|---|
| Concepts | TLS certificates explained for beginners — CAs, ACME, EST, private keys, the full glossary |
| Quickstart | Five-minute setup with Docker Compose, dashboard tour, API tour |
| Examples | Five turnkey scenarios — ACME+NGINX, wildcard DNS-01, private CA+Traefik, step-ca+HAProxy, multi-issuer |
| Advanced demo | End-to-end certificate lifecycle with technical depth at each step |
| Why certctl | Positioning vs ACME clients, agent-based SaaS, enterprise platforms; when to look elsewhere |
Reference
You're operating certctl in production or building integrations and need authoritative technical detail.
| Doc | What it covers |
|---|---|
| Architecture | System design, data flow, security model, deployment topologies |
| Profiles | CertificateProfile policy object — issuer wiring, EKUs, RequiresApproval gate (with profile-edit closure) |
| API | OpenAPI 3.1 spec, integration patterns, client SDK generation |
| CLI | certctl-cli command reference and CI/CD integration patterns |
| Configuration | CERTCTL_* environment variable reference (scheduler, rate limits, deploy verify, audit, agent) |
| MCP server | Model Context Protocol integration for AI assistants |
| Release verification | Cosign / SLSA / SBOM verification procedure |
| Intermediate CA hierarchy | Multi-level CA tree management — RFC 5280 §3.2/§4.2.1.9/§4.2.1.10 enforcement |
| Auth standards implemented | RFC + CWE evidence for the API-key + RBAC + OIDC + sessions + break-glass surface (NOT a compliance-mapping doc) |
| Deployment model | Atomic write, post-deploy verify, rollback semantics across all targets |
| Vendor matrix | Tested vendor versions per target connector |
Connectors
The connector index is the canonical catalog (interfaces, registry, scanners, plus an inline reference per built-in). Per-connector deep-dive siblings cover operator-grade material — vendor edges, troubleshooting, rotation playbooks, when-to-use vs alternatives.
Issuers (13 deep-dives): ACME · ADCS · AWS ACM Private CA · DigiCert · EJBCA / Keyfactor · Entrust · GlobalSign Atlas HVCA · Google CAS · Local CA · OpenSSL / Custom CA · Sectigo SCM · step-ca / Smallstep · Vault PKI
Targets (15 deep-dives): Apache · AWS Certificate Manager · Azure Key Vault · Caddy · Envoy · F5 BIG-IP · HAProxy · IIS · Java Keystore · Kubernetes Secrets · NGINX · Postfix / Dovecot · SSH (agentless) · Traefik · Windows Certificate Store
Protocols
| Doc | What it covers |
|---|---|
| ACME server | Run certctl as an RFC 8555 + RFC 9773 ARI ACME server |
| ACME server threat model | Security posture for the ACME server endpoint |
| SCEP server | RFC 8894 native SCEP server — RA cert config, multi-profile dispatch, must-staple, mTLS sibling route |
| SCEP for Microsoft Intune | Intune-specific deployment guide — NDES replacement playbook |
| EST server | RFC 7030 EST server — 802.1X / Wi-Fi enrollment, IoT bootstrap, channel binding |
| CRL & OCSP | RFC 5280 CRL + RFC 6960 OCSP responder for relying parties |
| Async CA polling | Bounded polling for async-CA issuer connectors |
Operator
You're running certctl in production and need operational guidance.
| Doc | What it covers |
|---|---|
| Security posture | Auth, rate limits, encryption at rest, key rotation, RBAC + OIDC + sessions + break-glass, bootstrap |
| Secret custody | Where private keys live; FileDriver vs HSM/KMS; encryption wire format; env-seeded vs DB-seeded plaintext policy |
| Observability | Metrics surface, Prometheus exposition vs client_golang, tracing scope, log structure, rate-limit semantics across restarts/replicas |
| RBAC operator reference | Roles, permissions, scopes, scope-down + day-0 bootstrap |
| Auth threat model | API-key + RBAC + OIDC + sessions + break-glass — token forgery, session hijacking, IdP compromise, role-grant abuse, bootstrap-token leak, audit-mutation |
| OIDC / SSO runbooks | Per-IdP setup guides — Keycloak, Authentik, Okta, Auth0, Entra ID, Google Workspace |
| Control plane TLS | Self-signed bootstrap, operator-supplied Secret, cert-manager Certificate CR |
| Database TLS | PostgreSQL transport encryption |
| Approval workflow | Two-person integrity gate for high-stakes issuance + profile-edit closure |
| Helm deployment | Kubernetes installation via the bundled chart |
| Performance baselines | Operator-runnable benchmarks for regression spot checks |
| Auth benchmarks | Session + OIDC validation p99 targets and measured baselines |
| Legacy clients (TLS 1.2) | Reverse-proxy runbook for embedded EST/SCEP clients on TLS 1.2 |
Runbooks
| Runbook | When |
|---|---|
| Cloud targets | AWS ACM + Azure Key Vault deployment, debugging, rollback |
| Expiry alerts | Per-policy multi-channel routing matrix, severity tiers |
| Disaster recovery | CRL cache, OCSP responder cert, CA private-key rotation, Postgres restore |
| Config-encryption upgrade | Force v1/v2 → v3 re-seal across the database; passphrase rotation procedure |
| PostgreSQL backup | Operator-run backup recipe (docker-compose + Kubernetes); recommended cadence; quarterly DR dry-run |
Migration
You're moving from another cert-management tool to certctl, or running both in parallel.
| From | Doc |
|---|---|
| Certbot | migration/from-certbot.md |
| acme.sh | migration/from-acmesh.md |
| cert-manager (coexistence, not replacement) | migration/cert-manager-coexistence.md |
| Caddy ACME (point Caddy at certctl) | migration/acme-from-caddy.md |
| cert-manager ACME (point cert-manager at certctl) | migration/acme-from-cert-manager.md |
| Traefik ACME (point Traefik at certctl) | migration/acme-from-traefik.md |
| API keys → RBAC (v2.0.x → v2.1.0) | migration/api-keys-to-rbac.md — AUDIT YOUR API KEYS post-upgrade |
| Enable OIDC SSO | migration/oidc-enable.md — step-by-step OIDC onboarding for an existing API-key + RBAC deployment |
Contributor
You're contributing to certctl, running tests locally, or trying to understand the CI pipeline.
| Doc | What it covers |
|---|---|
| Testing strategy | What we test and why; per-PR fast gates vs daily deep-scan |
| Test environment | Local environment with real CAs (Pebble, step-ca, etc.) |
| QA prerequisites | Before running QA: stack boot, demo data baseline, env vars |
| QA test suite | qa_test.go reference for release QA |
| GUI QA checklist | Manual GUI verification pass for release |
| Release sign-off | Release-day checklist — code state, automated gates, manual QA, artefact verification |
| CI pipeline | CI shape, regression guards, adding new checks |
| CI guards | Per-class CI guards (code-shape, contract-parity, build/dep, operational); how to add one |
Archive
Historical docs preserved for reference. Most operators don't need these.
| Doc | Why archived |
|---|---|
| Upgrade to TLS (v2.2) | Pre-v2.2 HTTPS-everywhere upgrade procedure |
| Upgrade past v2 JWT removal | G-1 milestone JWT auth removal procedure |
Reading order by role
First-time operator: Concepts → Quickstart → Examples. About 90 minutes end to end.
Production operator: Architecture → Security posture → Control plane TLS → Disaster recovery runbook. About 4 hours end to end.
PKI engineer: ACME server → SCEP server → EST server → Intermediate CA hierarchy. About 6 hours end to end.
Contributor: Architecture → Testing strategy → Test environment → CI pipeline. About 3 hours end to end.