Breaking change release. Plaintext HTTP listener removed. The certctl
control plane now terminates TLS 1.3 on :8443 via
http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape
hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md.
Server
- cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert
swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback),
preflightServerTLS validation
- cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe,
watchSIGHUP wiring, cert/key path config threading
- tls_test.go: 418-line regression coverage of reload, preflight,
callback behavior, SAN validation
Config
- CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required)
- Plaintext rejection: agents/CLI/MCP pre-flight-fail on http://
URLs with a pointer to docs/upgrade-to-tls.md
Agents, CLI, MCP
- All three pre-flight-reject http:// URLs with fail-loud diagnostic
- CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust
- CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass
(loud warning on startup)
- install-agent.sh emits both vars as commented template lines
docker-compose
- certctl-tls-init sidecar generates SAN-valid self-signed cert into
deploy/test/certs/ on first boot
- All demo-stack curls pin against ca.crt with --cacert
Helm chart
- Three TLS provisioning modes, exactly one required:
- server.tls.existingSecret (operator-supplied)
- server.tls.certManager.enabled (cert-manager integration)
- server.tls.selfSigned.enabled (eval only — not for production)
- server-certificate.yaml template for cert-manager mode
- helm install without a TLS source fails at template render with
a pointer to docs/tls.md
CI
- .github/workflows/ci.yml Helm Chart Validation step renders the
chart in both existingSecret and cert-manager modes, plus an
inverse guard-regression test that asserts helm template MUST
refuse to render when no TLS source is configured. Previously
the single `helm template` invocation hit the certctl.tls.required
fail-loud guard and exit-1'd CI. Four invocations now: lint
(existingSecret), template (existingSecret), template
(cert-manager), template (no args — must fail).
Integration tests
- deploy/test/integration_test.go stands up the Compose stack over
HTTPS, extracts the CA bundle, and exercises every certctl API
over https://localhost:8443
- All 34 integration subtests green (per Phase 8 local CI-parity)
Documentation
- New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload)
- New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade
warnings, fleet-roll sequencing)
- CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry
(file heading unchanged; release tag is v2.0.47)
- All curls in docs/, examples/, deploy/helm/ guides use
https://localhost:8443 --cacert
Verification
- grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits
- grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin
API default, SSRF doc comment) — zero certctl endpoints
- Tasks #197–#206 (Phases 0–8) all closed in the tracker
Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
Problem:
TestValidate_ValidConfig and TestValidate_AuthTypeNone construct a
SchedulerConfig without RetryInterval, so Validate() fails the
'retry interval must be at least 1 second' check at config.go:1086
with 'retry interval must be at least 1 second'. Both tests expect
success, so they fail whenever run.
Root cause (re-derived from source, not inherited from memory):
git log -S 'retry interval must be at least' --source --all shows
the validation was introduced in 0200c7f (I-001, RetryFailedJobs
scheduler wiring). git log -- internal/config/config_test.go shows
the test file was last touched in 7382e5f, which predates 0200c7f.
I-001 added a new Validate() rule without updating the two positive
test fixtures — a gap in I-001's verification pass.
This is NOT C-001 fallout. The config_test.go file was untouched by
the C-001 closure commits 91642e2 and 4696116. The failure surfaced
during the full test suite run after C-001 landed because no one
had run 'go test ./internal/config/...' since I-001.
Scope:
- internal/config/config_test.go (2 fixtures: TestValidate_ValidConfig,
TestValidate_AuthTypeNone).
Implementation:
Added 'RetryInterval: 5 * time.Minute' to both SchedulerConfig
literals. 5 minutes matches the I-001 default at config.go:818:
RetryInterval: getEnvDuration("CERTCTL_SCHEDULER_RETRY_INTERVAL", 5*time.Minute)
The other two TestValidate_* tests (InvalidAuthType, APIKeyAuth_
MissingSecret) are unaffected because they expect Validate() to
error at the auth-type check (line 1052) or auth-secret check
(line 1057), both of which fire before the RetryInterval check at
line 1086.
Verification:
- go test -count=1 -run 'TestValidate_' ./internal/config/...: PASS
- go test -short -count=1 ./...: all packages PASS
- go vet ./...: exit 0
Residual:
None. This is a pure test-fixture fix — production code is unchanged.
Commit:
0200c7f (I-001) should have included this edit. Attributed here for
traceability.
Close coverage gaps identified by dual-audit (qualitative + quantitative).
New test files for config (0%→98%), router (0%→100%), handler validation,
health, audit, response helpers, webhook notifier (0%→88%), email notifier,
middleware (recovery, rate limiter), domain profile, service nil-safety,
config helpers, issuer bootstrap, and server bootstrap wiring. Expanded
existing tests for ACME (34%→42%), step-ca (42%→52%), F5, SSH, agent
(43%→63%), scheduler (88%→99%), renewal service, and issuerfactory.
All tests pass: go test -short, go vet, go test -race clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>