Closes Q-1 (cat-s3-58ce7e9840be) — 37 t.Skip / testing.Short() sites
across 9 test files audited. Per-site verdict matrix:
- cmd/agent/verify_test.go (1 site): defensive guard against unreachable
httptest.NewTLSServer code path. Document-skip with closure comment.
- deploy/test/qa_test.go (11 sites): file already gated by `//go:build qa`
tag. The 11 t.Skip("Requires X — manual test") markers are runtime
second-line guards for operators who run -tags qa against a stack
missing the required external service. File-level header comment
block added explaining the manual-test convention.
- deploy/test/healthcheck_test.go (5 sites): 3 docker-availability +
1 testing.Short + 1 hard-skip for not-yet-wired runtime probe
(image-spec contract above already covers the audit-flagged
regression). All correctly gated; file-level header comment block
added explaining each.
- deploy/test/integration_test.go (5 sites): in-flight-state guards
(poll-with-skip after 90s polling for agent-online, inter-test
Phase04→Phase07 ordering, scheduler-tick race for discovered certs,
inter-test issuer fallthrough, defensive PEM-empty assertion).
Each site now has a closure comment explaining why skip is the
right choice rather than fail (upstream phase already surfaces the
real failure; skipping prevents masking root cause behind cascading
noise).
- internal/repository/postgres/{testutil,seed,repo}_test.go (5 sites):
testing.Short() gates for testcontainers-backed live PostgreSQL
integration tests. All correctly gated; closure comments added
naming the run command.
- internal/connector/notifier/email/email_test.go (2 sites):
anti-fixture assertions (test asserts SMTP dial fails; if a captive
portal black-holes the call to success, skip rather than false-pass).
Closure comments added explaining the fixture assumption.
- internal/connector/target/iis/iis_test.go (2 sites): platform-gated
skip for powershell.exe absence on non-Windows hosts. Mirrors the
production iis_connector.go LookPath guard. Closure comments added.
Total: 17 closure comments anchor the 37 skip sites (some sites share a
single block-level comment). All skips remain in place; the change is
purely documentation. The audit recommendation was "audit each skip and
decide" — for these 37, the decision is uniformly **document-skip**:
the gating is correct, the t.Skip messages name the missing precondition,
and the closure comments now pin the rationale for future readers.
See coverage-gap-audit-2026-04-24-v5/unified-audit.md
cat-s3-58ce7e9840be for closure rationale.
Breaking change release. Plaintext HTTP listener removed. The certctl
control plane now terminates TLS 1.3 on :8443 via
http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape
hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md.
Server
- cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert
swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback),
preflightServerTLS validation
- cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe,
watchSIGHUP wiring, cert/key path config threading
- tls_test.go: 418-line regression coverage of reload, preflight,
callback behavior, SAN validation
Config
- CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required)
- Plaintext rejection: agents/CLI/MCP pre-flight-fail on http://
URLs with a pointer to docs/upgrade-to-tls.md
Agents, CLI, MCP
- All three pre-flight-reject http:// URLs with fail-loud diagnostic
- CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust
- CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass
(loud warning on startup)
- install-agent.sh emits both vars as commented template lines
docker-compose
- certctl-tls-init sidecar generates SAN-valid self-signed cert into
deploy/test/certs/ on first boot
- All demo-stack curls pin against ca.crt with --cacert
Helm chart
- Three TLS provisioning modes, exactly one required:
- server.tls.existingSecret (operator-supplied)
- server.tls.certManager.enabled (cert-manager integration)
- server.tls.selfSigned.enabled (eval only — not for production)
- server-certificate.yaml template for cert-manager mode
- helm install without a TLS source fails at template render with
a pointer to docs/tls.md
CI
- .github/workflows/ci.yml Helm Chart Validation step renders the
chart in both existingSecret and cert-manager modes, plus an
inverse guard-regression test that asserts helm template MUST
refuse to render when no TLS source is configured. Previously
the single `helm template` invocation hit the certctl.tls.required
fail-loud guard and exit-1'd CI. Four invocations now: lint
(existingSecret), template (existingSecret), template
(cert-manager), template (no args — must fail).
Integration tests
- deploy/test/integration_test.go stands up the Compose stack over
HTTPS, extracts the CA bundle, and exercises every certctl API
over https://localhost:8443
- All 34 integration subtests green (per Phase 8 local CI-parity)
Documentation
- New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload)
- New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade
warnings, fleet-roll sequencing)
- CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry
(file heading unchanged; release tag is v2.0.47)
- All curls in docs/, examples/, deploy/helm/ guides use
https://localhost:8443 --cacert
Verification
- grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits
- grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin
API default, SSRF doc comment) — zero certctl endpoints
- Tasks #197–#206 (Phases 0–8) all closed in the tracker
Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
The generateTestCert() function previously returned &x509.Certificate{Raw: []byte("test")},
which is not a valid DER-encoded certificate. Replace with a proper self-signed certificate
generator using ECDSA P-256 that creates valid X.509 certificates for testing.
Added imports: crypto/ecdsa, crypto/elliptic, crypto/rand, crypto/x509/pkix, math/big
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix undefined tls.Listener in verify_test.go (type doesn't exist in
crypto/tls); use server.Listener.Addr() and server.TLS.Certificates
- Fix mockJobRepository missing Delete/ListByStatus/ListByCertificate/
UpdateStatus/GetPendingJobs methods required by JobRepository interface
- Fix mockAuditService type mismatch: NewVerificationService expects
*AuditService (concrete), not a mock; use real AuditService with mock
repo following existing testutil_test.go patterns
- Fix List() signature mismatch (had extra filter param)
- Add nil-safe logger checks in verify.go to prevent panics in tests
- Remove unused imports (crypto/tls, bytes, repository)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
M25: After deploying a certificate, the agent probes the live TLS
endpoint and compares SHA-256 fingerprints to verify the correct cert
is being served. Best-effort — failures don't block deployments.
New endpoints: POST /jobs/{id}/verify, GET /jobs/{id}/verification.
Migration 000008 adds verification columns to jobs table.
M26: Traefik target connector (file provider, auto-reload) and Caddy
target connector (dual-mode: admin API hot-reload or file-based).
Both wired into agent dispatch.
Also: restructured README to highlight supported integrations (issuers,
targets, notifiers) earlier, moved API/CLI/MCP sections lower. Updated
all docs (features, connectors, architecture, testing guide, why-certctl)
and fixed integration tests for 18-param RegisterHandlers signature.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>