Commit Graph

5 Commits

Author SHA1 Message Date
shankar0123 a172b6ed3b Bundle B CI follow-up: G-3 env-var docs + M-028 closure (final 5 SA1019 sites)
Two CI failures on master after Bundle B merge:

1. Frontend Build / G-3 env-var docs guardrail
   Bundle B introduced CERTCTL_RATE_LIMIT_PER_USER_RPS and
   CERTCTL_RATE_LIMIT_PER_USER_BURST without adding them to
   docs/features.md. The guardrail step that scans Go source for
   getEnv* calls and asserts each appears in a doc page failed.
   Fix: docs/features.md rate-limit section extended with both new
   env vars + a paragraph explaining the per-key keying contract
   from M-025.

2. Go Build & Test / staticcheck SA1019 hits (6 errors)
   The CI workflow runs staticcheck without continue-on-error. Bundle
   7 opened M-028 to track 6 deprecated-API sites; Bundle 9 closed 1
   of them (the elliptic.Marshal in local.go) but kept a deliberate
   regression-oracle reference in bundle9_coverage_test.go protected
   only by golangci-lint's //nolint comment — staticcheck-as-CLI does
   not honor that, only its native //lint:ignore directive.

   Closure of remaining 5 sites:
     cmd/server/main_test.go:47, 163, 192, 465 — 4 × middleware.NewAuth
       migrated to middleware.NewAuthWithNamedKeys with explicit
       NamedAPIKey entries. The auth=none case at line 465 maps to a
       nil NamedAPIKey slice (no-op pass-through, matches the
       NewAuthWithNamedKeys contract for empty input). Audit count was
       3; recon found a 4th at line 465 that was missed.
     internal/api/handler/scep.go:266 — csr.Attributes is a real RFC
       2985 §5.4.1 challengePassword carve-out. Go's stdlib deprecation
       note explicitly applies only to OID 1.2.840.113549.1.9.14
       (requestedExtensions), NOT to OID 1.2.840.113549.1.9.7
       (challengePassword), for which there is no non-deprecated
       stdlib API. Suppressed with native //lint:ignore SA1019 +
       comment block citing the RFC.
     internal/connector/issuer/local/bundle9_coverage_test.go:342 —
       deliberate regression-oracle that calls elliptic.Marshal to
       prove the new crypto/ecdh path is byte-identical. Comment
       converted from //nolint:staticcheck to native //lint:ignore
       SA1019 so staticcheck-as-CLI honors the suppression.

Audit deliverables:
  cowork/comprehensive-audit-2026-04-25/audit-report.md: M-028 box
    flipped [x]; score 30/55 -> 31/55 (Medium 12/27 -> 13/27).
  cowork/comprehensive-audit-2026-04-25/findings.yaml: M-028 status
    partial_closed -> closed with closure note.

Verification:
  go test -count=1 -short ./cmd/server ./internal/api/handler
    ./internal/connector/issuer/local ./internal/api/middleware
    ./internal/config — all green.
  staticcheck on each changed package — 0 SA1019 hits.

Bundle C had M-028 in scope; this CI-fix lift moves it forward so
master CI goes green immediately. Bundle C scope adjusts to remove
M-028 and focuses on M-006 / M-015 / M-016 / M-019 / M-020 plus the
M-007 / M-008 coverage gaps.
2026-04-26 23:35:13 +00:00
shankar0123 52248be717 v2.0.47: HTTPS Everywhere — TLS-only control plane, agents/CLI/MCP
Breaking change release. Plaintext HTTP listener removed. The certctl
control plane now terminates TLS 1.3 on :8443 via
http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape
hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md.

Server
- cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert
  swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback),
  preflightServerTLS validation
- cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe,
  watchSIGHUP wiring, cert/key path config threading
- tls_test.go: 418-line regression coverage of reload, preflight,
  callback behavior, SAN validation

Config
- CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required)
- Plaintext rejection: agents/CLI/MCP pre-flight-fail on http://
  URLs with a pointer to docs/upgrade-to-tls.md

Agents, CLI, MCP
- All three pre-flight-reject http:// URLs with fail-loud diagnostic
- CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust
- CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass
  (loud warning on startup)
- install-agent.sh emits both vars as commented template lines

docker-compose
- certctl-tls-init sidecar generates SAN-valid self-signed cert into
  deploy/test/certs/ on first boot
- All demo-stack curls pin against ca.crt with --cacert

Helm chart
- Three TLS provisioning modes, exactly one required:
  - server.tls.existingSecret (operator-supplied)
  - server.tls.certManager.enabled (cert-manager integration)
  - server.tls.selfSigned.enabled (eval only — not for production)
- server-certificate.yaml template for cert-manager mode
- helm install without a TLS source fails at template render with
  a pointer to docs/tls.md

CI
- .github/workflows/ci.yml Helm Chart Validation step renders the
  chart in both existingSecret and cert-manager modes, plus an
  inverse guard-regression test that asserts helm template MUST
  refuse to render when no TLS source is configured. Previously
  the single `helm template` invocation hit the certctl.tls.required
  fail-loud guard and exit-1'd CI. Four invocations now: lint
  (existingSecret), template (existingSecret), template
  (cert-manager), template (no args — must fail).

Integration tests
- deploy/test/integration_test.go stands up the Compose stack over
  HTTPS, extracts the CA bundle, and exercises every certctl API
  over https://localhost:8443
- All 34 integration subtests green (per Phase 8 local CI-parity)

Documentation
- New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload)
- New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade
  warnings, fleet-roll sequencing)
- CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry
  (file heading unchanged; release tag is v2.0.47)
- All curls in docs/, examples/, deploy/helm/ guides use
  https://localhost:8443 --cacert

Verification
- grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits
- grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin
  API default, SSRF doc comment) — zero certctl endpoints
- Tasks #197–#206 (Phases 0–8) all closed in the tracker

Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
2026-04-20 03:43:10 +00:00
shankar0123 b813660c74 security: require SCEP challenge password when SCEP enabled (fixes H-2)
Problem (CWE-306 Missing Authentication for Critical Function):
internal/service/scep.go PKCSReq skipped the shared-secret check when
s.challengePassword was empty. An unconfigured-but-enabled SCEP server
accepted any unauthenticated client reaching /scep and issued a
certificate against the configured issuer for any CSR with a valid
signature. No audit trail distinguished authenticated from
unauthenticated enrollments. This matches the two-layer fail-closed
pattern already used for C-2 (f549a7a): reject at startup AND reject
at the service boundary.

Fix (two layers, defense-in-depth):

Layer 1 — startup pre-flight in cmd/server/main.go:
  preflightSCEPChallengePassword returns a non-nil error when SCEP is
  enabled and CERTCTL_SCEP_CHALLENGE_PASSWORD is empty. main logs and
  os.Exit(1)s before the SCEP service is constructed. Disabled SCEP is
  unaffected. The helper is unit-testable in isolation.

Layer 2 — service-layer rejection in internal/service/scep.go:
  PKCSReq refuses enrollment when s.challengePassword == "" even though
  main already blocks this state — protects future call sites (tests,
  library reuse, a REST-over-HTTPS wrapper). When a secret is
  configured, the comparison now uses crypto/subtle.ConstantTimeCompare
  so response time does not leak the configured secret through a
  short-circuiting byte compare.

Files:
- cmd/server/main.go: preflightSCEPChallengePassword helper; call site
  inside the `if cfg.SCEP.Enabled` block before issuer lookup; fatal
  slog error references CWE-306 and names the env var so operators can
  diagnose the startup failure without reading code.
- cmd/server/main_test.go: TestPreflightSCEPChallengePassword with five
  table-driven subtests (disabled empty, disabled set, enabled empty
  rejected, enabled set, single-char boundary). The enabled-empty case
  asserts the error string contains both CERTCTL_SCEP_CHALLENGE_PASSWORD
  and CWE-306 so the log message remains actionable.
- internal/config/config.go: SCEPConfig.ChallengePassword godoc now
  states the field is REQUIRED when SCEP.Enabled and cross-references
  preflightSCEPChallengePassword.
- internal/service/scep.go: imports crypto/subtle; PKCSReq rewritten
  with the two-layer check; comment block cites H-2 / CWE-306 and the
  constant-time rationale.
- internal/service/scep_test.go: existing tests that relied on the
  vulnerable empty-password path now configure a secret on both sides.
  TestSCEPService_PKCSReq_ChallengePassword_NotRequired is replaced by
  TestSCEPService_PKCSReq_ChallengePassword_EmptyServerConfigRejected
  which iterates ["", "any-value", "guess"] against an unconfigured
  server and asserts "not configured" in the error. A new
  TestSCEPService_PKCSReq_ChallengePassword_ConstantTimeLengthIndependence
  exercises same-prefix-longer and wrong-case inputs to guard against a
  regression from ConstantTimeCompare to a short-circuiting byte compare.
- internal/service/m11c_crypto_enforcement_test.go: four tests
  (RejectsWeakKey, AcceptsStrongKey, MaxTTL_ForwardedToIssuer,
  NoProfileRepo_PassesThrough) constructed NewSCEPService with an empty
  challenge password and exercised PKCSReq through the now-rejected
  vulnerable path. All four now configure "secret123" on both sides with
  an inline H-2 comment; the crypto/MaxTTL/profile behavior they assert
  is unchanged.

Wire-format / behavioral invariants preserved:
- RFC 8894 SCEP handler is untouched (internal/api/handler/scep.go and
  internal/pkcs7/*): GetCACaps/GetCACert responses, PKIOperation request
  parsing, and the PKCS#7 certs-only response format are byte-identical.
- RFC 7030 EST handler is untouched
  (internal/api/handler/est.go + internal/pkcs7/*).
- Revocation idempotency composite key (H-1, migration 000012) untouched.
- AES-256-GCM config encryption (C-2) untouched.
- CRL DER bytes and OCSP response bytes unchanged.

Verification:
- go build ./...              silent success
- go vet ./...                silent success
- go test -race -count=1 ./internal/service/ ./cmd/server/
  ./internal/api/handler/ ./internal/integration/    all OK
- Coverage with comfortable headroom over CI gates:
    service     67.8% (gate 55%)
    handler     79.0% (gate 60%)
    domain      92.7% (gate 40%)
    middleware  80.0% (gate 30%)
    cmd/server  1.6%  (preflightSCEPChallengePassword: 100%)
  internal/service/scep.go PKCSReq statement coverage: 100%.
- rg sweeps: no `s.challengePassword != ""` remains;
  no `challengePassword != s.challengePassword` remains.

Operational note: operators with SCEP enabled but no challenge password
set will see a fatal startup error and a log line citing
CERTCTL_SCEP_CHALLENGE_PASSWORD and CWE-306 after upgrading. This is the
intended fail-closed behavior. Fix by either setting the env var to a
non-empty shared secret or setting CERTCTL_SCEP_ENABLED=false.

Audit report: certctl-audit-report.md (revision 5) logs this under
H-2 Resolution Log.
2026-04-16 22:22:51 +00:00
shankar0123 370f856725 fix: resolve 8 staticcheck lint errors in test files
SA1029: use typed context key instead of string in main_test.go
S1039: remove unnecessary fmt.Sprintf in validation_test.go
SA4023: fix unreachable nil check on concrete error type
SA4006: fix unused variable assignments in stepca_test.go (4 occurrences)
SA4000: fix duplicate expression in ssh_test.go (BEGIN vs END CERTIFICATE)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 23:27:57 -04:00
shankar0123 7382e5f03b test: comprehensive test gap closure across 24 packages
Close coverage gaps identified by dual-audit (qualitative + quantitative).
New test files for config (0%→98%), router (0%→100%), handler validation,
health, audit, response helpers, webhook notifier (0%→88%), email notifier,
middleware (recovery, rate limiter), domain profile, service nil-safety,
config helpers, issuer bootstrap, and server bootstrap wiring. Expanded
existing tests for ACME (34%→42%), step-ca (42%→52%), F5, SSH, agent
(43%→63%), scheduler (88%→99%), renewal service, and issuerfactory.

All tests pass: go test -short, go vet, go test -race clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 23:09:40 -04:00