Files
certctl/CHANGELOG.md
T
shankar0123 9c1d446e40 fix(security,config): remove unimplemented JWT auth-type, close silent downgrade (G-1)
The pre-G-1 config validator accepted CERTCTL_AUTH_TYPE=jwt and the
startup log faithfully echoed 'authentication enabled type=jwt'.
Reasonable people read that and concluded JWT auth was on. It wasn't.
The auth-middleware wiring at cmd/server/main.go unconditionally routed
every request through the api-key bearer middleware regardless of
cfg.Auth.Type. So CERTCTL_AUTH_TYPE=jwt quietly compared the incoming
'Authorization: Bearer <token>' against whatever string the operator put
in CERTCTL_AUTH_SECRET — real JWT clients got 401, and operators who
treated CERTCTL_AUTH_SECRET as a *signing* secret (because they thought
they were configuring JWT) had effectively handed an attacker an api-key.
A security finding masquerading as a config option.

We chose the audit-recommended structural fix: remove the option, fail
fast at startup, and add the gateway-fronting pattern as the documented
forward path. Implementing JWT middleware would have meant jwks vs
static-secret rotation, claim mapping, expiry enforcement, audience and
issuer validation, key rollover semantics, and regression coverage at the
same depth as the existing api-key path — a feature, not a fix. Operators
who genuinely need JWT/OIDC front certctl with an authenticating gateway
(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium /
Authelia) and run the upstream certctl with CERTCTL_AUTH_TYPE=none. Same
shape works on docker-compose and Helm.

The change is comprehensive across 7 phases — every surface that
mentioned 'jwt' as a certctl-auth-type is updated, plus structural
backstops (typed enum, runtime guard, helm template validation, CI grep
guard) so the lie can't reappear.

Files changed:

Phase 1 — production code (typed enum + jwt removal):
- internal/config/config.go: AuthType typed alias + AuthTypeAPIKey /
  AuthTypeNone constants + ValidAuthTypes() helper. Validate() routes
  literal 'jwt' through a dedicated multi-line diagnostic naming the
  authenticating-gateway pattern, then cross-checks against
  ValidAuthTypes(). Secret-required branch simplified to api-key-only.
  Field comment on AuthConfig.Type rewritten to drop jwt and point at
  the gateway pattern.
- internal/api/middleware/middleware.go: AuthConfig.Type field comment
  references the typed config.AuthType constants.
- internal/api/handler/health.go: same treatment for HealthHandler.AuthType.
- cmd/server/main.go: defense-in-depth runtime switch immediately after
  config.Load() — exits 1 on any unsupported auth-type that bypassed the
  validator. Auth-disabled startup log explicitly names the
  authenticating-gateway pattern.

Phase 2 — tests (Red→Green, contract pinning):
- internal/config/config_test.go: TestValidate_JWTAuth_RejectedDedicated
  (two table rows pinning the dedicated G-1 error fires regardless of
  whether Secret is set), TestValidAuthTypesDoesNotContainJWT (property
  guard against future re-introduction),
  TestValidAuthTypesIsExactly_APIKey_None (allowed-set contract),
  TestValidate_GenericInvalidAuthType (pins non-jwt invalid values still
  hit the generic invalid-auth-type error). Removed the prior
  TestValidate_JWTAuth_MissingSecret happy-path since its premise is
  inverted post-G-1.
- internal/api/handler/health_test.go: removed
  TestAuthInfo_ReturnsAuthType_JWT (which baked the silent-downgrade lie
  into the regression suite). Pre-existing _APIKey test continues to
  cover the api-key happy path.

Phase 3 — spec, docs, env templates:
- api/openapi.yaml: auth_type enum dropped to [api-key, none] with
  inline comment naming the G-1 closure.
- .env.example (root): CERTCTL_AUTH_TYPE comment block rewritten to drop
  jwt and point at the gateway pattern; secret-required conditional
  simplified to api-key-only.
- docs/architecture.md: middleware-stack bullet rewritten to drop the
  JWT mention; new H3 'Authenticating-gateway pattern (JWT, OIDC, mTLS)'
  section explaining the design rationale and listing oauth2-proxy /
  Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia / Caddy
  forward_auth / Apache mod_auth_openidc / nginx auth_request as the
  standard fronting options.
- docs/upgrade-to-v2-jwt-removal.md (new ~125 lines): migration guide
  with preconditions, what-changes, both recovery paths, complete
  docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy
  ext_authz patterns, rollback posture.

Phase 4 — Helm chart (template validation + docs):
- deploy/helm/certctl/templates/_helpers.tpl: new certctl.validateAuthType
  helper mirroring the existing certctl.tls.required pattern. Fails
  template render on any server.auth.type outside {api-key, none} with
  a multi-line diagnostic.
- deploy/helm/certctl/templates/server-deployment.yaml,
  server-configmap.yaml, server-secret.yaml: invoke the helper at the
  top of each template that depends on .Values.server.auth.type.
- deploy/helm/certctl/values.yaml: auth: block comment expanded with the
  G-1 rationale and gateway-pattern cross-reference.
- deploy/helm/CHART_SUMMARY.md: server.auth.type table row now surfaces
  the allowed set and points at the upgrade doc.
- deploy/helm/certctl/README.md: new 'JWT / OIDC via authenticating
  gateway' section with a Kubernetes-flavored oauth2-proxy + certctl
  walkthrough.

Phase 5 — release surface:
- CHANGELOG.md: new [unreleased] top entry with Breaking / Removed /
  Added / Changed sections; explicit pointer at
  docs/upgrade-to-v2-jwt-removal.md from the Breaking subsection.

Phase 6 — CI guardrail:
- .github/workflows/ci.yml: new 'Forbidden auth-type literal regression
  guard (G-1)' step. Scoped patterns catch the actual regression shapes
  (map literal, slice literal, switch case, OpenAPI enum, env-file
  default, AuthType('jwt') cast). Comments and the dedicated rejection
  branch are intentionally exempt; connector-package JWT references
  (Google OAuth2 / step-ca) are exempt as out-of-scope external
  protocols. Verified locally: the guard passes on the actual tree and
  fires on all 4 synthetic regression patterns.

Out of scope (explicitly untouched):
- internal/connector/discovery/gcpsm/gcpsm.go — Google OAuth2 service-
  account JWT (external protocol).
- internal/connector/issuer/googlecas/googlecas.go — same.
- internal/connector/issuer/stepca/stepca.go — step-ca's provisioner
  one-time-token JWT for /sign API.
- docs/test-env.md, docs/connectors.md, docs/features.md — describe
  external CAs' use of JWT, not certctl's auth shape.
- Implementing actual JWT middleware. Feature, not a fix.

Verification (all gates pass):
- go build ./... — clean
- go vet ./... — clean
- go test -short ./... — every package green
- go test -short -race ./internal/config/... ./internal/api/... — clean
- govulncheck ./... — no vulnerabilities in our code
- helm lint deploy/helm/certctl/ — clean
- helm template with auth.type=api-key — renders OK
- helm template with auth.type=none — renders OK
- helm template with auth.type=jwt — fails with validateAuthType
  diagnostic (exit 1)
- python3 yaml.safe_load on api/openapi.yaml — parses
- CI guardrail mirror — clean on real tree, fires on all 4 synthetic
  regression patterns
- Smoke test: 'CERTCTL_AUTH_TYPE=jwt ./certctl-server' exits non-zero
  with: 'Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no
  longer accepted (G-1 silent auth downgrade): no JWT middleware ships
  with certctl. To use JWT/OIDC, run an authenticating gateway
  (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) in
  front of certctl and set CERTCTL_AUTH_TYPE=none on the upstream.
  See docs/architecture.md "Authenticating-gateway pattern" and
  docs/upgrade-to-v2-jwt-removal.md for the migration walkthrough'

config pkg coverage: ValidAuthTypes 100%, Validate 94.7%, total 75.5%.

Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
      §2 P1 cluster, cat-g-jwt_silent_auth_downgrade
      Audit recommendation followed verbatim: 'Remove jwt from
      validAuthTypes until middleware ships'.
2026-04-25 00:22:23 +00:00

12 KiB

Changelog

All notable changes to certctl are documented in this file. Dates use ISO 8601. Versions follow Semantic Versioning.

[unreleased] — 2026-04-24

G-1: JWT silent auth downgrade — closed end-to-end

Pre-G-1 the config validator accepted CERTCTL_AUTH_TYPE=jwt and the startup log faithfully echoed "authentication enabled" "type"="jwt". Reasonable people read that and concluded JWT was on. It wasn't. The auth-middleware wiring at cmd/server/main.go unconditionally routed every request through the api-key bearer middleware regardless of cfg.Auth.Type. So CERTCTL_AUTH_TYPE=jwt quietly compared incoming Authorization: Bearer <something> against whatever string the operator put in CERTCTL_AUTH_SECRET — real JWT clients got 401, and operators who treated CERTCTL_AUTH_SECRET as a signing secret (because they thought they were configuring JWT) had effectively handed an attacker an api-key. A security finding masquerading as a config option. We chose to remove the option rather than ship JWT middleware — the audit-recommended structural fix that closes the hazard. Operators who actually need JWT/OIDC front certctl with an authenticating gateway (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia) and run the upstream certctl with CERTCTL_AUTH_TYPE=none. The same pattern works on docker-compose and Helm.

Breaking Changes

  • CERTCTL_AUTH_TYPE=jwt is no longer accepted. Pre-G-1 the value was silently downgraded to api-key middleware. Post-G-1 the server fails at startup with a dedicated diagnostic naming the authenticating-gateway pattern. Operators with this in their env block must either switch to api-key (if they were de facto using api-key auth all along — same Bearer token continues to work) or switch to none and front certctl with an oauth2-proxy / Envoy / Traefik / Pomerium gateway. See docs/upgrade-to-v2-jwt-removal.md.
  • Helm chart server.auth.type=jwt now fails at helm install / helm upgrade template time. New certctl.validateAuthType template helper runs on every template that depends on .Values.server.auth.type (server-deployment.yaml, server-configmap.yaml, server-secret.yaml) and fails the render with a pointer at the gateway-fronting pattern.
  • OpenAPI spec auth_type enum no longer includes jwt. API consumers checking /api/v1/auth/info against the spec will see a smaller enum.

Removed

  • Documented references to JWT in the certctl auth surface (config docblocks, middleware/health-handler comments, .env.example, docs/architecture.md middleware-stack bullet). Connector-level JWT references (Google OAuth2 service-account JWT in internal/connector/discovery/gcpsm/, internal/connector/issuer/googlecas/; step-ca's provisioner one-time-token JWT in internal/connector/issuer/stepca/) are unrelated and untouched — those are external-protocol uses, not certctl's own auth shape.

Added

  • config.AuthType typed alias with AuthTypeAPIKey / AuthTypeNone exported constants. Single source of truth for the allowed set across the validator, the runtime defense-in-depth switch in main.go, and the helm chart's validateAuthType helper.
  • config.ValidAuthTypes() helper returning the complete allowed set; pinned by a property test (TestValidAuthTypesDoesNotContainJWT) that fails the build if "jwt" is ever re-added to the slice.
  • Defense-in-depth runtime guard in cmd/server/main.go immediately after config.Load() — a switch config.AuthType(cfg.Auth.Type) that exits 1 if the validator was bypassed (test harness, alt config loader, env-var rebinding).
  • certctl.validateAuthType Helm template helper mirroring the existing certctl.tls.required pattern. Fails template render on any server.auth.type outside {api-key, none}.
  • docs/architecture.md "Authenticating-gateway pattern (JWT, OIDC, mTLS)" section explaining the design rationale for the narrow in-process auth surface and listing oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia / Caddy forward_auth / Apache mod_auth_openidc / nginx auth_request as the standard fronting options.
  • docs/upgrade-to-v2-jwt-removal.md migration guide. Same shape as docs/upgrade-to-tls.md. Walks through the dedicated startup error, both recovery paths (api-key vs gateway-fronting), a complete docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy ext_authz patterns, and rollback posture.
  • deploy/helm/certctl/README.md "JWT / OIDC via authenticating gateway" section with a Kubernetes-flavored oauth2-proxy + certctl walkthrough.
  • CI regression guardrail in .github/workflows/ci.yml (Forbidden auth-type literal regression guard (G-1)) — grep-fails the build if "jwt" appears as an auth-type literal in production code or spec. Connector packages exempt (legitimate external-protocol uses).
  • Negative test coverage in internal/config/config_test.go: TestValidate_JWTAuth_RejectedDedicated (two table rows pinning that the dedicated G-1 error fires regardless of whether Secret is set), TestValidAuthTypesDoesNotContainJWT (property-level guard), TestValidAuthTypesIsExactly_APIKey_None (allowed-set contract), TestValidate_GenericInvalidAuthType (pins that other invalid values still surface the generic invalid-auth-type error, so the dedicated G-1 path doesn't accidentally swallow non-jwt typos).

Changed

  • internal/api/middleware/middleware.go::AuthConfig.Type field comment now references the typed config.AuthType constants instead of an inline string enumeration.
  • internal/api/handler/health.go::HealthHandler.AuthType field comment same treatment.
  • internal/api/handler/health_test.go — the prior TestAuthInfo_ReturnsAuthType_JWT (which asserted the handler echoed "jwt", baking the silent-downgrade lie into the regression suite) is removed; the pre-existing TestAuthInfo_ReturnsAuthType_APIKey continues to cover the api-key happy path.
  • Auth-disabled startup log in main.go now points operators at the authenticating-gateway pattern explicitly.

[2.2.0] — 2026-04-19

HTTPS Everywhere — The Irony

certctl manages other teams' certificates. Until v2.2, it didn't terminate TLS on its own control plane. We treated the server as an internal service sitting behind whatever TLS-terminating infrastructure the operator already owned — reverse proxies, Kubernetes Ingress controllers, service mesh sidecars. Working through an EST coverage-gap audit surfaced this as a credibility problem we wanted to fix head-on: a cert-lifecycle product should ship with HTTPS by default. This release flips that. Self-signed bootstrap for docker-compose demos, operator-supplied Secret for Helm (with optional cert-manager integration), and a one-step cutover with no backward-compat bridge. Out-of-date agents will fail at the TLS handshake layer on upgrade; the upgrade guide walks operators through the roll.

Breaking Changes

  • HTTPS-only control plane. The plaintext HTTP listener is gone. There is no CERTCTL_TLS_ENABLED=false escape hatch and no :8080 fallback. Operators who were running certctl behind their own TLS terminator must either (a) continue doing so and let the downstream TLS terminator talk to certctl's HTTPS listener, or (b) bring their own cert/key and terminate on certctl directly. Either path requires config changes — see docs/upgrade-to-tls.md for a one-step cutover.
  • Agents reject CERTCTL_SERVER_URL=http://... at startup. This is a pre-flight config validation failure with a fail-loud diagnostic pointing at docs/upgrade-to-tls.md. Not a TCP-refused, not a TLS-handshake-error — the agent will not even attempt the network call. Every agent deployment must be reconfigured before upgrading the server.
  • CLI and MCP clients require https:// URLs. Same pre-flight rejection of plaintext schemes.
  • TLS 1.2 is not supported. TLS 1.3 only. The server's tls.Config.MinVersion is pinned to tls.VersionTLS13. Any client still negotiating TLS 1.2 will fail at the handshake. Modern curl, Go stdlib, browsers, and Kubernetes tooling all default to 1.3-capable; legacy clients may need an upgrade.
  • Helm chart requires a TLS source. helm install without one of server.tls.existingSecret, server.tls.certManager.enabled, or (for eval only) server.tls.selfSigned.enabled fails at template time with a diagnostic pointing at docs/tls.md. There is no default-to-plaintext path.

Added

  • Self-signed bootstrap for Docker Compose demos. A certctl-tls-init init container runs before the server on first boot, generates a SAN-valid self-signed cert into deploy/test/certs/, and exits. The server mounts the resulting cert/key. Every curl in the demo stack pins against ./deploy/test/certs/ca.crt with --cacert.
  • Helm chart TLS provisioning — three modes. Operator-supplied Secret (server.tls.existingSecret), cert-manager integration (server.tls.certManager.enabled with issuer selection), or self-signed (server.tls.selfSigned.enabled — eval only, not supported for production). Chart templates enforce exactly one is active.
  • Hot-reload of TLS cert/key on SIGHUP. Overwrite the cert/key on disk, send SIGHUP to the server PID, watch the slog.Info("tls.reload", ...) log line, and new TLS connections use the new cert. Failure during reload is logged and does not crash the server; the previous cert remains in use.
  • Agent CA-bundle env vars. CERTCTL_SERVER_CA_BUNDLE_PATH points at a PEM file the agent's HTTP client will trust. CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY disables verification (development only — the agent logs a loud warning at startup). install-agent.sh writes both as commented template lines into the generated agent.env.
  • Integration test suite runs over HTTPS. go test -tags=integration ./deploy/test/... stands up the full Compose stack, extracts the self-signed CA bundle, and exercises every certctl API over https://localhost:8443. All 34 subtests green.
  • docs/tls.md — cert provisioning patterns: bring-your-own Secret, cert-manager, self-signed bootstrap, SAN requirements, rotation workflows, SIGHUP reload semantics, troubleshooting.
  • docs/upgrade-to-tls.md — one-step cutover guide for existing v2.1 operators. Walks through the agent fleet roll, Helm upgrade sequencing, downgrade-is-not-supported warnings, and cert-provisioning decision tree.

Changed

  • cmd/server/main.go now calls http.Server.ListenAndServeTLS(certFile, keyFile). The plaintext ListenAndServe code path is deleted — grep -rn "ListenAndServe[^T]" cmd/ internal/ returns zero hits.
  • All documentation curls (docs/testing-guide.md, docs/quickstart.md, deploy/helm/INSTALLATION.md, deploy/helm/DEPLOYMENT_GUIDE.md, deploy/ENVIRONMENTS.md, docs/openapi.md, migration guides, example READMEs) use https://localhost:8443 and --cacert against the demo stack's bundle.
  • OpenAPI spec (api/openapi.yaml) servers blocks default to https://localhost:8443.

Security

  • TLS 1.3 pinned via tls.Config.MinVersion = tls.VersionTLS13.
  • Plaintext HTTP listener removed entirely — no port 8080, no Upgrade-Insecure-Requests, no HSTS-required redirect dance. There is only one port: 8443, TLS 1.3.
  • grep -rn "http://" cmd/ internal/ returns zero hits outside test fixtures and the agent-side URL-scheme rejection error message.

Upgrade Notes

Read docs/upgrade-to-tls.md before upgrading. The short version:

  1. Pick a TLS source — bring-your-own cert, cert-manager, or self-signed bootstrap.
  2. Upgrade the server with TLS configured. First boot over HTTPS.
  3. Roll the agent fleet: set CERTCTL_SERVER_URL=https://... and, if using a private CA, CERTCTL_SERVER_CA_BUNDLE_PATH. Old agents will fail loud at startup — expected.
  4. Roll CLI/MCP clients the same way.

There is no backward-compat bridge. There is no dual-listener mode. The cutover is one step.