From 9c1d446e40a1522edd8910978e821b6426276849 Mon Sep 17 00:00:00 2001 From: shankar0123 Date: Sat, 25 Apr 2026 00:22:23 +0000 Subject: [PATCH] fix(security,config): remove unimplemented JWT auth-type, close silent downgrade (G-1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The pre-G-1 config validator accepted CERTCTL_AUTH_TYPE=jwt and the startup log faithfully echoed 'authentication enabled type=jwt'. Reasonable people read that and concluded JWT auth was on. It wasn't. The auth-middleware wiring at cmd/server/main.go unconditionally routed every request through the api-key bearer middleware regardless of cfg.Auth.Type. So CERTCTL_AUTH_TYPE=jwt quietly compared the incoming 'Authorization: Bearer ' against whatever string the operator put in CERTCTL_AUTH_SECRET — real JWT clients got 401, and operators who treated CERTCTL_AUTH_SECRET as a *signing* secret (because they thought they were configuring JWT) had effectively handed an attacker an api-key. A security finding masquerading as a config option. We chose the audit-recommended structural fix: remove the option, fail fast at startup, and add the gateway-fronting pattern as the documented forward path. Implementing JWT middleware would have meant jwks vs static-secret rotation, claim mapping, expiry enforcement, audience and issuer validation, key rollover semantics, and regression coverage at the same depth as the existing api-key path — a feature, not a fix. Operators who genuinely need JWT/OIDC front certctl with an authenticating gateway (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia) and run the upstream certctl with CERTCTL_AUTH_TYPE=none. Same shape works on docker-compose and Helm. The change is comprehensive across 7 phases — every surface that mentioned 'jwt' as a certctl-auth-type is updated, plus structural backstops (typed enum, runtime guard, helm template validation, CI grep guard) so the lie can't reappear. Files changed: Phase 1 — production code (typed enum + jwt removal): - internal/config/config.go: AuthType typed alias + AuthTypeAPIKey / AuthTypeNone constants + ValidAuthTypes() helper. Validate() routes literal 'jwt' through a dedicated multi-line diagnostic naming the authenticating-gateway pattern, then cross-checks against ValidAuthTypes(). Secret-required branch simplified to api-key-only. Field comment on AuthConfig.Type rewritten to drop jwt and point at the gateway pattern. - internal/api/middleware/middleware.go: AuthConfig.Type field comment references the typed config.AuthType constants. - internal/api/handler/health.go: same treatment for HealthHandler.AuthType. - cmd/server/main.go: defense-in-depth runtime switch immediately after config.Load() — exits 1 on any unsupported auth-type that bypassed the validator. Auth-disabled startup log explicitly names the authenticating-gateway pattern. Phase 2 — tests (Red→Green, contract pinning): - internal/config/config_test.go: TestValidate_JWTAuth_RejectedDedicated (two table rows pinning the dedicated G-1 error fires regardless of whether Secret is set), TestValidAuthTypesDoesNotContainJWT (property guard against future re-introduction), TestValidAuthTypesIsExactly_APIKey_None (allowed-set contract), TestValidate_GenericInvalidAuthType (pins non-jwt invalid values still hit the generic invalid-auth-type error). Removed the prior TestValidate_JWTAuth_MissingSecret happy-path since its premise is inverted post-G-1. - internal/api/handler/health_test.go: removed TestAuthInfo_ReturnsAuthType_JWT (which baked the silent-downgrade lie into the regression suite). Pre-existing _APIKey test continues to cover the api-key happy path. Phase 3 — spec, docs, env templates: - api/openapi.yaml: auth_type enum dropped to [api-key, none] with inline comment naming the G-1 closure. - .env.example (root): CERTCTL_AUTH_TYPE comment block rewritten to drop jwt and point at the gateway pattern; secret-required conditional simplified to api-key-only. - docs/architecture.md: middleware-stack bullet rewritten to drop the JWT mention; new H3 'Authenticating-gateway pattern (JWT, OIDC, mTLS)' section explaining the design rationale and listing oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia / Caddy forward_auth / Apache mod_auth_openidc / nginx auth_request as the standard fronting options. - docs/upgrade-to-v2-jwt-removal.md (new ~125 lines): migration guide with preconditions, what-changes, both recovery paths, complete docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy ext_authz patterns, rollback posture. Phase 4 — Helm chart (template validation + docs): - deploy/helm/certctl/templates/_helpers.tpl: new certctl.validateAuthType helper mirroring the existing certctl.tls.required pattern. Fails template render on any server.auth.type outside {api-key, none} with a multi-line diagnostic. - deploy/helm/certctl/templates/server-deployment.yaml, server-configmap.yaml, server-secret.yaml: invoke the helper at the top of each template that depends on .Values.server.auth.type. - deploy/helm/certctl/values.yaml: auth: block comment expanded with the G-1 rationale and gateway-pattern cross-reference. - deploy/helm/CHART_SUMMARY.md: server.auth.type table row now surfaces the allowed set and points at the upgrade doc. - deploy/helm/certctl/README.md: new 'JWT / OIDC via authenticating gateway' section with a Kubernetes-flavored oauth2-proxy + certctl walkthrough. Phase 5 — release surface: - CHANGELOG.md: new [unreleased] top entry with Breaking / Removed / Added / Changed sections; explicit pointer at docs/upgrade-to-v2-jwt-removal.md from the Breaking subsection. Phase 6 — CI guardrail: - .github/workflows/ci.yml: new 'Forbidden auth-type literal regression guard (G-1)' step. Scoped patterns catch the actual regression shapes (map literal, slice literal, switch case, OpenAPI enum, env-file default, AuthType('jwt') cast). Comments and the dedicated rejection branch are intentionally exempt; connector-package JWT references (Google OAuth2 / step-ca) are exempt as out-of-scope external protocols. Verified locally: the guard passes on the actual tree and fires on all 4 synthetic regression patterns. Out of scope (explicitly untouched): - internal/connector/discovery/gcpsm/gcpsm.go — Google OAuth2 service- account JWT (external protocol). - internal/connector/issuer/googlecas/googlecas.go — same. - internal/connector/issuer/stepca/stepca.go — step-ca's provisioner one-time-token JWT for /sign API. - docs/test-env.md, docs/connectors.md, docs/features.md — describe external CAs' use of JWT, not certctl's auth shape. - Implementing actual JWT middleware. Feature, not a fix. Verification (all gates pass): - go build ./... — clean - go vet ./... — clean - go test -short ./... — every package green - go test -short -race ./internal/config/... ./internal/api/... — clean - govulncheck ./... — no vulnerabilities in our code - helm lint deploy/helm/certctl/ — clean - helm template with auth.type=api-key — renders OK - helm template with auth.type=none — renders OK - helm template with auth.type=jwt — fails with validateAuthType diagnostic (exit 1) - python3 yaml.safe_load on api/openapi.yaml — parses - CI guardrail mirror — clean on real tree, fires on all 4 synthetic regression patterns - Smoke test: 'CERTCTL_AUTH_TYPE=jwt ./certctl-server' exits non-zero with: 'Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no longer accepted (G-1 silent auth downgrade): no JWT middleware ships with certctl. To use JWT/OIDC, run an authenticating gateway (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) in front of certctl and set CERTCTL_AUTH_TYPE=none on the upstream. See docs/architecture.md "Authenticating-gateway pattern" and docs/upgrade-to-v2-jwt-removal.md for the migration walkthrough' config pkg coverage: ValidAuthTypes 100%, Validate 94.7%, total 75.5%. Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md §2 P1 cluster, cat-g-jwt_silent_auth_downgrade Audit recommendation followed verbatim: 'Remove jwt from validAuthTypes until middleware ships'. --- .env.example | 11 +- .github/workflows/ci.yml | 63 +++++++ CHANGELOG.md | 35 ++++ api/openapi.yaml | 9 +- cmd/server/main.go | 26 ++- deploy/helm/CHART_SUMMARY.md | 4 +- deploy/helm/certctl/README.md | 34 ++++ deploy/helm/certctl/templates/_helpers.tpl | 23 +++ .../certctl/templates/server-configmap.yaml | 1 + .../certctl/templates/server-deployment.yaml | 1 + .../helm/certctl/templates/server-secret.yaml | 1 + deploy/helm/certctl/values.yaml | 19 +- docs/architecture.md | 8 +- docs/upgrade-to-v2-jwt-removal.md | 155 +++++++++++++++++ internal/api/handler/health.go | 8 +- internal/api/handler/health_test.go | 32 +--- internal/api/middleware/middleware.go | 8 +- internal/config/config.go | 92 ++++++++-- internal/config/config_test.go | 164 ++++++++++++++++-- 19 files changed, 629 insertions(+), 65 deletions(-) create mode 100644 docs/upgrade-to-v2-jwt-removal.md diff --git a/.env.example b/.env.example index 5b8739d..9c366e4 100644 --- a/.env.example +++ b/.env.example @@ -30,9 +30,16 @@ CERTCTL_SERVER_PORT=8443 CERTCTL_LOG_LEVEL=info CERTCTL_LOG_FORMAT=json -# Auth type: "api-key", "jwt", or "none" (for demo/development) +# Auth type: "api-key" (production) or "none" (demo/development). +# For JWT/OIDC, run an authenticating gateway in front of certctl +# (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) and +# set CERTCTL_AUTH_TYPE=none on the upstream — see +# docs/architecture.md "Authenticating-gateway pattern". G-1 removed +# the in-process "jwt" option (no JWT middleware shipped — silent auth +# downgrade); see docs/upgrade-to-v2-jwt-removal.md if you previously +# set CERTCTL_AUTH_TYPE=jwt. CERTCTL_AUTH_TYPE=none -# Required when CERTCTL_AUTH_TYPE is "api-key" or "jwt" +# Required when CERTCTL_AUTH_TYPE is "api-key". # Generate with: openssl rand -base64 32 # CERTCTL_AUTH_SECRET=change-me-in-production diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d98458e..fb87a5d 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -44,6 +44,69 @@ jobs: - name: Run govulncheck run: govulncheck ./... + - name: Forbidden auth-type literal regression guard (G-1) + # G-1 closed the JWT silent auth downgrade by removing "jwt" from the + # accepted CERTCTL_AUTH_TYPE values. This step grep-fails the build + # if "jwt" reappears in any of the *additive* auth-type surfaces: + # the validAuthTypes / ValidAuthTypes() set, the OpenAPI enum, the + # helm chart's allowed-types list, or the .env.example default. + # Comment lines and the dedicated rejection branch in config.go + # (`c.Auth.Type == "jwt"`) are intentionally exempt — those are the + # G-1 fix itself, not a regression. + # + # Connector packages (internal/connector/) are exempt because the + # Google OAuth2 service-account JWT and step-ca provisioner one- + # time-token JWT are external-protocol uses, unrelated to certctl's + # own auth shape. Test files (_test.go) are exempt so negative + # tests can pass the literal. + # + # See docs/upgrade-to-v2-jwt-removal.md for the closure rationale, + # or internal/config/config.go::ValidAuthTypes for the allowed set. + run: | + set -e + + # Scoped patterns that indicate "jwt" being added back to an + # allowed-set surface. Each catches a regression shape we've + # actually seen in pre-G-1 code: + # - Go map/slice literal: "jwt": true or "jwt", + # - Go switch case: case "jwt" + # - YAML enum: enum: [..., jwt, ...] or - jwt + # - .env conditional: AUTH_TYPE.*"jwt"|=jwt$ + BAD=$(grep -rnEH \ + -e '"jwt"\s*:\s*true' \ + -e '"jwt"\s*,' \ + -e 'case\s+"jwt"' \ + -e 'enum:.*\bjwt\b' \ + -e '^\s*-\s*jwt\s*$' \ + -e 'AUTH_TYPE\s*=\s*jwt\s*$' \ + -e 'AUTH_TYPE\s*=\s*jwt\s*#' \ + -e 'auth\.type\s*=\s*jwt\s*$' \ + -e 'AuthType\("jwt"\)' \ + internal/config/ \ + internal/api/ \ + cmd/ \ + api/openapi.yaml \ + .env.example \ + deploy/.env.example \ + deploy/helm/certctl/values.yaml \ + deploy/helm/certctl/templates/ \ + 2>/dev/null \ + | grep -v '_test.go' \ + | grep -vE '^\s*[^:]+:[0-9]+:\s*(//|#)' \ + | grep -v 'is no longer accepted' \ + || true) + if [ -n "$BAD" ]; then + echo "G-1 regression: \"jwt\" reappeared in an allowed-set surface:" + echo "$BAD" + echo "" + echo "Allowed surface for 'jwt' literals: comment lines, the" + echo "dedicated rejection branch in internal/config/config.go," + echo "and connector packages (Google OAuth2, step-ca)." + echo "See docs/upgrade-to-v2-jwt-removal.md and" + echo "internal/config/config.go::ValidAuthTypes()." + exit 1 + fi + - name: Race Detection run: go test -race ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/scheduler/... ./internal/connector/... ./internal/crypto/... ./internal/domain/... ./internal/validation/... ./internal/tlsprobe/... -count=1 -timeout 300s diff --git a/CHANGELOG.md b/CHANGELOG.md index c751b5e..02a7f20 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,41 @@ All notable changes to certctl are documented in this file. Dates use ISO 8601. Versions follow [Semantic Versioning](https://semver.org/). +## [unreleased] — 2026-04-24 + +### G-1: JWT silent auth downgrade — closed end-to-end + +> Pre-G-1 the config validator accepted `CERTCTL_AUTH_TYPE=jwt` and the startup log faithfully echoed `"authentication enabled" "type"="jwt"`. Reasonable people read that and concluded JWT was on. It wasn't. The auth-middleware wiring at `cmd/server/main.go` unconditionally routed every request through the api-key bearer middleware regardless of `cfg.Auth.Type`. So `CERTCTL_AUTH_TYPE=jwt` quietly compared incoming `Authorization: Bearer ` against whatever string the operator put in `CERTCTL_AUTH_SECRET` — real JWT clients got 401, and operators who treated `CERTCTL_AUTH_SECRET` as a *signing* secret (because they thought they were configuring JWT) had effectively handed an attacker an api-key. A security finding masquerading as a config option. We chose to remove the option rather than ship JWT middleware — the audit-recommended structural fix that closes the hazard. Operators who actually need JWT/OIDC front certctl with an authenticating gateway (oauth2-proxy / Envoy `ext_authz` / Traefik `ForwardAuth` / Pomerium / Authelia) and run the upstream certctl with `CERTCTL_AUTH_TYPE=none`. The same pattern works on docker-compose and Helm. + +### Breaking Changes + +- **`CERTCTL_AUTH_TYPE=jwt` is no longer accepted.** Pre-G-1 the value was silently downgraded to api-key middleware. Post-G-1 the server fails at startup with a dedicated diagnostic naming the authenticating-gateway pattern. Operators with this in their env block must either switch to `api-key` (if they were de facto using api-key auth all along — same Bearer token continues to work) or switch to `none` and front certctl with an oauth2-proxy / Envoy / Traefik / Pomerium gateway. See [`docs/upgrade-to-v2-jwt-removal.md`](docs/upgrade-to-v2-jwt-removal.md). +- **Helm chart `server.auth.type=jwt` now fails at `helm install` / `helm upgrade` template time.** New `certctl.validateAuthType` template helper runs on every template that depends on `.Values.server.auth.type` (`server-deployment.yaml`, `server-configmap.yaml`, `server-secret.yaml`) and fails the render with a pointer at the gateway-fronting pattern. +- **OpenAPI spec `auth_type` enum no longer includes `jwt`.** API consumers checking `/api/v1/auth/info` against the spec will see a smaller enum. + +### Removed + +- Documented references to JWT in the certctl auth surface (config docblocks, middleware/health-handler comments, `.env.example`, `docs/architecture.md` middleware-stack bullet). Connector-level JWT references (Google OAuth2 service-account JWT in `internal/connector/discovery/gcpsm/`, `internal/connector/issuer/googlecas/`; step-ca's provisioner one-time-token JWT in `internal/connector/issuer/stepca/`) are unrelated and untouched — those are external-protocol uses, not certctl's own auth shape. + +### Added + +- **`config.AuthType` typed alias** with `AuthTypeAPIKey` / `AuthTypeNone` exported constants. Single source of truth for the allowed set across the validator, the runtime defense-in-depth switch in `main.go`, and the helm chart's `validateAuthType` helper. +- **`config.ValidAuthTypes()`** helper returning the complete allowed set; pinned by a property test (`TestValidAuthTypesDoesNotContainJWT`) that fails the build if `"jwt"` is ever re-added to the slice. +- **Defense-in-depth runtime guard** in `cmd/server/main.go` immediately after `config.Load()` — a `switch config.AuthType(cfg.Auth.Type)` that exits 1 if the validator was bypassed (test harness, alt config loader, env-var rebinding). +- **`certctl.validateAuthType` Helm template helper** mirroring the existing `certctl.tls.required` pattern. Fails template render on any `server.auth.type` outside `{api-key, none}`. +- **`docs/architecture.md` "Authenticating-gateway pattern (JWT, OIDC, mTLS)"** section explaining the design rationale for the narrow in-process auth surface and listing oauth2-proxy / Envoy `ext_authz` / Traefik `ForwardAuth` / Pomerium / Authelia / Caddy `forward_auth` / Apache `mod_auth_openidc` / nginx `auth_request` as the standard fronting options. +- **`docs/upgrade-to-v2-jwt-removal.md`** migration guide. Same shape as `docs/upgrade-to-tls.md`. Walks through the dedicated startup error, both recovery paths (`api-key` vs gateway-fronting), a complete docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy `ext_authz` patterns, and rollback posture. +- **`deploy/helm/certctl/README.md`** "JWT / OIDC via authenticating gateway" section with a Kubernetes-flavored oauth2-proxy + certctl walkthrough. +- **CI regression guardrail** in `.github/workflows/ci.yml` (`Forbidden auth-type literal regression guard (G-1)`) — grep-fails the build if `"jwt"` appears as an auth-type literal in production code or spec. Connector packages exempt (legitimate external-protocol uses). +- **Negative test coverage** in `internal/config/config_test.go`: `TestValidate_JWTAuth_RejectedDedicated` (two table rows pinning that the dedicated G-1 error fires regardless of whether `Secret` is set), `TestValidAuthTypesDoesNotContainJWT` (property-level guard), `TestValidAuthTypesIsExactly_APIKey_None` (allowed-set contract), `TestValidate_GenericInvalidAuthType` (pins that other invalid values still surface the generic invalid-auth-type error, so the dedicated G-1 path doesn't accidentally swallow non-jwt typos). + +### Changed + +- `internal/api/middleware/middleware.go::AuthConfig.Type` field comment now references the typed `config.AuthType` constants instead of an inline string enumeration. +- `internal/api/handler/health.go::HealthHandler.AuthType` field comment same treatment. +- `internal/api/handler/health_test.go` — the prior `TestAuthInfo_ReturnsAuthType_JWT` (which asserted the handler echoed `"jwt"`, baking the silent-downgrade lie into the regression suite) is removed; the pre-existing `TestAuthInfo_ReturnsAuthType_APIKey` continues to cover the api-key happy path. +- Auth-disabled startup log in `main.go` now points operators at the authenticating-gateway pattern explicitly. + ## [2.2.0] — 2026-04-19 ### HTTPS Everywhere — The Irony diff --git a/api/openapi.yaml b/api/openapi.yaml index 0b6c5c9..e44af11 100644 --- a/api/openapi.yaml +++ b/api/openapi.yaml @@ -132,7 +132,14 @@ paths: properties: auth_type: type: string - enum: [api-key, jwt, none] + # G-1 (P1): "jwt" removed from this enum after the silent + # auth downgrade was identified — no JWT middleware ships + # with certctl. Operators who need JWT/OIDC front certctl + # with an authenticating gateway (oauth2-proxy / Envoy / + # Traefik / Pomerium) and set CERTCTL_AUTH_TYPE=none + # upstream. See docs/architecture.md "Authenticating- + # gateway pattern". + enum: [api-key, none] required: type: boolean diff --git a/cmd/server/main.go b/cmd/server/main.go index 84d872c..84dadb1 100644 --- a/cmd/server/main.go +++ b/cmd/server/main.go @@ -39,6 +39,26 @@ func main() { os.Exit(1) } + // Defense-in-depth runtime guard for the auth-type discriminator. + // + // G-1 (P1): config.Load() already runs Validate() which rejects "jwt" + // and any value outside config.ValidAuthTypes() with a dedicated + // diagnostic. This switch is belt-and-braces — if a future refactor + // bypasses the validator (test harness, alt config loader, env-var + // rebinding after Load) the server must not silently boot with an + // unsupported auth shape. The error path uses fmt.Fprintf because + // the slog logger is constructed from cfg below this point; we want + // the failure to be visible regardless of log-level configuration. + switch config.AuthType(cfg.Auth.Type) { + case config.AuthTypeAPIKey, config.AuthTypeNone: + // ok — fall through + default: + fmt.Fprintf(os.Stderr, + "unsupported auth type at runtime: %q (valid: %v) — config validation should have caught this; refusing to start\n", + cfg.Auth.Type, config.ValidAuthTypes()) + os.Exit(1) + } + // Set up structured logging logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{ Level: cfg.GetLogLevel(), @@ -615,7 +635,7 @@ func main() { // compatibility CERTCTL_AUTH_SECRET is synthesized into legacy-key-N // entries with Admin=false. var namedKeys []middleware.NamedAPIKey - if cfg.Auth.Type != "none" { + if config.AuthType(cfg.Auth.Type) != config.AuthTypeNone { // Translate typed config.NamedAPIKey -> middleware.NamedAPIKey. The // two structs are field-compatible but live in different packages to // preserve the config→middleware dependency direction. @@ -704,8 +724,8 @@ func main() { logger.Info("rate limiting enabled", "rps", cfg.RateLimit.RPS, "burst", cfg.RateLimit.BurstSize) } - if cfg.Auth.Type == "none" { - logger.Warn("authentication disabled (CERTCTL_AUTH_TYPE=none) — not suitable for production") + if config.AuthType(cfg.Auth.Type) == config.AuthTypeNone { + logger.Warn("authentication disabled (CERTCTL_AUTH_TYPE=none) — not suitable for production except behind an authenticating gateway (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium)") } else { logger.Info("authentication enabled", "type", cfg.Auth.Type) } diff --git a/deploy/helm/CHART_SUMMARY.md b/deploy/helm/CHART_SUMMARY.md index 1c3952e..b6983bb 100644 --- a/deploy/helm/CHART_SUMMARY.md +++ b/deploy/helm/CHART_SUMMARY.md @@ -246,8 +246,8 @@ helm install certctl certctl/ \ |--------|---------|-------------| | `server.replicas` | 1 | Number of server replicas | | `server.port` | 8443 | Server port | -| `server.auth.type` | api-key | Authentication type | -| `server.auth.apiKey` | "" | API key (REQUIRED) | +| `server.auth.type` | api-key | Authentication type — `api-key` or `none` (G-1: `jwt` removed; for JWT/OIDC use a fronting authenticating gateway, see `docs/architecture.md` and `docs/upgrade-to-v2-jwt-removal.md`) | +| `server.auth.apiKey` | "" | API key (REQUIRED when `auth.type=api-key`) | | `server.logging.level` | info | Log level | | `server.logging.format` | json | Log format | diff --git a/deploy/helm/certctl/README.md b/deploy/helm/certctl/README.md index d19aace..14cb1c9 100644 --- a/deploy/helm/certctl/README.md +++ b/deploy/helm/certctl/README.md @@ -86,6 +86,40 @@ helm upgrade deploy/helm/certctl/ \ --set server.auth.apiKey='new-key' ``` +### JWT / OIDC via authenticating gateway + +certctl's in-process auth surface is intentionally narrow: `server.auth.type=api-key` for production deployments and `server.auth.type=none` for development. There is no in-process JWT, OIDC, mTLS, or SAML middleware. (`server.auth.type=jwt` was accepted pre-G-1 but silently routed every request through the api-key bearer middleware — silent auth downgrade. The chart now fails at `helm install`/`helm upgrade` template time via the `certctl.validateAuthType` helper if you set it. See [`../../../docs/upgrade-to-v2-jwt-removal.md`](../../../docs/upgrade-to-v2-jwt-removal.md) if you previously had this in your values.) + +For deployments that need JWT/OIDC, the canonical Kubernetes-flavored shape is to put oauth2-proxy in front of the certctl Service, attach an authenticating Ingress middleware, and run certctl with `server.auth.type=none`: + +```bash +# 1. Install oauth2-proxy (or any OIDC-terminating sidecar) in the same namespace +helm install oauth2-proxy oauth2-proxy/oauth2-proxy \ + --namespace certctl \ + --set config.clientID="$OIDC_CLIENT_ID" \ + --set config.clientSecret="$OIDC_CLIENT_SECRET" \ + --set config.cookieSecret="$(openssl rand -base64 32)" \ + --set config.configFile='| + provider = "oidc" + oidc_issuer_url = "https://your-issuer/" + upstreams = ["http://-server.certctl.svc.cluster.local:8443"] + pass_authorization_header = true + set_authorization_header = true + email_domains = ["*"] + ' + +# 2. Install certctl with type=none (gateway terminates auth) +helm install certctl deploy/helm/certctl/ \ + --namespace certctl \ + --set server.auth.type=none \ + --set postgresql.auth.password="$(openssl rand -base64 24)" + +# 3. Attach an Ingress that routes through oauth2-proxy +# (Traefik ForwardAuth, nginx auth_request, Envoy ext_authz, etc.) +``` + +Same root pattern works with Pomerium, Authelia, Caddy `forward_auth`, Apache `mod_auth_openidc`, or any service-mesh `ext_authz`. See [`../../../docs/architecture.md`](../../../docs/architecture.md) "Authenticating-gateway pattern" for the full design rationale and [`../../../docs/upgrade-to-v2-jwt-removal.md`](../../../docs/upgrade-to-v2-jwt-removal.md) for the migration walkthrough. + ### TLS certificate sourcing By default the chart provisions a self-signed cert via the same init-container pattern as the docker-compose deploy. For production, supply an operator-managed Secret (cert-manager, internal CA, etc.) — see [`docs/tls.md`](../../../docs/tls.md) for the full provisioning matrix and [`docs/upgrade-to-tls.md`](../../../docs/upgrade-to-tls.md) for upgrade-from-HTTP procedures. diff --git a/deploy/helm/certctl/templates/_helpers.tpl b/deploy/helm/certctl/templates/_helpers.tpl index cb67982..5d21b50 100644 --- a/deploy/helm/certctl/templates/_helpers.tpl +++ b/deploy/helm/certctl/templates/_helpers.tpl @@ -169,3 +169,26 @@ per affected resource. No-op when configured correctly. {{- fail "\n\nserver.tls.certManager.enabled=true but server.tls.certManager.issuerRef.name is empty.\n\nSet:\n --set server.tls.certManager.issuerRef.name=\n\nSee docs/tls.md.\n" -}} {{- end -}} {{- end }} + +{{/* +Auth-type validation gate. + +G-1 (P1): pre-G-1 the chart accepted server.auth.type=jwt and the +certctl-server container silently routed every request through the +api-key bearer middleware (no JWT impl ships with certctl). Post-G-1 +the chart fails at template-time with a pointer at the authenticating- +gateway pattern. The valid set must stay in sync with +internal/config.ValidAuthTypes() in the Go binary; if you add a value +there you must add it here too (and update the property test in +internal/config/config_test.go that pins both surfaces). + +Any template that consumes .Values.server.auth.type should call +`{{ include "certctl.validateAuthType" . }}` at the top so this guard +runs once per affected resource. No-op when configured correctly. +*/}} +{{- define "certctl.validateAuthType" -}} +{{- $valid := list "api-key" "none" -}} +{{- if not (has .Values.server.auth.type $valid) -}} +{{- fail (printf "\n\nserver.auth.type=%q is not supported (valid: %v).\n\nFor JWT/OIDC, run an authenticating gateway in front of certctl\n(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) and\nset server.auth.type=none here so the gateway terminates federated\nidentity. See docs/architecture.md \"Authenticating-gateway pattern\"\nand docs/upgrade-to-v2-jwt-removal.md for the migration walkthrough.\n\nG-1 audit closure: pre-G-1 the chart accepted type=jwt and the binary\nsilently downgraded to api-key middleware. The chart now fails at\ntemplate time so misconfigured deployments cannot ship.\n" .Values.server.auth.type $valid) -}} +{{- end -}} +{{- end }} diff --git a/deploy/helm/certctl/templates/server-configmap.yaml b/deploy/helm/certctl/templates/server-configmap.yaml index 7bab534..a04fdb9 100644 --- a/deploy/helm/certctl/templates/server-configmap.yaml +++ b/deploy/helm/certctl/templates/server-configmap.yaml @@ -1,3 +1,4 @@ +{{- include "certctl.validateAuthType" . }} apiVersion: v1 kind: ConfigMap metadata: diff --git a/deploy/helm/certctl/templates/server-deployment.yaml b/deploy/helm/certctl/templates/server-deployment.yaml index 2d69834..633a091 100644 --- a/deploy/helm/certctl/templates/server-deployment.yaml +++ b/deploy/helm/certctl/templates/server-deployment.yaml @@ -1,4 +1,5 @@ {{- include "certctl.tls.required" . }} +{{- include "certctl.validateAuthType" . }} apiVersion: apps/v1 kind: Deployment metadata: diff --git a/deploy/helm/certctl/templates/server-secret.yaml b/deploy/helm/certctl/templates/server-secret.yaml index 316427c..0786615 100644 --- a/deploy/helm/certctl/templates/server-secret.yaml +++ b/deploy/helm/certctl/templates/server-secret.yaml @@ -1,3 +1,4 @@ +{{- include "certctl.validateAuthType" . }} apiVersion: v1 kind: Secret metadata: diff --git a/deploy/helm/certctl/values.yaml b/deploy/helm/certctl/values.yaml index de5f5ff..5581b06 100644 --- a/deploy/helm/certctl/values.yaml +++ b/deploy/helm/certctl/values.yaml @@ -112,10 +112,23 @@ server: port: 8443 annotations: {} - # Authentication configuration + # Authentication configuration. + # Valid types: "api-key" (production) or "none" (demo only — disables + # authentication on the API and logs a loud Warn at server startup). + # For JWT/OIDC, run an authenticating gateway in front of certctl + # (oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) + # and set type=none here so the gateway terminates federated identity. + # See docs/architecture.md "Authenticating-gateway pattern". + # + # G-1 (P1): pre-G-1 the chart accepted server.auth.type=jwt and the + # certctl-server container silently routed every request through the + # api-key bearer middleware — silent auth downgrade. Post-G-1 the + # chart's `certctl.validateAuthType` template helper rejects any value + # outside {api-key, none} at template time. See + # docs/upgrade-to-v2-jwt-removal.md if you previously set type=jwt. auth: - type: api-key # Options: api-key, none (for demo only) - apiKey: "" # REQUIRED in production - set via --set or values override + type: api-key + apiKey: "" # REQUIRED when type=api-key (set via --set or values override). # Logging configuration logging: diff --git a/docs/architecture.md b/docs/architecture.md index 65cf1cc..91115c9 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -891,9 +891,15 @@ The HTTP middleware stack processes requests in the following order (see `cmd/se 4. **BodyLimit** - request body size cap via `http.MaxBytesReader` 5. **RateLimiter** - token bucket rate limiting (optional, when enabled) 6. **CORS** - cross-origin request handling (deny-by-default) -7. **Auth** - API key or JWT validation +7. **Auth** - API key validation (or none in development; JWT/OIDC via authenticating gateway, see below — not in-process) 8. **AuditLog** - records every API call to the audit trail (requires auth context for actor) +### Authenticating-gateway pattern (JWT, OIDC, mTLS) + +certctl's in-process authentication surface is intentionally narrow: `api-key` for production deployments and `none` for development. There is no in-process JWT, OIDC, mTLS, or SAML middleware. (`CERTCTL_AUTH_TYPE=jwt` was accepted pre-G-1 but silently routed through the api-key bearer middleware — a security finding masquerading as a config option, removed at the v2.x boundary; see [`upgrade-to-v2-jwt-removal.md`](upgrade-to-v2-jwt-removal.md) if you previously set it.) + +For deployments that need JWT/OIDC/mTLS, the standard pattern is to put an authenticating gateway in front of certctl and configure `CERTCTL_AUTH_TYPE=none` on the upstream certctl process. The gateway terminates the federated identity protocol, validates tokens / certificates / SAML assertions, and proxies the authenticated request to certctl as a same-origin call on a private network. This separation gives operators the full breadth of the modern identity ecosystem (oauth2-proxy, Envoy `ext_authz`, Traefik `ForwardAuth`, Pomerium, Authelia, Caddy `forward_auth`, Apache `mod_auth_openidc`, nginx `auth_request`) without certctl itself having to track signing-key rotation, claim mapping, audience validation, and the rest of the JWT/OIDC surface area. Operators wanting per-request actor attribution past the gateway boundary forward the gateway-resolved identity (e.g., `X-Auth-Request-User` from oauth2-proxy) and run a small authorization layer at the gateway that enforces the bearer-key contract certctl actually uses. + ### Concurrency Safety The background scheduler uses `sync/atomic.Bool` idempotency guards on every loop (8 always-on plus up to 4 optional) — if a tick fires while the previous iteration is still running, it skips. A `sync.WaitGroup` tracks all in-flight goroutines. `WaitForCompletion(timeout)` blocks during shutdown until all work finishes or the timeout expires, preventing state corruption from mid-flight database operations during process exit. diff --git a/docs/upgrade-to-v2-jwt-removal.md b/docs/upgrade-to-v2-jwt-removal.md new file mode 100644 index 0000000..e5308bf --- /dev/null +++ b/docs/upgrade-to-v2-jwt-removal.md @@ -0,0 +1,155 @@ +# Upgrading past G-1 — `CERTCTL_AUTH_TYPE=jwt` removal + +If your certctl deployment currently sets `CERTCTL_AUTH_TYPE=jwt` (or `server.auth.type=jwt` in Helm), the next certctl upgrade will fail-fast at startup with a dedicated diagnostic. This guide explains why, what to switch to, and how to keep JWT/OIDC at your edge. + +For everyone else — operators running `api-key` or `none` — this upgrade is a no-op. Skip to [`upgrade-to-tls.md`](upgrade-to-tls.md) for the v2.2 HTTPS-everywhere migration if you haven't done that one yet. + +## Why we removed it + +Pre-G-1, the config validator at `internal/config/config.go` accepted three values for `CERTCTL_AUTH_TYPE`: `api-key`, `jwt`, and `none`. The startup log line at `cmd/server/main.go` faithfully echoed `"authentication enabled" "type"="jwt"` when an operator picked `jwt`. Reasonable people read that and concluded JWT auth was on. + +It wasn't. Grep `internal/ cmd/` for `NewJWT`, `JWTMiddleware`, or `jwt.Parse` — pre-G-1, there were zero matches in production code. The auth-middleware wiring at `cmd/server/main.go:653` unconditionally called `middleware.NewAuthWithNamedKeys(namedKeys)` regardless of `cfg.Auth.Type`. So `CERTCTL_AUTH_TYPE=jwt` just routed every request through the api-key bearer middleware, comparing the incoming `Authorization: Bearer ` against whatever string the operator put in `CERTCTL_AUTH_SECRET`. Real JWT clients got 401 (the api-key middleware saw the JWT string as a literal token and compared bytes). Operators who treated `CERTCTL_AUTH_SECRET` as a JWT signing secret (and therefore handled it less carefully than an api-key) handed an attacker an api-key. Silent auth downgrade — a security finding masquerading as a config option. + +We chose to remove the option rather than implement JWT middleware. Implementing real JWT/OIDC requires jwks vs static-secret rotation, claim mapping (which claim is the actor / the admin flag?), expiry enforcement, audience and issuer validation, key rollover semantics, and regression coverage at the same depth as the existing api-key path. That's a feature, not a fix. The audit-recommended structural fix — and the one that actually closes the hazard — is to fail loudly instead of silently downgrading. + +## What changes at startup + +Post-G-1, a binary started with `CERTCTL_AUTH_TYPE=jwt` exits non-zero before opening the listener: + +``` +Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no longer accepted +(G-1 silent auth downgrade): no JWT middleware ships with certctl. To use +JWT/OIDC, run an authenticating gateway (oauth2-proxy / Envoy ext_authz / +Traefik ForwardAuth / Pomerium) in front of certctl and set +CERTCTL_AUTH_TYPE=none on the upstream. See docs/architecture.md +"Authenticating-gateway pattern" and docs/upgrade-to-v2-jwt-removal.md +for the migration walkthrough +``` + +Helm operators get the same shape at `helm install` / `helm upgrade` template time: `server.auth.type=jwt` is rejected by the chart's `certctl.validateAuthType` template helper before any Kubernetes object is rendered. + +The CI-side regression guard at `.github/workflows/ci.yml` blocks any future PR that re-introduces `"jwt"` as an auth-type literal in production code or spec. + +## Recovery — pick one + +### Option A — switch to `api-key` (you weren't actually using JWT) + +If your `CERTCTL_AUTH_SECRET` was a single high-entropy token and your clients sent it as `Authorization: Bearer `, you were already using api-key auth — you just had `CERTCTL_AUTH_TYPE` set to the wrong string. Flip it: + +``` +# .env (docker-compose) +CERTCTL_AUTH_TYPE=api-key +CERTCTL_AUTH_SECRET= +``` + +``` +# Helm +helm upgrade deploy/helm/certctl/ \ + --reuse-values \ + --set server.auth.type=api-key \ + --set server.auth.apiKey= +``` + +No client changes needed — the same Bearer token continues to work. The startup log will now read `"authentication enabled" "type"="api-key"`, which matches what was actually happening pre-G-1. + +### Option B — front certctl with an authenticating gateway + +If you genuinely need JWT, OIDC, mTLS, or SAML, run an authenticating gateway in front of certctl and let the gateway terminate the federated identity protocol. Configure certctl for `CERTCTL_AUTH_TYPE=none`: + +``` +CERTCTL_AUTH_TYPE=none +``` + +Then put an oauth2-proxy / Envoy `ext_authz` / Traefik `ForwardAuth` / Pomerium / Authelia (etc.) in the network path between operators and certctl. The gateway validates the identity and proxies the authenticated request to certctl as a same-origin call on a private network. + +### Concrete walkthrough — oauth2-proxy + certctl on docker-compose + +This is the simplest production-grade JWT/OIDC shape. It assumes you have an OIDC provider (Okta, Auth0, Google Workspace, Keycloak, Dex) and a registered client_id / client_secret. + +```yaml +# deploy/docker-compose.gateway.yml — overlay on the base compose file +services: + oauth2-proxy: + image: quay.io/oauth2-proxy/oauth2-proxy:latest + command: + - --provider=oidc + - --oidc-issuer-url=https:/// + - --client-id=${OIDC_CLIENT_ID} + - --client-secret=${OIDC_CLIENT_SECRET} + - --cookie-secret=${OAUTH2_PROXY_COOKIE_SECRET} # openssl rand -base64 32 + - --upstream=http://certctl-server:8443 # internal-network only; certctl listens on 8443 + - --http-address=0.0.0.0:4180 + - --email-domain=* + - --pass-access-token=true + - --pass-authorization-header=true + - --set-authorization-header=true # forwards a bearer token upstream + - --skip-provider-button=true + - --reverse-proxy=true + ports: + - "443:4180" + depends_on: + - certctl-server + networks: + - certctl-network + + certctl-server: + environment: + CERTCTL_AUTH_TYPE: none # gateway terminates auth — see docs/upgrade-to-v2-jwt-removal.md + # ... rest of the certctl env block unchanged +``` + +Operators hit `https:///`, get redirected through the OIDC provider, land back at oauth2-proxy with a session cookie, and oauth2-proxy proxies their request to certctl on the internal Docker network. certctl itself is HTTPS-only on `:8443` (TLS 1.3, see [`tls.md`](tls.md)) but operator browsers never see that hop directly. Bind certctl-server's `:8443` to the internal Docker network only — do NOT publish it to the host. The audit trail will record the actor as the gateway-forwarded identity if you also configure a small bearer-token-mapping shim at the gateway (most production deployments do this with a per-user api-key issued by the gateway after OIDC validation). + +### Traefik ForwardAuth pattern (Kubernetes) + +Same shape, kubernetes-flavored: + +```yaml +apiVersion: traefik.io/v1alpha1 +kind: Middleware +metadata: + name: oidc-forward-auth +spec: + forwardAuth: + address: http://oauth2-proxy.auth.svc.cluster.local:4180 + trustForwardHeader: true + authResponseHeaders: + - X-Auth-Request-User + - X-Auth-Request-Email + - Authorization +--- +apiVersion: traefik.io/v1alpha1 +kind: IngressRoute +metadata: + name: certctl +spec: + routes: + - match: Host(`certctl.example.com`) + kind: Rule + middlewares: + - name: oidc-forward-auth + services: + - name: certctl-server + port: 8443 +``` + +The certctl Helm release runs with `server.auth.type=none`. The Traefik IngressRoute attaches `oidc-forward-auth` as a middleware so every request is OIDC-validated by oauth2-proxy before reaching certctl. + +### Envoy `ext_authz` pattern + +For service-mesh deployments (Istio, Consul, plain Envoy), the `ext_authz` filter calls out to an external authorization service per-request. Same outcome: certctl runs `CERTCTL_AUTH_TYPE=none` and Envoy + your authz service handle JWT/OIDC/mTLS at the mesh edge. See the [Envoy ext_authz docs](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter) for the configuration surface. + +## Rollback + +Pre-G-1 binaries silently accepted `CERTCTL_AUTH_TYPE=jwt` and routed through the api-key middleware. Downgrading the binary is the only mechanical rollback path, and it puts you back into the silent-downgrade state — which is exactly what the G-1 audit finding is about. We don't recommend it. If something is forcing your hand, capture the operational issue you're hitting and open a GitHub issue against the certctl repo with the SHAs involved; the Authenticating-gateway pattern was specifically designed to cover the use cases that historically led operators to set `CERTCTL_AUTH_TYPE=jwt`. + +There is no on-disk state that changes with this upgrade — no migrations to roll back, no encrypted config to re-encode, no certificates to re-issue. The change is entirely in the config-validation surface and the helm-chart template guard. + +## Cross-references + +- [`architecture.md`](architecture.md) — "Authenticating-gateway pattern (JWT, OIDC, mTLS)" section. +- [`tls.md`](tls.md) — TLS provisioning patterns. The gateway proxying to certctl-server still needs to trust certctl's TLS cert; same patterns apply. +- [`../deploy/helm/certctl/README.md`](../deploy/helm/certctl/README.md) — Helm-chart-flavored guidance. +- `internal/config/config.go::ValidAuthTypes` — the single source of truth for what's accepted post-G-1. +- `internal/repository/postgres/db.go::wrapPingError` — unrelated; pattern for runtime diagnostic of operator misconfiguration. +- `coverage-gap-audit-2026-04-24-v5/unified-audit.md` — the audit finding (`cat-g-jwt_silent_auth_downgrade`). diff --git a/internal/api/handler/health.go b/internal/api/handler/health.go index b352b81..5ed6470 100644 --- a/internal/api/handler/health.go +++ b/internal/api/handler/health.go @@ -7,8 +7,14 @@ import ( ) // HealthHandler handles health and readiness check endpoints. +// +// G-1 (P1): AuthType is one of "api-key" or "none" — see +// internal/config.AuthType / config.ValidAuthTypes() for the typed +// constants and the rationale for dropping "jwt" (no JWT middleware +// ships with certctl; operators who need JWT/OIDC front certctl with +// an authenticating gateway and set AuthType="none" on the upstream). type HealthHandler struct { - AuthType string // "api-key", "jwt", "none" + AuthType string // "api-key" or "none" (see config.AuthType constants) } // NewHealthHandler creates a new HealthHandler. diff --git a/internal/api/handler/health_test.go b/internal/api/handler/health_test.go index 5966f95..b7bb965 100644 --- a/internal/api/handler/health_test.go +++ b/internal/api/handler/health_test.go @@ -162,30 +162,14 @@ func TestAuthInfo_ReturnsAuthType_None(t *testing.T) { } } -func TestAuthInfo_ReturnsAuthType_JWT(t *testing.T) { - handler := NewHealthHandler("jwt") - - req, err := http.NewRequest(http.MethodGet, "/api/v1/auth/info", nil) - if err != nil { - t.Fatalf("NewRequest failed: %v", err) - } - - w := httptest.NewRecorder() - handler.AuthInfo(w, req) - - var result map[string]interface{} - if err := json.NewDecoder(w.Body).Decode(&result); err != nil { - t.Fatalf("failed to decode response: %v", err) - } - - if result["auth_type"] != "jwt" { - t.Errorf("auth_type = %q, want jwt", result["auth_type"]) - } - - if required, ok := result["required"].(bool); !ok || !required { - t.Errorf("required = %v, want true", result["required"]) - } -} +// G-1 (P1): the prior `TestAuthInfo_ReturnsAuthType_JWT` asserted the +// handler echoed "jwt" — using the silent-auth-downgrade value as a +// test fixture, which baked the lie into the regression suite. The +// test is removed because "jwt" is now rejected at config-load time +// (see internal/config/config_test.go::TestValidate_JWTAuth_RejectedDedicated) +// and never reaches this handler. The pre-existing +// `TestAuthInfo_ReturnsAuthType_APIKey` above (line ~107) covers the +// api-key happy path; nothing else needs replacing here. func TestAuthCheck_ReturnsOK(t *testing.T) { handler := NewHealthHandler("api-key") diff --git a/internal/api/middleware/middleware.go b/internal/api/middleware/middleware.go index 080f6be..82e17f2 100644 --- a/internal/api/middleware/middleware.go +++ b/internal/api/middleware/middleware.go @@ -117,8 +117,14 @@ func HashAPIKey(key string) string { } // AuthConfig holds configuration for the Auth middleware. +// +// G-1 (P1): valid Type values are "api-key" or "none" only. "jwt" was +// removed because no JWT middleware ships with certctl (silent auth +// downgrade pre-G-1). The single source of truth for the allowed set +// lives at internal/config.AuthType / config.ValidAuthTypes() — prefer +// those constants over string literals when comparing. type AuthConfig struct { - Type string // "api-key", "jwt", "none" + Type string // "api-key" or "none" (see config.AuthType constants) Secret string // The raw API key or comma-separated list of valid API keys } diff --git a/internal/config/config.go b/internal/config/config.go index 60f74c5..0d9fe3b 100644 --- a/internal/config/config.go +++ b/internal/config/config.go @@ -802,13 +802,59 @@ type NamedAPIKey struct { Admin bool } +// AuthType is the discriminator for the API auth middleware shape. The +// string alias preserves env-var roundtrip (the value flows through getEnv +// as a plain string) while giving us a typed surface for switches and +// validation. Use the named constants below rather than string literals +// so future enum additions/removals are caught at compile time. +// +// G-1 (P1): the pre-G-1 validAuthTypes map literal accepted "jwt" with no +// JWT middleware behind it (silent auth downgrade — the configured type +// was logged as "jwt" but every request routed through the api-key bearer +// middleware regardless). Operators who set CERTCTL_AUTH_TYPE=jwt thought +// they had JWT auth; they didn't. The typed alias + ValidAuthTypes() +// helper make the allowed set the single source of truth across config +// validation, the runtime defense-in-depth switch in main.go, and the +// helm-chart template guard (`certctl.validateAuthType`). +type AuthType string + +const ( + // AuthTypeAPIKey routes requests through the api-key bearer middleware. + // CERTCTL_AUTH_SECRET (or CERTCTL_API_KEYS_NAMED) is required. + AuthTypeAPIKey AuthType = "api-key" + + // AuthTypeNone disables authentication entirely. Development only — + // the server logs a loud Warn at startup. Operators who need + // JWT/OIDC/mTLS run an authenticating gateway (oauth2-proxy / Envoy + // ext_authz / Traefik ForwardAuth / Pomerium) in front of certctl + // and set this value on the upstream certctl process. See + // docs/architecture.md "Authenticating-gateway pattern". + AuthTypeNone AuthType = "none" +) + +// ValidAuthTypes returns the allowed CERTCTL_AUTH_TYPE values. The set is +// intentionally narrow — JWT was accepted pre-G-1 with no middleware +// implementation behind it. Single source of truth referenced by the +// validator below, the runtime guard in cmd/server/main.go, the helm +// chart template (`certctl.validateAuthType`), and the property test in +// config_test.go that pins "jwt" out of the slice forever. +func ValidAuthTypes() []AuthType { + return []AuthType{AuthTypeAPIKey, AuthTypeNone} +} + // AuthConfig contains authentication configuration. type AuthConfig struct { // Type sets the authentication mechanism for the REST API. - // Valid values: "api-key" (default, production), "jwt", "none" (development only). - // When "api-key", clients must provide Authorization: Bearer header. - // "none" requires explicit opt-in via CERTCTL_AUTH_TYPE env var with warning logged. + // Valid values: "api-key" (default, production) and "none" (development + // only — disables authentication on the API and logs a loud Warn at + // startup). For JWT/OIDC, run an authenticating gateway (oauth2-proxy / + // Envoy / Traefik ForwardAuth / Pomerium) in front of certctl and set + // CERTCTL_AUTH_TYPE=none on the upstream — see docs/architecture.md + // "Authenticating-gateway pattern" and docs/upgrade-to-v2-jwt-removal.md. // Setting: CERTCTL_AUTH_TYPE environment variable. Default: "api-key". + // Use the AuthType constants (AuthTypeAPIKey / AuthTypeNone) for typed + // comparisons; the field stays `string` to preserve env-var roundtrip + // shape used by getEnv() and downstream Helm/compose interpolation. Type string // Secret is the legacy authentication secret (comma-separated API keys). @@ -1148,18 +1194,40 @@ func (c *Config) Validate() error { return fmt.Errorf("invalid log format: %s", c.Log.Format) } - // Validate auth type - validAuthTypes := map[string]bool{ - "api-key": true, - "jwt": true, - "none": true, + // Validate auth type. + // + // G-1 (P1): the pre-G-1 set was {"api-key", "jwt", "none"} with "jwt" + // accepted but no JWT middleware shipped — silent auth downgrade. + // Post-G-1 we route a literal "jwt" value through a dedicated + // rejection that gives operators actionable guidance (the + // authenticating-gateway pattern) instead of the generic + // "invalid auth type". Then we cross-check against ValidAuthTypes() + // so any value outside {api-key, none} surfaces uniformly. + if c.Auth.Type == "jwt" { + return fmt.Errorf( + "CERTCTL_AUTH_TYPE=jwt is no longer accepted (G-1 silent auth " + + "downgrade): no JWT middleware ships with certctl. To use " + + "JWT/OIDC, run an authenticating gateway (oauth2-proxy / " + + "Envoy ext_authz / Traefik ForwardAuth / Pomerium) in " + + "front of certctl and set CERTCTL_AUTH_TYPE=none on the " + + "upstream. See docs/architecture.md \"Authenticating-" + + "gateway pattern\" and docs/upgrade-to-v2-jwt-removal.md " + + "for the migration walkthrough") } - if !validAuthTypes[c.Auth.Type] { - return fmt.Errorf("invalid auth type: %s", c.Auth.Type) + authTypeValid := false + for _, t := range ValidAuthTypes() { + if AuthType(c.Auth.Type) == t { + authTypeValid = true + break + } + } + if !authTypeValid { + return fmt.Errorf("invalid auth type: %s (valid: %v)", c.Auth.Type, ValidAuthTypes()) } - // If using JWT or API-key, secret is required - if (c.Auth.Type == "jwt" || c.Auth.Type == "api-key") && c.Auth.Secret == "" { + // If using API-key, secret is required. (Secret was previously also + // required for "jwt"; removed with the jwt rejection above.) + if c.Auth.Type == string(AuthTypeAPIKey) && c.Auth.Secret == "" { return fmt.Errorf("auth secret is required for auth type %s", c.Auth.Type) } diff --git a/internal/config/config_test.go b/internal/config/config_test.go index 959c040..57527e8 100644 --- a/internal/config/config_test.go +++ b/internal/config/config_test.go @@ -458,6 +458,8 @@ func TestValidate_InvalidAuthType(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -477,6 +479,8 @@ func TestValidate_APIKeyAuth_MissingSecret(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -484,25 +488,133 @@ func TestValidate_APIKeyAuth_MissingSecret(t *testing.T) { } } -func TestValidate_JWTAuth_MissingSecret(t *testing.T) { - cfg := &Config{ - Server: validServerConfig(t), - Database: DatabaseConfig{URL: "postgres://localhost/certctl", MaxConnections: 25}, - Log: LogConfig{Level: "info", Format: "json"}, - Auth: AuthConfig{Type: "jwt", Secret: ""}, - Keygen: KeygenConfig{Mode: "agent"}, - Scheduler: SchedulerConfig{ - RenewalCheckInterval: 1 * time.Hour, - JobProcessorInterval: 30 * time.Second, - AgentHealthCheckInterval: 2 * time.Minute, - NotificationProcessInterval: 1 * time.Minute, - }, +// TestValidate_JWTAuth_RejectedDedicated locks down the G-1 fix: pre-G-1 +// `CERTCTL_AUTH_TYPE=jwt` was accepted by the validator (the bare error +// path was the empty-secret one previously). Post-G-1 the literal "jwt" +// value is rejected with a dedicated diagnostic regardless of whether +// Secret is set, because there is no JWT middleware in the binary — +// operators who need JWT/OIDC must front certctl with an authenticating +// gateway. +// +// Two table rows pin the contract: missing-secret cannot paper over the +// rejection (the dedicated error fires first, before the secret check), +// and a populated secret also cannot paper over it. Both paths must +// hit the dedicated G-1 diagnostic, not the generic "invalid auth +// type" or "auth secret is required". +func TestValidate_JWTAuth_RejectedDedicated(t *testing.T) { + t.Parallel() + cases := []struct { + name string + secret string + }{ + {"jwt rejected (no secret)", ""}, + {"jwt rejected (with secret — operator can't paper over)", "anything"}, } - if err := cfg.Validate(); err == nil { - t.Error("Validate() should return error when jwt auth has empty secret") + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + cfg := &Config{ + Server: validServerConfig(t), + Database: DatabaseConfig{URL: "postgres://localhost/certctl", MaxConnections: 25}, + Log: LogConfig{Level: "info", Format: "json"}, + Auth: AuthConfig{Type: "jwt", Secret: tc.secret}, + Keygen: KeygenConfig{Mode: "agent"}, + Scheduler: SchedulerConfig{ + RenewalCheckInterval: 1 * time.Hour, + JobProcessorInterval: 30 * time.Second, + AgentHealthCheckInterval: 2 * time.Minute, + NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, + }, + } + err := cfg.Validate() + if err == nil { + t.Fatal("Validate() returned nil; expected dedicated G-1 rejection") + } + const wantSubstr = "CERTCTL_AUTH_TYPE=jwt is no longer accepted" + if !strings.Contains(err.Error(), wantSubstr) { + t.Errorf("Validate() = %v\nwant substring %q (the dedicated G-1 diagnostic)", err, wantSubstr) + } + }) } } +// TestValidAuthTypesDoesNotContainJWT is a property-level guard against +// a future PR silently re-introducing "jwt" into the allowed set. If +// someone adds JWT back to ValidAuthTypes(), this test fails immediately +// with a pointer at the audit finding. The matching CI grep guardrail +// in .github/workflows/ci.yml provides a secondary check at build time. +func TestValidAuthTypesDoesNotContainJWT(t *testing.T) { + t.Parallel() + for _, at := range ValidAuthTypes() { + if at == "jwt" { + t.Fatalf("jwt is in ValidAuthTypes — silent auth downgrade regressed (G-1)") + } + } +} + +// TestValidAuthTypesIsExactly_APIKey_None pins the current allowed set. +// If a future change adds a new auth type, this test must be updated +// alongside the validator and the helm-chart `validateAuthType` helper — +// keeping all three surfaces in sync. +func TestValidAuthTypesIsExactly_APIKey_None(t *testing.T) { + t.Parallel() + got := ValidAuthTypes() + if len(got) != 2 { + t.Fatalf("ValidAuthTypes() returned %d entries, want 2: %v", len(got), got) + } + want := map[AuthType]bool{AuthTypeAPIKey: true, AuthTypeNone: true} + for _, at := range got { + if !want[at] { + t.Errorf("unexpected auth type in ValidAuthTypes: %q", at) + } + } +} + +// TestValidate_GenericInvalidAuthType ensures that values outside the +// allowed set (other than the special-cased "jwt") still surface the +// generic "invalid auth type" error. Pins that the dedicated G-1 +// rejection didn't accidentally swallow non-jwt typos. +func TestValidate_GenericInvalidAuthType(t *testing.T) { + t.Parallel() + for _, badType := range []string{"", "garbage", "oidc", "mtls", "API-KEY"} { + t.Run("type="+badType, func(t *testing.T) { + cfg := &Config{ + Server: validServerConfig(t), + Database: DatabaseConfig{URL: "postgres://localhost/certctl", MaxConnections: 25}, + Log: LogConfig{Level: "info", Format: "json"}, + Auth: AuthConfig{Type: badType, Secret: "x"}, + Keygen: KeygenConfig{Mode: "agent"}, + Scheduler: SchedulerConfig{ + RenewalCheckInterval: 1 * time.Hour, + JobProcessorInterval: 30 * time.Second, + AgentHealthCheckInterval: 2 * time.Minute, + NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, + }, + } + err := cfg.Validate() + if err == nil { + t.Fatalf("Validate(type=%q) returned nil; expected invalid-auth-type rejection", badType) + } + if !strings.Contains(err.Error(), "invalid auth type") { + t.Errorf("Validate(type=%q) = %v; want \"invalid auth type\" error", badType, err) + } + if strings.Contains(err.Error(), "G-1 silent auth") { + t.Errorf("Validate(type=%q) = %v; should not hit the dedicated G-1 path for non-jwt values", badType, err) + } + }) + } +} + +// G-1 (P1): no need to add `TestValidate_NoneAuth_AcceptsEmptySecret` or +// `TestValidate_APIKeyAuth_RequiresSecret` here — the pre-existing tests +// `TestValidate_AuthTypeNone` (above) and `TestValidate_APIKeyAuth_MissingSecret` +// (above) already cover those paths. Documented for the next reader: the +// G-1 fix flipped jwt off but did not disturb either the +// none-bypasses-secret or the api-key-requires-secret behavior. + func TestValidate_InvalidKeygenMode(t *testing.T) { cfg := &Config{ Server: validServerConfig(t), @@ -515,6 +627,8 @@ func TestValidate_InvalidKeygenMode(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -544,6 +658,8 @@ func TestValidate_InvalidPort(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -574,6 +690,8 @@ func TestValidate_TLSCertPathEmpty(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } err := cfg.Validate() @@ -605,6 +723,8 @@ func TestValidate_TLSKeyPathEmpty(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } err := cfg.Validate() @@ -637,6 +757,8 @@ func TestValidate_TLSCertFileMissing(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } err := cfg.Validate() @@ -668,6 +790,8 @@ func TestValidate_TLSKeyFileMissing(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } err := cfg.Validate() @@ -701,6 +825,8 @@ func TestValidate_TLSMismatchedPair(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } err := cfg.Validate() @@ -724,6 +850,8 @@ func TestValidate_EmptyDatabaseURL(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -743,6 +871,8 @@ func TestValidate_InvalidLogLevel(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -762,6 +892,8 @@ func TestValidate_InvalidLogFormat(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil { @@ -840,6 +972,8 @@ func TestValidate_DatabaseMaxConnectionsZero(t *testing.T) { JobProcessorInterval: 30 * time.Second, AgentHealthCheckInterval: 2 * time.Minute, NotificationProcessInterval: 1 * time.Minute, + NotificationRetryInterval: 2 * time.Minute, + RetryInterval: 5 * time.Minute, }, } if err := cfg.Validate(); err == nil {