mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-10 16:18:51 +00:00
9c1d446e40
The pre-G-1 config validator accepted CERTCTL_AUTH_TYPE=jwt and the
startup log faithfully echoed 'authentication enabled type=jwt'.
Reasonable people read that and concluded JWT auth was on. It wasn't.
The auth-middleware wiring at cmd/server/main.go unconditionally routed
every request through the api-key bearer middleware regardless of
cfg.Auth.Type. So CERTCTL_AUTH_TYPE=jwt quietly compared the incoming
'Authorization: Bearer <token>' against whatever string the operator put
in CERTCTL_AUTH_SECRET — real JWT clients got 401, and operators who
treated CERTCTL_AUTH_SECRET as a *signing* secret (because they thought
they were configuring JWT) had effectively handed an attacker an api-key.
A security finding masquerading as a config option.
We chose the audit-recommended structural fix: remove the option, fail
fast at startup, and add the gateway-fronting pattern as the documented
forward path. Implementing JWT middleware would have meant jwks vs
static-secret rotation, claim mapping, expiry enforcement, audience and
issuer validation, key rollover semantics, and regression coverage at the
same depth as the existing api-key path — a feature, not a fix. Operators
who genuinely need JWT/OIDC front certctl with an authenticating gateway
(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium /
Authelia) and run the upstream certctl with CERTCTL_AUTH_TYPE=none. Same
shape works on docker-compose and Helm.
The change is comprehensive across 7 phases — every surface that
mentioned 'jwt' as a certctl-auth-type is updated, plus structural
backstops (typed enum, runtime guard, helm template validation, CI grep
guard) so the lie can't reappear.
Files changed:
Phase 1 — production code (typed enum + jwt removal):
- internal/config/config.go: AuthType typed alias + AuthTypeAPIKey /
AuthTypeNone constants + ValidAuthTypes() helper. Validate() routes
literal 'jwt' through a dedicated multi-line diagnostic naming the
authenticating-gateway pattern, then cross-checks against
ValidAuthTypes(). Secret-required branch simplified to api-key-only.
Field comment on AuthConfig.Type rewritten to drop jwt and point at
the gateway pattern.
- internal/api/middleware/middleware.go: AuthConfig.Type field comment
references the typed config.AuthType constants.
- internal/api/handler/health.go: same treatment for HealthHandler.AuthType.
- cmd/server/main.go: defense-in-depth runtime switch immediately after
config.Load() — exits 1 on any unsupported auth-type that bypassed the
validator. Auth-disabled startup log explicitly names the
authenticating-gateway pattern.
Phase 2 — tests (Red→Green, contract pinning):
- internal/config/config_test.go: TestValidate_JWTAuth_RejectedDedicated
(two table rows pinning the dedicated G-1 error fires regardless of
whether Secret is set), TestValidAuthTypesDoesNotContainJWT (property
guard against future re-introduction),
TestValidAuthTypesIsExactly_APIKey_None (allowed-set contract),
TestValidate_GenericInvalidAuthType (pins non-jwt invalid values still
hit the generic invalid-auth-type error). Removed the prior
TestValidate_JWTAuth_MissingSecret happy-path since its premise is
inverted post-G-1.
- internal/api/handler/health_test.go: removed
TestAuthInfo_ReturnsAuthType_JWT (which baked the silent-downgrade lie
into the regression suite). Pre-existing _APIKey test continues to
cover the api-key happy path.
Phase 3 — spec, docs, env templates:
- api/openapi.yaml: auth_type enum dropped to [api-key, none] with
inline comment naming the G-1 closure.
- .env.example (root): CERTCTL_AUTH_TYPE comment block rewritten to drop
jwt and point at the gateway pattern; secret-required conditional
simplified to api-key-only.
- docs/architecture.md: middleware-stack bullet rewritten to drop the
JWT mention; new H3 'Authenticating-gateway pattern (JWT, OIDC, mTLS)'
section explaining the design rationale and listing oauth2-proxy /
Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia / Caddy
forward_auth / Apache mod_auth_openidc / nginx auth_request as the
standard fronting options.
- docs/upgrade-to-v2-jwt-removal.md (new ~125 lines): migration guide
with preconditions, what-changes, both recovery paths, complete
docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy
ext_authz patterns, rollback posture.
Phase 4 — Helm chart (template validation + docs):
- deploy/helm/certctl/templates/_helpers.tpl: new certctl.validateAuthType
helper mirroring the existing certctl.tls.required pattern. Fails
template render on any server.auth.type outside {api-key, none} with
a multi-line diagnostic.
- deploy/helm/certctl/templates/server-deployment.yaml,
server-configmap.yaml, server-secret.yaml: invoke the helper at the
top of each template that depends on .Values.server.auth.type.
- deploy/helm/certctl/values.yaml: auth: block comment expanded with the
G-1 rationale and gateway-pattern cross-reference.
- deploy/helm/CHART_SUMMARY.md: server.auth.type table row now surfaces
the allowed set and points at the upgrade doc.
- deploy/helm/certctl/README.md: new 'JWT / OIDC via authenticating
gateway' section with a Kubernetes-flavored oauth2-proxy + certctl
walkthrough.
Phase 5 — release surface:
- CHANGELOG.md: new [unreleased] top entry with Breaking / Removed /
Added / Changed sections; explicit pointer at
docs/upgrade-to-v2-jwt-removal.md from the Breaking subsection.
Phase 6 — CI guardrail:
- .github/workflows/ci.yml: new 'Forbidden auth-type literal regression
guard (G-1)' step. Scoped patterns catch the actual regression shapes
(map literal, slice literal, switch case, OpenAPI enum, env-file
default, AuthType('jwt') cast). Comments and the dedicated rejection
branch are intentionally exempt; connector-package JWT references
(Google OAuth2 / step-ca) are exempt as out-of-scope external
protocols. Verified locally: the guard passes on the actual tree and
fires on all 4 synthetic regression patterns.
Out of scope (explicitly untouched):
- internal/connector/discovery/gcpsm/gcpsm.go — Google OAuth2 service-
account JWT (external protocol).
- internal/connector/issuer/googlecas/googlecas.go — same.
- internal/connector/issuer/stepca/stepca.go — step-ca's provisioner
one-time-token JWT for /sign API.
- docs/test-env.md, docs/connectors.md, docs/features.md — describe
external CAs' use of JWT, not certctl's auth shape.
- Implementing actual JWT middleware. Feature, not a fix.
Verification (all gates pass):
- go build ./... — clean
- go vet ./... — clean
- go test -short ./... — every package green
- go test -short -race ./internal/config/... ./internal/api/... — clean
- govulncheck ./... — no vulnerabilities in our code
- helm lint deploy/helm/certctl/ — clean
- helm template with auth.type=api-key — renders OK
- helm template with auth.type=none — renders OK
- helm template with auth.type=jwt — fails with validateAuthType
diagnostic (exit 1)
- python3 yaml.safe_load on api/openapi.yaml — parses
- CI guardrail mirror — clean on real tree, fires on all 4 synthetic
regression patterns
- Smoke test: 'CERTCTL_AUTH_TYPE=jwt ./certctl-server' exits non-zero
with: 'Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no
longer accepted (G-1 silent auth downgrade): no JWT middleware ships
with certctl. To use JWT/OIDC, run an authenticating gateway
(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) in
front of certctl and set CERTCTL_AUTH_TYPE=none on the upstream.
See docs/architecture.md "Authenticating-gateway pattern" and
docs/upgrade-to-v2-jwt-removal.md for the migration walkthrough'
config pkg coverage: ValidAuthTypes 100%, Validate 94.7%, total 75.5%.
Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
§2 P1 cluster, cat-g-jwt_silent_auth_downgrade
Audit recommendation followed verbatim: 'Remove jwt from
validAuthTypes until middleware ships'.
156 lines
9.3 KiB
Markdown
156 lines
9.3 KiB
Markdown
# Upgrading past G-1 — `CERTCTL_AUTH_TYPE=jwt` removal
|
|
|
|
If your certctl deployment currently sets `CERTCTL_AUTH_TYPE=jwt` (or `server.auth.type=jwt` in Helm), the next certctl upgrade will fail-fast at startup with a dedicated diagnostic. This guide explains why, what to switch to, and how to keep JWT/OIDC at your edge.
|
|
|
|
For everyone else — operators running `api-key` or `none` — this upgrade is a no-op. Skip to [`upgrade-to-tls.md`](upgrade-to-tls.md) for the v2.2 HTTPS-everywhere migration if you haven't done that one yet.
|
|
|
|
## Why we removed it
|
|
|
|
Pre-G-1, the config validator at `internal/config/config.go` accepted three values for `CERTCTL_AUTH_TYPE`: `api-key`, `jwt`, and `none`. The startup log line at `cmd/server/main.go` faithfully echoed `"authentication enabled" "type"="jwt"` when an operator picked `jwt`. Reasonable people read that and concluded JWT auth was on.
|
|
|
|
It wasn't. Grep `internal/ cmd/` for `NewJWT`, `JWTMiddleware`, or `jwt.Parse` — pre-G-1, there were zero matches in production code. The auth-middleware wiring at `cmd/server/main.go:653` unconditionally called `middleware.NewAuthWithNamedKeys(namedKeys)` regardless of `cfg.Auth.Type`. So `CERTCTL_AUTH_TYPE=jwt` just routed every request through the api-key bearer middleware, comparing the incoming `Authorization: Bearer <something>` against whatever string the operator put in `CERTCTL_AUTH_SECRET`. Real JWT clients got 401 (the api-key middleware saw the JWT string as a literal token and compared bytes). Operators who treated `CERTCTL_AUTH_SECRET` as a JWT signing secret (and therefore handled it less carefully than an api-key) handed an attacker an api-key. Silent auth downgrade — a security finding masquerading as a config option.
|
|
|
|
We chose to remove the option rather than implement JWT middleware. Implementing real JWT/OIDC requires jwks vs static-secret rotation, claim mapping (which claim is the actor / the admin flag?), expiry enforcement, audience and issuer validation, key rollover semantics, and regression coverage at the same depth as the existing api-key path. That's a feature, not a fix. The audit-recommended structural fix — and the one that actually closes the hazard — is to fail loudly instead of silently downgrading.
|
|
|
|
## What changes at startup
|
|
|
|
Post-G-1, a binary started with `CERTCTL_AUTH_TYPE=jwt` exits non-zero before opening the listener:
|
|
|
|
```
|
|
Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no longer accepted
|
|
(G-1 silent auth downgrade): no JWT middleware ships with certctl. To use
|
|
JWT/OIDC, run an authenticating gateway (oauth2-proxy / Envoy ext_authz /
|
|
Traefik ForwardAuth / Pomerium) in front of certctl and set
|
|
CERTCTL_AUTH_TYPE=none on the upstream. See docs/architecture.md
|
|
"Authenticating-gateway pattern" and docs/upgrade-to-v2-jwt-removal.md
|
|
for the migration walkthrough
|
|
```
|
|
|
|
Helm operators get the same shape at `helm install` / `helm upgrade` template time: `server.auth.type=jwt` is rejected by the chart's `certctl.validateAuthType` template helper before any Kubernetes object is rendered.
|
|
|
|
The CI-side regression guard at `.github/workflows/ci.yml` blocks any future PR that re-introduces `"jwt"` as an auth-type literal in production code or spec.
|
|
|
|
## Recovery — pick one
|
|
|
|
### Option A — switch to `api-key` (you weren't actually using JWT)
|
|
|
|
If your `CERTCTL_AUTH_SECRET` was a single high-entropy token and your clients sent it as `Authorization: Bearer <token>`, you were already using api-key auth — you just had `CERTCTL_AUTH_TYPE` set to the wrong string. Flip it:
|
|
|
|
```
|
|
# .env (docker-compose)
|
|
CERTCTL_AUTH_TYPE=api-key
|
|
CERTCTL_AUTH_SECRET=<your-existing-token>
|
|
```
|
|
|
|
```
|
|
# Helm
|
|
helm upgrade <release> deploy/helm/certctl/ \
|
|
--reuse-values \
|
|
--set server.auth.type=api-key \
|
|
--set server.auth.apiKey=<your-existing-token>
|
|
```
|
|
|
|
No client changes needed — the same Bearer token continues to work. The startup log will now read `"authentication enabled" "type"="api-key"`, which matches what was actually happening pre-G-1.
|
|
|
|
### Option B — front certctl with an authenticating gateway
|
|
|
|
If you genuinely need JWT, OIDC, mTLS, or SAML, run an authenticating gateway in front of certctl and let the gateway terminate the federated identity protocol. Configure certctl for `CERTCTL_AUTH_TYPE=none`:
|
|
|
|
```
|
|
CERTCTL_AUTH_TYPE=none
|
|
```
|
|
|
|
Then put an oauth2-proxy / Envoy `ext_authz` / Traefik `ForwardAuth` / Pomerium / Authelia (etc.) in the network path between operators and certctl. The gateway validates the identity and proxies the authenticated request to certctl as a same-origin call on a private network.
|
|
|
|
### Concrete walkthrough — oauth2-proxy + certctl on docker-compose
|
|
|
|
This is the simplest production-grade JWT/OIDC shape. It assumes you have an OIDC provider (Okta, Auth0, Google Workspace, Keycloak, Dex) and a registered client_id / client_secret.
|
|
|
|
```yaml
|
|
# deploy/docker-compose.gateway.yml — overlay on the base compose file
|
|
services:
|
|
oauth2-proxy:
|
|
image: quay.io/oauth2-proxy/oauth2-proxy:latest
|
|
command:
|
|
- --provider=oidc
|
|
- --oidc-issuer-url=https://<your-issuer>/
|
|
- --client-id=${OIDC_CLIENT_ID}
|
|
- --client-secret=${OIDC_CLIENT_SECRET}
|
|
- --cookie-secret=${OAUTH2_PROXY_COOKIE_SECRET} # openssl rand -base64 32
|
|
- --upstream=http://certctl-server:8443 # internal-network only; certctl listens on 8443
|
|
- --http-address=0.0.0.0:4180
|
|
- --email-domain=*
|
|
- --pass-access-token=true
|
|
- --pass-authorization-header=true
|
|
- --set-authorization-header=true # forwards a bearer token upstream
|
|
- --skip-provider-button=true
|
|
- --reverse-proxy=true
|
|
ports:
|
|
- "443:4180"
|
|
depends_on:
|
|
- certctl-server
|
|
networks:
|
|
- certctl-network
|
|
|
|
certctl-server:
|
|
environment:
|
|
CERTCTL_AUTH_TYPE: none # gateway terminates auth — see docs/upgrade-to-v2-jwt-removal.md
|
|
# ... rest of the certctl env block unchanged
|
|
```
|
|
|
|
Operators hit `https://<your-host>/`, get redirected through the OIDC provider, land back at oauth2-proxy with a session cookie, and oauth2-proxy proxies their request to certctl on the internal Docker network. certctl itself is HTTPS-only on `:8443` (TLS 1.3, see [`tls.md`](tls.md)) but operator browsers never see that hop directly. Bind certctl-server's `:8443` to the internal Docker network only — do NOT publish it to the host. The audit trail will record the actor as the gateway-forwarded identity if you also configure a small bearer-token-mapping shim at the gateway (most production deployments do this with a per-user api-key issued by the gateway after OIDC validation).
|
|
|
|
### Traefik ForwardAuth pattern (Kubernetes)
|
|
|
|
Same shape, kubernetes-flavored:
|
|
|
|
```yaml
|
|
apiVersion: traefik.io/v1alpha1
|
|
kind: Middleware
|
|
metadata:
|
|
name: oidc-forward-auth
|
|
spec:
|
|
forwardAuth:
|
|
address: http://oauth2-proxy.auth.svc.cluster.local:4180
|
|
trustForwardHeader: true
|
|
authResponseHeaders:
|
|
- X-Auth-Request-User
|
|
- X-Auth-Request-Email
|
|
- Authorization
|
|
---
|
|
apiVersion: traefik.io/v1alpha1
|
|
kind: IngressRoute
|
|
metadata:
|
|
name: certctl
|
|
spec:
|
|
routes:
|
|
- match: Host(`certctl.example.com`)
|
|
kind: Rule
|
|
middlewares:
|
|
- name: oidc-forward-auth
|
|
services:
|
|
- name: certctl-server
|
|
port: 8443
|
|
```
|
|
|
|
The certctl Helm release runs with `server.auth.type=none`. The Traefik IngressRoute attaches `oidc-forward-auth` as a middleware so every request is OIDC-validated by oauth2-proxy before reaching certctl.
|
|
|
|
### Envoy `ext_authz` pattern
|
|
|
|
For service-mesh deployments (Istio, Consul, plain Envoy), the `ext_authz` filter calls out to an external authorization service per-request. Same outcome: certctl runs `CERTCTL_AUTH_TYPE=none` and Envoy + your authz service handle JWT/OIDC/mTLS at the mesh edge. See the [Envoy ext_authz docs](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_authz_filter) for the configuration surface.
|
|
|
|
## Rollback
|
|
|
|
Pre-G-1 binaries silently accepted `CERTCTL_AUTH_TYPE=jwt` and routed through the api-key middleware. Downgrading the binary is the only mechanical rollback path, and it puts you back into the silent-downgrade state — which is exactly what the G-1 audit finding is about. We don't recommend it. If something is forcing your hand, capture the operational issue you're hitting and open a GitHub issue against the certctl repo with the SHAs involved; the Authenticating-gateway pattern was specifically designed to cover the use cases that historically led operators to set `CERTCTL_AUTH_TYPE=jwt`.
|
|
|
|
There is no on-disk state that changes with this upgrade — no migrations to roll back, no encrypted config to re-encode, no certificates to re-issue. The change is entirely in the config-validation surface and the helm-chart template guard.
|
|
|
|
## Cross-references
|
|
|
|
- [`architecture.md`](architecture.md) — "Authenticating-gateway pattern (JWT, OIDC, mTLS)" section.
|
|
- [`tls.md`](tls.md) — TLS provisioning patterns. The gateway proxying to certctl-server still needs to trust certctl's TLS cert; same patterns apply.
|
|
- [`../deploy/helm/certctl/README.md`](../deploy/helm/certctl/README.md) — Helm-chart-flavored guidance.
|
|
- `internal/config/config.go::ValidAuthTypes` — the single source of truth for what's accepted post-G-1.
|
|
- `internal/repository/postgres/db.go::wrapPingError` — unrelated; pattern for runtime diagnostic of operator misconfiguration.
|
|
- `coverage-gap-audit-2026-04-24-v5/unified-audit.md` — the audit finding (`cat-g-jwt_silent_auth_downgrade`).
|