Files
shankar0123 56e2ea1ad7 docs: v2.1.0 release polish — strip internal bundle/phase tags, update status for OIDC ship
README:
- Rewrite Status block: drop the stale 'federated identity not yet
  shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout
  + break-glass as early-access; encourage GitHub issues for IdP
  rough edges. (A1 framing — keep early-access umbrella, no
  SAML/WebAuthn/JIT roadmap teaser.)
- Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks,
  group-claim → role mapping, AES-256-GCM client_secret encryption,
  JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding,
  RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute
  expiry, BCL, break-glass admin.
- Update Security paragraph: three auth paths (API keys / OIDC /
  break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL.
- Correct CI coverage thresholds against
  .github/coverage-thresholds.yml (service 70%, handler 75%,
  crypto 88%, auth packages 85-95%); 'static analysis' replaces
  the inflated '11 linters' claim (actual count is 4 active).

Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags:
- docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2
  sections (API-key + RBAC defenses / OIDC + sessions + break-glass
  defenses / OIDC + sessions threat catalogue / Closed federated-
  identity threats / Future-work threats); clean ~12 H3/prose hits.
- docs/operator/rbac.md — strip Bundle 1 framing from intro,
  scope_id deferral note, MCP tools section, day-0 bootstrap, and
  'Where to look next'.
- docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from
  title intro, hardware floor caption, result table caption,
  methodology, and pre-merge audit section.
- docs/operator/security.md — already cleaned earlier this session
  (RBAC / day-0 / approval-bypass / OIDC federation / sessions /
  OIDC first-admin / break-glass H3s).
- docs/operator/oidc-runbooks/{index,keycloak,authentik,okta,
  azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4
  references; replace with feature-name prose.
- docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023
  audit-reference framing; keep CWE-326.
- docs/operator/database-tls.md — drop Bundle B / M-018 framing
  from intro + Helm section.
- docs/operator/runbooks/disaster-recovery.md — drop 'Production
  hardening II Phase 10' status callout.
- docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO';
  strip Bundle 1/2 framing from prereqs, troubleshooting, related
  docs; update __Host- cookie callout from 'audit MED-14' to
  v2.1.0-BREAKING.
- docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from
  intro, migration table, IsAdmin section, and cross-references.
- docs/migration/acme-from-cert-manager.md — strip residual
  'Phase 5' tags from cert-manager integration test references.
- docs/reference/configuration.md — retitle Auth section.
- docs/reference/profiles.md — strip Bundle 1 Phase 9 framing
  from RequiresApproval section + Related list.
- docs/reference/auth-standards-implemented.md — rewrite intro
  (API-key + RBAC + OIDC + sessions + back-channel logout +
  break-glass); rename 'Bundle 1 (RBAC) standards covered
  separately' H2; clean per-row Phase references.
- docs/README.md — rewrite nav-table entries to drop Bundle 1/2
  parentheticals; retitle 'Enable OIDC SSO' migration entry.

No code or test changes; pure operator-facing prose polish for
the v2.1.0 tag.
2026-05-11 16:54:07 +00:00

16 KiB

Enable OIDC SSO

Last reviewed: 2026-05-10

This guide walks an operator already running certctl with API-key auth + RBAC through enabling OIDC SSO. The path is additive: API-key auth keeps working unchanged; OIDC sits alongside as a second authentication surface for human users.

If you are upgrading from a pre-RBAC (v2.0.x) deployment, finish api-keys-to-rbac.md first. If you have not deployed certctl at all, start with getting-started/quickstart.md. For the canonical mental model + per-flow threat coverage, see security.md and auth-threat-model.md.

What "enable OIDC" gives you

After this migration:

  • Human operators can log in via the OIDC button on the certctl login page (one button per configured IdP).
  • The IdP authenticates the user; certctl validates the returned ID token, mints a session cookie, and redirects to the dashboard.
  • IdP groups → certctl roles are operator-configured (e.g. engineering@example.comr-operator).
  • Every login emits an audit row (auth.oidc_login_succeeded) attributing the action to the federated user, NOT to a shared API key.
  • The first user from a configured admin group (when CERTCTL_BOOTSTRAP_ADMIN_GROUPS is set) becomes admin per tenant; one-shot per the admin-existence probe.

What does NOT change:

  • API keys keep working. Existing automation continues to authenticate via Authorization: Bearer exactly as before.
  • The break-glass admin path stays default-OFF.
  • The auditor split + approval workflow + RBAC primitive are unchanged.

Pre-requisites

On certctl side:

  • Server build ≥ v2.1.0. Confirm via curl https://<your-host>:8443/api/v1/version.
  • CERTCTL_CONFIG_ENCRYPTION_KEY set in the server environment. This is the passphrase that encrypts the OIDC client_secret at rest. Use a stable, secrets-manager-stored value at least 32 random bytes long. The server refuses to start if the key is missing AND any source='database' rows already exist (CWE-311 fail-closed gate). Set this before doing anything else.
  • An admin actor available to drive the configuration. The actor needs the auth.oidc.create + auth.oidc.edit permissions; r-admin carries both by default. Get one via the day-0 bootstrap path if you don't have one yet.
  • HTTPS-only control plane (post-v2.2 milestone — this is the default). The OIDC redirect URI MUST be https://.

On IdP side:

  • A Keycloak / Authentik / Okta / Auth0 / Entra ID / Google Workspace tenant where you can register an OIDC application. Free dev tiers work for evaluation. See the per-IdP runbook at oidc-runbooks/index.md.
  • Network reachability from certctl-server to the IdP's /.well-known/openid-configuration discovery endpoint. The certctl service fetches discovery + JWKS at provider creation and at every RefreshKeys call.

Step-by-step

1. Pin CERTCTL_CONFIG_ENCRYPTION_KEY

If your deployment already has it set (the CWE-311 fail-closed gate enforces this for any source='database' issuer/target row), skip this step. If you don't:

# Generate a 32-byte random key + base64-encode it.
openssl rand -base64 32 > /etc/certctl/config-encryption-key
chmod 600 /etc/certctl/config-encryption-key

Then make the server consume it at boot:

# In your environment, systemd unit, k8s Secret, etc.
export CERTCTL_CONFIG_ENCRYPTION_KEY="$(cat /etc/certctl/config-encryption-key)"

Restart the server. Confirm the boot log does NOT show the ErrEncryptionKeyRequired warning. If it does, the server refuses to start because there's pre-existing source='database' material that needs to be re-sealed; see docs/operator/security.md for the re-encryption flow.

2. Pick an IdP runbook + complete the IdP-side configuration

Pick the runbook for your IdP and do EVERYTHING in its IdP-side section. The runbooks are at docs/operator/oidc-runbooks/. What you need from the runbook before continuing here:

  • The IdP's discovery URL (the iss value certctl will validate against).
  • An OIDC client ID + client secret. Save the secret; you'll paste it into certctl in step 3.
  • At least one IdP group with the users who should be allowed to log in. The runbook walks the group-claim mapper config.
  • The IdP-side group claim shape — most IdPs emit string-array under a groups key, but Auth0 uses namespaced URL keys (https://your-namespace/groups) and Entra ID emits group OBJECT IDs (GUIDs) instead of names. The runbook calls out the per-IdP shape.

3. Configure the certctl-side OIDC provider

Via the GUI (recommended for first-time setup):

  1. Sign in as an admin actor.
  2. Navigate to Auth → OIDC Providers in the sidebar.
  3. Click Configure provider.
  4. Fill in the form using the values from step 2's runbook.
  5. Click Save.

If the discovery doc fetch fails, the modal surfaces the error inline. Most-common cause: a typo in the issuer URL.

Or via the CLI / MCP:

curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/providers \
  -H "Authorization: Bearer ${CERTCTL_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Keycloak",
    "issuer_url": "https://keycloak.example.com/realms/certctl",
    "client_id": "certctl",
    "client_secret": "<paste-the-secret>",
    "redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
    "groups_claim_path": "groups",
    "groups_claim_format": "string-array",
    "scopes": ["openid", "profile", "email"],
    "iat_window_seconds": 300,
    "jwks_cache_ttl_seconds": 3600
  }'

The MCP equivalent (certctl_auth_create_oidc_provider) accepts the same JSON shape.

4. Add the group → role mappings

Empty mapping list = nobody can log in via this provider (the fail-closed contract; pinned by ErrGroupsUnmapped). Add at least one mapping BEFORE announcing the SSO endpoint to users.

Via the GUI: Auth → OIDC Providers → → Group → role mappings → Add.

Via the API:

curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/group-mappings \
  -H "Authorization: Bearer ${CERTCTL_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "provider_id": "<provider-id-from-step-3>",
    "group_name": "engineering@example.com",
    "role_id": "r-operator"
  }'

A typical setup adds two or three mappings: engineers → r-operator, viewers → r-viewer, optionally admins → r-admin. For Entra ID, use group object IDs (GUIDs) NOT names; for Auth0, use the bare group name from inside the namespaced claim array.

5. (Optional) Configure first-admin bootstrap

If your deployment has no admin actor yet AND you want the first OIDC-authenticated user from a specific group to become admin (instead of using the env-var-token bootstrap path), set:

export CERTCTL_BOOTSTRAP_ADMIN_GROUPS=admins
export CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID=<provider-id-from-step-3>

Restart the server. The first user with the admins group claim from that provider becomes admin on login per tenant. Subsequent logins go through normal group-role mapping. Audit row on every grant (bootstrap.oidc_first_admin).

If you already have an admin actor (likely — you needed one to run step 3), the bootstrap hook silently falls through to normal mapping; no harm done. The probe is one-shot per tenant and can't double-grant.

6. Verify with a single test user

Before announcing the SSO endpoint to your users, verify the full login flow with a test user from your IdP:

  1. Open https://<your-certctl-host>:8443/login in a fresh incognito window.

  2. The page should render Sign in with <provider> button(s) above the API-key form. If not, check that getAuthInfo is returning the oidc_providers field — curl https://<your-host>:8443/api/v1/auth/info should show the configured provider(s).

  3. Click the provider button. The browser redirects to the IdP, you authenticate, and the IdP redirects back. You should land on the certctl dashboard.

  4. Navigate to Auth → Sessions. You should see a row with your own actor ID and the current timestamp.

  5. Confirm the audit row:

    curl https://<your-host>:8443/api/v1/audit?category=auth \
      -H "Authorization: Bearer ${CERTCTL_API_KEY}" \
      | jq '.events[] | select(.action == "auth.oidc_login_succeeded")'
    

    You should see a row attributed to the federated user with details.provider_id matching your configuration.

If any step fails, see the Troubleshooting section below.

7. Announce the SSO endpoint

Once step 6 passes, the SSO endpoint is operational. Tell your users to log in via https://<your-host>:8443/login and click the provider button. API-key auth continues to work for automation; the two paths coexist.

Optional GUI hardening:

  • If you want the API-key form hidden once OIDC is configured, the operator can add a frontend feature flag in a follow-on commit. Default behavior keeps both paths visible (the API-key form stays for break-glass + Bearer-mode deploys).
  • If you want to revoke a user's session immediately (e.g. an employee left), use Auth → Sessions → All actors (admin) → → Revoke. The next request from that user's browser fails 401.

Rollback

If you need to disable OIDC:

  1. Delete every group-role mapping for the provider:
    # GUI: Auth → OIDC Providers → <provider> → Group → role mappings → Remove (each)
    
  2. Delete the OIDC provider:
    # GUI: Auth → OIDC Providers → <provider> → Delete (type-confirm-name dialog)
    
    The server returns HTTP 409 if any user has an authenticated session minted via this provider; revoke those sessions first.
  3. The Sign in with <provider> button disappears from the login page on the next getAuthInfo round-trip (typically the next page load).
  4. Existing sessions continue to work until idle/absolute expiry. To force-revoke them, Auth → Sessions → All actors (admin) → revoke each row.

API-key auth continues to work throughout this rollback; you do not need to re-bootstrap or change any other configuration.

Troubleshooting

"Discovery doc fetch failed" at provider creation. The most common cause is a typo in the issuer URL. Curl the URL manually:

curl -v https://<idp-host>/<path>/.well-known/openid-configuration

If that returns 404, fix the issuer URL.

"IdP downgrade-attack defense" rejected provider creation. Your IdP advertises HS256/HS384/HS512 or none in id_token_signing_alg_values_supported. Configure the IdP to advertise only RS256 / RS512 / ES256 / ES384 / EdDSA before re-creating the provider in certctl. The relevant runbook section walks this.

Login redirects to IdP, user authenticates, but the callback redirects back to /login with "no roles assigned". The user authenticated successfully but their groups didn't match any configured mapping (ErrGroupsUnmapped). Check:

  • The user is a member of the IdP group you mapped.
  • The group-claim mapper is configured correctly at the IdP (the runbook walks per-IdP).
  • The group name in your certctl mapping exactly matches what the IdP emits — case-sensitive, no leading slash for Keycloak full-path-OFF.

Decode the ID token at jwt.io against the IdP's JWKS to see exactly what's in the groups claim.

ErrIssuerMismatch even though the discovery doc looks correct. The iss claim in the ID token must match OIDCProvider.IssuerURL byte-for-byte. Some IdPs include / omit a trailing slash; check the per-IdP runbook section on iss formatting.

oidc: pre-login session not found or already consumed. The user clicked the OIDC login button, then the browser tab idled past the 10-minute pre-login TTL OR the user opened the IdP login in a new tab and consumed the row from the first one. Have them retry from the login page.

oidc: state parameter mismatch (replay or forgery). Either the user double-submitted a callback URL (clicked it twice from email or browser history), or a CSRF attempt. The pre-login row is single-use; second consumption returns ErrPreLoginNotFound. Have them retry from the login page.

Sessions revoked but the user can still hit the API. Check the session contract: the cookie is HMAC-validated on every request, but the actual database row is what Revoke deletes. If your reverse proxy is caching the response or the __Host-certctl_session cookie wasn't actually cleared on the client, the cookie hits the server's session middleware which returns 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case.

JWKS rotation: an IdP rotated its signing key and existing users start failing login. Click Refresh discovery cache on the OIDC provider detail page (or POST /api/v1/auth/oidc/providers/<id>/refresh). The certctl service re-fetches discovery + JWKS. New tokens validate immediately. The Keycloak integration test exercises this drill end to end.

Database row count drift. After OIDC is live, expect to see new rows under:

  • oidc_providers (one per configured provider)
  • group_role_mappings (one per configured mapping)
  • users (one per first OIDC-authenticated user; certctl auto-upserts on login)
  • sessions (one per logged-in browser session; idle 1h / absolute 8h GC)
  • session_signing_keys (one active + retained-history rows post rotation)
  • oidc_pre_login_sessions (transient; 10-minute TTL, scheduler-GC'd)

All ten of these tables are tenant-scoped (tenant_id column); single-tenant deployments use the seeded t-default tenant.

What you can do next

  • Run docs/operator/oidc-runbooks/<your-idp>.md end to end to fill in the validation checklist + sign-off line.
  • Read docs/operator/auth-benchmarks.md for the steady-state + cold-cache performance baselines.
  • Review the auth-threat-model.md OIDC + sessions + break-glass sections to understand the failure modes the federated-identity surface defends against.
  • Schedule a rotation reminder for the OIDC client_secret (typically 6-12 months; the IdP doesn't auto-rotate it). Edit the provider via the GUI when the time comes; leaving client_secret blank in the edit form preserves the existing ciphertext, providing a value rotates.

v2.1.0 carries a wire-format change to the three auth cookies: they now carry the __Host- prefix. The cookie names are:

  • __Host-certctl_session (was certctl_session)
  • __Host-certctl_csrf (was certctl_csrf)
  • __Host-certctl_oidc_pending (was certctl_oidc_pending)

The rename gains browser-enforced subdomain-takeover defense: a __Host-* cookie can only be set with Path=/ + Secure + no Domain attribute, and the browser rejects any subdomain attempt to overwrite it. The protection is free (the existing cookies already met the prerequisites) but the wire-format change means:

  • Every active session is invalidated by the deploy that lands this change. Operators see one re-authentication prompt; subsequent logins issue the new __Host-*-prefixed cookie.
  • The pre-login cookie's Path widens from /auth/oidc/ to / — required by the __Host- prefix. The cookie lifetime is unchanged (10 minutes) and is only ever consumed by the callback handler; the wider path scope is harmless.
  • No operator action required beyond accepting the one-time re-login window. The GUI's CSRF cookie reader was updated in lockstep; existing bookmarked deep links work without modification.

If you have GUI customizations that read document.cookie directly, update them to look for __Host-certctl_csrf (the lookup in web/src/api/client.ts is the in-tree reference).

Cross-references