README:
- Rewrite Status block: drop the stale 'federated identity not yet
shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout
+ break-glass as early-access; encourage GitHub issues for IdP
rough edges. (A1 framing — keep early-access umbrella, no
SAML/WebAuthn/JIT roadmap teaser.)
- Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks,
group-claim → role mapping, AES-256-GCM client_secret encryption,
JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding,
RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute
expiry, BCL, break-glass admin.
- Update Security paragraph: three auth paths (API keys / OIDC /
break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL.
- Correct CI coverage thresholds against
.github/coverage-thresholds.yml (service 70%, handler 75%,
crypto 88%, auth packages 85-95%); 'static analysis' replaces
the inflated '11 linters' claim (actual count is 4 active).
Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags:
- docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2
sections (API-key + RBAC defenses / OIDC + sessions + break-glass
defenses / OIDC + sessions threat catalogue / Closed federated-
identity threats / Future-work threats); clean ~12 H3/prose hits.
- docs/operator/rbac.md — strip Bundle 1 framing from intro,
scope_id deferral note, MCP tools section, day-0 bootstrap, and
'Where to look next'.
- docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from
title intro, hardware floor caption, result table caption,
methodology, and pre-merge audit section.
- docs/operator/security.md — already cleaned earlier this session
(RBAC / day-0 / approval-bypass / OIDC federation / sessions /
OIDC first-admin / break-glass H3s).
- docs/operator/oidc-runbooks/{index,keycloak,authentik,okta,
azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4
references; replace with feature-name prose.
- docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023
audit-reference framing; keep CWE-326.
- docs/operator/database-tls.md — drop Bundle B / M-018 framing
from intro + Helm section.
- docs/operator/runbooks/disaster-recovery.md — drop 'Production
hardening II Phase 10' status callout.
- docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO';
strip Bundle 1/2 framing from prereqs, troubleshooting, related
docs; update __Host- cookie callout from 'audit MED-14' to
v2.1.0-BREAKING.
- docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from
intro, migration table, IsAdmin section, and cross-references.
- docs/migration/acme-from-cert-manager.md — strip residual
'Phase 5' tags from cert-manager integration test references.
- docs/reference/configuration.md — retitle Auth section.
- docs/reference/profiles.md — strip Bundle 1 Phase 9 framing
from RequiresApproval section + Related list.
- docs/reference/auth-standards-implemented.md — rewrite intro
(API-key + RBAC + OIDC + sessions + back-channel logout +
break-glass); rename 'Bundle 1 (RBAC) standards covered
separately' H2; clean per-row Phase references.
- docs/README.md — rewrite nav-table entries to drop Bundle 1/2
parentheticals; retitle 'Enable OIDC SSO' migration entry.
No code or test changes; pure operator-facing prose polish for
the v2.1.0 tag.
14 KiB
Keycloak OIDC runbook
Last reviewed: 2026-05-10
This is the canonical reference runbook for wiring certctl's OIDC SSO surface against Keycloak. Keycloak is a free / open-source identity provider that runs on-prem or self-hosted; it is also the load-bearing test fixture for certctl's OIDC integration tests (internal/auth/oidc/testfixtures/keycloak.go), so the certctl-side validation pipeline is exhaustively exercised against it.
If your IdP is something else (Okta, Auth0, Azure AD, Authentik, Google Workspace), see the per-IdP siblings in this directory. The mental model + certctl-side wiring are identical; only the IdP-side console differs.
Prerequisites
On the Keycloak side:
- Keycloak ≥ 25.0 (older versions work but the screen flows differ slightly — the integration test fixture pins 25.0).
- Admin access to a realm — either an existing tenant realm or a fresh one created for certctl. Don't share Keycloak's
masterrealm; create a dedicated realm. - Network reachability from certctl-server to the Keycloak
https://<keycloak-host>/realms/<realm-name>discovery endpoint. The certctl service fetches/.well-known/openid-configurationat provider creation and at everyRefreshKeyscall. - Keycloak's signing alg set to RS256 (default) or any of: RS512, ES256, ES384, EdDSA. HS256/HS384/HS512 +
noneare rejected by certctl's IdP-downgrade-attack defense at provider creation time.
On the certctl side:
CERTCTL_CONFIG_ENCRYPTION_KEYset to a stable secret (production deployments only — the encryption-at-rest layer for the OIDC client_secret depends on it).- An admin actor holding
auth.oidc.create+auth.oidc.edit(held byr-adminby default; granted viacertctl_auth_assign_role_to_keyMCP tool or the GUI's Auth → Keys page). - Server build ≥ v2.1.0.
IdP-side configuration
The same configuration you'll do by hand here is what the testcontainers fixture imports from internal/auth/oidc/testfixtures/keycloak-realm.json — read that file alongside this runbook to see the exact JSON shape Keycloak persists.
1. Create or pick a realm
In the Keycloak admin console (https://<keycloak-host>/admin/), drop into the realm you'll use. If creating a new one, the realm name will become part of the issuer URL: https://<keycloak-host>/realms/<realm-name>.
2. Create the OIDC client
Clients → Create client:
- Client type: OpenID Connect
- Client ID:
certctl(or whatever you prefer; it goes intoOIDCProvider.client_idon the certctl side). - Always display in console: off.
- Click Next.
On the capability config page:
- Client authentication: On (this makes the client confidential, which is what certctl requires).
- Authorization: off.
- Standard flow: on (auth-code with PKCE — this is the path certctl uses).
- Direct access grants: off (ROPC; the test fixture turns this on for ROPC convenience but production should NOT).
- Implicit flow: off.
- Service accounts roles: off.
- Click Next.
Login settings:
- Root URL: leave blank.
- Home URL: blank.
- Valid redirect URIs:
https://<your-certctl-host>:8443/auth/oidc/callback— ONE entry, exact match. Wildcards (*) work for local dev (http://localhost:*) but production should pin the exact host. - Valid post logout redirect URIs: blank or
+(matches the redirect URI list). - Web origins:
+(matches the redirect URI origin) or empty. - Click Save.
On the saved client's Credentials tab, copy the Client secret — you'll need it for the certctl-side payload.
3. Create the groups
Groups → Create group:
- Repeat for every certctl role you want to map to a group. A typical setup creates two:
certctl-engineers(intended target:r-operator)certctl-viewers(intended target:r-viewer)
- Optionally an
certctl-adminsgroup →r-adminfor break-glass-free first-admin bootstrap; see theauth-threat-model.mdsection on bootstrap admins.
4. Configure the group-membership claim mapper
This is the load-bearing step — without it, the ID token won't carry a groups claim and every login fails closed with ErrGroupsUnmapped.
Clients → certctl → Client scopes → certctl-dedicated → Add mapper → By configuration → Group Membership:
- Name:
groups - Token Claim Name:
groups - Full group path: off (so the claim emits
engineers, not/engineers; matches the certctlstring-arraygroup-claim format). - Add to ID token: on.
- Add to access token: on (optional but recommended; the userinfo-fallback path uses it).
- Add to userinfo: on.
- Click Save.
5. Create the user(s)
Users → Add user:
- Username:
alice(or however you identify operators). - Email: required (used as the certctl-side
User.Email). - First name + last name: optional but populates
User.DisplayName. - Email verified: on if you trust the user.
- Click Create.
On the saved user's Credentials tab:
- Set a password. Mark Temporary if you want the user to reset on first login.
On the Groups tab:
- Join the user to the group(s) you created in step 3.
certctl-side configuration
Via the GUI
- Sign in as an admin actor.
- Navigate to Auth → OIDC Providers in the sidebar.
- Click Configure provider.
- Fill in:
- Display name:
Keycloak(free-text; what end-users see on the login page button). - Issuer URL:
https://<keycloak-host>/realms/<realm-name>. - Client ID:
certctl(matches step 2 above). - Client secret: paste the secret from step 2's Credentials tab.
- Redirect URI:
https://<your-certctl-host>:8443/auth/oidc/callback. - Groups claim path:
groups(the default; matches step 4's Token Claim Name). - Groups claim format:
string-array(the default). - Fetch userinfo: off (Keycloak emits groups in the ID token; userinfo fallback is for IdPs that don't).
- Scopes:
openid profile email(the certctl service prependsopenidif missing). - IAT window seconds: 300 (default).
- JWKS cache TTL seconds: 3600 (default).
- Display name:
- Click Save.
If the discovery doc fetch fails, the modal surfaces the error inline. The most common cause is a typo in the issuer URL — Keycloak emits 404 for any path under /realms/ that doesn't match an actual realm.
Via the API
curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/providers \
-H "Authorization: Bearer ${CERTCTL_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "Keycloak",
"issuer_url": "https://keycloak.example.com/realms/certctl",
"client_id": "certctl",
"client_secret": "<paste-the-secret>",
"redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
"groups_claim_path": "groups",
"groups_claim_format": "string-array",
"fetch_userinfo": false,
"scopes": ["openid", "profile", "email"],
"iat_window_seconds": 300,
"jwks_cache_ttl_seconds": 3600
}'
Via MCP
certctl_auth_create_oidc_provider {
"name": "Keycloak",
"issuer_url": "https://keycloak.example.com/realms/certctl",
"client_id": "certctl",
"client_secret": "<paste-the-secret>",
"redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
"groups_claim_path": "groups",
"groups_claim_format": "string-array",
"scopes": ["openid", "profile", "email"]
}
Add the group→role mappings
GUI: Auth → OIDC Providers → Keycloak → Group → role mappings → Add.
- IdP group:
certctl-engineers→ certctl role:r-operator. - IdP group:
certctl-viewers→ certctl role:r-viewer.
API equivalent: POST /api/v1/auth/oidc/group-mappings with {"provider_id": "<id>", "group_name": "certctl-engineers", "role_id": "r-operator"}. MCP equivalent: certctl_auth_add_group_mapping.
Empty mapping list = nobody can log in via Keycloak (the fail-closed contract). Add at least one before announcing the SSO endpoint to users.
Verification
End-to-end login
- Open
https://<your-certctl-host>:8443/loginin a fresh incognito window. - The page renders an OIDC button block with
Sign in with Keycloak(the display name from the create-provider step). - Click it. The browser redirects to Keycloak, you authenticate as
alice, Keycloak redirects back to certctl, and you land on the dashboard. - Navigate to Auth → Sessions. You should see a row with your own actor ID, the IP you logged in from, and the current timestamp under "last seen".
Audit trail
curl https://<your-certctl-host>:8443/api/v1/audit?category=auth \
-H "Authorization: Bearer ${CERTCTL_API_KEY}" | jq '.events[] | select(.action == "auth.oidc_login_succeeded")'
You should see a row for the login above, with details.provider_id matching the Keycloak provider's id and details.subject set to the Keycloak user's sub claim (typically a UUID).
JWKS-rotation drill
Operator action when Keycloak rotates its realm signing key:
- In Keycloak: Realm settings → Keys → Providers → Add provider → rsa-generated, set priority higher than the current key (e.g. 200), enabled = on, active = on.
- In certctl: GUI → Auth → OIDC Providers → Keycloak → Refresh discovery cache button. Or the CLI / MCP equivalent:
POST /api/v1/auth/oidc/providers/<id>/refresh. - Run another login. The new ID token is signed under the new key; the certctl service validates it against the freshly-fetched JWKS doc.
The Keycloak integration test TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey exercises this exact flow end to end.
Troubleshooting
"Discovery doc fetch failed" at provider creation.
The most common cause is a wrong issuer URL — typo in realm name, missing /realms/ segment, or HTTP→HTTPS redirect that the Go client doesn't follow without explicit headers. Curl the URL manually:
curl -v https://<keycloak-host>/realms/<realm-name>/.well-known/openid-configuration
If that returns 404, fix the realm name. If it returns 200 but certctl still fails, check cmd/server logs for the wrapped error.
"IdP downgrade-attack defense" rejected provider creation.
Keycloak's realm has a signing key advertised in id_token_signing_alg_values_supported that's in certctl's deny-list (HS256/HS384/HS512/none). Check Realm settings → Keys → Providers — disable any HMAC key providers and re-create the provider in certctl.
Login redirects to Keycloak, the user authenticates, but the callback redirects back to /login with "no roles assigned".
The user authenticated successfully but their groups didn't match any configured mapping (ErrGroupsUnmapped). Check:
- The user is actually a member of the group you mapped (Users → user → Groups tab in Keycloak).
- The group-membership mapper is configured correctly (Clients → certctl → Client scopes → certctl-dedicated → mappers → groups → "Full group path: off" matters).
- The group name in your certctl mapping exactly matches what Keycloak emits — case-sensitive, no leading slash if "Full group path: off".
You can confirm what Keycloak is actually emitting by decoding the ID token at jwt.io against the Keycloak public key, or by enabling certctl's debug logging on the OIDC service for one login (logs are scrubbed of token contents per the OIDC service's token-leak hygiene contract; debug logs surface only the resolved group list and the mapping decision).
"id_token verify failed: token used before issued"
Clock skew between Keycloak and certctl-server. Either align both to NTP, or bump iat_window_seconds on the OIDC provider config (default 300 = 5 minutes). The certctl service caps iat_window_seconds at 600.
"oidc: pre-login session not found or already consumed" The user clicked the OIDC login button, then the browser tab idled past the 10-minute pre-login TTL OR the user opened the IdP login in a new tab and consumed the row from the first one. Have them retry.
"oidc: state parameter mismatch (replay or forgery)"
Either the user double-submitted a callback URL (clicked it twice from email or browser history), or a CSRF attempt. The pre-login row is single-use; second consumption returns ErrPreLoginNotFound. Have them retry from the login page.
Sessions revoked but the user can still hit the API.
Check the session contract: the cookie is HMAC-validated on every request, but the actual database row is what Revoke deletes. If your reverse proxy is caching the response or the __Host-certctl_session cookie wasn't actually cleared on the client, the cookie will hit the server's session middleware which will return 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case.
Validation checklist
Before signing off this runbook for production rollout, validate these end-to-end:
auth.oidc_provider_createdaudit row appears after the create-provider POST.Sign in with Keycloakbutton renders on the login page aftergetAuthInforeturns the configured provider.- A user with mapped groups completes the auth-code flow and lands on the dashboard.
- A user WITHOUT mapped groups gets the "no roles assigned" landing (not the dashboard).
- The
auth.oidc_login_succeededandauth.oidc_login_failedaudit rows correctly distinguish the two cases. - The Sessions page shows the new session, with self-pill on the caller's row.
- Revoking the session via the GUI causes the next API request from that browser to 401 + redirect to login.
- Running the JWKS-rotation drill (steps above) does not break in-flight logins; rotated tokens validate against the refreshed JWKS.
- Editing the provider with
client_secretblank preserves the existing ciphertext (operator confirms by reading theoidc_providers.client_secret_encryptedcolumn before + after the PUT — bytes unchanged).
Sign-off: _______________ (operator) on _______________ (date).