Closes Phase 11 of cowork/auth-bundle-2-prompt.md. Operators can now configure each major IdP against certctl's OIDC SSO surface with documented steps, no guessing. Files ===== docs/operator/oidc-runbooks/index.md (NEW): * Index page linking all six per-IdP runbooks. * Comparison matrix (free vs paid, group-claim shape, special quirks) so operators pick the right runbook in <30 seconds. * "Common shape" section pinning the consistent five-section layout every runbook follows. * "Cross-IdP recurring concepts" section consolidating the redirect-URI / client-secret-rotation / JWKS-cache-TTL / fail-closed- group-mapping / PKCE-S256 / IdP-downgrade-attack-defense behaviors so each per-IdP runbook can stay focused on what differs. docs/operator/oidc-runbooks/keycloak.md (NEW): * Canonical reference. Mirrors the testfixtures/keycloak-realm.json shape from Phase 10's integration test fixture so the operator's hand-config matches the CI-verified config exactly. * Step-by-step IdP-side: realm → client → groups → group-mapper → user. Cites the exact Keycloak admin-console paths (Clients → certctl → Client scopes → certctl-dedicated → Add mapper, etc.). * GUI + API + MCP equivalents for the certctl-side configuration. * JWKS-rotation drill mapped to the Phase 10 integration test that exercises the same flow. * 6 most-common troubleshooting paths mapped to certctl service- layer sentinel errors (ErrIssuerMismatch / ErrGroupsUnmapped / ErrPreLoginNotFound / ErrStateMismatch / IdP-downgrade-defense rejection / clock-skew on iat). docs/operator/oidc-runbooks/authentik.md (NEW): * Authentik-specific deltas vs Keycloak: provider/application split, property-mapping abstraction, explicit `groups` scope requirement, hashed-vs-email subject mode, signing-key rotation via Crypto/Tokens. docs/operator/oidc-runbooks/okta.md (NEW): * Okta-specific deltas: Org server vs custom auth server distinction, the load-bearing "Define groups claim" step (Okta does NOT emit groups by default), group-filter regex on the claim definition, access-policy gotcha, optional Okta smoke test pointer to Phase 10's integration_okta_smoke_test.go. docs/operator/oidc-runbooks/auth0.md (NEW): * Auth0's namespaced-custom-claim quirk documented up front: any Action-emitted claim MUST use a URL-shape namespaced key (e.g. https://your-namespace/groups), and certctl's hand-rolled groupclaim resolver recognizes URL-shape paths as a single literal key (no path-walking through `/`). Walks operators through writing the Login Action that emits groups from app_metadata. Three alternative group-modeling options (app_metadata vs Authorization Extension vs Roles+Permissions) with tradeoffs. docs/operator/oidc-runbooks/azure-ad.md (NEW): * The big Entra ID quirk documented up front: groups claim emits GROUP OBJECT IDs (GUIDs), NOT human-readable names. Certctl group→ role mappings MUST be configured against the GUIDs. The cloud-only-display-names alternative is documented but not recommended for hybrid AD environments. Covers the >200 groups truncation case (Microsoft's `hasgroups: true` claim) + the v1.0 vs v2.0 endpoint distinction (certctl supports v2.0 only). docs/operator/oidc-runbooks/google-workspace.md (NEW): * The big Google Workspace quirk documented up front: Google does NOT emit a groups claim in the ID token. Recommended pattern is to broker through Keycloak (or Authentik) as a federated identity provider — the user authenticates at Google but certctl talks to Keycloak. Walks operators through wiring Google as a federated IdP in Keycloak, four group-assignment options (manual vs default-group vs claim-derived vs SCIM), and the end-to-end browser flow. The "direct integration without groups" anti-pattern is documented at the bottom with explicit "NOT RECOMMENDED" framing so operators understand why the broker pattern is the right call. docs/README.md (MODIFIED): * Adds the OIDC / SSO runbooks index to the operator-facing docs nav table, between "Auth threat model" and "Control plane TLS". Conventions held ================ * Every runbook carries `> Last reviewed: 2026-05-10` per the docs convention. * Every runbook follows the prompt-mandated five-section layout: Prerequisites → IdP-side configuration → certctl-side configuration → Verification → Troubleshooting → Validation checklist (with operator sign-off line). * Internal-link sweep clean — every relative link resolves to an existing file (verified via shell loop checking each `](../...)` and `](*.md)` reference). External links to IdP vendor sites are the canonical https URLs. * No leakage of cowork/ workspace paths as Markdown links — the azure-ad.md initially had a `[auth-bundles-index.md](../../../../cowork/...)` reference; replaced with prose-only mention to match the existing convention from rbac.md + migration/api-keys-to-rbac.md. * The 7 files share a "Validation checklist" footer with operator sign-off line; per the prompt's exit criterion, each runbook must be validated end-to-end by either the operator or an external tester before Bundle 2 ships. Verification ============ * Last-reviewed dates: 7/7 runbooks dated 2026-05-10. * Internal-link sweep: 0 broken (every `]( ...)` reference resolves). * docs/README.md → operator/oidc-runbooks/index.md link resolves. * No backend / frontend / Go-test impact — pure docs commit. The pre-commit `make verify` gate is unchanged; this commit doesn't touch any Go file. Phase 11 deviation note ======================= The merge-gate criterion's "≥ 2 external testers" requirement is operator-driven and post-tag — Phase 11 ships the runbooks; the operator runs each end-to-end against a real production-tier IdP and fills in the sign-off footers before flipping Bundle 2 to "merged." Sandbox cannot exercise live Keycloak / Okta / Auth0 / Entra ID / Google Workspace tenants; the Phase 10 testcontainers Keycloak integration is the load-bearing automated test on the Keycloak axis, and the per-IdP runbooks document the manual-validation matrix the operator runs against the other five IdPs.
14 KiB
Keycloak OIDC runbook
Last reviewed: 2026-05-10
This is the canonical reference runbook for wiring certctl's OIDC SSO surface against Keycloak. Keycloak is a free / open-source identity provider that runs on-prem or self-hosted; it is also the load-bearing test fixture for Phase 10 of Auth Bundle 2 (internal/auth/oidc/testfixtures/keycloak.go), so the certctl-side validation pipeline is exhaustively exercised against it.
If your IdP is something else (Okta, Auth0, Azure AD, Authentik, Google Workspace), see the per-IdP siblings in this directory. The mental model + certctl-side wiring are identical; only the IdP-side console differs.
Prerequisites
On the Keycloak side:
- Keycloak ≥ 25.0 (older versions work but the screen flows differ slightly — the Phase 10 fixture pins 25.0).
- Admin access to a realm — either an existing tenant realm or a fresh one created for certctl. Don't share Keycloak's
masterrealm; create a dedicated realm. - Network reachability from certctl-server to the Keycloak
https://<keycloak-host>/realms/<realm-name>discovery endpoint. The certctl service fetches/.well-known/openid-configurationat provider creation and at everyRefreshKeyscall. - Keycloak's signing alg set to RS256 (default) or any of: RS512, ES256, ES384, EdDSA. HS256/HS384/HS512 +
noneare rejected by certctl's IdP-downgrade-attack defense at provider creation time.
On the certctl side:
CERTCTL_CONFIG_ENCRYPTION_KEYset to a stable secret (production deployments only — the encryption-at-rest layer for the OIDC client_secret depends on it).- An admin actor holding
auth.oidc.create+auth.oidc.edit(held byr-adminby default; granted viacertctl_auth_assign_role_to_keyMCP tool or the GUI's Auth → Keys page). - Bundle 2 server build ≥ v2.1.0 (or post-
5204f1bmaster).
IdP-side configuration
The same configuration you'll do by hand here is what the Phase 10 testcontainers fixture imports from internal/auth/oidc/testfixtures/keycloak-realm.json — read that file alongside this runbook to see the exact JSON shape Keycloak persists.
1. Create or pick a realm
In the Keycloak admin console (https://<keycloak-host>/admin/), drop into the realm you'll use. If creating a new one, the realm name will become part of the issuer URL: https://<keycloak-host>/realms/<realm-name>.
2. Create the OIDC client
Clients → Create client:
- Client type: OpenID Connect
- Client ID:
certctl(or whatever you prefer; it goes intoOIDCProvider.client_idon the certctl side). - Always display in console: off.
- Click Next.
On the capability config page:
- Client authentication: On (this makes the client confidential, which is what certctl requires).
- Authorization: off.
- Standard flow: on (auth-code with PKCE — this is the path certctl uses).
- Direct access grants: off (ROPC; the test fixture turns this on for ROPC convenience but production should NOT).
- Implicit flow: off.
- Service accounts roles: off.
- Click Next.
Login settings:
- Root URL: leave blank.
- Home URL: blank.
- Valid redirect URIs:
https://<your-certctl-host>:8443/auth/oidc/callback— ONE entry, exact match. Wildcards (*) work for local dev (http://localhost:*) but production should pin the exact host. - Valid post logout redirect URIs: blank or
+(matches the redirect URI list). - Web origins:
+(matches the redirect URI origin) or empty. - Click Save.
On the saved client's Credentials tab, copy the Client secret — you'll need it for the certctl-side payload.
3. Create the groups
Groups → Create group:
- Repeat for every certctl role you want to map to a group. A typical setup creates two:
certctl-engineers(intended target:r-operator)certctl-viewers(intended target:r-viewer)
- Optionally an
certctl-adminsgroup →r-adminfor break-glass-free first-admin bootstrap; see theauth-threat-model.mdsection on bootstrap admins.
4. Configure the group-membership claim mapper
This is the load-bearing step — without it, the ID token won't carry a groups claim and every login fails closed with ErrGroupsUnmapped.
Clients → certctl → Client scopes → certctl-dedicated → Add mapper → By configuration → Group Membership:
- Name:
groups - Token Claim Name:
groups - Full group path: off (so the claim emits
engineers, not/engineers; matches the certctlstring-arraygroup-claim format). - Add to ID token: on.
- Add to access token: on (optional but recommended; the userinfo-fallback path uses it).
- Add to userinfo: on.
- Click Save.
5. Create the user(s)
Users → Add user:
- Username:
alice(or however you identify operators). - Email: required (used as the certctl-side
User.Email). - First name + last name: optional but populates
User.DisplayName. - Email verified: on if you trust the user.
- Click Create.
On the saved user's Credentials tab:
- Set a password. Mark Temporary if you want the user to reset on first login.
On the Groups tab:
- Join the user to the group(s) you created in step 3.
certctl-side configuration
Via the GUI
- Sign in as an admin actor.
- Navigate to Auth → OIDC Providers in the sidebar.
- Click Configure provider.
- Fill in:
- Display name:
Keycloak(free-text; what end-users see on the login page button). - Issuer URL:
https://<keycloak-host>/realms/<realm-name>. - Client ID:
certctl(matches step 2 above). - Client secret: paste the secret from step 2's Credentials tab.
- Redirect URI:
https://<your-certctl-host>:8443/auth/oidc/callback. - Groups claim path:
groups(the default; matches step 4's Token Claim Name). - Groups claim format:
string-array(the default). - Fetch userinfo: off (Keycloak emits groups in the ID token; userinfo fallback is for IdPs that don't).
- Scopes:
openid profile email(the certctl service prependsopenidif missing). - IAT window seconds: 300 (default).
- JWKS cache TTL seconds: 3600 (default).
- Display name:
- Click Save.
If the discovery doc fetch fails, the modal surfaces the error inline. The most common cause is a typo in the issuer URL — Keycloak emits 404 for any path under /realms/ that doesn't match an actual realm.
Via the API
curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/providers \
-H "Authorization: Bearer ${CERTCTL_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "Keycloak",
"issuer_url": "https://keycloak.example.com/realms/certctl",
"client_id": "certctl",
"client_secret": "<paste-the-secret>",
"redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
"groups_claim_path": "groups",
"groups_claim_format": "string-array",
"fetch_userinfo": false,
"scopes": ["openid", "profile", "email"],
"iat_window_seconds": 300,
"jwks_cache_ttl_seconds": 3600
}'
Via MCP
certctl_auth_create_oidc_provider {
"name": "Keycloak",
"issuer_url": "https://keycloak.example.com/realms/certctl",
"client_id": "certctl",
"client_secret": "<paste-the-secret>",
"redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
"groups_claim_path": "groups",
"groups_claim_format": "string-array",
"scopes": ["openid", "profile", "email"]
}
Add the group→role mappings
GUI: Auth → OIDC Providers → Keycloak → Group → role mappings → Add.
- IdP group:
certctl-engineers→ certctl role:r-operator. - IdP group:
certctl-viewers→ certctl role:r-viewer.
API equivalent: POST /api/v1/auth/oidc/group-mappings with {"provider_id": "<id>", "group_name": "certctl-engineers", "role_id": "r-operator"}. MCP equivalent: certctl_auth_add_group_mapping.
Empty mapping list = nobody can log in via Keycloak (the fail-closed contract). Add at least one before announcing the SSO endpoint to users.
Verification
End-to-end login
- Open
https://<your-certctl-host>:8443/loginin a fresh incognito window. - The page renders an OIDC button block with
Sign in with Keycloak(the display name from the create-provider step). - Click it. The browser redirects to Keycloak, you authenticate as
alice, Keycloak redirects back to certctl, and you land on the dashboard. - Navigate to Auth → Sessions. You should see a row with your own actor ID, the IP you logged in from, and the current timestamp under "last seen".
Audit trail
curl https://<your-certctl-host>:8443/api/v1/audit?category=auth \
-H "Authorization: Bearer ${CERTCTL_API_KEY}" | jq '.events[] | select(.action == "auth.oidc_login_succeeded")'
You should see a row for the login above, with details.provider_id matching the Keycloak provider's id and details.subject set to the Keycloak user's sub claim (typically a UUID).
JWKS-rotation drill
Operator action when Keycloak rotates its realm signing key:
- In Keycloak: Realm settings → Keys → Providers → Add provider → rsa-generated, set priority higher than the current key (e.g. 200), enabled = on, active = on.
- In certctl: GUI → Auth → OIDC Providers → Keycloak → Refresh discovery cache button. Or the CLI / MCP equivalent:
POST /api/v1/auth/oidc/providers/<id>/refresh. - Run another login. The new ID token is signed under the new key; the certctl service validates it against the freshly-fetched JWKS doc.
The Phase 10 integration test TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey exercises this exact flow end to end.
Troubleshooting
"Discovery doc fetch failed" at provider creation.
The most common cause is a wrong issuer URL — typo in realm name, missing /realms/ segment, or HTTP→HTTPS redirect that the Go client doesn't follow without explicit headers. Curl the URL manually:
curl -v https://<keycloak-host>/realms/<realm-name>/.well-known/openid-configuration
If that returns 404, fix the realm name. If it returns 200 but certctl still fails, check cmd/server logs for the wrapped error.
"IdP downgrade-attack defense" rejected provider creation.
Keycloak's realm has a signing key advertised in id_token_signing_alg_values_supported that's in certctl's deny-list (HS256/HS384/HS512/none). Check Realm settings → Keys → Providers — disable any HMAC key providers and re-create the provider in certctl.
Login redirects to Keycloak, the user authenticates, but the callback redirects back to /login with "no roles assigned".
The user authenticated successfully but their groups didn't match any configured mapping (ErrGroupsUnmapped). Check:
- The user is actually a member of the group you mapped (Users → user → Groups tab in Keycloak).
- The group-membership mapper is configured correctly (Clients → certctl → Client scopes → certctl-dedicated → mappers → groups → "Full group path: off" matters).
- The group name in your certctl mapping exactly matches what Keycloak emits — case-sensitive, no leading slash if "Full group path: off".
You can confirm what Keycloak is actually emitting by decoding the ID token at jwt.io against the Keycloak public key, or by enabling certctl's debug logging on the OIDC service for one login (logs are scrubbed of token contents per the Phase 3 token-leak hygiene contract; debug logs surface only the resolved group list and the mapping decision).
"id_token verify failed: token used before issued"
Clock skew between Keycloak and certctl-server. Either align both to NTP, or bump iat_window_seconds on the OIDC provider config (default 300 = 5 minutes). The certctl service caps iat_window_seconds at 600.
"oidc: pre-login session not found or already consumed" The user clicked the OIDC login button, then the browser tab idled past the 10-minute pre-login TTL OR the user opened the IdP login in a new tab and consumed the row from the first one. Have them retry.
"oidc: state parameter mismatch (replay or forgery)"
Either the user double-submitted a callback URL (clicked it twice from email or browser history), or a CSRF attempt. The pre-login row is single-use; second consumption returns ErrPreLoginNotFound. Have them retry from the login page.
Sessions revoked but the user can still hit the API.
Check the Phase 4 session contract: the cookie is HMAC-validated on every request, but the actual database row is what Revoke deletes. If your reverse proxy is caching the response or the certctl_session cookie wasn't actually cleared on the client, the cookie will hit the server's session middleware which will return 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case.
Validation checklist
Before signing off this runbook for production rollout, validate these end-to-end:
auth.oidc_provider_createdaudit row appears after the create-provider POST.Sign in with Keycloakbutton renders on the login page aftergetAuthInforeturns the configured provider.- A user with mapped groups completes the auth-code flow and lands on the dashboard.
- A user WITHOUT mapped groups gets the "no roles assigned" landing (not the dashboard).
- The
auth.oidc_login_succeededandauth.oidc_login_failedaudit rows correctly distinguish the two cases. - The Sessions page shows the new session, with self-pill on the caller's row.
- Revoking the session via the GUI causes the next API request from that browser to 401 + redirect to login.
- Running the JWKS-rotation drill (steps above) does not break in-flight logins; rotated tokens validate against the refreshed JWKS.
- Editing the provider with
client_secretblank preserves the existing ciphertext (operator confirms by reading theoidc_providers.client_secret_encryptedcolumn before + after the PUT — bytes unchanged).
Sign-off: _______________ (operator) on _______________ (date).