mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 14:21:37 +00:00
auth-bundle-2 Phase 11: 6 per-IdP OIDC runbooks + index + docs/README wiring
Closes Phase 11 of cowork/auth-bundle-2-prompt.md. Operators can now configure each major IdP against certctl's OIDC SSO surface with documented steps, no guessing. Files ===== docs/operator/oidc-runbooks/index.md (NEW): * Index page linking all six per-IdP runbooks. * Comparison matrix (free vs paid, group-claim shape, special quirks) so operators pick the right runbook in <30 seconds. * "Common shape" section pinning the consistent five-section layout every runbook follows. * "Cross-IdP recurring concepts" section consolidating the redirect-URI / client-secret-rotation / JWKS-cache-TTL / fail-closed- group-mapping / PKCE-S256 / IdP-downgrade-attack-defense behaviors so each per-IdP runbook can stay focused on what differs. docs/operator/oidc-runbooks/keycloak.md (NEW): * Canonical reference. Mirrors the testfixtures/keycloak-realm.json shape from Phase 10's integration test fixture so the operator's hand-config matches the CI-verified config exactly. * Step-by-step IdP-side: realm → client → groups → group-mapper → user. Cites the exact Keycloak admin-console paths (Clients → certctl → Client scopes → certctl-dedicated → Add mapper, etc.). * GUI + API + MCP equivalents for the certctl-side configuration. * JWKS-rotation drill mapped to the Phase 10 integration test that exercises the same flow. * 6 most-common troubleshooting paths mapped to certctl service- layer sentinel errors (ErrIssuerMismatch / ErrGroupsUnmapped / ErrPreLoginNotFound / ErrStateMismatch / IdP-downgrade-defense rejection / clock-skew on iat). docs/operator/oidc-runbooks/authentik.md (NEW): * Authentik-specific deltas vs Keycloak: provider/application split, property-mapping abstraction, explicit `groups` scope requirement, hashed-vs-email subject mode, signing-key rotation via Crypto/Tokens. docs/operator/oidc-runbooks/okta.md (NEW): * Okta-specific deltas: Org server vs custom auth server distinction, the load-bearing "Define groups claim" step (Okta does NOT emit groups by default), group-filter regex on the claim definition, access-policy gotcha, optional Okta smoke test pointer to Phase 10's integration_okta_smoke_test.go. docs/operator/oidc-runbooks/auth0.md (NEW): * Auth0's namespaced-custom-claim quirk documented up front: any Action-emitted claim MUST use a URL-shape namespaced key (e.g. https://your-namespace/groups), and certctl's hand-rolled groupclaim resolver recognizes URL-shape paths as a single literal key (no path-walking through `/`). Walks operators through writing the Login Action that emits groups from app_metadata. Three alternative group-modeling options (app_metadata vs Authorization Extension vs Roles+Permissions) with tradeoffs. docs/operator/oidc-runbooks/azure-ad.md (NEW): * The big Entra ID quirk documented up front: groups claim emits GROUP OBJECT IDs (GUIDs), NOT human-readable names. Certctl group→ role mappings MUST be configured against the GUIDs. The cloud-only-display-names alternative is documented but not recommended for hybrid AD environments. Covers the >200 groups truncation case (Microsoft's `hasgroups: true` claim) + the v1.0 vs v2.0 endpoint distinction (certctl supports v2.0 only). docs/operator/oidc-runbooks/google-workspace.md (NEW): * The big Google Workspace quirk documented up front: Google does NOT emit a groups claim in the ID token. Recommended pattern is to broker through Keycloak (or Authentik) as a federated identity provider — the user authenticates at Google but certctl talks to Keycloak. Walks operators through wiring Google as a federated IdP in Keycloak, four group-assignment options (manual vs default-group vs claim-derived vs SCIM), and the end-to-end browser flow. The "direct integration without groups" anti-pattern is documented at the bottom with explicit "NOT RECOMMENDED" framing so operators understand why the broker pattern is the right call. docs/README.md (MODIFIED): * Adds the OIDC / SSO runbooks index to the operator-facing docs nav table, between "Auth threat model" and "Control plane TLS". Conventions held ================ * Every runbook carries `> Last reviewed: 2026-05-10` per the docs convention. * Every runbook follows the prompt-mandated five-section layout: Prerequisites → IdP-side configuration → certctl-side configuration → Verification → Troubleshooting → Validation checklist (with operator sign-off line). * Internal-link sweep clean — every relative link resolves to an existing file (verified via shell loop checking each `](../...)` and `](*.md)` reference). External links to IdP vendor sites are the canonical https URLs. * No leakage of cowork/ workspace paths as Markdown links — the azure-ad.md initially had a `[auth-bundles-index.md](../../../../cowork/...)` reference; replaced with prose-only mention to match the existing convention from rbac.md + migration/api-keys-to-rbac.md. * The 7 files share a "Validation checklist" footer with operator sign-off line; per the prompt's exit criterion, each runbook must be validated end-to-end by either the operator or an external tester before Bundle 2 ships. Verification ============ * Last-reviewed dates: 7/7 runbooks dated 2026-05-10. * Internal-link sweep: 0 broken (every `]( ...)` reference resolves). * docs/README.md → operator/oidc-runbooks/index.md link resolves. * No backend / frontend / Go-test impact — pure docs commit. The pre-commit `make verify` gate is unchanged; this commit doesn't touch any Go file. Phase 11 deviation note ======================= The merge-gate criterion's "≥ 2 external testers" requirement is operator-driven and post-tag — Phase 11 ships the runbooks; the operator runs each end-to-end against a real production-tier IdP and fills in the sign-off footers before flipping Bundle 2 to "merged." Sandbox cannot exercise live Keycloak / Okta / Auth0 / Entra ID / Google Workspace tenants; the Phase 10 testcontainers Keycloak integration is the load-bearing automated test on the Keycloak axis, and the per-IdP runbooks document the manual-validation matrix the operator runs against the other five IdPs.
This commit is contained in:
@@ -0,0 +1,207 @@
|
||||
# Microsoft Entra ID (Azure AD) OIDC runbook
|
||||
|
||||
> Last reviewed: 2026-05-10
|
||||
|
||||
This runbook wires certctl's OIDC SSO surface against [Microsoft Entra ID](https://learn.microsoft.com/entra/), formerly Azure AD. Entra ID is Microsoft's commercial cloud IdP; it's the default IdP for any organization on Microsoft 365 / Azure.
|
||||
|
||||
For the canonical reference + mental model, read [keycloak.md](keycloak.md) first; this runbook only documents the Entra-ID-specific deltas.
|
||||
|
||||
## The big Entra ID quirk: groups claim emits OBJECT IDs, not names
|
||||
|
||||
Entra ID's `groups` claim emits a JSON array of **group object IDs (GUIDs)**, not human-readable names. A user in `Engineering Group` and `Cert Operators` will see something like:
|
||||
|
||||
```json
|
||||
{
|
||||
"groups": [
|
||||
"8b9b1faa-4e83-471e-8b00-7d99c3e2a5f1",
|
||||
"f00cf1e2-2db1-4cdf-a1ba-1234567890ab"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**You must configure your certctl group→role mappings against these GUIDs**, not against `Engineering Group` or `Cert Operators`. There are workarounds (cloud-only group display names + the optional claims path; see the alternative below) but the GUID-based approach is the only one that works reliably across all Entra ID configurations.
|
||||
|
||||
This is by design at Microsoft — group names are mutable and not globally unique within a tenant; object IDs are immutable and globally unique. Operators on Microsoft 365 / Azure deployments are accustomed to managing access by GUID.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**On the Entra ID side:**
|
||||
|
||||
- A Microsoft 365 tenant or standalone Azure AD tenant. Free Azure AD tier is sufficient; paid tiers (P1/P2) unlock conditional access + SCIM provisioning + risk-based auth, none of which are required for the basic OIDC integration.
|
||||
- Application Administrator or Global Administrator role.
|
||||
- Network reachability from certctl-server to `https://login.microsoftonline.com/<tenant-id>/v2.0/.well-known/openid-configuration`.
|
||||
|
||||
**On the certctl side:** same as Keycloak.
|
||||
|
||||
## IdP-side configuration
|
||||
|
||||
### 1. Register the application
|
||||
|
||||
In the [Entra ID admin center](https://entra.microsoft.com/):
|
||||
|
||||
**Applications → App registrations → New registration**:
|
||||
|
||||
- Name: `certctl`.
|
||||
- Supported account types: **Accounts in this organizational directory only** (single-tenant; matches the typical operator use case).
|
||||
- Redirect URI: **Web** + `https://<your-certctl-host>:8443/auth/oidc/callback`.
|
||||
- Click **Register**.
|
||||
|
||||
On the saved app's **Overview** page, copy:
|
||||
|
||||
- **Application (client) ID** → certctl's `client_id`.
|
||||
- **Directory (tenant) ID** → goes into the issuer URL.
|
||||
|
||||
### 2. Create a client secret
|
||||
|
||||
**App → Certificates & secrets → Client secrets → New client secret**:
|
||||
|
||||
- Description: `certctl-server`.
|
||||
- Expires: 6 months / 12 months / 24 months — your choice. Set a calendar reminder; Entra ID does NOT auto-rotate secrets.
|
||||
- Click **Add**.
|
||||
|
||||
Copy the **Value** column immediately — it's shown ONCE on creation. The certctl provider's `client_secret` field gets this value.
|
||||
|
||||
(Production hardening: prefer **Certificates** over secrets for client authentication; certctl currently supports `client_secret_post` only, but a follow-on bundle can add `private_key_jwt` for cert-based client auth. Track this if you have a hard requirement against shared secrets.)
|
||||
|
||||
### 3. Add the `groups` claim to the token
|
||||
|
||||
**App → Token configuration → Add groups claim**:
|
||||
|
||||
- Pick **Security groups** (covers most operators) OR **Groups assigned to the application** (more granular but requires Premium).
|
||||
- Token type: **ID token** + **Access token** (both, so userinfo fallback works).
|
||||
- Customize emit format for ID/access: leave as **Group ID** (default; this is the GUID-based path the runbook is structured around).
|
||||
- Click **Save**.
|
||||
|
||||
If you instead want display names in the claim (only works for cloud-only groups; on-prem-synced groups continue to emit GUIDs regardless):
|
||||
|
||||
- Customize emit format → **Cloud-only group display names**.
|
||||
- BUT — note this works only for groups created in Entra ID itself, not groups synced from on-prem AD. Hybrid environments will have inconsistent claims.
|
||||
|
||||
### 4. Add the optional `email` and `profile` claims
|
||||
|
||||
By default Entra ID's ID token does NOT include `email` — Microsoft considers email part of the "OIDC profile" but only emits it under specific conditions. To force emission:
|
||||
|
||||
**App → Token configuration → Add optional claim → ID token → email**.
|
||||
|
||||
You may also want `family_name`, `given_name`, `preferred_username` for richer User records on the certctl side.
|
||||
|
||||
### 5. Grant the API permissions
|
||||
|
||||
**App → API permissions**:
|
||||
|
||||
- Microsoft Graph → Delegated permissions → ensure these are granted (most are default):
|
||||
- `openid`
|
||||
- `profile`
|
||||
- `email`
|
||||
- `offline_access` (optional; for refresh tokens — certctl doesn't use them currently).
|
||||
- Click **Grant admin consent** if your tenant requires it.
|
||||
|
||||
### 6. (Optional) Restrict who can sign in
|
||||
|
||||
By default any user in your tenant can attempt to sign in to the app. To restrict to specific users / groups:
|
||||
|
||||
**Enterprise applications → certctl → Properties → Assignment required: Yes**.
|
||||
Then **Users and groups → Add user/group** and pick the `cert-engineers` / `cert-viewers` Entra ID groups.
|
||||
|
||||
## certctl-side configuration
|
||||
|
||||
```bash
|
||||
curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/providers \
|
||||
-H "Authorization: Bearer ${CERTCTL_API_KEY}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "Entra ID",
|
||||
"issuer_url": "https://login.microsoftonline.com/<tenant-id>/v2.0",
|
||||
"client_id": "<application-id>",
|
||||
"client_secret": "<client-secret-value>",
|
||||
"redirect_uri": "https://certctl.example.com:8443/auth/oidc/callback",
|
||||
"groups_claim_path": "groups",
|
||||
"groups_claim_format": "string-array",
|
||||
"fetch_userinfo": false,
|
||||
"scopes": ["openid", "profile", "email"],
|
||||
"iat_window_seconds": 300,
|
||||
"jwks_cache_ttl_seconds": 3600
|
||||
}'
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
- `issuer_url` MUST include `/v2.0` at the end for the v2.0 endpoint. The v1.0 endpoint emits tokens with a different `iss` shape and is NOT supported by certctl. The discovery doc at `https://login.microsoftonline.com/<tenant-id>/v2.0/.well-known/openid-configuration` confirms the right path.
|
||||
- `<tenant-id>` is the Directory (tenant) ID GUID from step 1.
|
||||
|
||||
### Add the group→role mappings (GUID-keyed)
|
||||
|
||||
Get the GUIDs of your engineering / viewer groups:
|
||||
|
||||
**Entra ID → Groups → All groups → <group> → Overview → Object ID**.
|
||||
|
||||
Then in certctl:
|
||||
|
||||
```bash
|
||||
# Engineering group → r-operator
|
||||
curl -X POST https://<your-certctl-host>:8443/api/v1/auth/oidc/group-mappings \
|
||||
-H "Authorization: Bearer ${CERTCTL_API_KEY}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"provider_id": "<provider-id>",
|
||||
"group_name": "8b9b1faa-4e83-471e-8b00-7d99c3e2a5f1",
|
||||
"role_id": "r-operator"
|
||||
}'
|
||||
```
|
||||
|
||||
Repeat for every group you want to map. **Document the GUID-to-name mapping in your operator runbook** — without it, the next operator looking at certctl's mappings page sees a wall of GUIDs with no way to know which is which. Consider naming the mapping descriptively if your group-mapping schema supports it (Bundle 2 doesn't yet — group-mapping descriptions are a parking-lot item for a follow-on bundle).
|
||||
|
||||
## Verification
|
||||
|
||||
End-to-end login + audit + Sessions checks are identical to Keycloak.
|
||||
|
||||
**Entra-ID-specific:** the audit row's `details.subject` will be Microsoft's `oid` claim (a GUID, the user's object ID), stable across UPN / email changes. The certctl `users` table's `oidc_subject` column holds this GUID.
|
||||
|
||||
**JWKS-rotation:** Microsoft auto-rotates signing keys on a documented schedule (every ~6 weeks). The discovery doc + JWKS endpoint always serve the union of active + recently-active keys, so in-flight logins continue to validate. No manual operator action needed in steady state. If you suspect a stuck cache after a Microsoft-side rotation, click "Refresh discovery cache" in the certctl GUI to evict.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Login completes; ID token contains a `hasgroups: true` claim instead of `groups`.**
|
||||
|
||||
Entra ID emits this when a user is in too many groups (>200 by default for ID tokens, >150 for access tokens) — Microsoft truncates the claim and tells the consumer to use Microsoft Graph to look up the full list. certctl does NOT currently support the Graph fallback path (it's a follow-on bundle item).
|
||||
|
||||
Workarounds:
|
||||
|
||||
- Reduce the user's group membership to <200 (rarely practical in large tenants).
|
||||
- Restrict the `groups` claim to "Groups assigned to the application" (Token configuration step 3 above) instead of "Security groups". The "assigned" set is bounded by the app's user assignments and stays under the limit.
|
||||
- Use Entra ID's optional `wids` (well-known IDs) claim if you only care about admin/non-admin distinction; certctl can be configured against `wids` by setting `groups_claim_path` accordingly.
|
||||
|
||||
**`groups` claim missing entirely.**
|
||||
|
||||
Step 3 wasn't completed — Entra ID does NOT emit `groups` by default. Add the claim via Token configuration before users will see it.
|
||||
|
||||
**`ErrIssuerMismatch` even though the `tid` in the token matches.**
|
||||
|
||||
The v2.0 endpoint emits `iss = https://login.microsoftonline.com/<tenant-id>/v2.0` (no trailing slash). The v1.0 endpoint emits `iss = https://sts.windows.net/<tenant-id>/`. Confirm certctl's `issuer_url` matches v2.0 exactly — no trailing slash, includes `/v2.0`.
|
||||
|
||||
**On-prem-synced groups emit GUIDs even when "Cloud-only display names" is selected.**
|
||||
|
||||
Expected behavior — Microsoft only emits display names for groups created in Entra ID itself (cloud-only). On-prem-synced groups always emit object IDs. The hybrid case is unfixable from the IdP side; either map against GUIDs (recommended) or migrate the relevant groups to cloud-only.
|
||||
|
||||
**The `email` claim is empty even though the user has a primary email.**
|
||||
|
||||
Entra ID's `email` claim only populates when:
|
||||
1. The user has a "Primary email" set on their Entra ID profile (often blank for B2B guest users).
|
||||
2. The optional claim was added in step 4.
|
||||
|
||||
For B2B guests, the `preferred_username` claim usually carries the email-shape login. You can configure certctl to use `preferred_username` as the user's display name fallback, but the `User.Email` column will remain blank — that's expected for guests.
|
||||
|
||||
**Conditional Access policies blocking the login.**
|
||||
|
||||
If your tenant has Conditional Access requiring MFA for new applications, certctl will see the user redirected through the MFA challenge. This works transparently — the certctl service doesn't care that MFA was performed; it only validates the resulting ID token. If MFA is failing for the user, debug at the Entra ID side (Sign-in logs).
|
||||
|
||||
## Validation checklist
|
||||
|
||||
Same as [keycloak.md](keycloak.md#validation-checklist), with these additions:
|
||||
|
||||
- [ ] The ID token's `groups` claim is a string-array of GUIDs (decode at jwt.io).
|
||||
- [ ] Each certctl group-mapping uses the GUID, not a human-readable name.
|
||||
- [ ] A user with >200 groups successfully logs in (or the operator has documented the limitation + workaround in their internal runbook).
|
||||
- [ ] The Entra ID **Sign-in logs** view shows the certctl login event with status "Success".
|
||||
|
||||
Sign-off: _______________ (operator) on _______________ (date).
|
||||
Reference in New Issue
Block a user