auth-bundle-2 Phase 16: docs updates (security.md OIDC + sessions + break-glass + auditor split sections; new migration/oidc-enable.md; CHANGELOG.md v2.1.0 Bundle 2 release notes)

Closes Phase 16 of cowork/auth-bundle-2-prompt.md. Three operator-
facing docs updated, one new migration guide ships, README nav row
added.

Files
=====

docs/operator/security.md (MODIFIED, Last reviewed bumped to 2026-05-10):
* Added 5 new Bundle 2 subsections under '## Authentication
  surface' after the Bundle 1 approval-bypass-closure entry:
  - 'OIDC federation (Bundle 2 Phases 1-7)' — alg allow-list,
    IdP-downgrade defense, iss/aud/azp/at_hash, single-use
    state+nonce, PKCE-S256 mandatory, JWKS rotation handling,
    encrypted client_secret at rest with the v3 blob format
    pinned by an integration test, pointer to oidc-runbooks/
    for per-IdP setup.
  - 'Sessions + back-channel logout (Bundle 2 Phases 4-6)' —
    length-prefixed HMAC cookie wire format, HttpOnly + Secure
    + SameSite cookie hardening, idle/absolute timeouts, CSRF
    defense, signing-key rotation primitive, fail-fatal
    EnsureInitialSigningKey at server boot, OpenID Connect
    Back-Channel Logout 1.0 (NOT RFC 8414).
  - 'OIDC first-admin bootstrap (Bundle 2 Phase 7)' — coexists
    with Bundle 1's env-var-token bootstrap, group-scoped via
    CERTCTL_BOOTSTRAP_ADMIN_GROUPS + CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID,
    one-shot per tenant.
  - 'Break-glass admin (Bundle 2 Phase 7.5)' — default-OFF,
    surface invisibility via 404-not-403, Argon2id with OWASP
    2024 params, lockout state machine, constant-time-via-
    verifyDummy, WARN log at boot, runbook pointer for
    operator drill.
  - 'Migrating an existing deployment to OIDC' — pointer to
    the new migration/oidc-enable.md walkthrough.

docs/migration/oidc-enable.md (NEW, Last reviewed 2026-05-10):
* Step-by-step migration guide for an operator on a Bundle-1-merged
  deployment to enable OIDC SSO. Pre-reqs (CERTCTL_CONFIG_ENCRYPTION_KEY,
  admin actor with auth.oidc.create + auth.oidc.edit, IdP tenant)
  + 7 numbered steps (pin encryption key, complete IdP-side per
  runbook, configure certctl-side OIDCProvider, add group→role
  mappings with fail-closed warning, optional first-admin bootstrap,
  verify with single test user, announce SSO endpoint).
* Rollback section covering the 4-step disable flow + the 409
  Conflict on provider-delete-while-sessions-exist + the
  existing-sessions-keep-working-until-expiry semantics.
* Troubleshooting section pinning 8 most-common failure modes
  (discovery doc fetch fails / IdP downgrade defense rejects /
  no roles assigned / iss mismatch / pre-login expired / state
  mismatch / sessions revoked but user can hit API / JWKS
  rotation breaks login).
* Database row count drift documented so operators know what to
  expect after OIDC is live (10 Bundle 2 tables enumerated).
* Cross-references to oidc-runbooks/ + security.md +
  auth-threat-model.md + auth-benchmarks.md + auth-standards-implemented.md.

CHANGELOG.md (MODIFIED):
* v2.1.0 section title bumped from 'Auth Bundle 1: RBAC primitive'
  to 'Auth Bundles 1 + 2: RBAC primitive + OIDC SSO + sessions'.
* Replaced the Bundle 1 closing-bullet ('Bundle 2 starts after
  Bundle 1 lands on master') with 18 new Bundle 2 entries:
  - OIDC + sessions + back-channel logout + break-glass overview.
  - OIDC token validation pinned at three layers (alg allow-list,
    IdP-downgrade defense, OIDC Core §3.1.3.7 re-verification).
  - Length-prefixed HMAC session cookies.
  - CSRF double-submit + hashed-token-on-row.
  - OIDC client_secret AES-256-GCM v3 blob at rest +
    integration-test invariant.
  - OIDC first-admin bootstrap.
  - Default-OFF break-glass admin (Argon2id + lockout +
    constant-time + surface invisibility).
  - GUI: 4 new pages + login-page IdP buttons + sidebar logout.
  - 11 new MCP tools for OIDC + session management.
  - 6 per-IdP runbooks (Keycloak / Authentik / Okta / Auth0 /
    Entra ID / Google Workspace).
  - Threat model extended with 5 new defense subsections + 8 new
    threat-catalogue subsections.
  - Performance baselines documented (4 benchmarks; 3 measured
    + 1 operator-runs).
  - Standards-and-RFC implementation table (13 RFCs + 14 CWEs;
    NOT a compliance-mapping doc).
  - Coverage gates held at floor 90 across all 4 Bundle 2
    packages (anti-Bundle-1-mistake invariant).
  - Multi-tenant query CI guard (ratchet baseline 32).
  - Phase 10 Keycloak testcontainers integration test + optional
    Okta smoke test.
  - OpenAPI cookieAuth security scheme + 13 new endpoints + 4
    break-glass endpoints.
  - Bundle-1-only compat regression CI guard +
    Bundle-1-to-2-upgrade regression CI guard.
* Final paragraph updated to point at oidc-enable.md alongside
  api-keys-to-rbac.md as the two migration walkthroughs.

docs/README.md (MODIFIED):
* Added the new oidc-enable.md migration row under '## Migration'
  alongside the existing api-keys-to-rbac.md entry, with a
  one-line description flagging it as the Bundle 2 OIDC
  onboarding walkthrough.

Verification
============

* Last-reviewed on security.md + oidc-enable.md: 2026-05-10.
* Internal-link sweep on oidc-enable.md: 0 broken (every relative
  link resolves via shell-loop verification).
* Internal-link sweep on docs/README.md: 0 broken (all .md
  references resolve).
* No Go-side impact, make verify gate unchanged.

Bundle 2 documentation deliverables now complete: security.md +
auth-threat-model.md + oidc-runbooks/ + auth-benchmarks.md +
auth-standards-implemented.md + api-keys-to-rbac.md + oidc-enable.md
+ CHANGELOG.md v2.1.0. The full Bundle 2 surface is operator-
discoverable from docs/README.md root nav.
This commit is contained in:
shankar0123
2026-05-10 17:07:27 +00:00
parent 3f335af45e
commit c03d18bb1c
4 changed files with 550 additions and 8 deletions
+160 -7
View File
@@ -1,6 +1,6 @@
# Changelog
## v2.1.0 - Auth Bundle 1: RBAC primitive ⚠️
## v2.1.0 - Auth Bundles 1 + 2: RBAC primitive + OIDC SSO + sessions ⚠️
> **SECURITY: AUDIT YOUR API KEYS.**
>
@@ -87,15 +87,168 @@ What else changed in v2.1.0:
`phase12_protocol_allowlist_test.go` AST scan all guard against
accidentally wrapping ACME / SCEP / EST / OCSP / CRL routes in
`rbacGate`.
- **Bundle 2 (OIDC + sessions) starts after Bundle 1 lands on
master.** Roadmap entry remains in `cowork/auth-bundle-2-prompt.md`.
- **Bundle 2: OIDC + sessions + back-channel logout + break-glass.**
Auth Bundle 2 ships in the same v2.1.0 release. Operators get OIDC
SSO support for Keycloak / Authentik / Okta / Auth0 / Microsoft
Entra ID / Google Workspace (via Keycloak broker), HMAC-signed
session cookies with idle/absolute timeouts + CSRF defense,
back-channel logout per OpenID Connect Back-Channel Logout 1.0,
and a default-OFF break-glass admin path with Argon2id passwords
for SSO-broken incidents. API-key auth keeps working unchanged
alongside; existing automation needs no changes. Migration walkthrough
at [`docs/migration/oidc-enable.md`](docs/migration/oidc-enable.md);
per-IdP setup guides at
[`docs/operator/oidc-runbooks/index.md`](docs/operator/oidc-runbooks/index.md).
- **OIDC token validation pinned at three layers.** Algorithm
allow-list (RS256/RS512/ES256/ES384/EdDSA only) with HS-family + `none`
rejected at the service-layer sentinel; IdP-downgrade-attack defense
at provider creation AND every JWKS RefreshKeys (intersects the IdP's
advertised `id_token_signing_alg_values_supported` against the allow-
list, rejects providers that advertise weak algs even before any
token is signed); OIDC Core §3.1.3.7 re-verification of `iss` /
`aud` / `azp` / `at_hash` (REQUIRED-when-access_token-present per
Phase 3 tightening of the spec MAY → MUST) / `exp` / `iat` window
/ `nonce` constant-time-compare. PKCE-S256 mandatory; `plain`
rejected. Single-use state + nonce via atomic `DELETE...RETURNING`
on consume.
- **Session cookies use length-prefixed HMAC.** The cookie wire format
is `v1.<session_id>.<signing_key_id>.<base64url-no-pad(HMAC-SHA256)>`
with HMAC input `len:sid:len:kid` (NOT bare-concat) to defeat
concatenation collisions. `HttpOnly` + `Secure` + `SameSite=Lax`
default; `SameSite=Strict` configurable via `CERTCTL_SESSION_SAMESITE`.
Idle timeout 1h / absolute 8h defaults; scheduler GC sweeps expired
rows hourly. Signing keys rotate via the new `RotateSigningKey`
primitive; the old key stays valid for `CERTCTL_SESSION_SIGNING_KEY_RETENTION`
(default 24h) so existing cookies validate during rollover.
- **CSRF defense via double-submit-cookie + hashed-token-on-row.**
Plaintext CSRF token in the JS-readable `certctl_csrf` cookie
(intentionally `HttpOnly=false` for the GUI to echo into the
`X-CSRF-Token` header); SHA-256 hash on the session row;
`subtle.ConstantTimeCompare` in the new `CSRFMiddleware`. API-key
actors are CSRF-exempt (no session row in context).
- **OIDC `client_secret` encrypted at rest.** AES-256-GCM v3 blob
format (magic 0x03 + salt(16) + nonce(12) + ciphertext+tag) using
the existing `CERTCTL_CONFIG_ENCRYPTION_KEY`. Encryption invariant
pinned by an integration test asserting ciphertext != plaintext +
v3 blob shape + round-trip recovery + wrong-passphrase fails.
- **OIDC first-admin bootstrap.** New `CERTCTL_BOOTSTRAP_ADMIN_GROUPS`
+ `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` env vars: the first
OIDC-authenticated user with a matching group claim becomes admin
per tenant. Coexists with the Bundle 1 env-var-token bootstrap;
the admin-existence probe ensures only one wins. Audit row
(`bootstrap.oidc_first_admin`) on every grant.
- **Break-glass admin (default-OFF).** New `CERTCTL_BREAKGLASS_ENABLED`
env var (default `false`). When enabled, the local Argon2id-password
admin path bypasses OIDC + group-claim layers — intended ONLY for
SSO-broken incidents. Argon2id with OWASP 2024 params (m=64 MiB,
t=3, p=4); lockout after 5 failures (configurable); constant-time
across all failure paths via `verifyDummy`; surface invisibility
(HTTP 404 on every endpoint when disabled, NOT 403). WARN log at
server boot when enabled. WebAuthn/FIDO2 second factor pairing on
the v3 roadmap (Decision 12).
- **GUI: OIDC Providers + Group → Role Mappings + Sessions + login
buttons.** Four new pages under `/auth/*` consume the Bundle 2 API
surface. Login page renders one "Sign in with X" button per
configured OIDC provider (in addition to the API-key form, which
remains as a fallback for Bearer-mode + break-glass paths). Sessions
page exposes own-sessions + admin all-actors view. Every actionable
element is permission-gated server-side via `auth.oidc.*` and
`auth.session.*` perms; client-side hide is UX layer. Logout button
in the sidebar fires `POST /auth/logout` to clear the session
server-side before redirecting to login.
- **MCP server gains 11 OIDC + session tools.** `certctl_auth_list_oidc_providers`,
`_get_oidc_provider`, `_create_oidc_provider`, `_update_oidc_provider`,
`_delete_oidc_provider`, `_refresh_oidc_provider`,
`_list_group_mappings`, `_add_group_mapping`, `_remove_group_mapping`,
`_list_sessions`, `_revoke_session`. Operator-facing MCP tool count
goes 12 (Bundle 1 RBAC) → 23 across the auth surface. Total MCP
tool count: `grep -cE 'mcp\.AddTool\(' internal/mcp/tools*.go` ≈ 150.
- **Per-IdP runbooks: 6 production-tier setup guides** at
`docs/operator/oidc-runbooks/`. Each runbook follows a consistent
five-section layout (Prerequisites / IdP-side config / certctl-side
config / Verification / Troubleshooting + Validation checklist with
operator sign-off line). Keycloak is the canonical reference;
Authentik / Okta / Auth0 / Entra ID / Google Workspace document the
IdP-specific deltas (Auth0's namespaced custom claims; Entra ID's
group OBJECT IDs; Google Workspace's missing-groups-claim limitation
+ the recommended Keycloak broker pattern).
- **Threat model extended.** [`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md)
ships 5 new "Defenses Bundle 2 ships" subsections + 8 new threat-
catalogue subsections (OIDC token forgery / session hijacking / IdP
compromise / back-channel logout failure modes / group-claim
manipulation / bootstrap risks / break-glass risks / token-leak
hygiene). 6 new SQL-shaped operator-facing checks. New "Threats
Bundle 2 does NOT close" section enumerating the 8 v3-backlog items
(WebAuthn / JIT elevation / SAML / multi-tenant activation /
HSM-FIPS / OIDC RP-initiated logout / Playwright / per-IdP
external-tester sign-off).
- **Performance baselines documented.** [`docs/operator/auth-benchmarks.md`](docs/operator/auth-benchmarks.md)
ships four benchmarks with measured baselines on a 4 vCPU /
8 GiB / Postgres 16 / Go 1.25 floor: `BenchmarkSession_SteadyState`
p99 5 µs (target < 1 ms; 200× under), `BenchmarkSession_ColdProcess`
p99 7.1 ms (target < 10 ms), `BenchmarkOIDC_SteadyState` p99 1.5 ms
(target < 5 ms), `BenchmarkOIDC_ColdCache` operator-runs against
live Keycloak via `make benchmark-auth-coldcache`.
- **Standards + RFC implementation table.** [`docs/reference/auth-standards-implemented.md`](docs/reference/auth-standards-implemented.md)
ships 13 RFC / standard rows + 14 CWE rows with concrete file paths
+ negative-test anchors per row. NOT a compliance-mapping doc per
the operator's 2026-05-05 retired-compliance-docs decision; the
doc explicitly says "build the framework mapping yourself against
the rows here using the framework-mapping methodology your audit
firm prescribes; this project does not own that mapping."
- **Coverage gates held at floor 90 across all four Bundle 2
packages.** `internal/auth/oidc/` 93.7%, `internal/auth/session/`
94.9%, `internal/auth/breakglass/` 91.5%, `internal/auth/user/domain/`
96.4%. NO held-low-with-rationale entry — the Phase 13 prompt's
anti-Bundle-1-mistake rule held. Bundle 1's existing 85% floors
for `internal/auth/` + `internal/service/auth/` stay 85
(already-shipped-and-accepted) per the prompt's explicit
inheritance rule.
- **Multi-tenant query CI guard.** New `scripts/ci-guards/multi-tenant-query-coverage.sh`
(ratchet-style, baseline 32 at v2.1.0 close): greps every
SELECT/UPDATE/DELETE in `internal/repository/postgres/` against
10 tenant-aware tables, fails on regression OR improvement (forces
the operator to lift / lower the baseline visibly). Forward-compat
protection so a future Bundle 3 / managed-service multi-tenant
activation can flip the switch without finding silent
tenant-data-leak bugs in shipped queries.
- **Phase 10 Keycloak testcontainers integration test.** New build-tag-
gated suite at `internal/auth/oidc/testfixtures/` + `integration_keycloak_test.go`
drives the full OIDC flow against a live Keycloak container booted
by testcontainers-go. 5-test matrix: discovery + JWKS load, full
PKCE auth-code happy path with HTTP form scraping, logout-revokes-
session, JWKS rotation, unmapped-groups-fails-closed. Reuses one
container across the matrix to amortize the 60-90s boot. Optional
Okta smoke test (build-tagged `integration && okta_smoke`) for live
tenant validation. New Makefile targets: `make keycloak-integration-test`
+ `make okta-smoke-test` + `make benchmark-auth-coldcache`.
- **OpenAPI surface extended.** New `cookieAuth` security scheme
(apiKey/cookie/`certctl_session`) alongside the existing
`bearerAuth`. 13 new Bundle 2 endpoints across the OIDC + session
+ group-mapping CRUD surface; 4 break-glass endpoints with
surface-invisibility framing. The N-bundle-2-security-empty-preserved
CI guard locks the `security: []` opt-out count at ≥ 14 so existing
public endpoints stay public.
- **Bundle-1-only compat regression CI guard.** New
`scripts/ci-guards/bundle-1-compat-regression.sh` asserts the
load-bearing invariants that protect the Bundle-1-only-deploy
case (session middleware defers-to-next, CSRF passthrough on
missing session row, ChainAuthSessionThenBearer wired, public
OIDC routes in AuthExempt allowlist, AuthInfo guards on
OIDCProvidersResolver != nil). Sibling
`bundle-1-to-2-upgrade-regression.sh` asserts the upgrade-path
invariants (migrations 000034..000038 are CREATE TABLE IF NOT EXISTS
+ BEGIN/COMMIT-wrapped + no DROP TABLE / ALTER...DROP COLUMN
against 19 protected Bundle-1 tables + ON CONFLICT DO NOTHING on
permission seed).
Migration ordering, idempotency, and downgrade are documented in
[`docs/migration/api-keys-to-rbac.md`](docs/migration/api-keys-to-rbac.md).
The threat model + compliance mapping live at
[`docs/migration/api-keys-to-rbac.md`](docs/migration/api-keys-to-rbac.md)
(API-key → RBAC, Bundle 1) and [`docs/migration/oidc-enable.md`](docs/migration/oidc-enable.md)
(API-key → OIDC, Bundle 2). The threat model lives at
[`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md).
Day-2 RBAC operations live at
[`docs/operator/rbac.md`](docs/operator/rbac.md).
Day-2 RBAC operations live at [`docs/operator/rbac.md`](docs/operator/rbac.md).
RFC + CWE evidence at [`docs/reference/auth-standards-implemented.md`](docs/reference/auth-standards-implemented.md).
## v2.0.68 - Image registry path changed ⚠️