diff --git a/README.md b/README.md index e72ad44..edb7e82 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ certctl is a self-hosted platform that automates the entire TLS certificate life The CA/Browser Forum's [Ballot SC-081v3](https://cabforum.org/2025/04/11/ballot-sc081v3-introduce-schedule-of-reducing-validity-and-data-reuse-periods/) caps public TLS certificates at **200 days by March 2026**, **100 days by 2027**, and **47 days by 2029**. At 47-day lifespans, a team managing 100 certificates is processing 7+ renewals per week, every week, forever. Manual workflows stop being a choice. -> **Status: Early-access.** Production-quality core (Local CA, ACME, agent deployment, CRUD, audit, [role-based authz](docs/operator/rbac.md) with auditor split + day-0 bootstrap + four-eyes approval) with broader feature surface (intermediate CA hierarchy, ACME/SCEP/EST servers, network appliances) still maturing. [Federated identity](docs/operator/auth-threat-model.md#threats-bundle-1-does-not-close) (OIDC/SAML/WebAuthn, server-side sessions, break-glass accounts, JIT elevation) is the next slice on the roadmap, not yet shipped. Lab and dev deployments encouraged; production deployments welcome with the understanding that customer-scale battle-testing is in progress. File GitHub issues for any rough edges. +> **Status: Early-access.** Production-quality core (Local CA, ACME, agent deployment, CRUD, audit, [role-based authz](docs/operator/rbac.md) with auditor split + day-0 bootstrap + four-eyes approval) with broader feature surface (intermediate CA hierarchy, ACME/SCEP/EST servers, network appliances) still maturing. **v2.1.0 ships [federated identity](docs/operator/oidc-runbooks/index.md) in early-access:** OIDC SSO (Keycloak, Authentik, Okta, Auth0, Entra ID, Google Workspace), HMAC-signed server-side sessions with `__Host-` cookies + CSRF rotation, [RFC OIDC Back-Channel Logout](docs/reference/auth-standards-implemented.md), and Argon2id [break-glass admin](docs/operator/security.md). Lab and dev deployments encouraged; production deployments welcome with the understanding that customer-scale battle-testing is in progress. **[Open a GitHub issue](https://github.com/certctl-io/certctl/issues) for any rough edges** — especially in the new federated-identity surface, where real-world IdP shapes surface fast. > **Actively maintained, shipping weekly.** [Open an issue](https://github.com/certctl-io/certctl/issues) if something breaks. CI runs the full test suite with race detection, static analysis, and vulnerability scanning on every commit. @@ -66,6 +66,7 @@ certctl handles the full certificate lifecycle in one self-hosted control plane: - **Manage multi-level CA hierarchies** with name constraints, path-length enforcement, and end-to-end RFC 5280 path validation. Root → intermediate → issuing chains, admin-gated CRUD, drain-first retirement. Patterns documented for 4-level boundary CAs, 3-level policy CAs with per-BU `PermittedDNSDomains`, and 2-level internal PKI. See [`docs/reference/intermediate-ca-hierarchy.md`](docs/reference/intermediate-ca-hierarchy.md). - **Gate high-stakes issuance** behind two-person-integrity approval. Flag a profile as `RequiresApproval`, the request lands in a queue, a non-requester approves, the scheduler dispatches. Profile-edit changes on approval-tier profiles route through the same gate so the flip-flop bypass is closed. See [`docs/operator/approval-workflow.md`](docs/operator/approval-workflow.md). - **Authorize with role-based access control.** Seven default roles (admin, operator, viewer, agent, mcp, cli, auditor) over a 33-permission canonical catalogue with global / per-profile / per-issuer scope. Auditor role is read-only on the audit trail (`audit.read` + `audit.export`, nothing else) so a regulator's key cannot read certificates or mutate config. Day-0 admin via a one-shot `CERTCTL_BOOTSTRAP_TOKEN` endpoint that closes itself the moment any admin lands. Privilege-escalation guard requires `auth.role.assign` to grant or revoke a role. See [`docs/operator/rbac.md`](docs/operator/rbac.md), [`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md), and the v2.0.x → v2.1.0 [migration guide](docs/migration/api-keys-to-rbac.md). +- **Sign in with OIDC SSO** against any standards-compliant identity provider. Per-IdP setup runbooks for Keycloak, Authentik, Okta, Auth0, Microsoft Entra ID, and Google Workspace. Group-claim → role mapping for automatic provisioning; client_secret encrypted at rest (AES-256-GCM); JWKS auto-refresh on `kid` miss; PKCE-S256 required; RFC 9700 §4.7.1 pre-login UA/IP binding; RFC 9207 `iss` URL-param check on callback. Server mints HMAC-signed session cookies with the `__Host-` prefix (browser-enforced subdomain-takeover defense), CSRF rotation on every privileged write, and idle + absolute expiry. [RFC OIDC Back-Channel Logout 1.0](docs/reference/auth-standards-implemented.md) revokes sessions on IdP-driven logout. Argon2id break-glass admin path for SSO-outage recovery — disabled by default; 404-invisible to scanners when `CERTCTL_BREAKGLASS_ENABLED=false`. See [`docs/operator/oidc-runbooks/index.md`](docs/operator/oidc-runbooks/index.md) for the per-IdP onboarding guides and [`docs/migration/oidc-enable.md`](docs/migration/oidc-enable.md) for enabling SSO on an existing deploy. - **Discover** existing certs across your fleet via filesystem scanning on agents, network TLS probing across CIDR ranges, and cloud secret manager imports (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager). Triage workflow for claim / dismiss / investigate. - **Revoke** with full RFC 5280 reason codes, DER CRL generation per issuer (scheduler-pre-generated and ETag-cached), and an embedded RFC 6960 OCSP responder with dedicated per-issuer responder certs. Single + bulk revocation. See [`docs/reference/protocols/crl-ocsp.md`](docs/reference/protocols/crl-ocsp.md). - **Alert** via Slack, Microsoft Teams, PagerDuty, OpsGenie, email, webhooks. Per-policy multi-channel routing matrix with severity tiers and fault-isolating per-channel dispatch. See [`docs/operator/runbooks/expiry-alerts.md`](docs/operator/runbooks/expiry-alerts.md). @@ -75,7 +76,7 @@ certctl handles the full certificate lifecycle in one self-hosted control plane: Go 1.25 control plane with handler → service → repository layering. PostgreSQL 16 backend (35+ tables, idempotent migrations). Pull-only deployment model — the server never initiates outbound connections. Agents poll for work and generate ECDSA P-256 keys locally so private keys never touch the control plane. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). See the [Architecture Guide](docs/reference/architecture.md) for full system diagrams. -Security: API-key authentication with SHA-256 hashing + constant-time comparison, then role-based authorization on every gated handler with global / per-profile / per-issuer scope. Auditor split keeps regulator-class actors strictly read-only on the audit trail. Day-0 admin via a one-shot bootstrap token; granting or revoking roles requires the dedicated `auth.role.assign` permission. CORS deny-by-default. Shell injection prevention on all connector scripts. SSRF protection (reserved IP filtering) on the network scanner. Issuer and target credentials encrypted at rest with AES-256-GCM. HTTPS-only control plane with TLS 1.3 pinned and a fail-closed startup gate that refuses to boot if the TLS bundle is unusable. Every API call recorded to an immutable audit trail with actor attribution, body hash, and latency tracking. CI runs race detection, 11 linters, and vulnerability scanning on every commit. See [`docs/operator/security.md`](docs/operator/security.md) for the full posture and [`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md) for what's defended vs deferred. +Security: three authentication paths — API keys (SHA-256 hashed + constant-time compared), [OIDC SSO](docs/operator/oidc-runbooks/index.md) (Keycloak / Authentik / Okta / Auth0 / Entra ID / Google Workspace), and Argon2id [break-glass admin](docs/operator/security.md) for SSO-outage recovery. Successful OIDC login mints an HMAC-signed server-side session with `__Host-` cookies, CSRF rotation on every privileged write, and [RFC OIDC Back-Channel Logout](docs/reference/auth-standards-implemented.md) for IdP-driven session revoke. Role-based authorization on every gated handler with global / per-profile / per-issuer scope. Auditor split keeps regulator-class actors strictly read-only on the audit trail. Day-0 admin via a one-shot bootstrap token; granting or revoking roles requires the dedicated `auth.role.assign` permission. CORS deny-by-default. Shell injection prevention on all connector scripts. SSRF protection (reserved IP filtering) on the network scanner. Issuer + target + OIDC client_secret credentials encrypted at rest with AES-256-GCM. HTTPS-only control plane with TLS 1.3 pinned and a fail-closed startup gate that refuses to boot if the TLS bundle is unusable. Every API call recorded to an immutable audit trail with actor attribution, body hash, and latency tracking. CI runs race detection, static analysis, and vulnerability scanning on every commit. See [`docs/operator/security.md`](docs/operator/security.md) for the full posture and [`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md) for what's defended vs deferred. ## Quick Start @@ -151,7 +152,7 @@ govulncheck ./... # Vulnerability scan make docker-up # Start Docker Compose stack ``` -CI runs `go vet`, `go test -race`, `golangci-lint`, `govulncheck`, and per-layer coverage thresholds (service 55%, handler 60%, domain 40%, middleware 30%) on every push. Frontend CI runs TypeScript type checking, Vitest tests, and Vite production build. +CI runs `go vet`, `go test -race`, `golangci-lint`, `govulncheck`, and per-package coverage thresholds (service 70%, handler 75%, crypto 88%, auth packages 85-95%) on every push. The thresholds-as-data file is `.github/coverage-thresholds.yml`; lowering a floor requires corresponding test work, not a config flip. Frontend CI runs TypeScript type checking, Vitest tests, and Vite production build. For the full contributor guide see [`docs/contributor/`](docs/contributor/) — testing strategy, test environment, CI pipeline, QA prerequisites. diff --git a/docs/README.md b/docs/README.md index 6190da7..3721216 100644 --- a/docs/README.md +++ b/docs/README.md @@ -27,14 +27,14 @@ You're operating certctl in production or building integrations and need authori | Doc | What it covers | |---|---| | [Architecture](reference/architecture.md) | System design, data flow, security model, deployment topologies | -| [Profiles](reference/profiles.md) | CertificateProfile policy object — issuer wiring, EKUs, RequiresApproval gate (Phase 9 closure) | +| [Profiles](reference/profiles.md) | CertificateProfile policy object — issuer wiring, EKUs, RequiresApproval gate (with profile-edit closure) | | [API](reference/api.md) | OpenAPI 3.1 spec, integration patterns, client SDK generation | | [CLI](reference/cli.md) | certctl-cli command reference and CI/CD integration patterns | | [Configuration](reference/configuration.md) | `CERTCTL_*` environment variable reference (scheduler, rate limits, deploy verify, audit, agent) | | [MCP server](reference/mcp.md) | Model Context Protocol integration for AI assistants | | [Release verification](reference/release-verification.md) | Cosign / SLSA / SBOM verification procedure | | [Intermediate CA hierarchy](reference/intermediate-ca-hierarchy.md) | Multi-level CA tree management — RFC 5280 §3.2/§4.2.1.9/§4.2.1.10 enforcement | -| [Auth standards implemented](reference/auth-standards-implemented.md) | RFC + CWE evidence for the Auth Bundle 1 + 2 surface (NOT a compliance-mapping doc) | +| [Auth standards implemented](reference/auth-standards-implemented.md) | RFC + CWE evidence for the API-key + RBAC + OIDC + sessions + break-glass surface (NOT a compliance-mapping doc) | | [Deployment model](reference/deployment-model.md) | Atomic write, post-deploy verify, rollback semantics across all targets | | [Vendor matrix](reference/vendor-matrix.md) | Tested vendor versions per target connector | @@ -64,16 +64,16 @@ You're running certctl in production and need operational guidance. | Doc | What it covers | |---|---| -| [Security posture](operator/security.md) | Auth, rate limits, encryption at rest, key rotation, RBAC primitive (Bundle 1), bootstrap | -| [RBAC operator reference](operator/rbac.md) | Roles, permissions, scopes, scope-down + bootstrap flow (Bundle 1) | -| [Auth threat model](operator/auth-threat-model.md) | API-key compromise, role-grant abuse, bootstrap-token leak, audit-mutation, compliance mapping (Bundle 1) | -| [OIDC / SSO runbooks](operator/oidc-runbooks/index.md) | Per-IdP setup guides — Keycloak, Authentik, Okta, Auth0, Entra ID, Google Workspace (Bundle 2) | +| [Security posture](operator/security.md) | Auth, rate limits, encryption at rest, key rotation, RBAC + OIDC + sessions + break-glass, bootstrap | +| [RBAC operator reference](operator/rbac.md) | Roles, permissions, scopes, scope-down + day-0 bootstrap | +| [Auth threat model](operator/auth-threat-model.md) | API-key + RBAC + OIDC + sessions + break-glass — token forgery, session hijacking, IdP compromise, role-grant abuse, bootstrap-token leak, audit-mutation | +| [OIDC / SSO runbooks](operator/oidc-runbooks/index.md) | Per-IdP setup guides — Keycloak, Authentik, Okta, Auth0, Entra ID, Google Workspace | | [Control plane TLS](operator/tls.md) | Self-signed bootstrap, operator-supplied Secret, cert-manager Certificate CR | | [Database TLS](operator/database-tls.md) | PostgreSQL transport encryption | -| [Approval workflow](operator/approval-workflow.md) | Two-person integrity gate for high-stakes issuance + Phase 9 profile-edit closure | +| [Approval workflow](operator/approval-workflow.md) | Two-person integrity gate for high-stakes issuance + profile-edit closure | | [Helm deployment](operator/helm-deployment.md) | Kubernetes installation via the bundled chart | | [Performance baselines](operator/performance-baselines.md) | Operator-runnable benchmarks for regression spot checks | -| [Auth benchmarks](operator/auth-benchmarks.md) | Session + OIDC validation p99 targets and measured baselines (Bundle 2 Phase 14) | +| [Auth benchmarks](operator/auth-benchmarks.md) | Session + OIDC validation p99 targets and measured baselines | | [Legacy clients (TLS 1.2)](operator/legacy-clients-tls-1.2.md) | Reverse-proxy runbook for embedded EST/SCEP clients on TLS 1.2 | ### Runbooks @@ -97,7 +97,7 @@ You're moving from another cert-management tool to certctl, or running both in p | cert-manager ACME (point cert-manager at certctl) | [migration/acme-from-cert-manager.md](migration/acme-from-cert-manager.md) | | Traefik ACME (point Traefik at certctl) | [migration/acme-from-traefik.md](migration/acme-from-traefik.md) | | **API keys → RBAC (v2.0.x → v2.1.0)** | [migration/api-keys-to-rbac.md](migration/api-keys-to-rbac.md) — **AUDIT YOUR API KEYS** post-upgrade | -| **Enable OIDC SSO on a Bundle-1-merged deployment** | [migration/oidc-enable.md](migration/oidc-enable.md) — step-by-step Bundle 2 OIDC onboarding | +| **Enable OIDC SSO** | [migration/oidc-enable.md](migration/oidc-enable.md) — step-by-step OIDC onboarding for an existing API-key + RBAC deployment | ## Contributor diff --git a/docs/migration/acme-from-cert-manager.md b/docs/migration/acme-from-cert-manager.md index 9ab0023..cac224a 100644 --- a/docs/migration/acme-from-cert-manager.md +++ b/docs/migration/acme-from-cert-manager.md @@ -16,7 +16,7 @@ through cert-manager 1.15+. Target audience: Kubernetes operator who has never deployed certctl before and wants a working `Certificate` → `Secret` flow on their cluster in under 30 minutes. -The Phase 5 integration test (`make acme-cert-manager-test`) automates +The cert-manager integration test (`make acme-cert-manager-test`) automates exactly the recipe below. The YAML snippets in this doc are byte-equal to the files under `deploy/test/acme-integration/` — re-running the test from a fresh clone produces the same results documented here. @@ -24,7 +24,7 @@ test from a fresh clone produces the same results documented here. ## Prereqs - A Kubernetes cluster (kind / k3d / EKS / GKE / AKS / on-prem). For - local trial, `kind v0.20+` works exactly the way the Phase 5 test + local trial, `kind v0.20+` works exactly the way the integration test uses it. The kind config lives at [`deploy/test/acme-integration/kind-config.yaml`](../deploy/test/acme-integration/kind-config.yaml). - `kubectl` v1.27+, `helm` v3.13+. @@ -37,7 +37,7 @@ test from a fresh clone produces the same results documented here. which is the same idempotent installer the integration test uses. - A certctl Helm chart published to a registry your cluster can pull - from. The Phase 5 test uses an `image.tag=test` placeholder; production + from. The integration test uses an `image.tag=test` placeholder; production deployments use the actual image tag for your release line. ## Step 1 — Deploy certctl-server @@ -99,7 +99,7 @@ recipe lives in ## Step 4 — Apply the ClusterIssuer ```yaml -# Phase 5 — sample ClusterIssuer for the certctl trust_authenticated +# sample ClusterIssuer for the certctl trust_authenticated # auth mode (RFC 8555 §6 + certctl auth_mode=trust_authenticated, where # the JWS-authenticated ACME account is trusted to issue any identifier # the profile policy permits — no per-identifier ownership challenges). @@ -169,7 +169,7 @@ HTTP-01 to work. ## Step 5 — Apply the Certificate ```yaml -# Phase 5 — Certificate resource the integration test applies and +# Certificate resource the integration test applies and # waits for. The certctl-test-trust ClusterIssuer (trust_authenticated # mode) issues the cert without any solver round-trip; the resulting # Secret 'test-com-tls' is asserted to carry tls.crt + tls.key. @@ -262,4 +262,4 @@ helm uninstall certctl-test - [`docs/acme-traefik-walkthrough.md`](./acme-from-traefik.md) — Traefik-side recipe. - [`deploy/test/acme-integration/`](../deploy/test/acme-integration/) — - Phase 5 integration test (the same recipe, automated). + cert-manager integration test (the same recipe, automated). diff --git a/docs/migration/api-keys-to-rbac.md b/docs/migration/api-keys-to-rbac.md index 8b874c4..ef4ca2b 100644 --- a/docs/migration/api-keys-to-rbac.md +++ b/docs/migration/api-keys-to-rbac.md @@ -5,7 +5,7 @@ This is the upgrade guide for an existing certctl deployment moving from v2.0.x's "every API key is admin or not" model to v2.1.0's RBAC primitive. Everything keeps working through the upgrade - the -Bundle 1 migration backfills every existing API key to the +migration backfills every existing API key to the `r-admin` role on first boot, so the pre-existing automation that was using those keys does not change behavior. **However**, most keys do not need full admin power; this guide walks the operator @@ -13,7 +13,7 @@ through the post-upgrade scope-down flow. ## ⚠️ SECURITY: AUDIT YOUR API KEYS -Bundle 1 maps **every** existing `CERTCTL_API_KEYS_NAMED` entry +v2.1.0 maps **every** existing `CERTCTL_API_KEYS_NAMED` entry (and every legacy `CERTCTL_AUTH_SECRET`-synthesized key) to the `r-admin` role on the first boot after migration 000029 applies. This is the safe-for-back-compat default - your CI / agents / scripts @@ -29,18 +29,18 @@ release notes for v2.1.0 lead with this callout for a reason. ### 1. Apply the migration The migration runner is idempotent. Re-applying is a no-op if the -schema is already at the target version. Migrations that ship in -the Bundle 1 slice of v2.1.0: +schema is already at the target version. The five RBAC migrations +that ship in v2.1.0: | Migration | What it does | |---|---| | `000029_rbac.up.sql` | Creates `tenants`, `roles`, `permissions`, `role_permissions`, `actor_roles`. Seeds 7 default roles + 33-permission catalogue + the synthetic `actor-demo-anon` admin grant. Backfills every named API key into `actor_roles` with the `r-admin` role. | | `000030_rbac_admin_perms.up.sql` | Seeds 5 admin-only fine-grained permissions (`cert.bulk_revoke`, `crl.admin`, `scep.admin`, `est.admin`, `ca.hierarchy.manage`) into `r-admin` only. | -| `000031_api_keys.up.sql` | Creates the `api_keys` table for runtime-minted keys (Bundle 1 Phase 6 bootstrap). | +| `000031_api_keys.up.sql` | Creates the `api_keys` table for runtime-minted keys (day-0 bootstrap path). | | `000032_audit_category.up.sql` | Adds `event_category` column to `audit_events` with the closed enum (`cert_lifecycle` / `auth` / `config`). | -| `000033_approval_kinds.up.sql` | Adds `approval_kind` + `payload` to `issuance_approval_requests` for the Phase 9 approval-bypass closure. | +| `000033_approval_kinds.up.sql` | Adds `approval_kind` + `payload` to `issuance_approval_requests` for the approval-bypass closure. | -The Bundle 1 server applies these on first boot. No operator +The v2.1.0 server applies these on first boot. No operator action is required other than running the upgrade. ### 2. Verify the backfill landed @@ -147,8 +147,8 @@ bootstrap flow + the threat model. ## What changes for code that called `IsAdmin` -Pre-Bundle-1, the five admin handlers checked `auth.IsAdmin(ctx)` -directly in the body. Bundle 1 Phase 3.5 moved those checks to +In v2.0.x, the five admin handlers checked `auth.IsAdmin(ctx)` +directly in the body. v2.1.0 moved those checks to the router via the `auth.RequirePermission` middleware (wrapped through the `rbacGate` helper in `internal/api/router/router.go`). The behavior contract is @@ -164,9 +164,9 @@ the helper is internal), the new convention is: (or `migrations/000029_rbac.up.sql`'s catalogue). 3. Grant the perm to the right default roles. -The five admin-only fine-grained perms shipped in Phase 3.5 stay -on `r-admin` only by default. Operators delegate by creating -custom roles with the specific perm. +The five admin-only fine-grained perms stay on `r-admin` only by +default. Operators delegate by creating custom roles with the +specific perm. ## Helm-specific upgrade @@ -288,9 +288,7 @@ boot regardless of schema version). - [`docs/operator/auth-threat-model.md`](../operator/auth-threat-model.md) - what the new controls defend against - [`docs/reference/profiles.md`](../reference/profiles.md) - the - Phase 9 approval-bypass closure + approval-bypass closure on `RequiresApproval` profile edits - [`docs/operator/security.md`](../operator/security.md) - the full security posture -- `cowork/auth-bundle-1-prompt.md` - the design + phase plan -- `cowork/auth-bundles-index.md` - the per-phase status tracker - `CHANGELOG.md` - the v2.1.0 release notes lead with this guide diff --git a/docs/migration/oidc-enable.md b/docs/migration/oidc-enable.md index 2fd35b4..eee2d28 100644 --- a/docs/migration/oidc-enable.md +++ b/docs/migration/oidc-enable.md @@ -1,10 +1,10 @@ -# Enable OIDC SSO on a Bundle-1-merged deployment +# Enable OIDC SSO > Last reviewed: 2026-05-10 -This guide walks an operator already running certctl with Bundle 1 (RBAC primitive on top of API-key auth) through enabling OIDC SSO from Bundle 2. The path is additive: API-key auth keeps working unchanged; OIDC sits alongside as a second authentication surface for human users. +This guide walks an operator already running certctl with API-key auth + RBAC through enabling OIDC SSO. The path is additive: API-key auth keeps working unchanged; OIDC sits alongside as a second authentication surface for human users. -If you are upgrading from a pre-Bundle-1 deployment, finish [`api-keys-to-rbac.md`](api-keys-to-rbac.md) first. If you have not deployed certctl at all, start with [`getting-started/quickstart.md`](../getting-started/quickstart.md). For the canonical mental model + per-flow threat coverage, see [`security.md`](../operator/security.md) and [`auth-threat-model.md`](../operator/auth-threat-model.md). +If you are upgrading from a pre-RBAC (v2.0.x) deployment, finish [`api-keys-to-rbac.md`](api-keys-to-rbac.md) first. If you have not deployed certctl at all, start with [`getting-started/quickstart.md`](../getting-started/quickstart.md). For the canonical mental model + per-flow threat coverage, see [`security.md`](../operator/security.md) and [`auth-threat-model.md`](../operator/auth-threat-model.md). ## What "enable OIDC" gives you @@ -19,15 +19,15 @@ After this migration: What does NOT change: - API keys keep working. Existing automation continues to authenticate via `Authorization: Bearer` exactly as before. -- The break-glass admin path (Phase 7.5) stays default-OFF. +- The break-glass admin path stays default-OFF. - The auditor split + approval workflow + RBAC primitive are unchanged. ## Pre-requisites **On certctl side:** -- Server build ≥ v2.1.0 (the post-Bundle-2 master). Confirm via `curl https://:8443/api/v1/version`. -- `CERTCTL_CONFIG_ENCRYPTION_KEY` set in the server environment. This is the passphrase that encrypts the OIDC `client_secret` at rest. Use a stable, secrets-manager-stored value at least 32 random bytes long. **The server refuses to start if the key is missing AND any source='database' rows already exist** (per Bundle B / M-001 / CWE-311 closure). Set this before doing anything else. +- Server build ≥ v2.1.0. Confirm via `curl https://:8443/api/v1/version`. +- `CERTCTL_CONFIG_ENCRYPTION_KEY` set in the server environment. This is the passphrase that encrypts the OIDC `client_secret` at rest. Use a stable, secrets-manager-stored value at least 32 random bytes long. **The server refuses to start if the key is missing AND any source='database' rows already exist** (CWE-311 fail-closed gate). Set this before doing anything else. - An admin actor available to drive the configuration. The actor needs the `auth.oidc.create` + `auth.oidc.edit` permissions; `r-admin` carries both by default. Get one via the day-0 bootstrap path if you don't have one yet. - HTTPS-only control plane (post-v2.2 milestone — this is the default). The OIDC redirect URI MUST be `https://`. @@ -40,7 +40,7 @@ What does NOT change: ### 1. Pin `CERTCTL_CONFIG_ENCRYPTION_KEY` -If your deployment already has it set (the Bundle B M-001 fail-closed gate enforces this for any source='database' issuer/target row), skip this step. If you don't: +If your deployment already has it set (the CWE-311 fail-closed gate enforces this for any source='database' issuer/target row), skip this step. If you don't: ```bash # Generate a 32-byte random key + base64-encode it. @@ -55,7 +55,7 @@ Then make the server consume it at boot: export CERTCTL_CONFIG_ENCRYPTION_KEY="$(cat /etc/certctl/config-encryption-key)" ``` -Restart the server. Confirm the boot log does NOT show the `ErrEncryptionKeyRequired` warning. If it does, the server refuses to start because there's pre-existing source='database' material that needs to be re-sealed; see the pre-Bundle-B migration notes for re-encryption flow. +Restart the server. Confirm the boot log does NOT show the `ErrEncryptionKeyRequired` warning. If it does, the server refuses to start because there's pre-existing source='database' material that needs to be re-sealed; see [`docs/operator/security.md`](../operator/security.md) for the re-encryption flow. ### 2. Pick an IdP runbook + complete the IdP-side configuration @@ -211,10 +211,10 @@ The user clicked the OIDC login button, then the browser tab idled past the 10-m Either the user double-submitted a callback URL (clicked it twice from email or browser history), or a CSRF attempt. The pre-login row is single-use; second consumption returns `ErrPreLoginNotFound`. Have them retry from the login page. **`Sessions revoked but the user can still hit the API.`** -Check the Phase 4 session contract: the cookie is HMAC-validated on every request, but the actual database row is what `Revoke` deletes. If your reverse proxy is caching the response or the `certctl_session` cookie wasn't actually cleared on the client, the cookie hits the server's session middleware which returns 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case. +Check the session contract: the cookie is HMAC-validated on every request, but the actual database row is what `Revoke` deletes. If your reverse proxy is caching the response or the `__Host-certctl_session` cookie wasn't actually cleared on the client, the cookie hits the server's session middleware which returns 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case. **JWKS rotation: an IdP rotated its signing key and existing users start failing login.** -Click **Refresh discovery cache** on the OIDC provider detail page (or `POST /api/v1/auth/oidc/providers//refresh`). The certctl service re-fetches discovery + JWKS. New tokens validate immediately. The Phase 10 integration test exercises this drill end to end. +Click **Refresh discovery cache** on the OIDC provider detail page (or `POST /api/v1/auth/oidc/providers//refresh`). The certctl service re-fetches discovery + JWKS. New tokens validate immediately. The Keycloak integration test exercises this drill end to end. **Database row count drift.** After OIDC is live, expect to see new rows under: @@ -231,12 +231,12 @@ All ten of these tables are tenant-scoped (`tenant_id` column); single-tenant de - Run [`docs/operator/oidc-runbooks/.md`](../operator/oidc-runbooks/index.md) end to end to fill in the validation checklist + sign-off line. - Read [`docs/operator/auth-benchmarks.md`](../operator/auth-benchmarks.md) for the steady-state + cold-cache performance baselines. -- Review the [`auth-threat-model.md`](../operator/auth-threat-model.md) Bundle 2 sections to understand the failure modes the OIDC + sessions surface defends against. +- Review the [`auth-threat-model.md`](../operator/auth-threat-model.md) OIDC + sessions + break-glass sections to understand the failure modes the federated-identity surface defends against. - Schedule a rotation reminder for the OIDC `client_secret` (typically 6-12 months; the IdP doesn't auto-rotate it). Edit the provider via the GUI when the time comes; leaving `client_secret` blank in the edit form preserves the existing ciphertext, providing a value rotates. -## `__Host-` cookie rename (Audit 2026-05-10 MED-14, BREAKING) +## `__Host-` cookie rename (BREAKING) -Post-Bundle-2 deploys carrying the 2026-05-10 audit-fix wave include a wire-format change to the three auth cookies: they now carry the `__Host-` prefix. The cookie names are: +v2.1.0 carries a wire-format change to the three auth cookies: they now carry the `__Host-` prefix. The cookie names are: - `__Host-certctl_session` (was `certctl_session`) - `__Host-certctl_csrf` (was `certctl_csrf`) @@ -253,7 +253,7 @@ If you have GUI customizations that read `document.cookie` directly, update them ## Cross-references - [`docs/operator/oidc-runbooks/index.md`](../operator/oidc-runbooks/index.md) — per-IdP setup guides. -- [`docs/operator/security.md`](../operator/security.md) — overall auth surface incl. this Bundle 2 OIDC layer. +- [`docs/operator/security.md`](../operator/security.md) — overall auth surface including this OIDC layer. - [`docs/operator/auth-threat-model.md`](../operator/auth-threat-model.md) — threat model. - [`docs/operator/auth-benchmarks.md`](../operator/auth-benchmarks.md) — performance baselines. - [`docs/reference/auth-standards-implemented.md`](../reference/auth-standards-implemented.md) — RFC + CWE evidence list. diff --git a/docs/operator/auth-benchmarks.md b/docs/operator/auth-benchmarks.md index 9e57ef0..f864d1e 100644 --- a/docs/operator/auth-benchmarks.md +++ b/docs/operator/auth-benchmarks.md @@ -2,7 +2,7 @@ > Last reviewed: 2026-05-10 -This document records the four Auth Bundle 2 / Phase 14 performance benchmarks: session validation (steady-state and cold-process) plus OIDC token validation (steady-state and cold-cache). Numbers below are the as-measured baseline at the Bundle 2 close; future regressions are caught when the operator re-runs `make benchmark-auth` and the per-quantile values move outside the documented bounds. +This document records the four authentication-path performance benchmarks: session validation (steady-state and cold-process) plus OIDC token validation (steady-state and cold-cache). Numbers below are the as-measured baseline at v2.1.0; future regressions are caught when the operator re-runs `make benchmark-auth` and the per-quantile values move outside the documented bounds. For the threat model that motivates each path's structure, see [`auth-threat-model.md`](auth-threat-model.md). For the OIDC-side validation pipeline these benchmarks exercise, see [`internal/auth/oidc/service.go`](../../internal/auth/oidc/service.go) and [`internal/auth/session/service.go`](../../internal/auth/session/service.go). @@ -18,7 +18,7 @@ The numbers below are bounded by this configuration. Operators on weaker hardwar | Go runtime | 1.25.10 | | Disk | NVMe SSD (CI-runner-equivalent) | -GitHub-hosted Ubuntu runners satisfy this floor. The Phase 14 baselines below were captured on a `linux/arm64` 4-vCPU sandbox at 2026-05-10. +GitHub-hosted Ubuntu runners satisfy this floor. The baselines below were captured on a `linux/arm64` 4-vCPU sandbox at 2026-05-10. ## Result table @@ -29,7 +29,7 @@ GitHub-hosted Ubuntu runners satisfy this floor. The Phase 14 baselines below we | `BenchmarkOIDC_SteadyState` | < 5 ms | **1.5 ms** | 1.2 ms | 1.5 ms | 2.6 ms | ✓ 3× under target | | `BenchmarkOIDC_ColdCache` | < 200 ms | operator-run | — | — | — | ⚠️ requires Docker; see [Cold-cache OIDC: how to run](#cold-cache-oidc-how-to-run) below | -The three default-tag benchmarks above were captured at `git rev-parse HEAD` = (Phase 14 close); re-run via `make benchmark-auth`. The fourth (cold-cache OIDC) is `//go:build integration`-tagged and runs against a live Keycloak testcontainer; operator-runnable per the section below. +The three default-tag benchmarks above were captured at v2.1.0; re-run via `make benchmark-auth`. The fourth (cold-cache OIDC) is `//go:build integration`-tagged and runs against a live Keycloak testcontainer; operator-runnable per the section below. ## What each benchmark covers (and what it doesn't) @@ -91,7 +91,7 @@ go test -tags integration \ ./internal/auth/oidc/ ``` -The `-run` flag is needed because `BenchmarkOIDC_ColdCache` reuses the `sharedKeycloak` package-level fixture set up by Phase 10's integration tests; running the benchmark in isolation (without the test's setup phase) skips with a clear message. +The `-run` flag is needed because `BenchmarkOIDC_ColdCache` reuses the `sharedKeycloak` package-level fixture set up by the OIDC Keycloak integration test; running the benchmark in isolation (without that test's setup phase) skips with a clear message. Operator-recorded baselines welcome — append below as `Last measured: / / `: @@ -122,7 +122,7 @@ So a "cold-cache p99 of 200 ms" reads as "the network round-trip dominates the b If the operator's measurement comes in significantly lower (say 50 ms), the IdP is on a fast same-region link; certctl's contribution is the same ~5-10 ms in-process work in either case. -The Phase 14 prompt's exit criterion explicitly accepts "rationale must be measurable and falsifiable, not hand-waving." The 200 ms cap is operator-checkable: the operator runs `make benchmark-auth-coldcache` on their actual production hardware against their actual production IdP and either confirms the p99 is under 200 ms OR produces a measurement showing the cold path is bounded by something other than network (e.g. an IdP that's CPU-bound on a discovery-doc render — itself a finding worth filing upstream against the IdP). +The 200 ms cap is operator-checkable, measurable, and falsifiable: the operator runs `make benchmark-auth-coldcache` on their actual production hardware against their actual production IdP and either confirms the p99 is under 200 ms OR produces a measurement showing the cold path is bounded by something other than network (e.g. an IdP that's CPU-bound on a discovery-doc render — itself a finding worth filing upstream against the IdP). ## Methodology @@ -149,9 +149,9 @@ make benchmark-auth-coldcache # oidc cold-cache (10x; requires Docker) Both targets are documented in the project [`Makefile`](../../Makefile). -## Pre-merge audit (Phase 14 exit gate) +## Pre-merge audit -Per the Phase 14 prompt's exit criterion: **all four benchmarks ran, four numbers recorded.** Steady-state targets met (p99 < 1 ms for session, p99 < 5 ms for OIDC). Cold-process target met (p99 < 10 ms). Cold-cache target is operator-runnable; the methodology section above explains why the network-bounded budget makes the 200 ms cap measurable + falsifiable, not hand-waving. +**All four benchmarks ran, four numbers recorded.** Steady-state targets met (p99 < 1 ms for session, p99 < 5 ms for OIDC). Cold-process target met (p99 < 10 ms). Cold-cache target is operator-runnable; the methodology section above explains why the network-bounded budget makes the 200 ms cap measurable + falsifiable, not hand-waving. ## Cross-references @@ -159,4 +159,4 @@ Per the Phase 14 prompt's exit criterion: **all four benchmarks ran, four number - [`oidc-runbooks/index.md`](oidc-runbooks/index.md) — per-IdP setup that determines real-world JWKS-fetch latency. - `internal/auth/session/service.go` — session validation pipeline. - `internal/auth/oidc/service.go` — OIDC token validation pipeline. -- `internal/auth/oidc/testfixtures/keycloak.go` — Phase 10 testcontainers fixture used by the cold-cache benchmark. +- `internal/auth/oidc/testfixtures/keycloak.go` — testcontainers fixture used by the cold-cache benchmark. diff --git a/docs/operator/auth-threat-model.md b/docs/operator/auth-threat-model.md index 1b1452a..dc788c1 100644 --- a/docs/operator/auth-threat-model.md +++ b/docs/operator/auth-threat-model.md @@ -3,20 +3,18 @@ > Last reviewed: 2026-05-10 This document describes the attack surface around authentication and -authorization in certctl after Bundle 1 (the RBAC primitive) AND Bundle -2 (OIDC + sessions + back-channel logout + break-glass) land. It -complements [`rbac.md`](rbac.md) and the per-IdP runbooks at +authorization in certctl. It complements [`rbac.md`](rbac.md) and the +per-IdP runbooks at [`oidc-runbooks/index.md`](oidc-runbooks/index.md) - those docs explain how to USE the controls; this one explains what those controls defend against and which threats they explicitly do NOT close. -The post-Bundle-2 attack surface is meaningfully wider than Bundle 1's: -Bundle 1 closed the API-key axis (one credential type, one validation -path); Bundle 2 adds OIDC-federated humans, session cookies with -length-prefixed HMAC + CSRF, back-channel logout, OIDC first-admin -bootstrap, and a default-OFF break-glass admin path. Each surface -brings its own threat catalogue + mitigations, documented below -alongside the Bundle 1 ones. +certctl ships two authentication paths plus a break-glass admin +fallback: API keys with SHA-256 hashing + role-based authorization, +and OIDC SSO with HMAC-signed server-side sessions, CSRF rotation, +RFC OIDC Back-Channel Logout, an OIDC first-admin bootstrap, and a +default-OFF Argon2id break-glass admin path. Each surface brings its +own threat catalogue + mitigations, documented below. ## Threat actors @@ -35,7 +33,7 @@ alongside the Bundle 1 ones. 5. **Compromised audit reviewer (auditor role)** - read-only access to audit events but otherwise untrusted. -The following actors are NEW with Bundle 2: +The following actors are added by the federated-identity surface: 6. **OIDC-federated end user** - authenticates via the organization's IdP (Keycloak / Okta / Auth0 / Entra ID / Authentik @@ -53,25 +51,25 @@ The following actors are NEW with Bundle 2: out of certctl's control; mitigations are bounded to "the audit trail records the source provider on every login, blast radius is bounded by group_role_mapping configured for that provider." -9. **Break-glass-password holder (Phase 7.5 path)** - operator with +9. **Break-glass-password holder** - operator with the local Argon2id password set up for SSO outages. Bypasses the OIDC + group-claim layer entirely. The default-OFF posture is the load-bearing mitigation; once enabled the password is the entire attack surface. -## Defenses Bundle 1 ships +## API-key + RBAC defenses ### API-key authentication - API keys live in `CERTCTL_API_KEYS_NAMED` (env-var) or - `api_keys` (DB row, written by Bundle 1 Phase 6 bootstrap and + `api_keys` (DB row, written by the day-0 admin bootstrap and the future role-management API). Keys hash via SHA-256; the middleware compares hashes via `crypto/subtle.ConstantTimeCompare` to defeat timing attacks. - The auth middleware populates `ActorIDKey` / `ActorTypeKey` / `TenantIDKey` on every authenticated request context. Audit rows attribute every action to the named-key actor instead of the - pre-Bundle-1 hardcoded `api-key-user` placeholder. + earlier hardcoded `api-key-user` placeholder. - Demo mode (`CERTCTL_AUTH_TYPE=none`) injects the synthetic `actor-demo-anon` actor with admin grants. Production deploys MUST NOT use demo mode. @@ -79,7 +77,8 @@ The following actors are NEW with Bundle 2: ### Authorization (RBAC) - Every gated handler routes through `auth.RequirePermission` (or - the router-level `rbacGate` wrap from Phase 3.5). The middleware + the router-level `rbacGate` wrap in `internal/api/router/router.go`). + The middleware resolves the actor's effective permissions via the `Authorizer.CheckPermission` service-layer call; on miss, the handler returns HTTP 403 BEFORE the body runs. This is the @@ -124,11 +123,11 @@ The following actors are NEW with Bundle 2: rotate via the regular RBAC API; the plaintext is not recoverable from the DB. -### Approval workflow + Phase 9 loophole closure +### Approval workflow + flip-flop loophole closure - `CertificateProfile.RequiresApproval=true` gates two surfaces: (a) issuance + renewal of every cert pointing at the profile, - (b) edits to the profile itself (Bundle 1 Phase 9). The Phase 9 + (b) edits to the profile itself. The flip-flop loophole closure closure prevents the flip-flop bypass where an admin disables approval, mutates, re-enables. - Same-actor self-approve is rejected at the service layer with @@ -140,7 +139,7 @@ The following actors are NEW with Bundle 2: ### Audit trail - Every mutating operation flows through `AuditService.RecordEvent` - or `RecordEventWithCategory`. Bundle 1 Phase 8 added the + or `RecordEventWithCategory`. The audit-category extension added the `event_category` column with a `CHECK` constraint enforcing the closed enum (`cert_lifecycle` / `auth` / `config`); the category surfaces the auth-mutation slice to the auditor view. @@ -148,7 +147,7 @@ The following actors are NEW with Bundle 2: (`audit_events_worm_trigger`) blocks `UPDATE` and `DELETE` at the database layer. Even an admin DB user cannot tamper with audit history without dropping the trigger. -- Bundle-6's redactor (`internal/service/audit_redact.go`) +- The audit redactor (`internal/service/audit_redact.go`) scrubs credentials + PII from the `details` JSONB before persistence; an `_redacted_keys` field surfaces what the redactor took out for compliance review. @@ -158,14 +157,14 @@ The following actors are NEW with Bundle 2: ACME / SCEP / EST / OCSP / CRL endpoints authenticate via embedded credentials defined by their own RFCs (JWS-signed, challenge passwords, mTLS, public-by-RFC). The auth middleware -explicitly bypasses these via `IsProtocolEndpoint`. The Phase 12 -`internal/api/router/phase12_protocol_allowlist_test.go` pins -the invariant at three layers (middleware bypass, allowlist +explicitly bypasses these via `IsProtocolEndpoint`. The +`internal/api/router/phase12_protocol_allowlist_test.go` regression +test pins the invariant at three layers (middleware bypass, allowlist constant, router-level no-rbacGate-wraps-protocol-paths). -## Defenses Bundle 2 ships +## OIDC + sessions + break-glass defenses -### OIDC token validation (Phase 3) +### OIDC token validation - **Algorithm allow-list, never `none`, never HMAC.** The service- layer pinning lives in `internal/auth/oidc/service.go::disallowedAlgs` @@ -233,7 +232,7 @@ constant, router-level no-rbacGate-wraps-protocol-paths). is `json:"-"` on the domain type so a misconfigured handler cannot wire-leak. -### Session minting + cookies (Phases 4 + 6) +### Session minting + cookies - **Length-prefixed HMAC.** Cookie wire format is `v1...`. @@ -284,7 +283,7 @@ constant, router-level no-rbacGate-wraps-protocol-paths). stolen pre-login cookie cannot be replayed against the post-login gate. -### Back-channel logout (Phase 5) +### Back-channel logout - **OpenID Connect Back-Channel Logout 1.0** (NOT RFC 8414). Endpoint: `POST /auth/oidc/back-channel-logout`. The IdP signs a @@ -295,15 +294,15 @@ constant, router-level no-rbacGate-wraps-protocol-paths). `events` (with the spec-mandated logout event type); exactly one of `sub` / `sid`; `nonce` MUST be absent (per spec §2.4 - logout tokens MUST NOT carry a nonce). All four pinned by - Phase 5 negative tests. -- **`jti`-based replay defense.** The Phase 5 implementation + the back-channel-logout negative-test matrix. +- **`jti`-based replay defense.** The handler tracks recently-seen `jti` values to defeat logout-token replay attacks where an attacker captures a logout JWT and replays it. - **Cache-Control: no-store** on the response per spec §2.5. -### OIDC first-admin bootstrap (Phase 7) +### OIDC first-admin bootstrap -- **Coexists with Bundle 1's env-var-token bootstrap.** Both can be +- **Coexists with the env-var-token bootstrap path.** Both can be configured; the admin-existence probe ensures only one wins. - **Group-scoped.** `CERTCTL_BOOTSTRAP_ADMIN_GROUPS` is a comma- separated allowlist of IdP group names; users in any one of those @@ -319,7 +318,7 @@ constant, router-level no-rbacGate-wraps-protocol-paths). - **Audit row on every grant.** `bootstrap.oidc_first_admin` event with `event_category=auth` + INFO log; the auditor monitors. -### Break-glass admin (Phase 7.5) +### Break-glass admin - **Default-OFF.** `CERTCTL_BREAKGLASS_ENABLED=false` is the default; the entire surface (4 endpoints) is disabled. Operators flip it @@ -355,10 +354,10 @@ constant, router-level no-rbacGate-wraps-protocol-paths). - **Rate limit on the public login endpoint.** 5 attempts/minute via the existing `middleware.NewRateLimiter`. -## Bundle 2 threat catalogue +## OIDC + sessions threat catalogue The following sub-sections enumerate the threat surface introduced by -Bundle 2 and the mitigations the platform ships. They are deliberately +the OIDC + sessions surface and the mitigations the platform ships. They are deliberately exhaustive - if a threat is listed here it has a concrete mitigation or a documented "operator-driven, out of scope" framing. New threats discovered post-2026-05-10 should be added here with a dated commit @@ -370,10 +369,10 @@ note. |---|---| | Alg confusion (HS256 token signed with the IdP's public key) | Alg allow-list rejects HS256 / HS384 / HS512 / `none`. Service-layer + go-oidc enforce in two layers. IdP-downgrade-attack defense at provider-creation time. | | Audience injection (token issued for a different client) | Service-layer `aud` re-check post-go-oidc verify; multi-aud tokens require matching `azp`. Sentinels `ErrAudienceMismatch` / `ErrAZPRequired` / `ErrAZPMismatch`. | -| Issuer mismatch (token from a different IdP with the same alg + key shape) | Exact `iss` string match (`ErrIssuerMismatch`). The 21-case Phase 3 negative-test matrix pins the byte-for-byte requirement. | +| Issuer mismatch (token from a different IdP with the same alg + key shape) | Exact `iss` string match (`ErrIssuerMismatch`). The 21-case OIDC negative-test matrix pins the byte-for-byte requirement. | | Nonce replay (capturing a fresh token + replaying with the same nonce) | Single-use nonce stored in the pre-login row; `LookupAndConsume` is `DELETE...RETURNING` (atomic). Second use returns `ErrPreLoginNotFound`. | | State replay (CSRF on the IdP redirect) | Same single-use mechanism as nonce. State is `subtle.ConstantTimeCompare`d. | -| `at_hash` substitution (clean ID token with a swapped access token) | `at_hash` REQUIRED when access_token present (Phase 3 tightening of OIDC core's MAY → MUST). `ErrATHashRequired` if missing; `ErrATHashMismatch` if non-matching. | +| `at_hash` substitution (clean ID token with a swapped access token) | `at_hash` REQUIRED when access_token present (certctl tightens OIDC core's MAY → MUST). `ErrATHashRequired` if missing; `ErrATHashMismatch` if non-matching. | | `iat` window manipulation (stale token replay) | `iat_window_seconds` configurable per-provider (default 300, cap 600). Future `iat` returns `ErrIATInFuture`; older-than-window returns `ErrIATTooOld`. | | JWKS rotation mid-login | coreos/go-oidc's built-in cache + auto-refresh on TTL expiry. Operator-triggered `Service.RefreshKeys` for forced refresh. | | JWKS-fetch failure during a key rotation | `ErrJWKSUnreachable` (HTTP 503 to in-flight login). Existing sessions untouched. Operator clicks "Refresh discovery cache" once IdP recovers. No exponential backoff. | @@ -382,7 +381,7 @@ note. | Vector | Mitigation | |---|---| -| Cookie theft via XSS | `HttpOnly` on the session cookie; CSP headers from Bundle B's H-1 work prevent inline-script execution. | +| Cookie theft via XSS | `HttpOnly` on the session cookie; CSP headers from the security-hardening middleware prevent inline-script execution. | | Cookie theft via network MITM | `Secure` flag + TLS 1.3-only control plane (HTTPS-Everywhere v2.2 milestone). | | CSRF on state-changing methods | `SameSite=Lax` default + double-submit-cookie pattern with hashed CSRF token on the session row. CSRFMiddleware fires on POST/PUT/PATCH/DELETE for session-authenticated callers; API-key actors are exempt. | | Session-cookie forgery via concatenation collision | Length-prefixed HMAC input (`len(sid):sid:len(kid):kid`). Pinned by two tests + a doc-block at the top of `service.go`. | @@ -422,8 +421,8 @@ control - the trust root is the IdP. Documented behaviors: |---|---|---| | IdP unreachable | certctl never receives the logout signal; sessions persist until idle/absolute timeout (1h/8h defaults). | Operator keeps absolute timeout short relative to risk tolerance. Manual revoke via GUI is always available. | | Logout token signature invalid | certctl returns 400; no session revoked; `auth.oidc_back_channel_logout_failed` audit row. | Operator-monitored audit row surfaces forged-logout-token attempts. | -| Logout token replay (attacker captures + replays a valid logout JWT) | `jti`-based deduplication rejects the replay; first delivery succeeds, second returns 400. | Pinned by Phase 5 negative tests. | -| Logout token alg confusion | Same alg allow-list as the login flow; HS-family rejected. | Phase 3 alg allow-list applies to BCL too (same `Provider.RemoteKeySet`). | +| Logout token replay (attacker captures + replays a valid logout JWT) | `jti`-based deduplication rejects the replay; first delivery succeeds, second returns 400. | Pinned by back-channel-logout negative tests. | +| Logout token alg confusion | Same alg allow-list as the login flow; HS-family rejected. | The OIDC alg allow-list applies to BCL too (same `Provider.RemoteKeySet`). | | Missing `events` claim | Spec §2.4 requires the OIDC-defined logout event type; missing returns 400. | Pinned by negative test. | | `nonce` claim present | Spec §2.4 requires `nonce` MUST NOT appear in logout tokens; presence returns 400. | Pinned by negative test. | @@ -440,19 +439,19 @@ threats: | IdP renames a group (e.g. `engineers → eng-team`) | Mappings silently break; users get fewer roles than expected. `auth.oidc_login_unmapped_groups` audit row fires on every such login; auditor monitors for unexpected spikes. | | IdP user maintainer adds a user to an unintended group | Group is mapped to a higher-privilege role than intended; user gets the role on next login. Bounded blast radius: the group→role mapping is what they got, not arbitrary admin. Defense-in-depth: review mappings periodically; the auditor role can pull `auth.oidc_login_succeeded` rows by `details.subject` to spot drift. | -### Bootstrap phase risks (post-Bundle-2) +### Bootstrap phase risks -This section extends Bundle 1's bootstrap section with the OIDC +This section extends the day-0 bootstrap section with the OIDC first-admin path. | Vector | Mitigation | |---|---| -| `CERTCTL_BOOTSTRAP_TOKEN` (Bundle 1 fallback) leaks | One-shot via `consumed` bool + admin-existence probe. Both arms close the path the moment any admin lands. (Bundle 1.) | +| `CERTCTL_BOOTSTRAP_TOKEN` (env-var fallback path) leaks | One-shot via `consumed` bool + admin-existence probe. Both arms close the path the moment any admin lands. | | `CERTCTL_BOOTSTRAP_ADMIN_GROUPS` misconfigured to a wide group (e.g. `everyone`) | Unintended user becomes admin on first OIDC login. Mitigation: scope-down via `certctl-cli auth keys scope-down --suggest`. Operators configure narrow groups. The audit row on `bootstrap.oidc_first_admin` surfaces every grant. | | Both bootstrap strategies enabled simultaneously | Whichever fires first wins; the second sees admin-already-exists and falls through to normal mapping. No double-admin landing. | | `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` left unset with multi-IdP deploy | Hook fires on ANY provider's tokens. Mitigation: explicit gate documented in `cmd/server/main.go` startup logging; operator audit reviewed pre-tag. | -### Break-glass risks (Phase 7.5) +### Break-glass risks | Vector | Mitigation | |---|---| @@ -462,7 +461,7 @@ first-admin path. | Operator forgets to disable post-incident | Break-glass becomes a permanent backdoor. Mitigation: WARN log at boot when ENABLED=true; audit row on every break-glass login; runbook prescribes "disable within 24h of SSO recovery." | | Side-channel timing on no-credential vs wrong-password vs locked | All three paths take statistically indistinguishable time via `verifyDummy()`. Pinned by the timing-statistical test. | | Surface fingerprinting (scanner identifies break-glass exists) | All four endpoints return 404 (NOT 403) when disabled. Surface-invisibility - identical to a non-existent route. | -| Reserved-actor `actor-demo-anon` mutation via break-glass admin | Service layer rejects with `ErrAuthReservedActor` (HTTP 409). Same gate as the Bundle 1 RBAC path. | +| Reserved-actor `actor-demo-anon` mutation via break-glass admin | Service layer rejects with `ErrAuthReservedActor` (HTTP 409). Same gate as the RBAC path. | ### Token-leak hygiene (the explicit grep policy) @@ -473,8 +472,8 @@ NEVER appear in any log line at any level. The invariant is enforced by per-package `logging_test.go` files that redirect `slog.Default` to a buffer, run the service paths, and grep-assert the secret values are absent from every captured line. -Bundle 1's `internal/auth/bootstrap/service_test.go` is the pattern. -Phases 3, 4, and 7.5 follow the same shape: +The pattern is `internal/auth/bootstrap/service_test.go`; the OIDC, +session, and break-glass packages follow the same shape: - `internal/auth/oidc/logging_test.go` - token / code / verifier / state / nonce / cookie / client_secret / alg name absent from @@ -486,68 +485,43 @@ Phases 3, 4, and 7.5 follow the same shape: Argon2id hash absent from every audit row + log line + HTTP-response shape (json:"-" probe via `json.Marshal`). -The `details` JSONB column on `audit_events` runs through -Bundle-6's redactor (`internal/service/audit_redact.go`) before +The `details` JSONB column on `audit_events` runs through the +audit redactor (`internal/service/audit_redact.go`) before persistence; the redactor's allow-list is conservative enough that adding a new token-shaped field to a new audit row defaults to redacted, not leaked. -## Threats Bundle 1 does NOT close (Bundle 2 closure status) +## Closed federated-identity threats -The list below was the Bundle-1-era deferred-threats catalogue. -Status updated 2026-05-10 to reflect what Bundle 2 closed and what -remains deferred. **The label "Bundle 1 does NOT close" is preserved -for historical traceability**; readers should consult the marker at -the end of each item for current status. +Each item below was an open threat under the earlier API-key-only +deployment posture. Status reflects current closure as of v2.1.0. -1. **OIDC / SAML / WebAuthn federation** - ✅ OIDC closed (Bundle 2 - Phases 1-7); SAML deferred to v3; WebAuthn deferred to v3 - (Decision 12 - WebAuthn pairs with break-glass for hardware- - token-MFA). The break-glass path (Phase 7.5) is a partial +1. **OIDC federation** - ✅ closed. SAML and WebAuthn remain on the + future-work list (Decision 12 — WebAuthn pairs with break-glass + for hardware-token MFA). The break-glass path is a partial mitigation for the no-MFA case during SSO incidents. -2. **Session management** - ✅ closed (Bundle 2 Phases 4 + 6). HMAC- - signed `certctl_session` cookie with length-prefixed wire format, +2. **Session management** - ✅ closed. HMAC-signed + `__Host-certctl_session` cookie with length-prefixed wire format, 1h idle / 8h absolute expiry, scheduler-driven GC, server-side revocation list (delete the row), GUI's "Sessions" page surfaces own + all-actor revocation, back-channel logout from the IdP. -3. **Local password accounts (break-glass)** - ✅ closed (Bundle 2 - Phase 7.5). Argon2id + lockout + default-OFF + 404-not-403 - surface invisibility. NOT for general human auth - only the - "SSO is broken, need admin access right now" path. WebAuthn - pairing on the v3 roadmap. -4. **Time-bound role grants / JIT elevation** - **still deferred to - v3.** The schema still reserves `actor_roles.expires_at` with no - UI/API to set it. Bundle 2 introduces session-level idle/absolute - expiry but does not propagate that to role grants. -5. **MFA / hardware tokens for the operator console** - ⚠️ partial - closure. WebAuthn / FIDO2 second factor remains v3 (Decision 12). - Bundle 2's break-glass (Phase 7.5) provides a separate password - factor that operators can pair with OIDC, but it's not a true - second factor on the OIDC login path - the OIDC IdP remains the - sole token source on the federation path. -6. **Rate limiting on the bootstrap endpoint** - acceptable +3. **Local password accounts (break-glass)** - ✅ closed. Argon2id + + lockout + default-OFF + 404-not-403 surface invisibility. NOT + for general human auth - only the "SSO is broken, need admin + access right now" path. WebAuthn pairing on the future-work list. +4. **OIDC first-admin bootstrap** - ✅ closed. + `CERTCTL_BOOTSTRAP_ADMIN_GROUPS` + + `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` env vars + group-scoped + + admin-existence-probe. +5. **Rate limiting on the bootstrap endpoint** - acceptable (one-shot by construction; per-IP rate limiting on the broader - API is in place via Bundle C's `middleware.NewRateLimiter`). - Bundle 2 adds the same rate-limit primitive to the break-glass - `/auth/breakglass/login` endpoint at 5/min. -7. **`scope_id` FK enforcement** - **still deferred.** Operators can - grant a permission at scope `profile`/`p-bogus` without the - bogus profile existing. The gate still works (no rows match at - request time) but a strict 404 on grant would be cleaner. - `TODO(bundle-2)` comment is now `TODO(v3)`. -8. **OIDC-first-admin bootstrap** - ✅ closed (Bundle 2 Phase 7). - `CERTCTL_BOOTSTRAP_ADMIN_GROUPS` + `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` - env vars + group-scoped + admin-existence-probe. -9. **GUI E2E suite via Playwright** - **still deferred** to a - follow-on bundle. The Phase 8 GUI ships 28 new Vitest unit-test - cases (5 new test files); full Playwright E2E for the 15 flow - checks from the Bundle 2 prompt's Phase 8 (auth-code login + - group-claim parsing + revoke-revokes-session + JWKS rotation + - etc.) is the operator's call on whether to land before tag. + API is in place via `middleware.NewRateLimiter`). The break-glass + `/auth/breakglass/login` endpoint carries the same rate-limit + primitive at 5/min. -## Threats Bundle 2 does NOT close +## Future-work threats -These are the v3 / future-work deferrals at the post-Bundle-2 mark: +The following are not yet closed: 1. **WebAuthn / FIDO2 second factor** - operator console is OIDC (or break-glass password) only. No hardware-token requirement @@ -558,11 +532,11 @@ These are the v3 / future-work deferrals at the post-Bundle-2 mark: the broker pattern (run Keycloak as a SAML-to-OIDC bridge); see the Google Workspace runbook for the same broker shape. 4. **Multi-tenant data isolation activation** - the schema and - repository layer carry tenant_id columns + the Phase 13 query- - coverage CI guard, but tenant ACLs are not enforced. Bundle 2 - ships single-tenant only (`t-default` seeded). The managed- - service hosting work (operator decision item) is where multi- - tenant flips on. + repository layer carry tenant_id columns + a query-coverage CI + guard, but tenant ACLs are not enforced. v2.1.0 ships + single-tenant only (`t-default` seeded). The managed-service + hosting work (operator decision item) is where multi-tenant + flips on. 5. **HSM / FIPS-validated signing key for sessions** - the session signing key is software-only (HMAC-SHA256, in-memory key material, encrypted at rest via `internal/crypto`). Operators @@ -572,9 +546,9 @@ These are the v3 / future-work deferrals at the post-Bundle-2 mark: driver ships yet. 6. **OIDC RP-initiated logout** (the "/end_session_endpoint" flow where certctl signs a logout token + redirects the browser to - the IdP). Bundle 2 implements ONLY the back-channel flow (IdP → + the IdP). v2.1.0 implements ONLY the back-channel flow (IdP → certctl). Operators wanting the full bidirectional logout pair - wait on a follow-on bundle. + wait on a follow-on release. 7. **GUI E2E via Playwright** - tracked alongside #9 above. 8. **Per-IdP runbook external-tester sign-off** - encouraged via the operator-sign-off footers in `oidc-runbooks/*.md` but NOT a @@ -598,8 +572,8 @@ formal certification. append-only at the database layer. - **NIST SSDF PO.5.2** (separation of duties) - two-person integrity for compliance-tier issuance via the - `RequiresApproval` flow + Bundle 1 Phase 9's closure of the - flip-flop bypass. + `RequiresApproval` flow + the approval-bypass closure on + profile edits. - **FedRAMP AU-9** (audit information protection) - WORM enforcement + auditor-only read access (the auditor role cannot mutate, the WORM trigger blocks UPDATE/DELETE). @@ -632,7 +606,7 @@ Run these periodically to verify the controls are working. `audit.export` ONLY. Any other permission means a role grant widened the auditor's surface; revoke immediately. -The following checks are NEW with Bundle 2: +The following checks were added with v2.1.0's federated-identity surface: 6. `SELECT COUNT(*) FROM oidc_providers;` - confirm only the expected providers are configured. An unexpected row is a @@ -666,7 +640,7 @@ The following checks are NEW with Bundle 2: ## Cross-references -Bundle 1 (RBAC) anchors: +API-key + RBAC anchors: - [`rbac.md`](rbac.md) - the operator how-to - [`security.md`](security.md) - the wider security posture @@ -685,7 +659,7 @@ Bundle 1 (RBAC) anchors: - `migrations/000033_approval_kinds.up.sql` - approval-bypass closure -Bundle 2 (OIDC + sessions + back-channel logout + break-glass) anchors: +OIDC + sessions + back-channel logout + break-glass anchors: - [`oidc-runbooks/index.md`](oidc-runbooks/index.md) - per-IdP setup guides (Keycloak / Authentik / Okta / Auth0 / Entra ID / Google @@ -698,7 +672,7 @@ Bundle 2 (OIDC + sessions + back-channel logout + break-glass) anchors: CSRF middleware, chained-auth combinator - `internal/auth/breakglass/` - default-OFF break-glass admin (Argon2id + lockout + constant-time + surface-invisibility) -- `internal/auth/oidc/testfixtures/` - Phase 10 Keycloak +- `internal/auth/oidc/testfixtures/` - Keycloak testcontainers harness (`//go:build integration`) - `migrations/000034_oidc_providers.up.sql` - OIDC providers + group-role mappings tables @@ -711,8 +685,8 @@ Bundle 2 (OIDC + sessions + back-channel logout + break-glass) anchors: - `migrations/000038_breakglass_credentials.up.sql` - break-glass credentials table + 2 new permissions - `scripts/ci-guards/N-bundle-2-security-empty-preserved.sh` - - OpenAPI security: [] count guard + OpenAPI `security: []` count guard - `scripts/ci-guards/bundle-1-compat-regression.sh` - - Bundle-1-only-compat assertions (5 invariants) + API-key-only compat assertions (5 invariants) - `scripts/ci-guards/bundle-1-to-2-upgrade-regression.sh` - - upgrade-path assertions (6 invariants) + OIDC-upgrade-path assertions (6 invariants) diff --git a/docs/operator/database-tls.md b/docs/operator/database-tls.md index 4e2f078..8245724 100644 --- a/docs/operator/database-tls.md +++ b/docs/operator/database-tls.md @@ -2,14 +2,15 @@ > Last reviewed: 2026-05-05 -**Audit reference:** Bundle B / M-018. CWE-319 (Cleartext transmission of sensitive information). +**Audit reference:** CWE-319 (Cleartext transmission of sensitive information). certctl talks to Postgres over a single connection-string URL controlled by the `CERTCTL_DATABASE_URL` env var. The `sslmode` query parameter on that URL -selects the transport-encryption posture. Pre-Bundle-B all the bundled -deployment artifacts (Helm chart, docker-compose) hard-coded `sslmode=disable`. -Bundle B exposes that as an operator-facing knob with a documented default and -explicit opt-in / opt-out paths for the four real-world deployment shapes. +selects the transport-encryption posture. The bundled deployment artifacts +(Helm chart, docker-compose) historically hard-coded `sslmode=disable`; +current builds expose that as an operator-facing knob with a documented +default and explicit opt-in / opt-out paths for the four real-world +deployment shapes. ## Quick reference @@ -26,9 +27,9 @@ explicit opt-in / opt-out paths for the four real-world deployment shapes. is the floor for systems exposed to spoofing risk (it adds hostname validation against the server cert's CN/SAN). -## Helm chart (Bundle B) +## Helm chart -Bundle B adds two values under `postgresql.tls`: +The chart exposes two values under `postgresql.tls`: ```yaml postgresql: diff --git a/docs/operator/legacy-clients-tls-1.2.md b/docs/operator/legacy-clients-tls-1.2.md index 76e1110..307b350 100644 --- a/docs/operator/legacy-clients-tls-1.2.md +++ b/docs/operator/legacy-clients-tls-1.2.md @@ -2,7 +2,7 @@ > Last reviewed: 2026-05-05 -**Audit reference:** Bundle F / M-023. CWE-326 (Inadequate encryption strength). +**Audit reference:** CWE-326 (Inadequate encryption strength). ## What this is @@ -149,7 +149,7 @@ hop without server-side header trust. **Why this is the correct default:** trusting a proxy-supplied header for client identity opens a header-spoofing attack surface that requires careful design (CIDR allowlist of trusted proxies, fail-closed defaults, -explicit operator opt-in). The Bundle F closure of M-023 ships the +explicit operator opt-in). The legacy-clients work ships the TLS-bridge guidance as documentation only; a future commit can extend certctl with proxy-header trust if and when an operator demonstrates a deployment shape that requires it. Until that lands, the runbook above @@ -204,6 +204,6 @@ own embedded-device vendors for deprecation notices. - [`docs/operator/tls.md`](tls.md) — the certctl-internal TLS configuration (HTTPS-only control plane, MinVersion pin) - [`docs/operator/security.md`](security.md) — overall security posture -- [`docs/operator/database-tls.md`](database-tls.md) — Postgres TLS opt-in (Bundle B / M-018) +- [`docs/operator/database-tls.md`](database-tls.md) — Postgres TLS opt-in - [`docs/reference/protocols/scep-server.md`](../reference/protocols/scep-server.md) — SCEP RFC 8894 native server reference - [`docs/reference/protocols/est.md`](../reference/protocols/est.md) — EST RFC 7030 server reference diff --git a/docs/operator/oidc-runbooks/authentik.md b/docs/operator/oidc-runbooks/authentik.md index d3f1a02..1b6e7e6 100644 --- a/docs/operator/oidc-runbooks/authentik.md +++ b/docs/operator/oidc-runbooks/authentik.md @@ -14,7 +14,7 @@ For the canonical reference + mental model, read [keycloak.md](keycloak.md) firs - Admin access to the Authentik admin console at `https:///if/admin/`. - Network reachability from certctl-server to `https:///application/o//.well-known/openid-configuration`. -**On the certctl side:** same as Keycloak — `CERTCTL_CONFIG_ENCRYPTION_KEY` set, an admin actor holding `auth.oidc.create` + `auth.oidc.edit`, Bundle 2 server build. +**On the certctl side:** same as Keycloak — `CERTCTL_CONFIG_ENCRYPTION_KEY` set, an admin actor holding `auth.oidc.create` + `auth.oidc.edit`, server build ≥ v2.1.0. ## IdP-side configuration diff --git a/docs/operator/oidc-runbooks/azure-ad.md b/docs/operator/oidc-runbooks/azure-ad.md index 29e27ba..4665e60 100644 --- a/docs/operator/oidc-runbooks/azure-ad.md +++ b/docs/operator/oidc-runbooks/azure-ad.md @@ -149,7 +149,7 @@ curl -X POST https://:8443/api/v1/auth/oidc/group-mappings \ }' ``` -Repeat for every group you want to map. **Document the GUID-to-name mapping in your operator runbook** — without it, the next operator looking at certctl's mappings page sees a wall of GUIDs with no way to know which is which. Consider naming the mapping descriptively if your group-mapping schema supports it (Bundle 2 doesn't yet — group-mapping descriptions are a parking-lot item for a follow-on bundle). +Repeat for every group you want to map. **Document the GUID-to-name mapping in your operator runbook** — without it, the next operator looking at certctl's mappings page sees a wall of GUIDs with no way to know which is which. Consider naming the mapping descriptively if your group-mapping schema supports it (v2.1.0 doesn't yet — group-mapping descriptions are a parking-lot item for a follow-on release). ## Verification diff --git a/docs/operator/oidc-runbooks/index.md b/docs/operator/oidc-runbooks/index.md index 3aaf8d5..ea76811 100644 --- a/docs/operator/oidc-runbooks/index.md +++ b/docs/operator/oidc-runbooks/index.md @@ -2,7 +2,7 @@ > Last reviewed: 2026-05-10 -This is the index for the per-IdP setup runbooks that ship with Auth Bundle 2 (OIDC + sessions). Pick the runbook that matches your identity provider; each one walks you through the IdP-side configuration, the certctl-side configuration, end-to-end verification, and the most common troubleshooting paths. +This is the index for the per-IdP setup runbooks for certctl's OIDC SSO surface. Pick the runbook that matches your identity provider; each one walks you through the IdP-side configuration, the certctl-side configuration, end-to-end verification, and the most common troubleshooting paths. For the threat model behind certctl's OIDC implementation, see [`auth-threat-model.md`](../auth-threat-model.md). For the RBAC primitive that group→role mappings target, see [`rbac.md`](../rbac.md). For the underlying protocol details (PKCE, state, nonce, JWKS rotation, fail-closed semantics), see the OIDC service docstring at [`internal/auth/oidc/service.go`](../../../internal/auth/oidc/service.go). @@ -35,7 +35,7 @@ These show up in every runbook; understand them once and skim the rest. **Client secret rotation.** Every IdP issues a `client_secret` for the confidential client (certctl is always a confidential client; public clients aren't supported because we have a server-side place to keep the secret). Rotating at the IdP requires the operator to PUT the new secret into certctl via the GUI's "Edit provider" dialog or `certctl_auth_update_oidc_provider` MCP tool — leaving `client_secret` empty in the update payload preserves the existing ciphertext, providing a value rotates. -**JWKS cache TTL.** The certctl service caches the IdP's JWKS document for `jwks_cache_ttl_seconds` (default 3600). When the IdP rotates a signing key, in-flight logins that try to validate a new-key-signed token against the stale cache fail with `ErrJWKSUnreachable` until the next refresh. Operators have two options: wait out the TTL, or click "Refresh discovery cache" in the GUI's OIDC Provider Detail page (`POST /api/v1/auth/oidc/providers/{id}/refresh`) to force-evict the cache. The Phase 10 Keycloak integration test exercises this drill end to end. +**JWKS cache TTL.** The certctl service caches the IdP's JWKS document for `jwks_cache_ttl_seconds` (default 3600). When the IdP rotates a signing key, in-flight logins that try to validate a new-key-signed token against the stale cache fail with `ErrJWKSUnreachable` until the next refresh. Operators have two options: wait out the TTL, or click "Refresh discovery cache" in the GUI's OIDC Provider Detail page (`POST /api/v1/auth/oidc/providers/{id}/refresh`) to force-evict the cache. The Keycloak integration test exercises this drill end to end. **Group→role mappings are fail-closed.** The certctl service refuses to mint a session for a user whose IdP-supplied groups don't match ANY configured mapping (`ErrGroupsUnmapped` → HTTP 401 to the user with a "no roles assigned" page). This is intentional — empty mapping ≠ "let everyone in," it means "this provider is not yet configured for any role." Operators add at least one mapping (typically `` → `r-operator`) BEFORE rolling out OIDC to users. @@ -51,5 +51,5 @@ Each per-IdP runbook ends with a **validation checklist** the operator runs agai - [RBAC operator reference](../rbac.md) — roles, permissions, scope-down + bootstrap flow. - [Auth threat model](../auth-threat-model.md) — API-key + OIDC + session compromise scenarios; v3 WebAuthn pairing. -- [Security posture](../security.md) — overall auth surface incl. this Bundle 2 OIDC layer. -- [API keys → RBAC migration](../../migration/api-keys-to-rbac.md) — the Bundle 1 upgrade flow your operator likely already ran. +- [Security posture](../security.md) — overall auth surface including this OIDC layer. +- [API keys → RBAC migration](../../migration/api-keys-to-rbac.md) — the v2.0.x → v2.1.0 RBAC upgrade flow your operator likely already ran. diff --git a/docs/operator/oidc-runbooks/keycloak.md b/docs/operator/oidc-runbooks/keycloak.md index 28e4039..6232b9c 100644 --- a/docs/operator/oidc-runbooks/keycloak.md +++ b/docs/operator/oidc-runbooks/keycloak.md @@ -2,7 +2,7 @@ > Last reviewed: 2026-05-10 -This is the canonical reference runbook for wiring certctl's OIDC SSO surface against [Keycloak](https://www.keycloak.org/). Keycloak is a free / open-source identity provider that runs on-prem or self-hosted; it is also the load-bearing test fixture for Phase 10 of Auth Bundle 2 (`internal/auth/oidc/testfixtures/keycloak.go`), so the certctl-side validation pipeline is exhaustively exercised against it. +This is the canonical reference runbook for wiring certctl's OIDC SSO surface against [Keycloak](https://www.keycloak.org/). Keycloak is a free / open-source identity provider that runs on-prem or self-hosted; it is also the load-bearing test fixture for certctl's OIDC integration tests (`internal/auth/oidc/testfixtures/keycloak.go`), so the certctl-side validation pipeline is exhaustively exercised against it. If your IdP is something else (Okta, Auth0, Azure AD, Authentik, Google Workspace), see the per-IdP siblings in [this directory](index.md). The mental model + certctl-side wiring are identical; only the IdP-side console differs. @@ -10,7 +10,7 @@ If your IdP is something else (Okta, Auth0, Azure AD, Authentik, Google Workspac **On the Keycloak side:** -- Keycloak ≥ 25.0 (older versions work but the screen flows differ slightly — the Phase 10 fixture pins 25.0). +- Keycloak ≥ 25.0 (older versions work but the screen flows differ slightly — the integration test fixture pins 25.0). - Admin access to a realm — either an existing tenant realm or a fresh one created for certctl. Don't share Keycloak's `master` realm; create a dedicated realm. - Network reachability from certctl-server to the Keycloak `https:///realms/` discovery endpoint. The certctl service fetches `/.well-known/openid-configuration` at provider creation and at every `RefreshKeys` call. - Keycloak's signing alg set to RS256 (default) or any of: RS512, ES256, ES384, EdDSA. HS256/HS384/HS512 + `none` are rejected by certctl's IdP-downgrade-attack defense at provider creation time. @@ -19,11 +19,11 @@ If your IdP is something else (Okta, Auth0, Azure AD, Authentik, Google Workspac - `CERTCTL_CONFIG_ENCRYPTION_KEY` set to a stable secret (production deployments only — the encryption-at-rest layer for the OIDC client_secret depends on it). - An admin actor holding `auth.oidc.create` + `auth.oidc.edit` (held by `r-admin` by default; granted via `certctl_auth_assign_role_to_key` MCP tool or the GUI's Auth → Keys page). -- Bundle 2 server build ≥ v2.1.0 (or post-`5204f1b` master). +- Server build ≥ v2.1.0. ## IdP-side configuration -The same configuration you'll do by hand here is what the Phase 10 testcontainers fixture imports from `internal/auth/oidc/testfixtures/keycloak-realm.json` — read that file alongside this runbook to see the exact JSON shape Keycloak persists. +The same configuration you'll do by hand here is what the testcontainers fixture imports from `internal/auth/oidc/testfixtures/keycloak-realm.json` — read that file alongside this runbook to see the exact JSON shape Keycloak persists. ### 1. Create or pick a realm @@ -194,7 +194,7 @@ Operator action when Keycloak rotates its realm signing key: 2. In certctl: GUI → **Auth → OIDC Providers → Keycloak → Refresh discovery cache** button. Or the CLI / MCP equivalent: `POST /api/v1/auth/oidc/providers//refresh`. 3. Run another login. The new ID token is signed under the new key; the certctl service validates it against the freshly-fetched JWKS doc. -The Phase 10 integration test `TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey` exercises this exact flow end to end. +The Keycloak integration test `TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey` exercises this exact flow end to end. ## Troubleshooting @@ -214,7 +214,7 @@ The user authenticated successfully but their groups didn't match any configured - The group-membership mapper is configured correctly (Clients → certctl → Client scopes → certctl-dedicated → mappers → groups → "Full group path: off" matters). - The group name in your certctl mapping exactly matches what Keycloak emits — case-sensitive, no leading slash if "Full group path: off". -You can confirm what Keycloak is actually emitting by decoding the ID token at jwt.io against the Keycloak public key, or by enabling certctl's debug logging on the OIDC service for one login (logs are scrubbed of token contents per the Phase 3 token-leak hygiene contract; debug logs surface only the resolved group list and the mapping decision). +You can confirm what Keycloak is actually emitting by decoding the ID token at jwt.io against the Keycloak public key, or by enabling certctl's debug logging on the OIDC service for one login (logs are scrubbed of token contents per the OIDC service's token-leak hygiene contract; debug logs surface only the resolved group list and the mapping decision). **"id_token verify failed: token used before issued"** Clock skew between Keycloak and certctl-server. Either align both to NTP, or bump `iat_window_seconds` on the OIDC provider config (default 300 = 5 minutes). The certctl service caps `iat_window_seconds` at 600. @@ -226,7 +226,7 @@ The user clicked the OIDC login button, then the browser tab idled past the 10-m Either the user double-submitted a callback URL (clicked it twice from email or browser history), or a CSRF attempt. The pre-login row is single-use; second consumption returns `ErrPreLoginNotFound`. Have them retry from the login page. **Sessions revoked but the user can still hit the API.** -Check the Phase 4 session contract: the cookie is HMAC-validated on every request, but the actual database row is what `Revoke` deletes. If your reverse proxy is caching the response or the `certctl_session` cookie wasn't actually cleared on the client, the cookie will hit the server's session middleware which will return 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case. +Check the session contract: the cookie is HMAC-validated on every request, but the actual database row is what `Revoke` deletes. If your reverse proxy is caching the response or the `__Host-certctl_session` cookie wasn't actually cleared on the client, the cookie will hit the server's session middleware which will return 401 on the missing-row lookup. The middleware never serves stale data; the issue is upstream of certctl in this case. ## Validation checklist diff --git a/docs/operator/oidc-runbooks/okta.md b/docs/operator/oidc-runbooks/okta.md index 20bc308..932d5fb 100644 --- a/docs/operator/oidc-runbooks/okta.md +++ b/docs/operator/oidc-runbooks/okta.md @@ -112,7 +112,7 @@ End-to-end login + audit + Sessions checks are identical to Keycloak. **Okta-specific:** the audit row's `details.subject` will be Okta's user UID (a 20-char alphanumeric string starting with `00u`), stable across email changes. The certctl `users` table's `oidc_subject` column will hold this UID. -**Optional Okta smoke test in CI:** Phase 10 ships an opt-in smoke test at `internal/auth/oidc/integration_okta_smoke_test.go` (build tags `integration && okta_smoke`). Set `OKTA_ISSUER` + `OKTA_CLIENT_ID` + `OKTA_CLIENT_SECRET` env vars and run `make okta-smoke-test` to drive a discovery + RefreshKeys round-trip against your live tenant. Pre-reqs: enable the Resource Owner Password (ROPC) grant on the application (Sign-On tab → Grant types → Resource Owner Password) for the smoke test only; production certctl uses auth-code-with-PKCE. +**Optional Okta smoke test in CI:** certctl ships an opt-in smoke test at `internal/auth/oidc/integration_okta_smoke_test.go` (build tags `integration && okta_smoke`). Set `OKTA_ISSUER` + `OKTA_CLIENT_ID` + `OKTA_CLIENT_SECRET` env vars and run `make okta-smoke-test` to drive a discovery + RefreshKeys round-trip against your live tenant. Pre-reqs: enable the Resource Owner Password (ROPC) grant on the application (Sign-On tab → Grant types → Resource Owner Password) for the smoke test only; production certctl uses auth-code-with-PKCE. **JWKS-rotation drill:** Okta auto-rotates signing keys every ~3 months and publishes the new key alongside the old in the JWKS doc for ~1 month overlap. Manual rotation: **Security → API → Authorization Servers → default → Keys → "Generate new key"**. After rotation, click "Refresh discovery cache" in certctl's GUI; new tokens validate immediately. diff --git a/docs/operator/rbac.md b/docs/operator/rbac.md index ac34662..7789904 100644 --- a/docs/operator/rbac.md +++ b/docs/operator/rbac.md @@ -9,14 +9,14 @@ > [`security.md#demo-to-production-cutover-audit-2026-05-11-a-8`](security.md#demo-to-production-cutover-audit-2026-05-11-a-8). This is the operator-facing reference for the role-based access -control primitive that ships with Bundle 1 (auth bundle 1) of certctl. +control primitive in certctl. Read this if you're running certctl in production and need to grant / revoke access to API keys, set up the auditor split, or onboard the first admin. For the threat model behind these controls, see [`auth-threat-model.md`](auth-threat-model.md). For the migration -flow from a pre-Bundle-1 deployment, see +flow from a pre-RBAC (v2.0.x) deployment, see [`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md). ## Mental model @@ -69,7 +69,7 @@ giving them the keys to the kingdom. The forward. The five **admin-only fine-grained perms** seeded by migration -000030 (Phase 3.5 conversion) gate the high-blast-radius endpoints: +000030 gate the high-blast-radius endpoints: - `cert.bulk_revoke` - `POST /api/v1/certificates/bulk-revoke` and the EST sibling - `crl.admin` - `/api/v1/admin/crl/cache` @@ -141,14 +141,14 @@ even if no scoped grant exists. The reverse is also true - a scoped grant doesn't satisfy a request against a different scope. The Authorizer's `CheckPermission` is the single point of truth. -> **Note (Bundle 1 deferral):** the `scope_id` column is not +> **Note (deferral):** the `scope_id` column is not > currently FK-constrained against the resource tables. An > operator can grant a permission at scope `profile`/`p-bogus` > without `p-bogus` existing; the gate still works (no rows match -> at request time), but the API does not 404 the grant. Bundle 2 -> tracks the strict-FK closure. See +> at request time), but the API does not 404 the grant. Strict-FK +> closure is tracked for a follow-on release. See > `internal/repository/postgres/auth.go::AddPermission`'s -> `TODO(bundle-2)` comment. +> `TODO` comment. ## Granting + revoking access @@ -194,7 +194,7 @@ certctl-cli auth keys scope-down --non-interactive ./scope-down.json The mutating role-lifecycle commands (`certctl-cli auth roles create / update / delete` + `roles add-permission / remove-permission`) -are tracked as Bundle 1 Phase 5.5 follow-up; today, manage custom +are tracked as a follow-on; today, manage custom roles via the HTTP API or GUI. ### From the HTTP API @@ -258,7 +258,7 @@ distinguish wide cleanups from targeted demotions in the access log. ### From the MCP server -Bundle 1 Phase 11 ships 12 RBAC tools: +The MCP server ships 12 RBAC tools: `certctl_auth_me`, `certctl_auth_list_roles`, `certctl_auth_get_role`, `certctl_auth_create_role`, `certctl_auth_update_role`, `certctl_auth_delete_role`, `certctl_auth_list_permissions`, @@ -296,7 +296,7 @@ To create an auditor key: ## Day-0 bootstrap (first-admin path) -Bundle 1 Phase 6 ships a one-shot bootstrap endpoint for fresh +certctl ships a one-shot bootstrap endpoint for fresh deployments where no admin actor exists yet. 1. Set `CERTCTL_BOOTSTRAP_TOKEN=$(openssl rand -hex 32)` in the @@ -321,9 +321,10 @@ deployments where no admin actor exists yet. The token is constant-time-compared. The server logs a startup warning if `CERTCTL_BOOTSTRAP_TOKEN` is set AND admin actors -already exist (config-drift signal). For OIDC-first-admin (the -"first user who signs in via SSO becomes admin" pattern), wait for -Bundle 2. +already exist (config-drift signal). For the OIDC-first-admin +path (the "first user who signs in via SSO becomes admin" +pattern), see +[`docs/migration/oidc-enable.md`](../migration/oidc-enable.md). ## Demo mode (`CERTCTL_AUTH_TYPE=none`) @@ -344,11 +345,11 @@ example folders only. - [Threat model](auth-threat-model.md) - what attacks this primitive defends against and which it does not - [Migration guide](../migration/api-keys-to-rbac.md) - moving - pre-Bundle-1 deployments onto RBAC + pre-RBAC (v2.0.x) deployments onto RBAC - [Profiles](../reference/profiles.md) - the `RequiresApproval=true` - flow that Bundle 1 Phase 9 closure protects from flip-flop -- [Approval workflow](approval-workflow.md) - the Rank 7 Infisical - deep-research deliverable that the Phase 9 closure piggybacks on + flow with the flip-flop-bypass closure +- [Approval workflow](approval-workflow.md) - the two-person + integrity primitive backing `RequiresApproval` - `internal/auth/` - the middleware + keystore + RequirePermission - `internal/service/auth/` - the service-layer Authorizer - `cowork/auth-bundle-1-prompt.md` - the design + phase plan diff --git a/docs/operator/runbooks/disaster-recovery.md b/docs/operator/runbooks/disaster-recovery.md index 46a01ba..6860030 100644 --- a/docs/operator/runbooks/disaster-recovery.md +++ b/docs/operator/runbooks/disaster-recovery.md @@ -2,12 +2,11 @@ > Last reviewed: 2026-05-05 -> **Status (this document):** Production hardening II Phase 10 -> deliverable. Codifies the fail-safe behaviors that already exist in -> the codebase and the operator procedures for recovering from -> common failure modes. Nothing in this runbook requires new code — -> if a procedure here doesn't work as documented, that's a bug in -> docs (file an issue). +> **Status (this document):** Operator runbook codifying the +> fail-safe behaviors that already exist in the codebase and the +> procedures for recovering from common failure modes. Nothing in +> this runbook requires new code — if a procedure here doesn't work +> as documented, that's a bug in docs (file an issue). This runbook is the on-call deliverable: it tells reviewers and on-call operators what to do when a piece of certctl's state diff --git a/docs/operator/security.md b/docs/operator/security.md index c9cbf8a..369687e 100644 --- a/docs/operator/security.md +++ b/docs/operator/security.md @@ -9,16 +9,15 @@ any). ## OCSP responder availability -**Audit reference:** Bundle C / M-020. CWE-770 (uncontrolled resource -consumption); RFC 6960 (OCSP); RFC 7633 (Must-Staple). +**Audit reference:** CWE-770 (uncontrolled resource consumption); RFC +6960 (OCSP); RFC 7633 (Must-Staple). certctl ships an OCSP responder at `/.well-known/pki/ocsp/{issuer_id}/{serial}` -that signs a fresh response per request. Pre-Bundle-C the unauth handler -chain had no rate limit, so an attacker could DoS the responder and force -fail-open relying parties to accept revoked certificates as valid. Bundle C -adds the same per-key rate limiter to the unauth chain that the authenticated -chain has used since Bundle B. Per-IP keying applies because OCSP traffic is -unauthenticated. +that signs a fresh response per request. The unauth handler chain +applies the same per-key rate limiter the authenticated chain uses; +per-IP keying applies because OCSP traffic is unauthenticated. Without +this defense an attacker could DoS the responder and force fail-open +relying parties to accept revoked certificates as valid. The rate limiter alone does not solve the underlying revocation-bypass risk. **The architectural fix is for issued certificates to carry the OCSP @@ -59,11 +58,11 @@ For certificates issued to systems where revocation correctness matters: ## Postgres transport encryption -See [docs/database-tls.md](database-tls.md). Bundle B / M-018. +See [docs/database-tls.md](database-tls.md). ## Encryption at rest -Bundle B / M-001. PBKDF2-SHA256 at 600,000 rounds (OWASP 2024 Password +PBKDF2-SHA256 at 600,000 rounds (OWASP 2024 Password Storage Cheat Sheet floor) for the operator-supplied passphrase that derives the AES-256-GCM key for sensitive config columns. v3 blob format with a per-ciphertext random salt; v1/v2 read fallback for legacy rows. @@ -72,13 +71,13 @@ the accompanying tests for the format spec. ## Authentication surface -Bundle B / M-002. Two layers decide auth-exempt status: +Two layers decide auth-exempt status: 1. **Router layer:** `internal/api/router/router.go::AuthExemptRouterRoutes` - the endpoints registered via direct `r.mux.Handle` without going through the middleware chain (`/health`, `/ready`, `/api/v1/auth/info`, - `/api/v1/version`, plus `/api/v1/auth/bootstrap` GET + POST per - Bundle 1 Phase 6). + `/api/v1/version`, plus `/api/v1/auth/bootstrap` GET + POST for the + first-admin path). 2. **Dispatch layer:** `internal/api/router/router.go::AuthExemptDispatchPrefixes` - URL-prefix routing in `cmd/server/main.go::buildFinalHandler` for `/.well-known/pki/*`, `/.well-known/est/*`, `/.well-known/est-mtls`, @@ -87,26 +86,25 @@ Bundle B / M-002. Two layers decide auth-exempt status: Both lists have AST-walking regression tests (`auth_exempt_test.go`) that fail CI if a new bypass lands without updating the documented constant. -### RBAC primitive (Bundle 1) +### Role-based authorization -Bundle 1 ships role-based authorization on top of API-key -authentication. Every gated handler routes through the -`auth.RequirePermission` middleware (or its router-level wrap -`rbacGate`); the middleware resolves the actor's effective -permissions via the service-layer `Authorizer.CheckPermission` -and returns HTTP 403 BEFORE the handler body runs on miss. The -seven default roles (`admin` / `operator` / `viewer` / `agent` / -`mcp` / `cli` / `auditor`), 33-permission canonical catalogue, -and the auditor split (`r-auditor` holds only `audit.read` + -`audit.export`) are seeded by migration 000029. +Role-based authorization runs on top of API-key authentication. Every +gated handler routes through the `auth.RequirePermission` middleware +(or its router-level wrap `rbacGate`); the middleware resolves the +actor's effective permissions via the service-layer +`Authorizer.CheckPermission` and returns HTTP 403 BEFORE the handler +body runs on miss. The seven default roles (`admin` / `operator` / +`viewer` / `agent` / `mcp` / `cli` / `auditor`), 33-permission +canonical catalogue, and the auditor split (`r-auditor` holds only +`audit.read` + `audit.export`) are seeded by migration 000029. For the operator how-to, see [`rbac.md`](rbac.md). For the threat model + compliance mapping, see [`auth-threat-model.md`](auth-threat-model.md). For the upgrade -flow from a pre-Bundle-1 deployment, see +flow from an API-key-only deployment, see [`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md). -### Day-0 admin bootstrap (Bundle 1 Phase 6) +### Day-0 admin bootstrap Fresh deployments where no admin actor exists yet can mint the first admin via `POST /api/v1/auth/bootstrap` - set @@ -119,24 +117,25 @@ into the HTTP response body. See [`rbac.md`](rbac.md#day-0-bootstrap-first-admin-path) for the full flow. -### Approval-bypass closure (Bundle 1 Phase 9) +### Approval-bypass closure `CertificateProfile.RequiresApproval=true` profiles route both issuance/renewal AND profile edits through the -`ApprovalService` two-person integrity gate (Phase 9 closes the -flip-flop loophole where an admin could disable approval, mutate, -re-enable). Same-actor self-approve is rejected at the service -layer with `ErrApproveBySameActor`. See +`ApprovalService` two-person integrity gate. The flip-flop loophole +(an admin disabling approval, mutating, re-enabling) is closed by +gating profile-edit through the same approval flow. Same-actor +self-approve is rejected at the service layer with +`ErrApproveBySameActor`. See [`docs/reference/profiles.md`](../reference/profiles.md) for the full gate semantics. -### OIDC federation (Bundle 2 Phases 1-7) +### OIDC federation -Bundle 2 adds OIDC SSO on top of the API-key + RBAC foundation. -Operators configure one or more identity providers (Keycloak, -Authentik, Okta, Auth0, Entra ID, or Google Workspace via Keycloak -broker); end users sign in at the IdP, certctl validates the -returned ID token, and a session cookie is minted. +OIDC SSO runs on top of the API-key + RBAC foundation. Operators +configure one or more identity providers (Keycloak, Authentik, Okta, +Auth0, Entra ID, or Google Workspace via Keycloak broker); end users +sign in at the IdP, certctl validates the returned ID token, and a +session cookie is minted. The token-validation pipeline pins: @@ -151,9 +150,9 @@ The token-validation pipeline pins: - Exact `iss` match (`ErrIssuerMismatch`). - `aud` membership + `azp` for multi-aud tokens (per OIDC core §3.1.3.7 step 5). -- `at_hash` REQUIRED-when-access_token-present (Phase 3 tightening - of the spec MAY → MUST so a substituted access token cannot - ride alongside a clean ID token). +- `at_hash` REQUIRED-when-access_token-present (a tightening of the + spec MAY → MUST so a substituted access token cannot ride alongside + a clean ID token). - Single-use state + nonce (32-byte random server-generated; atomic `DELETE...RETURNING` on consume). - PKCE-S256 mandatory; `plain` rejected. @@ -175,7 +174,7 @@ Per-IdP setup guides at [`oidc-runbooks/index.md`](oidc-runbooks/index.md) cover Keycloak, Authentik, Okta, Auth0, Entra ID, and Google Workspace. -### Sessions + back-channel logout (Bundle 2 Phases 4-6) +### Sessions + back-channel logout Successful OIDC login mints a session cookie: `v1...`. @@ -220,9 +219,9 @@ For threat-model coverage of these surfaces, see operator-runnable performance baselines, see [`auth-benchmarks.md`](auth-benchmarks.md). -### OIDC first-admin bootstrap (Bundle 2 Phase 7) +### OIDC first-admin bootstrap -Coexists with Bundle 1's env-var-token bootstrap. When the +Coexists with the env-var-token bootstrap path. When the operator sets `CERTCTL_BOOTSTRAP_ADMIN_GROUPS` + (optionally) `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID`, the first user with one of those IdP groups becomes admin on first login per tenant. @@ -232,7 +231,7 @@ once any actor holds `r-admin`, the OIDC bootstrap hook silently falls through to normal mapping. Audit row on every grant (`bootstrap.oidc_first_admin`, `event_category=auth`). -### Break-glass admin (Bundle 2 Phase 7.5) +### Break-glass admin Default-OFF (`CERTCTL_BREAKGLASS_ENABLED=false`). When enabled, the local-password admin path bypasses OIDC + group-claim layers; @@ -319,8 +318,8 @@ Operator workflow at production cutover: ### Migrating an existing deployment to OIDC -A Bundle-1-merged deployment that wants to add OIDC follows the -step-by-step at +An existing API-key-only deployment that wants to add OIDC follows +the step-by-step at [`docs/migration/oidc-enable.md`](../migration/oidc-enable.md): configure CERTCTL_CONFIG_ENCRYPTION_KEY, pick + configure an IdP per the relevant runbook, configure the certctl-side OIDCProvider @@ -330,7 +329,7 @@ organization. ## Per-user rate limiting -Bundle B / M-025. Authenticated callers are bucketed by API-key name; +Authenticated callers are bucketed by API-key name; unauthenticated callers (probes, OCSP relying parties, EST/SCEP enrollees) are bucketed by source IP. `RPS` and `BurstSize` are per-key budgets. `PerUserRPS` / `PerUserBurstSize` give authenticated clients a separate @@ -345,11 +344,7 @@ certctl's API keys are configured via the `CERTCTL_API_KEYS_NAMED` env var in-memory list. There is no DB-resident key store, no GUI, no `/api/v1/keys` endpoint - the env var IS the key inventory. -Pre-Bundle-G the env var rejected duplicate names, so rotating a key -required: stop accepting OLDKEY → restart → roll NEWKEY out. Any client -polling against OLDKEY during the restart window hit a 401. - -Bundle G adds a **double-key rotation window**: two entries can share a +The env var supports a **double-key rotation window**: two entries can share a name during the rollover, and both keys validate. Operators run the rotation as: @@ -395,7 +390,7 @@ the end of step 4, extend the window before step 5. startup** (privilege escalation guard). - Two entries with the same `(name, key)` pair: **rejected at startup** (typo guard - rotation requires DIFFERENT keys under the same name). -- Single-entry steady state: unchanged from pre-Bundle-G behavior. +- Single-entry steady state: the simple legacy behaviour. ### What the contract does NOT do diff --git a/docs/reference/auth-standards-implemented.md b/docs/reference/auth-standards-implemented.md index b2699f3..d4871cf 100644 --- a/docs/reference/auth-standards-implemented.md +++ b/docs/reference/auth-standards-implemented.md @@ -2,7 +2,7 @@ > Last reviewed: 2026-05-10 -This document is an honest informational reference for operators, external testers, and acquirers who want to know which RFCs and standards Auth Bundle 1 (RBAC) and Auth Bundle 2 (OIDC + sessions + back-channel logout + break-glass) implement, and which CWE weakness classes the implementation closes. Every row points at a real file or migration in this repository. +This document is an honest informational reference for operators, external testers, and acquirers who want to know which RFCs and standards certctl's authentication surface (API keys + RBAC + OIDC + sessions + back-channel logout + break-glass admin) implements, and which CWE weakness classes the implementation closes. Every row points at a real file or migration in this repository. This document is intentionally NOT a compliance-mapping doc. The operator retired the framework-mapping subtree (`docs/compliance/{index,soc2,pci-dss,nist-sp-800-57}.md`) on 2026-05-05; framework-name-drops (SOC 2 / PCI-DSS / HIPAA / NIST SSDF / FedRAMP) are also swept from prose mentions across `README.md` and `docs/` per that decision. RFC and CWE references stay because they are precise technical pointers; framework labels were marketing-flavored and prone to overclaim. If you are an auditor mapping certctl's controls to a framework, treat the rows below as evidence and do the framework mapping yourself against the framework you are auditing against. @@ -17,15 +17,15 @@ Each row carries at least one negative test (a test that asserts the fail-closed | RFC 6749 (OAuth 2.0) | Authorization-code grant via OIDC; confidential-client credentials only | `internal/auth/oidc/service.go` (HandleAuthRequest, HandleCallback) | `internal/auth/oidc/service_test.go` (21+ negatives covering wrong aud / wrong iss / expired / etc.) | | RFC 7636 (PKCE) | S256 challenge mandatory; `plain` rejected at the service-layer sentinel; verifier persisted in pre-login row, single-use | `internal/auth/oidc/service.go` (oauth2.S256ChallengeOption hard-coded), `internal/auth/oidc/prelogin.go` | `TestService_PKCEPlainRejectedSentinel`, `TestService_StateReplayDeniedByConsumeOnce` | | RFC 7519 (JWT) | ID-token validation via go-oidc; service-layer alg allow-list (RS256/RS512/ES256/ES384/EdDSA); HS-family + `none` rejected | `internal/auth/oidc/service.go` (disallowedAlgs map, isDisallowedAlg) | `TestService_HandleCallback_RejectsHSAlgsConfusion`, `TestService_IdPDowngradeDefense_RejectsHSAdvertised` | -| RFC 7517 (JWK) | JWKS fetch + cache + rotation handled transparently by coreos/go-oidc; operator-triggered RefreshKeys + auto-refresh on TTL expiry | `internal/auth/oidc/service.go` (RefreshKeys; cfg.JWKSCacheTTLSeconds default 3600) | `TestService_RefreshKeys_CatchesPostLoadDowngrade`, `TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey` (Phase 10 integration) | -| OIDC Core 1.0 §3.1.3.7 | `iss` exact match, `aud` membership, `azp` for multi-aud, `at_hash` REQUIRED-when-access_token-present (Phase 3 tightening of the spec MAY → MUST), `nonce` constant-time-compare | `internal/auth/oidc/service.go` (HandleCallback steps 5-9) | `TestService_HandleCallback_RejectsWrongAudience`, `TestService_HandleCallback_AZPRequiredOnMultiAud`, `TestService_HandleCallback_ATHashRequiredWhenAccessTokenPresent`, `TestService_HandleCallback_RejectsNonceMismatch` | +| RFC 7517 (JWK) | JWKS fetch + cache + rotation handled transparently by coreos/go-oidc; operator-triggered RefreshKeys + auto-refresh on TTL expiry | `internal/auth/oidc/service.go` (RefreshKeys; cfg.JWKSCacheTTLSeconds default 3600) | `TestService_RefreshKeys_CatchesPostLoadDowngrade`, `TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey` (Keycloak integration) | +| OIDC Core 1.0 §3.1.3.7 | `iss` exact match, `aud` membership, `azp` for multi-aud, `at_hash` REQUIRED-when-access_token-present (certctl tightens the spec MAY → MUST), `nonce` constant-time-compare | `internal/auth/oidc/service.go` (HandleCallback steps 5-9) | `TestService_HandleCallback_RejectsWrongAudience`, `TestService_HandleCallback_AZPRequiredOnMultiAud`, `TestService_HandleCallback_ATHashRequiredWhenAccessTokenPresent`, `TestService_HandleCallback_RejectsNonceMismatch` | | OIDC Core 1.0 §5.3.2 (UserInfo endpoint) | Optional fallback when ID-token groups claim is empty; bounded by configured FetchUserinfo bool | `internal/auth/oidc/service.go` (fetchUserinfoGroups) | 4-case userinfo-fallback matrix in `service_test.go` (happy + endpoint-missing + endpoint-failing + userinfo-also-empty) | | OpenID Connect Back-Channel Logout 1.0 | `events` claim + `sid`/`sub` revocation; `nonce` MUST be absent; `jti`-based replay defense | `internal/api/handler/auth_session_oidc.go` (BackChannelLogout, DefaultBCLVerifier) | 6 negatives in `auth_session_oidc_test.go`: BCL missing events, BCL nonce-present, BCL unknown-key-sig, etc. | -| RFC 6265 (HTTP State Management) | Session cookie attributes: `Secure` + `HttpOnly` + `SameSite=Lax` (default; configurable to Strict via `CERTCTL_SESSION_SAMESITE`); `Path=/`; host-only | `internal/auth/session/service.go` (cookie minting), `internal/api/handler/auth_session_oidc.go` (Set-Cookie wiring) | Phase 6 middleware-chain test matrix (7 cases) in `internal/auth/session/middleware_test.go` | +| RFC 6265 (HTTP State Management) | Session cookie attributes: `Secure` + `HttpOnly` + `SameSite=Lax` (default; configurable to Strict via `CERTCTL_SESSION_SAMESITE`); `Path=/`; host-only | `internal/auth/session/service.go` (cookie minting), `internal/api/handler/auth_session_oidc.go` (Set-Cookie wiring) | 7-case middleware-chain test matrix in `internal/auth/session/middleware_test.go` | | RFC 9700 (OAuth 2.0 Security Best Current Practice) | PKCE mandatory; no implicit flow; strict redirect_uri (registered + exact-match per OIDCProvider.RedirectURI); state non-guessable (32-byte random); single-use | `internal/auth/oidc/service.go`; `OIDCProvider.Validate()` enforces redirect_uri shape | `TestOIDCProvider_Validate_RejectsHTTPRedirectInProd`, state-replay test | | RFC 8414 (OAuth 2.0 Authorization Server Metadata) | Discovery doc fetched via go-oidc at provider creation + RefreshKeys; `id_token_signing_alg_values_supported` consulted for IdP-downgrade-attack defense | `internal/auth/oidc/service.go` (getOrLoad, guardAdvertisedAlgs) | `TestService_IdPDowngradeDefense_RejectsHSAdvertised` and `RejectsNoneAdvertised` | -| RFC 7633 (X.509 TLS Feature Extension; Must-Staple) | Per-profile certctl issuance flag; out-of-scope for Bundle 2 but cited here because RFC 7633 OID `id-pe-tlsfeature` is in the same crypto-stack umbrella | `internal/connector/issuer/local/local.go` | Bundle 9 SCEP master-bundle Phase 5.6 tests; not Bundle-2 territory | -| RFC 8555 §7 (ACME directory metadata) | certctl-side ACME server tier; out-of-scope for Bundle 2 but cited because it shares the alg-pinning + nonce-handling discipline that Bundle 2 carries forward | `internal/api/handler/acme/*` | per-route handler tests in `internal/api/handler/acme/` | +| RFC 7633 (X.509 TLS Feature Extension; Must-Staple) | Per-profile certctl issuance flag; out-of-scope for the auth surface but cited here because RFC 7633 OID `id-pe-tlsfeature` is in the same crypto-stack umbrella | `internal/connector/issuer/local/local.go` | SCEP master-bundle must-staple tests; not auth-surface territory | +| RFC 8555 §7 (ACME directory metadata) | certctl-side ACME server tier; out-of-scope for the auth surface but cited because it shares the alg-pinning + nonce-handling discipline the auth surface carries forward | `internal/api/handler/acme/*` | per-route handler tests in `internal/api/handler/acme/` | | RFC 7515 (JWS) | JWS verification delegated to go-oidc/v3 + go-jose/v4; alg pin enforced at `gooidc.NewIDTokenVerifier` config + service-layer re-check | `internal/auth/oidc/service.go` (oauthConfig + verifier wiring) | `TestService_HandleCallback_RejectsExpired` and `TestService_HandleCallback_RejectsIATInFuture` | ## Table 2: CWE / weakness classes the implementation closes @@ -34,28 +34,28 @@ Each row points at the file(s) that implement the defense and the test file(s) t | CWE | Description | Where defended | Where pinned | |---|---|---|---| -| CWE-287 (Improper Authentication) | Session-cookie HMAC verification (length-prefixed input defeats concat-collision) + alg-pinned ID-token verify | `internal/auth/session/service.go` (computeHMAC, parseCookie, Validate); `internal/auth/oidc/service.go` (HandleCallback) | `TestComputeHMAC_LengthPrefixDefeatsConcatCollision`; `TestService_Validate_ConcatenationCollisionDefeatedByLengthPrefix`; full Phase 3 21+ negatives matrix | -| CWE-352 (Cross-Site Request Forgery) | Double-submit cookie + `SameSite=Lax`/`Strict` + hashed CSRF token on session row; constant-time compare in CSRFMiddleware | `internal/auth/session/middleware.go` (CSRFMiddleware) | Phase 6 7-case middleware-chain matrix (`internal/auth/session/middleware_test.go`); `TestSessionMiddleware_CSRFRequiredOnStateChangingMethods` | +| CWE-287 (Improper Authentication) | Session-cookie HMAC verification (length-prefixed input defeats concat-collision) + alg-pinned ID-token verify | `internal/auth/session/service.go` (computeHMAC, parseCookie, Validate); `internal/auth/oidc/service.go` (HandleCallback) | `TestComputeHMAC_LengthPrefixDefeatsConcatCollision`; `TestService_Validate_ConcatenationCollisionDefeatedByLengthPrefix`; full 21+ OIDC negatives matrix | +| CWE-352 (Cross-Site Request Forgery) | Double-submit cookie + `SameSite=Lax`/`Strict` + hashed CSRF token on session row; constant-time compare in CSRFMiddleware | `internal/auth/session/middleware.go` (CSRFMiddleware) | 7-case middleware-chain matrix (`internal/auth/session/middleware_test.go`); `TestSessionMiddleware_CSRFRequiredOnStateChangingMethods` | | CWE-384 (Session Fixation) | Session ID is opaque random `ses-` (32 bytes entropy) generated server-side at login; cookie value rotates on every login (no inheritance from pre-login); CSRF token rotates alongside | `internal/auth/session/service.go` (Create, RotateCSRFToken) | `TestService_Create_AssignsFreshSessionID`; CSRF rotation pinned via `TestService_RotateCSRFToken_AfterLogin` | | CWE-294 (Authentication Bypass by Capture-Replay) | Single-use state, single-use nonce (both stored in pre-login row, atomic `DELETE...RETURNING` on consume); single-use authorization code (Keycloak/IdP-side); `jti`-based BCL replay defense | `internal/auth/oidc/prelogin.go` (LookupAndConsume); `internal/api/handler/auth_session_oidc.go` (BCL handler) | `TestService_StateReplayDeniedByConsumeOnce`; `TestService_HandleCallback_RejectsForgedPreLoginCookie`; BCL replay negative in handler tests | | CWE-916 / CWE-329 (Use of Password Hash With Insufficient Computational Effort / Use of a Key Past its Expiration Date) | Argon2id with OWASP 2024 params (m=64 MiB, t=3, p=4, 16-byte salt, 32-byte output) for break-glass passwords; per-credential random salt; PHC-format hash | `internal/auth/breakglass/service.go` (HashPassword, VerifyPassword); v3 ciphertext blob format with PBKDF2-SHA256 600,000 rounds for config-at-rest encryption | `TestPhase7_5_HashPasswordOWASP2024Params`; `TestPhase7_5_HashFormatPHC`; `internal/crypto/encryption_test.go` for v3 PBKDF2 floor | | CWE-307 (Improper Restriction of Excessive Authentication Attempts) | Failure count + lockout window on break-glass credential; threshold default 5, reset window default 1h, lockout duration default 30s; atomic single-statement IncrementFailure defeats concurrent racing attempts | `internal/auth/breakglass/service.go` (Login, IncrementFailure); `internal/repository/postgres/breakglass.go` | `TestPhase7_5_LockoutAfterThresholdFailures`; `TestPhase7_5_FailureCountResetsAfterWindow` | -| CWE-345 (Insufficient Verification of Data Authenticity) | OIDC `at_hash` REQUIRED-when-access_token-present ties access token to ID token (Phase 3 tightening of OIDC core MAY → MUST); OIDC `iss` + `aud` + `azp` checks ensure token came from the configured IdP for the configured client | `internal/auth/oidc/service.go` (HandleCallback steps 5-9, atHashMatches) | `TestService_HandleCallback_ATHashRequiredWhenAccessTokenPresent`; `TestService_HandleCallback_RejectsATHashMismatch` | -| CWE-200 (Information Exposure) | Token-leak hygiene tests on every secret-bearing path: ID tokens, access tokens, refresh tokens, authorization codes, PKCE verifiers, state, nonce, signing keys, break-glass passwords NEVER appear in any log line at any level | `internal/auth/oidc/service.go`, `internal/auth/session/service.go`, `internal/auth/breakglass/service.go` (all log calls audited); `internal/service/audit_redact.go` (Bundle 6 redactor) | `internal/auth/oidc/logging_test.go` (4 grep-asserts); `internal/auth/breakglass/service_test.go` (token-leak hygiene + json.Marshal probe); `internal/auth/bootstrap/service_test.go` (Bundle 1 pattern) | +| CWE-345 (Insufficient Verification of Data Authenticity) | OIDC `at_hash` REQUIRED-when-access_token-present ties access token to ID token (certctl tightens OIDC core MAY → MUST); OIDC `iss` + `aud` + `azp` checks ensure token came from the configured IdP for the configured client | `internal/auth/oidc/service.go` (HandleCallback steps 5-9, atHashMatches) | `TestService_HandleCallback_ATHashRequiredWhenAccessTokenPresent`; `TestService_HandleCallback_RejectsATHashMismatch` | +| CWE-200 (Information Exposure) | Token-leak hygiene tests on every secret-bearing path: ID tokens, access tokens, refresh tokens, authorization codes, PKCE verifiers, state, nonce, signing keys, break-glass passwords NEVER appear in any log line at any level | `internal/auth/oidc/service.go`, `internal/auth/session/service.go`, `internal/auth/breakglass/service.go` (all log calls audited); `internal/service/audit_redact.go` (audit redactor) | `internal/auth/oidc/logging_test.go` (4 grep-asserts); `internal/auth/breakglass/service_test.go` (token-leak hygiene + json.Marshal probe); `internal/auth/bootstrap/service_test.go` (canonical pattern) | | CWE-770 (Allocation of Resources Without Limits or Throttling) | Per-IP rate limit on `/auth/breakglass/login` via the global middleware.NewRateLimiter (default RPS / burst from `CERTCTL_RATE_LIMIT_*` env vars) wrapped around the entire mux; the breakglass login endpoint inherits this protection. Per-route override available via `middleware.NewRateLimiter` per-bucket configuration if the operator wants stricter caps | `cmd/server/main.go` (rateLimiter wiring at the root middleware stack); `internal/api/middleware/middleware.go` (NewRateLimiter) | `internal/api/middleware/ratelimit_test.go`; `internal/api/middleware/ratelimit_keyed_test.go` | | CWE-330 (Use of Insufficiently Random Values) | `crypto/rand` for state, nonce, PKCE verifier (via `oauth2.GenerateVerifier`), session signing keys (32 random bytes), session IDs (`ses-` from 32 random bytes), pre-login IDs (`pl-` from 16 random bytes), CSRF tokens (32 random bytes), break-glass salts (16 random bytes via `crypto/rand`) | `internal/auth/oidc/service.go` (randomB64URL); `internal/auth/session/service.go` (newOpaqueID, newCSRFToken); `internal/auth/oidc/prelogin.go` (newID); `internal/auth/breakglass/service.go` (HashPassword salt) | `TestPreLoginAdapter_CreatePreLogin_RNGFailure` (entropy-source error path); RNG failure pinned for every callsite | -| CWE-311 (Missing Encryption of Sensitive Data) | OIDC `client_secret` AES-256-GCM encrypted at rest (v3 blob format: magic 0x03 + salt(16) + nonce(12) + ciphertext+tag); session signing keys same scheme; empty `CERTCTL_CONFIG_ENCRYPTION_KEY` returns `ErrEncryptionKeyRequired` (fail-closed) | `internal/crypto/encryption.go` (EncryptIfKeySet, DecryptIfKeySet); `internal/api/handler/auth_session_oidc.go` (encryptClientSecret); `internal/auth/session/service.go` (KeyMaterialEncrypted) | `internal/repository/postgres/oidc_encryption_invariant_test.go` (Phase 13 invariant test: ciphertext != plaintext, v2/v3 blob shape, round-trip + wrong-passphrase fails) | -| CWE-326 (Inadequate Encryption Strength) | TLS 1.3 only on the certctl control plane (post-v2.2 milestone); HSTS-equivalent posture via HTTPS-only listener; AES-256-GCM for at-rest config encryption; PBKDF2-SHA256 600,000 rounds for v3 blob key derivation (OWASP 2024 floor) | `cmd/server/main.go` (TLS 1.3 listener config); `internal/crypto/encryption.go` (v3 PBKDF2 iteration count) | `TestServerTLSConfig_RejectsTLS12` (Bundle 5); `TestEncryption_V3IterationCount_PinnedAtOWASP2024Floor` | +| CWE-311 (Missing Encryption of Sensitive Data) | OIDC `client_secret` AES-256-GCM encrypted at rest (v3 blob format: magic 0x03 + salt(16) + nonce(12) + ciphertext+tag); session signing keys same scheme; empty `CERTCTL_CONFIG_ENCRYPTION_KEY` returns `ErrEncryptionKeyRequired` (fail-closed) | `internal/crypto/encryption.go` (EncryptIfKeySet, DecryptIfKeySet); `internal/api/handler/auth_session_oidc.go` (encryptClientSecret); `internal/auth/session/service.go` (KeyMaterialEncrypted) | `internal/repository/postgres/oidc_encryption_invariant_test.go` (invariant test: ciphertext != plaintext, v2/v3 blob shape, round-trip + wrong-passphrase fails) | +| CWE-326 (Inadequate Encryption Strength) | TLS 1.3 only on the certctl control plane (post-v2.2 milestone); HSTS-equivalent posture via HTTPS-only listener; AES-256-GCM for at-rest config encryption; PBKDF2-SHA256 600,000 rounds for v3 blob key derivation (OWASP 2024 floor) | `cmd/server/main.go` (TLS 1.3 listener config); `internal/crypto/encryption.go` (v3 PBKDF2 iteration count) | `TestServerTLSConfig_RejectsTLS12`; `TestEncryption_V3IterationCount_PinnedAtOWASP2024Floor` | | CWE-1004 (Sensitive Cookie Without HttpOnly) | Session cookie set with `HttpOnly=true`; CSRF cookie intentionally `HttpOnly=false` so the GUI can read it for the `X-CSRF-Token` header (the read is by-design per the double-submit-cookie pattern) | `internal/auth/session/service.go` (cookie attrs); `internal/api/handler/auth_session_oidc.go` (Set-Cookie wiring) | Cookie-attribute pinning in handler tests; documented in [auth-threat-model.md](../operator/auth-threat-model.md) "Session minting + cookies" subsection | | CWE-614 (Sensitive Cookie in HTTPS Session Without 'Secure' Attribute) | Session + CSRF cookies set with `Secure=true`; rejected at cookie-write time on `http://` listeners (HTTPS-only control plane post-v2.2) | `internal/auth/session/service.go`; `cmd/server/main.go` HTTPS-only listener | TLS-listener tests in `cmd/server/`; cookie attrs pinned in handler tests | | CWE-1275 (Sensitive Cookie with Improper SameSite Attribute) | Session cookie `SameSite=Lax` default (configurable to Strict via `CERTCTL_SESSION_SAMESITE`); CSRF defense via the double-submit pattern means `Lax` is sufficient even if the operator does not flip to Strict | `internal/auth/session/service.go` (cookie attrs); `internal/config/config.go` (SAMESITE env var) | Cookie-attribute pinning; SameSite enforcement is per-cookie | -## Bundle 1 (RBAC) standards covered separately +## API-key + RBAC standards covered separately -The above tables focus on Bundle 2's OIDC + sessions + back-channel logout + break-glass surface. Bundle 1's RBAC primitive carries its own implementation pointers; the Bundle 1 [`auth-threat-model.md`](../operator/auth-threat-model.md) section "Defenses Bundle 1 ships" enumerates the full RBAC + bootstrap + auditor + approval-workflow surface. CWE-pointers that apply to Bundle 1's surface: +The above tables focus on the OIDC + sessions + back-channel logout + break-glass surface. The RBAC primitive carries its own implementation pointers; the [`auth-threat-model.md`](../operator/auth-threat-model.md) section "API-key + RBAC defenses" enumerates the full RBAC + bootstrap + auditor + approval-workflow surface. CWE-pointers that apply to the RBAC surface: -- CWE-285 (Improper Authorization) — defended by the Phase 3 RequirePermission middleware + Authorizer.CheckPermission service-layer call. Pinned by 90+ tests across `internal/auth/` and `internal/service/auth/`. -- CWE-862 (Missing Authorization) — pinned by Phase 12's `phase12_protocol_allowlist_test.go` (asserts protocol endpoints are explicitly allowlisted, NOT silently bypassing the gate). +- CWE-285 (Improper Authorization) — defended by the RequirePermission middleware + Authorizer.CheckPermission service-layer call. Pinned by 90+ tests across `internal/auth/` and `internal/service/auth/`. +- CWE-862 (Missing Authorization) — pinned by `phase12_protocol_allowlist_test.go` (asserts protocol endpoints are explicitly allowlisted, NOT silently bypassing the gate). - CWE-863 (Incorrect Authorization) — pinned by the auditor-split invariant in `internal/domain/auth/auditor_test.go` (auditor role holds exactly `audit.read` + `audit.export` ONLY). - CWE-732 (Incorrect Permission Assignment for Critical Resource) — five admin-only fine-grained perms (`cert.bulk_revoke`, `crl.admin`, `scep.admin`, `est.admin`, `ca.hierarchy.manage`) seeded into `r-admin` only; pinned by migration 000030 + `r-admin`-only seed test. @@ -74,10 +74,10 @@ If you are an external tester, an operator's auditor, or an acquirer doing techn - [`auth-threat-model.md`](../operator/auth-threat-model.md) — threat model behind these defenses. - [`security.md`](../operator/security.md) — overall security posture. - [`oidc-runbooks/index.md`](../operator/oidc-runbooks/index.md) — per-IdP operator setup guides. -- [`auth-benchmarks.md`](../operator/auth-benchmarks.md) — Phase 14 perf baselines for the validation paths cited above. +- [`auth-benchmarks.md`](../operator/auth-benchmarks.md) — performance baselines for the validation paths cited above. - `internal/auth/oidc/` — OIDC service + groupclaim resolver + pre-login adapter + bootstrap hook. - `internal/auth/session/` — Session service + middleware + CSRF + signing-key rotation. - `internal/auth/breakglass/` — break-glass admin (Argon2id + lockout + constant-time + surface-invisibility). - `internal/crypto/encryption.go` — AES-256-GCM v3 blob format for at-rest encryption. - `migrations/000029` through `000038` — schema for RBAC, OIDC providers, sessions, signing keys, users, group mappings, pre-login, break-glass. -- `scripts/ci-guards/multi-tenant-query-coverage.sh` — Phase 13 forward-compat multi-tenant query coverage. +- `scripts/ci-guards/multi-tenant-query-coverage.sh` — forward-compat multi-tenant query coverage guard. diff --git a/docs/reference/configuration.md b/docs/reference/configuration.md index 78cf439..c74c487 100644 --- a/docs/reference/configuration.md +++ b/docs/reference/configuration.md @@ -82,7 +82,7 @@ For the full deploy contract see |---|---|---| | `CERTCTL_AGENT_ID` | (none — required) | The agent's unique ID, issued by `POST /api/v1/agents/register` and bundled into the agent's registration response. Pass via this env var when the agent runs as a systemd unit / container without the `-agent-id` CLI flag. | -## Auth (Bundle 1 + Bundle 2) +## Auth (RBAC + OIDC + sessions + break-glass) Configuration knobs for the RBAC + OIDC + sessions + break-glass auth surface. Full operator guidance lives in diff --git a/docs/reference/profiles.md b/docs/reference/profiles.md index a6f6e9f..32bf156 100644 --- a/docs/reference/profiles.md +++ b/docs/reference/profiles.md @@ -10,7 +10,7 @@ managed certificate references exactly one profile; changing a profile's policy retroactively affects renewal of every cert pointing at it. -This file documents the profile lifecycle as it stands after Bundle 1. +This file documents the profile lifecycle as it stands at v2.1.0. For the schema, see `migrations/000003_certificate_profiles.up.sql` + `migrations/000027_approval_workflow.up.sql` + `migrations/000033_approval_kinds.up.sql`. For the API surface, @@ -27,8 +27,8 @@ see `api/openapi.yaml` under `/api/v1/profiles`. | `renewal_window_days` | 30 | Scheduler enqueues a renewal Job when `cert.NotAfter - now < renewal_window_days`. | | `allowed_key_algorithms` | RSA 2048+, ECDSA P-256+ | Validates incoming CSRs at issuance time. | | `allowed_ekus` | server, client | RFC 5280 §4.2.1.12 EKU set. | -| `must_staple` | false | Per-profile RFC 7633 `id-pe-tlsfeature` extension toggle (Phase 5.6 of the SCEP master bundle). | -| `requires_approval` | false | Bundle 1 Phase 9 - gates issuance + renewal AND profile edits behind a four-eyes approval workflow. See below. | +| `must_staple` | false | Per-profile RFC 7633 `id-pe-tlsfeature` extension toggle. | +| `requires_approval` | false | Gates issuance + renewal AND profile edits behind a four-eyes approval workflow. See below. | ## RequiresApproval and the approval workflow @@ -41,11 +41,11 @@ Setting `requires_approval=true` on a profile does two things: approved (job → `Pending`, scheduler dispatches) or rejected (job → `Cancelled`). Same actor cannot self-approve. 2. **Edits to the profile itself gate on a non-requester admin's - approval.** This is the Bundle 1 Phase 9 closure for the flip-flop + approval.** This is the closure for the flip-flop loophole - without it an admin could set `requires_approval=false`, mutate any other field, set `requires_approval=true`, and the approval workflow would only have been bypassed during the - "off" window. The Phase 9 gate fires under three conditions: + "off" window. The profile-edit gate fires under three conditions: - The live profile has `requires_approval=true` AND the operator submits any edit (regardless of whether the edit changes the flag). @@ -105,9 +105,8 @@ audit-only view. Each row carries the approval ID + the requester - `migrations/000027_approval_workflow.up.sql` (initial approval schema, Rank 7 of the 2026-05-03 deep-research deliverable) -- `migrations/000033_approval_kinds.up.sql` (Phase 9 - adds +- `migrations/000033_approval_kinds.up.sql` (adds `approval_kind` + `payload` + nullable cert/job FKs) - `internal/service/approval.go::RequestProfileEditApproval` - `internal/service/profile.go::UpdateProfile` (gate) - `internal/api/handler/profiles.go::UpdateProfile` (202 mapping) -- `cowork/auth-bundle-1-prompt.md` (Phase 9 spec)