certctl/CHANGELOG.md

# Changelog

## v2.1.0 - Auth Bundles 1 + 2: RBAC primitive + OIDC SSO + sessions ⚠️

> **SECURITY: AUDIT YOUR API KEYS.**
>
> Bundle 1 ships role-based authorization. Every existing API key
> configured via `CERTCTL_API_KEYS_NAMED` (or the legacy
> `CERTCTL_AUTH_SECRET`) is mapped to the **r-admin role on the first
> upgrade boot** so existing automation keeps working unchanged. Most
> keys do NOT need full admin power; downgrade them before tagging
> the next release.
>
> Recommended post-upgrade flow:
>
> ```bash
> # 1. List every key with its current role:
> certctl-cli auth keys list
>
> # 2. Walk an interactive prompt that downgrades each key:
> certctl-cli auth keys scope-down
>
> # 3. Or get a heuristic suggestion based on 30 days of audit history:
> certctl-cli auth keys scope-down --suggest
> certctl-cli auth keys scope-down --suggest --apply   # applies the suggestion
>
> # 4. Or drive scope-down from a JSON config (Helm post-upgrade hook):
> certctl-cli auth keys scope-down --non-interactive ./scope-down.json
> ```
>
> The synthetic `actor-demo-anon` actor (used when
> `CERTCTL_AUTH_TYPE=none` is configured) is system-managed and
> excluded from the prompt loop.

What else changed in v2.1.0:

- **Audit 2026-05-10 CRIT-1 closure — wire-layer RBAC enforcement.**
  The Bundle 1 + Bundle 2 audit surfaced that the permission catalogue
  was enforced on ~24 admin-only routes only; the bulk of state-changing
  routes (`POST /api/v1/certificates`, `PUT /api/v1/profiles/{id}`,
  `DELETE /api/v1/issuers/{id}`, `POST /api/v1/agents/{id}/csr`, even
  `POST /api/v1/auth/roles` + `POST /api/v1/auth/keys/{id}/roles`) had
  no `rbacGate` wrap. A `r-viewer` Bearer was essentially `r-admin`
  minus five fine-grained verbs at the wire layer (CWE-862). This
  release wraps every state-changing + read endpoint with
  `rbacGate` (global scope) or `rbacGateScoped` (per-profile / per-
  issuer scope-bound grants), and adds an AST-level CI guard
  (`TestRouterRBACGateCoverage`) that fails when a new route is
  registered without enforcement. Catalogue extended via migration
  000039 with 30 permissions covering `cert.edit`, `job.*`,
  `approval.*`, `policy.*`, `team.*`, `owner.*`, `notification.*`,
  `discovery.*`, `network_scan.*`, `healthcheck.*`, `digest.*`,
  `verification.*`, `stats.read`, `metrics.read`. **AUDIT YOUR
  KEYS** (the scope-down call-out above) now translates to real
  reduction in blast radius. Auditor pin preserved at exactly
  `{audit.read, audit.export}`.

- **RBAC primitive shipped.** `tenants`, `roles`, `permissions`,
  `role_permissions`, `actor_roles` tables (migration 000029); 33-permission
  canonical catalogue; 7 default roles (`admin`, `operator`, `viewer`,
  `agent`, `mcp`, `cli`, `auditor`); per-handler permission gates via
  `auth.RequirePermission` middleware (replaces the legacy
  `IsAdmin` boolean check on the 5 admin-only handlers).
- **Day-0 admin bootstrap.** Set `CERTCTL_BOOTSTRAP_TOKEN` on a fresh
  deploy and POST a single curl call against `/api/v1/auth/bootstrap` to
  mint the first admin API key; one-shot, never logged, and locks
  closed once any admin actor exists. Migration 000031 ships the
  `api_keys` table that stores the SHA-256 hash; the plaintext is
  shown in the response body once and never persisted.
- **Auditor role split.** New `auditor` role holds only `audit.read`
  + `audit.export`. Compliance reviewers can read the audit trail
  without holding mutation power. Migration 000032 adds
  `audit_events.event_category` so auditors can filter to
  authentication-related events specifically.
- **`/v1/auth/check` enrichment.** Response now includes the actor's
  standing roles and effective permissions, so the GUI gates
  affordances from a single fetch on app boot.
- **Approval-bypass closure.** Edits to a profile that has (or
  would have) `RequiresApproval=true` now route through the
  `ApprovalService` two-person integrity gate (Phase 9). Migration
  000033 adds `approval_kind` + `payload` to
  `issuance_approval_requests` so cert-issuance and profile-edit
  approvals share the same workflow. Same-actor self-approve is
  rejected with `ErrApproveBySameActor` for both kinds. Closes the
  flip-flop loophole where an admin could disable approval, mutate,
  re-enable. Documented at
  [`docs/reference/profiles.md`](docs/reference/profiles.md).
- **GUI: Roles / API Keys / Auth Settings / Approvals queue.**
  Four new pages under `/auth/*` consume `/v1/auth/me` for
  permission-aware rendering. The Approvals queue blocks
  self-approve at the client layer (Approve/Reject buttons hidden
  when requested_by == current actor_id) on top of the server-side
  enforcement. AuditPage gains a category filter (cert_lifecycle /
  auth / config) for the auditor view.
- **MCP server gains 12 RBAC tools.** Operators driving certctl
  from Claude / VS Code / any MCP client get parity with the GUI
  + CLI. Each tool routes through the same HTTP handler; permission
  gates fire server-side.
- **OpenAPI catalogues every new route.** Every Bundle 1 endpoint
  ships with an `operationId`; the parity test guards against drift.
- **Coverage gates.** `internal/auth/` and `internal/service/auth/`
  now have ≥85% coverage floors in `.github/coverage-thresholds.yml`.
  The 12-path negative-test list from the Bundle 1 prompt is
  fully covered (path #12 deferred with in-tree TODO).
- **Protocol-endpoint allowlist pinned at three layers.** The
  middleware bypass (`auth.IsProtocolEndpoint`), the router-level
  `AuthExemptRouterRoutes` constant, and a new
  `phase12_protocol_allowlist_test.go` AST scan all guard against
  accidentally wrapping ACME / SCEP / EST / OCSP / CRL routes in
  `rbacGate`.
- **Bundle 2: OIDC + sessions + back-channel logout + break-glass.**
  Auth Bundle 2 ships in the same v2.1.0 release. Operators get OIDC
  SSO support for Keycloak / Authentik / Okta / Auth0 / Microsoft
  Entra ID / Google Workspace (via Keycloak broker), HMAC-signed
  session cookies with idle/absolute timeouts + CSRF defense,
  back-channel logout per OpenID Connect Back-Channel Logout 1.0,
  and a default-OFF break-glass admin path with Argon2id passwords
  for SSO-broken incidents. API-key auth keeps working unchanged
  alongside; existing automation needs no changes. Migration walkthrough
  at [`docs/migration/oidc-enable.md`](docs/migration/oidc-enable.md);
  per-IdP setup guides at
  [`docs/operator/oidc-runbooks/index.md`](docs/operator/oidc-runbooks/index.md).
- **OIDC token validation pinned at three layers.** Algorithm
  allow-list (RS256/RS512/ES256/ES384/EdDSA only) with HS-family + `none`
  rejected at the service-layer sentinel; IdP-downgrade-attack defense
  at provider creation AND every JWKS RefreshKeys (intersects the IdP's
  advertised `id_token_signing_alg_values_supported` against the allow-
  list, rejects providers that advertise weak algs even before any
  token is signed); OIDC Core §3.1.3.7 re-verification of `iss` /
  `aud` / `azp` / `at_hash` (REQUIRED-when-access_token-present per
  Phase 3 tightening of the spec MAY → MUST) / `exp` / `iat` window
  / `nonce` constant-time-compare. PKCE-S256 mandatory; `plain`
  rejected. Single-use state + nonce via atomic `DELETE...RETURNING`
  on consume.
- **Session cookies use length-prefixed HMAC.** The cookie wire format
  is `v1.<session_id>.<signing_key_id>.<base64url-no-pad(HMAC-SHA256)>`
  with HMAC input `len:sid:len:kid` (NOT bare-concat) to defeat
  concatenation collisions. `HttpOnly` + `Secure` + `SameSite=Lax`
  default; `SameSite=Strict` configurable via `CERTCTL_SESSION_SAMESITE`.
  Idle timeout 1h / absolute 8h defaults; scheduler GC sweeps expired
  rows hourly. Signing keys rotate via the new `RotateSigningKey`
  primitive; the old key stays valid for `CERTCTL_SESSION_SIGNING_KEY_RETENTION`
  (default 24h) so existing cookies validate during rollover.
- **CSRF defense via double-submit-cookie + hashed-token-on-row.**
  Plaintext CSRF token in the JS-readable `certctl_csrf` cookie
  (intentionally `HttpOnly=false` for the GUI to echo into the
  `X-CSRF-Token` header); SHA-256 hash on the session row;
  `subtle.ConstantTimeCompare` in the new `CSRFMiddleware`. API-key
  actors are CSRF-exempt (no session row in context).
- **OIDC `client_secret` encrypted at rest.** AES-256-GCM v3 blob
  format (magic 0x03 + salt(16) + nonce(12) + ciphertext+tag) using
  the existing `CERTCTL_CONFIG_ENCRYPTION_KEY`. Encryption invariant
  pinned by an integration test asserting ciphertext != plaintext +
  v3 blob shape + round-trip recovery + wrong-passphrase fails.
- **OIDC first-admin bootstrap.** New `CERTCTL_BOOTSTRAP_ADMIN_GROUPS`
  + `CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID` env vars: the first
  OIDC-authenticated user with a matching group claim becomes admin
  per tenant. Coexists with the Bundle 1 env-var-token bootstrap;
  the admin-existence probe ensures only one wins. Audit row
  (`bootstrap.oidc_first_admin`) on every grant.
- **Break-glass admin (default-OFF).** New `CERTCTL_BREAKGLASS_ENABLED`
  env var (default `false`). When enabled, the local Argon2id-password
  admin path bypasses OIDC + group-claim layers — intended ONLY for
  SSO-broken incidents. Argon2id with OWASP 2024 params (m=64 MiB,
  t=3, p=4); lockout after 5 failures (configurable); constant-time
  across all failure paths via `verifyDummy`; surface invisibility
  (HTTP 404 on every endpoint when disabled, NOT 403). WARN log at
  server boot when enabled. WebAuthn/FIDO2 second factor pairing on
  the v3 roadmap (Decision 12).
- **GUI: OIDC Providers + Group → Role Mappings + Sessions + login
  buttons.** Four new pages under `/auth/*` consume the Bundle 2 API
  surface. Login page renders one "Sign in with X" button per
  configured OIDC provider (in addition to the API-key form, which
  remains as a fallback for Bearer-mode + break-glass paths). Sessions
  page exposes own-sessions + admin all-actors view. Every actionable
  element is permission-gated server-side via `auth.oidc.*` and
  `auth.session.*` perms; client-side hide is UX layer. Logout button
  in the sidebar fires `POST /auth/logout` to clear the session
  server-side before redirecting to login.
- **MCP server gains 11 OIDC + session tools.** `certctl_auth_list_oidc_providers`,
  `_get_oidc_provider`, `_create_oidc_provider`, `_update_oidc_provider`,
  `_delete_oidc_provider`, `_refresh_oidc_provider`,
  `_list_group_mappings`, `_add_group_mapping`, `_remove_group_mapping`,
  `_list_sessions`, `_revoke_session`. Operator-facing MCP tool count
  goes 12 (Bundle 1 RBAC) → 23 across the auth surface. Total MCP
  tool count: `grep -cE 'mcp\.AddTool\(' internal/mcp/tools*.go` ≈ 150.
- **Per-IdP runbooks: 6 production-tier setup guides** at
  `docs/operator/oidc-runbooks/`. Each runbook follows a consistent
  five-section layout (Prerequisites / IdP-side config / certctl-side
  config / Verification / Troubleshooting + Validation checklist with
  operator sign-off line). Keycloak is the canonical reference;
  Authentik / Okta / Auth0 / Entra ID / Google Workspace document the
  IdP-specific deltas (Auth0's namespaced custom claims; Entra ID's
  group OBJECT IDs; Google Workspace's missing-groups-claim limitation
  + the recommended Keycloak broker pattern).
- **Threat model extended.** [`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md)
  ships 5 new "Defenses Bundle 2 ships" subsections + 8 new threat-
  catalogue subsections (OIDC token forgery / session hijacking / IdP
  compromise / back-channel logout failure modes / group-claim
  manipulation / bootstrap risks / break-glass risks / token-leak
  hygiene). 6 new SQL-shaped operator-facing checks. New "Threats
  Bundle 2 does NOT close" section enumerating the 8 v3-backlog items
  (WebAuthn / JIT elevation / SAML / multi-tenant activation /
  HSM-FIPS / OIDC RP-initiated logout / Playwright / per-IdP
  external-tester sign-off).
- **Performance baselines documented.** [`docs/operator/auth-benchmarks.md`](docs/operator/auth-benchmarks.md)
  ships four benchmarks with measured baselines on a 4 vCPU /
  8 GiB / Postgres 16 / Go 1.25 floor: `BenchmarkSession_SteadyState`
  p99 5 µs (target < 1 ms; 200× under), `BenchmarkSession_ColdProcess`
  p99 7.1 ms (target < 10 ms), `BenchmarkOIDC_SteadyState` p99 1.5 ms
  (target < 5 ms), `BenchmarkOIDC_ColdCache` operator-runs against
  live Keycloak via `make benchmark-auth-coldcache`.
- **Standards + RFC implementation table.** [`docs/reference/auth-standards-implemented.md`](docs/reference/auth-standards-implemented.md)
  ships 13 RFC / standard rows + 14 CWE rows with concrete file paths
  + negative-test anchors per row. NOT a compliance-mapping doc per
  the operator's 2026-05-05 retired-compliance-docs decision; the
  doc explicitly says "build the framework mapping yourself against
  the rows here using the framework-mapping methodology your audit
  firm prescribes; this project does not own that mapping."
- **Coverage gates held at floor 90 across all four Bundle 2
  packages.** `internal/auth/oidc/` 93.7%, `internal/auth/session/`
  94.9%, `internal/auth/breakglass/` 91.5%, `internal/auth/user/domain/`
  96.4%. NO held-low-with-rationale entry — the Phase 13 prompt's
  anti-Bundle-1-mistake rule held. Bundle 1's existing 85% floors
  for `internal/auth/` + `internal/service/auth/` stay 85
  (already-shipped-and-accepted) per the prompt's explicit
  inheritance rule.
- **Multi-tenant query CI guard.** New `scripts/ci-guards/multi-tenant-query-coverage.sh`
  (ratchet-style, baseline 32 at v2.1.0 close): greps every
  SELECT/UPDATE/DELETE in `internal/repository/postgres/` against
  10 tenant-aware tables, fails on regression OR improvement (forces
  the operator to lift / lower the baseline visibly). Forward-compat
  protection so a future Bundle 3 / managed-service multi-tenant
  activation can flip the switch without finding silent
  tenant-data-leak bugs in shipped queries.
- **Phase 10 Keycloak testcontainers integration test.** New build-tag-
  gated suite at `internal/auth/oidc/testfixtures/` + `integration_keycloak_test.go`
  drives the full OIDC flow against a live Keycloak container booted
  by testcontainers-go. 5-test matrix: discovery + JWKS load, full
  PKCE auth-code happy path with HTTP form scraping, logout-revokes-
  session, JWKS rotation, unmapped-groups-fails-closed. Reuses one
  container across the matrix to amortize the 60-90s boot. Optional
  Okta smoke test (build-tagged `integration && okta_smoke`) for live
  tenant validation. New Makefile targets: `make keycloak-integration-test`
  + `make okta-smoke-test` + `make benchmark-auth-coldcache`.
- **OpenAPI surface extended.** New `cookieAuth` security scheme
  (apiKey/cookie/`certctl_session`) alongside the existing
  `bearerAuth`. 13 new Bundle 2 endpoints across the OIDC + session
  + group-mapping CRUD surface; 4 break-glass endpoints with
  surface-invisibility framing. The N-bundle-2-security-empty-preserved
  CI guard locks the `security: []` opt-out count at ≥ 14 so existing
  public endpoints stay public.
- **Bundle-1-only compat regression CI guard.** New
  `scripts/ci-guards/bundle-1-compat-regression.sh` asserts the
  load-bearing invariants that protect the Bundle-1-only-deploy
  case (session middleware defers-to-next, CSRF passthrough on
  missing session row, ChainAuthSessionThenBearer wired, public
  OIDC routes in AuthExempt allowlist, AuthInfo guards on
  OIDCProvidersResolver != nil). Sibling
  `bundle-1-to-2-upgrade-regression.sh` asserts the upgrade-path
  invariants (migrations 000034..000038 are CREATE TABLE IF NOT EXISTS
  + BEGIN/COMMIT-wrapped + no DROP TABLE / ALTER...DROP COLUMN
  against 19 protected Bundle-1 tables + ON CONFLICT DO NOTHING on
  permission seed).

Migration ordering, idempotency, and downgrade are documented in
[`docs/migration/api-keys-to-rbac.md`](docs/migration/api-keys-to-rbac.md)
(API-key → RBAC, Bundle 1) and [`docs/migration/oidc-enable.md`](docs/migration/oidc-enable.md)
(API-key → OIDC, Bundle 2). The threat model lives at
[`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md).
Day-2 RBAC operations live at [`docs/operator/rbac.md`](docs/operator/rbac.md).
RFC + CWE evidence at [`docs/reference/auth-standards-implemented.md`](docs/reference/auth-standards-implemented.md).

## v2.0.68 - Image registry path changed ⚠️

> **Image registry path changed.** Starting this release, container images publish to `ghcr.io/certctl-io/certctl-server` and `ghcr.io/certctl-io/certctl-agent`. Existing pulls from `ghcr.io/shankar0123/certctl-{server,agent}:<tag>` continue to work for previously-published tags (the registry never deletes images), but the `:latest` tag at the old path stops moving forward at this release. Update your `docker pull` paths, `docker-compose.yml` `image:` keys, or Helm `image.repository` values to receive future updates. Old `git clone` / `git push` / install-script / API URLs continue to redirect forever - only the container-registry path changed.

This is the only operator-action-required change in v2.0.68. Other changes in this release are cosmetic URL refreshes after the GitHub-org transfer from `shankar0123/certctl` to `certctl-io/certctl` (HTTP redirects mean no other operator action is required) plus an internal contextcheck lint fix in the agent. Full commit list is on the [GitHub release page](https://github.com/certctl-io/certctl/releases/tag/v2.0.68).

---

certctl no longer maintains a hand-edited per-version changelog. Per-release
notes are auto-generated from commit messages between consecutive tags.

**Where to find what changed in a given release:**

- **[GitHub Releases](https://github.com/certctl-io/certctl/releases)** - every
  tag has an auto-generated "What's Changed" section pulled from the commits
  between that tag and the previous one, plus per-release supply-chain
  verification instructions (Cosign / SLSA / SBOM).
- **`git log <prev-tag>..<this-tag> --oneline`** - same content, locally.

**Why no hand-edited CHANGELOG.md:**

certctl is solo-developed and pushes directly to master. Maintaining a
hand-edited CHANGELOG meant the file drifted (entries piled into
`[unreleased]` and never got promoted to per-version sections when tags were
cut). A stale CHANGELOG is worse than no CHANGELOG - it signals abandoned
maintenance to security-conscious operators doing diligence.

The auto-generated release notes work here because commit messages follow a
descriptive convention: `<area>: <summary>` with a longer body for non-trivial
changes (see `git log v2.0.50..HEAD` for the established pattern). Anyone
reading the GitHub Releases page can see exactly what landed in each version
without depending on the author to manually update a separate file.

**For the historical record:** earlier versions (pre-v2.2.0 and the [2.2.0]
tag itself) had a hand-edited CHANGELOG. That content is preserved in
[git history](https://github.com/certctl-io/certctl/blob/v2.2.0/CHANGELOG.md)
at the v2.2.0 tag.