Files
certctl/docs/operator/auth-threat-model.md
T
shankar0123 e7a94b6080 auth-bundle-1 Phase 13: docs (rbac.md + threat model + migration guide + security.md update)
Closes the last Phase before the Bundle 1 Exit gate. Operators
now have authoritative reference + threat model + migration guide
covering every behavior change Bundles 0-12 introduced.

# New docs

* docs/operator/rbac.md (340 lines) — operator how-to:
  - Mental model (actors / roles / permissions / scopes)
  - 7 default roles seeded by migration 000029 + the 5
    admin-only fine-grained perms seeded by 000030
  - Permission catalogue table by namespace
  - Scope semantics (global beats specific) + the Bundle-2
    deferral on scope_id FK enforcement
  - Granting / revoking access from GUI + CLI + HTTP API + MCP
  - The auditor pattern (audit-only, no resource read)
  - Day-0 bootstrap flow (CERTCTL_BOOTSTRAP_TOKEN → curl →
    HTTP 410 thereafter)
  - Demo-mode (CERTCTL_AUTH_TYPE=none) caveat for production

* docs/operator/auth-threat-model.md (180 lines) — what the
  controls defend against:
  - 5 threat actors (external, wrong-role, compromised key,
    insider operator, compromised auditor)
  - Per-defense walk-through (API-key auth, RBAC, bootstrap,
    approval workflow + Phase 9 closure, audit trail,
    protocol-endpoint allowlist)
  - 9 explicit deferrals (OIDC, sessions, local accounts,
    JIT elevation, MFA, etc.) — Bundle 2 / future scope
  - Compliance mapping (SOC 2 CC6.1/CC6.3, HIPAA §164.312(b),
    NIST SSDF PO.5.2, FedRAMP AU-9, PCI-DSS §10)
  - 5 operator-runnable sanity checks (e.g.,
    'SELECT FROM audit_events WHERE actor=system-bypass' MUST
    return 0 in production)

* docs/migration/api-keys-to-rbac.md (200 lines) — v2.0.x →
  v2.1.0 upgrade flow:
  - The SECURITY: AUDIT YOUR API KEYS callout
  - Migration list (000029-000033) + what each does
  - 4-mode scope-down flow (interactive / non-interactive
    JSON / --suggest / --suggest --apply)
  - What changes for code that called auth.IsAdmin
  - Helm-specific upgrade flow with example post-upgrade Job
  - Docker Compose upgrade flow + the 5 examples folders
    that ride demo mode unchanged
  - Verification queries + rollback flow

# Updated docs

* docs/operator/security.md — Last-reviewed bumped to
  2026-05-09; existing Authentication-surface section
  extended to call out the Bundle 1 RBAC primitive,
  day-0 bootstrap path, and approval-bypass closure with
  cross-references to the new docs.

* docs/reference/profiles.md — Last-reviewed header
  formatting fixed (added the > blockquote prefix used
  consistently across the docs tree).

# docs/README.md navigation

* Operator section gains 2 new rows (RBAC + auth-threat-model)
  and Approval-workflow row updated to mention Phase 9
  closure.
* Reference section gains the Profiles row.
* Migration section gains the api-keys-to-rbac row with the
  AUDIT YOUR API KEYS callout in the link description.

# CHANGELOG.md v2.1.0 section refreshed

The Phase 7 commit landed the SECURITY: AUDIT YOUR API KEYS
callout. This commit appends the missing Phase 9-12 highlights:

  - Approval-bypass closure (profile-edit gate + flip-flop
    loophole + ErrApproveBySameActor invariant)
  - GUI: Roles / API Keys / Auth Settings / Approvals queue
  - 12 new MCP RBAC tools
  - Coverage gates on internal/auth + internal/service/auth
  - Protocol-endpoint allowlist pinned at 3 layers

Trailing cross-reference block now points at all 4 new docs.

# Verifications

* Every internal link in the 4 new/modified docs validated by
  shell sweep (find broken links → 0 hits).
* Every new doc carries 'Last reviewed: 2026-05-09' header
  with the > blockquote prefix matching the docs-tree
  convention.
* go vet ./... clean.
* staticcheck across every Bundle-1-touched Go package clean.
* gofmt -l clean repo-wide.
* go test -short -count=1 green across internal/auth (incl.
  bootstrap), internal/api/handler, internal/api/router,
  internal/cli, internal/service (incl. auth),
  internal/domain/auth, internal/mcp, cmd/cli (cmd/server
  has 1 environmental failure on the sandbox virtiofs-tmp:
  TestPreflightSCEPRACertKey_KeyWorldReadable_Refuses depends
  on tmpfs file-mode semantics that virtiofs propagates
  differently — pre-existing, unrelated to Bundle 1).
* Frontend: 19 Vitest tests across src/pages/auth/ +
  AuditPage all pass; tsc --noEmit clean.
2026-05-10 00:10:15 +00:00

11 KiB

Authentication & authorization threat model

Last reviewed: 2026-05-09

This document describes the attack surface around authentication and authorization in certctl after Bundle 1 (the RBAC primitive) lands. It complements rbac.md — that doc explains how to use the controls; this one explains what those controls defend against and which threats they explicitly do NOT close.

For Bundle 2's OIDC + sessions extensions, this document will be updated. The Bundle 1 boundary is "API-key auth + RBAC primitive + day-0 bootstrap"; OIDC-federated humans, session cookies, revocation lists, WebAuthn, and break-glass local accounts are Bundle 2 scope.

Threat actors

  1. External attacker with no credential — probing the public HTTP surface. The default trust boundary for everything except the protocol-level endpoints (ACME / SCEP / EST / OCSP / CRL, which authenticate via embedded credentials per their own RFCs).
  2. Authenticated caller with the wrong role — has a valid API key but the role doesn't grant the requested operation. The primary RBAC threat model.
  3. Compromised API key — attacker holds a valid Bearer token that an honest operator originally provisioned. The key may carry any role.
  4. Insider operator — legitimate access; potentially trying to escalate privilege or bypass the approval workflow.
  5. Compromised audit reviewer (auditor role) — read-only access to audit events but otherwise untrusted.

Defenses Bundle 1 ships

API-key authentication

  • API keys live in CERTCTL_API_KEYS_NAMED (env-var) or api_keys (DB row, written by Bundle 1 Phase 6 bootstrap and the future role-management API). Keys hash via SHA-256; the middleware compares hashes via crypto/subtle.ConstantTimeCompare to defeat timing attacks.
  • The auth middleware populates ActorIDKey / ActorTypeKey / TenantIDKey on every authenticated request context. Audit rows attribute every action to the named-key actor instead of the pre-Bundle-1 hardcoded api-key-user placeholder.
  • Demo mode (CERTCTL_AUTH_TYPE=none) injects the synthetic actor-demo-anon actor with admin grants. Production deploys MUST NOT use demo mode.

Authorization (RBAC)

  • Every gated handler routes through auth.RequirePermission (or the router-level rbacGate wrap from Phase 3.5). The middleware resolves the actor's effective permissions via the Authorizer.CheckPermission service-layer call; on miss, the handler returns HTTP 403 BEFORE the body runs. This is the load-bearing gate.
  • The five admin-only fine-grained perms (cert.bulk_revoke / crl.admin / scep.admin / est.admin / ca.hierarchy.manage) are seeded into r-admin only. To delegate one, an operator creates a custom role with the specific perm and grants it to the right actor.
  • The auditor split: r-auditor holds only audit.read + audit.export. Pinned by the internal/domain/auth/auditor_test.go invariants. A regulator with the auditor key cannot read certificates, profiles, issuers, or any mutating surface.
  • The privilege-escalation guard: granting or revoking a role requires the caller to hold auth.role.assign (enforced in internal/service/auth/actor_role_service.go). A non-admin cannot self-grant admin.
  • The reserved-actor guard: mutations against actor-demo-anon return HTTP 409 from the service layer (ErrAuthReservedActor). The synthetic actor is operator- inaccessible.

Day-0 bootstrap

  • CERTCTL_BOOTSTRAP_TOKEN is constant-time-compared by EnvTokenStrategy.Validate. The strategy is one-shot via sync.Mutex-guarded consumed bool; the second call returns ErrDisabled (HTTP 410), not ErrInvalidToken (HTTP 401), so a probing attacker cannot distinguish "wrong token, retry" from "already consumed".
  • The strategy also re-probes admin existence on every Validate. If an admin actor lands during the gap between Available and Validate, the second caller still gets HTTP 410.
  • The minted plaintext key is written to the response body once. It is NEVER logged. The token-leak hygiene test in internal/api/handler/auth_bootstrap_test.go redirects slog.Default to a buffer and grep-asserts that neither the bootstrap token nor the minted key appears in any log line, audit row, or HTTP header.
  • The minted key is hashed before persistence. Lost key → rotate via the regular RBAC API; the plaintext is not recoverable from the DB.

Approval workflow + Phase 9 loophole closure

  • CertificateProfile.RequiresApproval=true gates two surfaces: (a) issuance + renewal of every cert pointing at the profile, (b) edits to the profile itself (Bundle 1 Phase 9). The Phase 9 closure prevents the flip-flop bypass where an admin disables approval, mutates, re-enables.
  • Same-actor self-approve is rejected at the service layer with ErrApproveBySameActor for both cert_issuance and profile_edit kinds. Two-person integrity is the load-bearing invariant; pinned by tests in internal/service/approval_test.go.

Audit trail

  • Every mutating operation flows through AuditService.RecordEvent or RecordEventWithCategory. Bundle 1 Phase 8 added the event_category column with a CHECK constraint enforcing the closed enum (cert_lifecycle / auth / config); the category surfaces the auth-mutation slice to the auditor view.
  • The WORM trigger from migration 000018 (audit_events_worm_trigger) blocks UPDATE and DELETE at the database layer. Even an admin DB user cannot tamper with audit history without dropping the trigger.
  • Bundle-6's redactor (internal/service/audit_redact.go) scrubs credentials + PII from the details JSONB before persistence; an _redacted_keys field surfaces what the redactor took out for compliance review.

Protocol-endpoint allowlist

ACME / SCEP / EST / OCSP / CRL endpoints authenticate via embedded credentials defined by their own RFCs (JWS-signed, challenge passwords, mTLS, public-by-RFC). The auth middleware explicitly bypasses these via IsProtocolEndpoint. The Phase 12 internal/api/router/phase12_protocol_allowlist_test.go pins the invariant at three layers (middleware bypass, allowlist constant, router-level no-rbacGate-wraps-protocol-paths).

Threats Bundle 1 does NOT close

These are NOT defended; some are deferred to Bundle 2, others are out-of-scope for the project entirely.

  1. OIDC / SAML / WebAuthn federation — Bundle 2.
  2. Session management — there is no session cookie, no server-side revocation list. Each Bearer token is the bearer credential. To revoke a key, delete the actor_roles rows or remove the env-var entry; there is no "log out everywhere" button. Bundle 2.
  3. Local password accounts (break-glass) — Bundle 2.
  4. Time-bound role grants / JIT elevation — the schema reserves actor_roles.expires_at but no UI/API to set it. Bundle 2 or v3.
  5. MFA / hardware tokens for the operator console — Bundle 2.
  6. Rate limiting on the bootstrap endpoint — the endpoint is one-shot by construction (consumed flag + admin-existence probe), so a brute-force attack on the token has at most the single attempt before the path closes. Per-IP rate limiting on the broader API is still in place via Bundle C's middleware.NewRateLimiter.
  7. scope_id FK enforcement — operators can grant a permission at scope profile/p-bogus without the bogus profile existing. The gate still works (no rows match at request time) but a strict 404 on grant would be cleaner. See RoleRepository.AddPermission TODO(bundle-2) comment in internal/repository/postgres/auth.go.
  8. OIDC-first-admin bootstrap — Bundle 1 ships only the env-var-token strategy. Bundle 2 adds the OIDC-group-claim strategy alongside (the Strategy interface in internal/auth/bootstrap/ is already in place).
  9. GUI E2E suite via Playwright — the prompt asked for nine end-to-end flow tests. Bundle 1 ships 19 React Testing Library + Vitest tests covering the same surface; full Playwright land in Phase 12-extended work.

Compliance mapping

The control set in this document supports the following framework requirements. This is a mapping; it is not a claim of formal certification.

  • SOC 2 CC6.1 (logical access controls) — RBAC primitive with role-based gating on every mutating endpoint.
  • SOC 2 CC6.3 (privileged access management) — r-admin role separation + role-grant audit trail with two-person integrity on approval-tier profile edits.
  • HIPAA §164.312(b) (audit controls) — event_category column lets the auditor role review authentication / authorization changes specifically. WORM trigger keeps the audit table append-only at the database layer.
  • NIST SSDF PO.5.2 (separation of duties) — two-person integrity for compliance-tier issuance via the RequiresApproval flow + Bundle 1 Phase 9's closure of the flip-flop bypass.
  • FedRAMP AU-9 (audit information protection) — WORM enforcement + auditor-only read access (the auditor role cannot mutate, the WORM trigger blocks UPDATE/DELETE).
  • PCI-DSS §10 (audit logging) — every mutating operation emits an audit row with actor + action + resource + timestamp + category. The audit table is append-only.

Operator-facing checks

Run these periodically to verify the controls are working.

  1. certctl-cli auth keys list — confirm no unexpected actor holds r-admin. Audit any new admin grants against the audit log.
  2. SELECT actor, action, COUNT(*) FROM audit_events WHERE action LIKE 'approval_%' AND timestamp > NOW() - INTERVAL '7 days' GROUP BY actor, action; — confirm approvals are happening and not concentrated in a single approver.
  3. SELECT COUNT(*) FROM audit_events WHERE actor = 'system-bypass'; — MUST return 0 in production. A non-zero count means CERTCTL_APPROVAL_BYPASS=true was set; production deploys MUST leave it unset.
  4. SELECT actor, COUNT(*) FROM audit_events WHERE action = 'bootstrap.consume'; — MUST return at most one row per tenant. Multiple rows means the bootstrap endpoint was called more than once, which the strategy's one-shot guard should have prevented; investigate.
  5. certctl-cli auth me while authenticated as the auditor key — effective_permissions must contain audit.read + audit.export ONLY. Any other permission means a role grant widened the auditor's surface; revoke immediately.

Cross-references

  • rbac.md — the operator how-to
  • security.md — the wider security posture
  • approval-workflow.md — the two-person integrity gate
  • docs/migration/api-keys-to-rbac.md — upgrade flow
  • internal/auth/ — middleware + keystore + RequirePermission + bootstrap
  • internal/service/auth/ — Authorizer + privilege-escalation guard + reserved-actor guard
  • migrations/000029_rbac.up.sql — schema + seed
  • migrations/000030_rbac_admin_perms.up.sql — five admin-only fine-grained perms
  • migrations/000032_audit_category.up.sql — auditor surface
  • migrations/000033_approval_kinds.up.sql — approval-bypass closure