auth-bundle-1 Phase 13: docs (rbac.md + threat model + migration guide + security.md update)

Closes the last Phase before the Bundle 1 Exit gate. Operators
now have authoritative reference + threat model + migration guide
covering every behavior change Bundles 0-12 introduced.

# New docs

* docs/operator/rbac.md (340 lines) — operator how-to:
  - Mental model (actors / roles / permissions / scopes)
  - 7 default roles seeded by migration 000029 + the 5
    admin-only fine-grained perms seeded by 000030
  - Permission catalogue table by namespace
  - Scope semantics (global beats specific) + the Bundle-2
    deferral on scope_id FK enforcement
  - Granting / revoking access from GUI + CLI + HTTP API + MCP
  - The auditor pattern (audit-only, no resource read)
  - Day-0 bootstrap flow (CERTCTL_BOOTSTRAP_TOKEN → curl →
    HTTP 410 thereafter)
  - Demo-mode (CERTCTL_AUTH_TYPE=none) caveat for production

* docs/operator/auth-threat-model.md (180 lines) — what the
  controls defend against:
  - 5 threat actors (external, wrong-role, compromised key,
    insider operator, compromised auditor)
  - Per-defense walk-through (API-key auth, RBAC, bootstrap,
    approval workflow + Phase 9 closure, audit trail,
    protocol-endpoint allowlist)
  - 9 explicit deferrals (OIDC, sessions, local accounts,
    JIT elevation, MFA, etc.) — Bundle 2 / future scope
  - Compliance mapping (SOC 2 CC6.1/CC6.3, HIPAA §164.312(b),
    NIST SSDF PO.5.2, FedRAMP AU-9, PCI-DSS §10)
  - 5 operator-runnable sanity checks (e.g.,
    'SELECT FROM audit_events WHERE actor=system-bypass' MUST
    return 0 in production)

* docs/migration/api-keys-to-rbac.md (200 lines) — v2.0.x →
  v2.1.0 upgrade flow:
  - The SECURITY: AUDIT YOUR API KEYS callout
  - Migration list (000029-000033) + what each does
  - 4-mode scope-down flow (interactive / non-interactive
    JSON / --suggest / --suggest --apply)
  - What changes for code that called auth.IsAdmin
  - Helm-specific upgrade flow with example post-upgrade Job
  - Docker Compose upgrade flow + the 5 examples folders
    that ride demo mode unchanged
  - Verification queries + rollback flow

# Updated docs

* docs/operator/security.md — Last-reviewed bumped to
  2026-05-09; existing Authentication-surface section
  extended to call out the Bundle 1 RBAC primitive,
  day-0 bootstrap path, and approval-bypass closure with
  cross-references to the new docs.

* docs/reference/profiles.md — Last-reviewed header
  formatting fixed (added the > blockquote prefix used
  consistently across the docs tree).

# docs/README.md navigation

* Operator section gains 2 new rows (RBAC + auth-threat-model)
  and Approval-workflow row updated to mention Phase 9
  closure.
* Reference section gains the Profiles row.
* Migration section gains the api-keys-to-rbac row with the
  AUDIT YOUR API KEYS callout in the link description.

# CHANGELOG.md v2.1.0 section refreshed

The Phase 7 commit landed the SECURITY: AUDIT YOUR API KEYS
callout. This commit appends the missing Phase 9-12 highlights:

  - Approval-bypass closure (profile-edit gate + flip-flop
    loophole + ErrApproveBySameActor invariant)
  - GUI: Roles / API Keys / Auth Settings / Approvals queue
  - 12 new MCP RBAC tools
  - Coverage gates on internal/auth + internal/service/auth
  - Protocol-endpoint allowlist pinned at 3 layers

Trailing cross-reference block now points at all 4 new docs.

# Verifications

* Every internal link in the 4 new/modified docs validated by
  shell sweep (find broken links → 0 hits).
* Every new doc carries 'Last reviewed: 2026-05-09' header
  with the > blockquote prefix matching the docs-tree
  convention.
* go vet ./... clean.
* staticcheck across every Bundle-1-touched Go package clean.
* gofmt -l clean repo-wide.
* go test -short -count=1 green across internal/auth (incl.
  bootstrap), internal/api/handler, internal/api/router,
  internal/cli, internal/service (incl. auth),
  internal/domain/auth, internal/mcp, cmd/cli (cmd/server
  has 1 environmental failure on the sandbox virtiofs-tmp:
  TestPreflightSCEPRACertKey_KeyWorldReadable_Refuses depends
  on tmpfs file-mode semantics that virtiofs propagates
  differently — pre-existing, unrelated to Bundle 1).
* Frontend: 19 Vitest tests across src/pages/auth/ +
  AuditPage all pass; tsc --noEmit clean.
This commit is contained in:
shankar0123
2026-05-10 00:10:15 +00:00
parent 06cea1ce0f
commit e7a94b6080
7 changed files with 913 additions and 9 deletions
+36 -1
View File
@@ -54,13 +54,48 @@ What else changed in v2.1.0:
- **`/v1/auth/check` enrichment.** Response now includes the actor's
standing roles and effective permissions, so the GUI gates
affordances from a single fetch on app boot.
- **Approval-bypass closure.** Edits to a profile that has (or
would have) `RequiresApproval=true` now route through the
`ApprovalService` two-person integrity gate (Phase 9). Migration
000033 adds `approval_kind` + `payload` to
`issuance_approval_requests` so cert-issuance and profile-edit
approvals share the same workflow. Same-actor self-approve is
rejected with `ErrApproveBySameActor` for both kinds. Closes the
flip-flop loophole where an admin could disable approval, mutate,
re-enable. Documented at
[`docs/reference/profiles.md`](docs/reference/profiles.md).
- **GUI: Roles / API Keys / Auth Settings / Approvals queue.**
Four new pages under `/auth/*` consume `/v1/auth/me` for
permission-aware rendering. The Approvals queue blocks
self-approve at the client layer (Approve/Reject buttons hidden
when requested_by == current actor_id) on top of the server-side
enforcement. AuditPage gains a category filter (cert_lifecycle /
auth / config) for the auditor view.
- **MCP server gains 12 RBAC tools.** Operators driving certctl
from Claude / VS Code / any MCP client get parity with the GUI
+ CLI. Each tool routes through the same HTTP handler; permission
gates fire server-side.
- **OpenAPI catalogues every new route.** Every Bundle 1 endpoint
ships with an `operationId`; the parity test guards against drift.
- **Coverage gates.** `internal/auth/` and `internal/service/auth/`
now have ≥85% coverage floors in `.github/coverage-thresholds.yml`.
The 12-path negative-test list from the Bundle 1 prompt is
fully covered (path #12 deferred with in-tree TODO).
- **Protocol-endpoint allowlist pinned at three layers.** The
middleware bypass (`auth.IsProtocolEndpoint`), the router-level
`AuthExemptRouterRoutes` constant, and a new
`phase12_protocol_allowlist_test.go` AST scan all guard against
accidentally wrapping ACME / SCEP / EST / OCSP / CRL routes in
`rbacGate`.
- **Bundle 2 (OIDC + sessions) starts after Bundle 1 lands on
master.** Roadmap entry remains in `cowork/auth-bundle-2-prompt.md`.
Migration ordering, idempotency, and downgrade are documented in
`docs/migration/api-keys-to-rbac.md`.
[`docs/migration/api-keys-to-rbac.md`](docs/migration/api-keys-to-rbac.md).
The threat model + compliance mapping live at
[`docs/operator/auth-threat-model.md`](docs/operator/auth-threat-model.md).
Day-2 RBAC operations live at
[`docs/operator/rbac.md`](docs/operator/rbac.md).
## v2.0.68 — Image registry path changed ⚠️
+6 -2
View File
@@ -27,6 +27,7 @@ You're operating certctl in production or building integrations and need authori
| Doc | What it covers |
|---|---|
| [Architecture](reference/architecture.md) | System design, data flow, security model, deployment topologies |
| [Profiles](reference/profiles.md) | CertificateProfile policy object — issuer wiring, EKUs, RequiresApproval gate (Phase 9 closure) |
| [API](reference/api.md) | OpenAPI 3.1 spec, integration patterns, client SDK generation |
| [CLI](reference/cli.md) | certctl-cli command reference and CI/CD integration patterns |
| [Configuration](reference/configuration.md) | `CERTCTL_*` environment variable reference (scheduler, rate limits, deploy verify, audit, agent) |
@@ -62,10 +63,12 @@ You're running certctl in production and need operational guidance.
| Doc | What it covers |
|---|---|
| [Security posture](operator/security.md) | Auth, rate limits, encryption at rest, key rotation |
| [Security posture](operator/security.md) | Auth, rate limits, encryption at rest, key rotation, RBAC primitive (Bundle 1), bootstrap |
| [RBAC operator reference](operator/rbac.md) | Roles, permissions, scopes, scope-down + bootstrap flow (Bundle 1) |
| [Auth threat model](operator/auth-threat-model.md) | API-key compromise, role-grant abuse, bootstrap-token leak, audit-mutation, compliance mapping (Bundle 1) |
| [Control plane TLS](operator/tls.md) | Self-signed bootstrap, operator-supplied Secret, cert-manager Certificate CR |
| [Database TLS](operator/database-tls.md) | PostgreSQL transport encryption |
| [Approval workflow](operator/approval-workflow.md) | Two-person integrity gate for high-stakes issuance |
| [Approval workflow](operator/approval-workflow.md) | Two-person integrity gate for high-stakes issuance + Phase 9 profile-edit closure |
| [Helm deployment](operator/helm-deployment.md) | Kubernetes installation via the bundled chart |
| [Performance baselines](operator/performance-baselines.md) | Operator-runnable benchmarks for regression spot checks |
| [Legacy clients (TLS 1.2)](operator/legacy-clients-tls-1.2.md) | Reverse-proxy runbook for embedded EST/SCEP clients on TLS 1.2 |
@@ -90,6 +93,7 @@ You're moving from another cert-management tool to certctl, or running both in p
| Caddy ACME (point Caddy at certctl) | [migration/acme-from-caddy.md](migration/acme-from-caddy.md) |
| cert-manager ACME (point cert-manager at certctl) | [migration/acme-from-cert-manager.md](migration/acme-from-cert-manager.md) |
| Traefik ACME (point Traefik at certctl) | [migration/acme-from-traefik.md](migration/acme-from-traefik.md) |
| **API keys → RBAC (v2.0.x → v2.1.0)** | [migration/api-keys-to-rbac.md](migration/api-keys-to-rbac.md) — **AUDIT YOUR API KEYS** post-upgrade |
## Contributor
+296
View File
@@ -0,0 +1,296 @@
# Migrating API keys to RBAC (v2.0.x → v2.1.0)
> Last reviewed: 2026-05-09
This is the upgrade guide for an existing certctl deployment moving
from v2.0.x's "every API key is admin or not" model to v2.1.0's
RBAC primitive. Everything keeps working through the upgrade — the
Bundle 1 migration backfills every existing API key to the
`r-admin` role on first boot, so the pre-existing automation that
was using those keys does not change behavior. **However**, most
keys do not need full admin power; this guide walks the operator
through the post-upgrade scope-down flow.
## ⚠️ SECURITY: AUDIT YOUR API KEYS
Bundle 1 maps **every** existing `CERTCTL_API_KEYS_NAMED` entry
(and every legacy `CERTCTL_AUTH_SECRET`-synthesized key) to the
`r-admin` role on the first boot after migration 000029 applies.
This is the safe-for-back-compat default — your CI / agents / scripts
keep working without changes — but if you don't downgrade keys, every
key in your fleet has full admin permissions including bulk-revoke,
CRL admin, and CA hierarchy management.
**Run the scope-down flow before tagging the next release.** The
release notes for v2.1.0 lead with this callout for a reason.
## Upgrade flow
### 1. Apply the migration
The migration runner is idempotent. Re-applying is a no-op if the
schema is already at the target version. The Bundle 1 migrations
that ship with v2.1.0:
| Migration | What it does |
|---|---|
| `000029_rbac.up.sql` | Creates `tenants`, `roles`, `permissions`, `role_permissions`, `actor_roles`. Seeds 7 default roles + 33-permission catalogue + the synthetic `actor-demo-anon` admin grant. Backfills every named API key into `actor_roles` with the `r-admin` role. |
| `000030_rbac_admin_perms.up.sql` | Seeds 5 admin-only fine-grained permissions (`cert.bulk_revoke`, `crl.admin`, `scep.admin`, `est.admin`, `ca.hierarchy.manage`) into `r-admin` only. |
| `000031_api_keys.up.sql` | Creates the `api_keys` table for runtime-minted keys (Bundle 1 Phase 6 bootstrap). |
| `000032_audit_category.up.sql` | Adds `event_category` column to `audit_events` with the closed enum (`cert_lifecycle` / `auth` / `config`). |
| `000033_approval_kinds.up.sql` | Adds `approval_kind` + `payload` to `issuance_approval_requests` for the Phase 9 approval-bypass closure. |
The Bundle 1 server applies these on first boot. No operator
action is required other than running the upgrade.
### 2. Verify the backfill landed
```bash
# Inspect the seeded actor_roles rows. You should see one row per
# entry in CERTCTL_API_KEYS_NAMED (Admin=true keys → r-admin,
# Admin=false keys → r-viewer) plus the seeded actor-demo-anon
# admin row.
psql -d certctl -c "SELECT actor_id, role_id, granted_by, granted_at FROM actor_roles ORDER BY granted_at;"
```
If the table is empty, the boot-loader hook in
`cmd/server/auth_backfill.go::backfillNamedKeyActorRoles` did not
run; re-check that `CERTCTL_AUTH_TYPE` is `api-key` (the boot
hook is gated on `cfg.Auth.Type != none`).
### 3. List + scope-down keys
The `certctl-cli` ships a four-mode scope-down command. Pick the
mode that matches your fleet size + automation posture.
#### Interactive walk
```bash
certctl-cli auth keys scope-down
```
Walks every actor (skips the synthetic `actor-demo-anon`) and
prompts for a target role. Empty input keeps the existing role.
Type one of `admin`, `operator`, `viewer`, `agent`, `mcp`, `cli`,
`auditor` to replace.
#### Non-interactive JSON config (Helm post-upgrade hook)
```bash
cat > scope-down.json <<EOF
{
"ci-bot": "operator",
"agent-prod-1": "agent",
"agent-prod-2": "agent",
"monitoring-bot": "viewer",
"compliance-bot": "auditor"
}
EOF
certctl-cli auth keys scope-down --non-interactive ./scope-down.json
```
Empty role values revoke every current grant WITHOUT granting a
replacement; assign roles selectively with
`certctl-cli auth keys assign`.
#### Audit-driven suggestion
```bash
# Preview suggestions based on the last 30 days of audit history
certctl-cli auth keys scope-down --suggest
# Apply the suggestions
certctl-cli auth keys scope-down --suggest --apply
```
The classifier (pure function in `internal/cli/auth_scope_down.go::SuggestRoleFromAuditEvents`)
walks the actor's audit events and emits one of:
| Suggestion | Trigger |
|---|---|
| `admin` | Any auth.role.* / auth.key.* / ca.hierarchy.* / *.bulk_revoke / *.admin action |
| `mcp` | All observed actions are MCP-shaped (`mcp.*`) |
| `viewer` | All observed actions are read-only (`*.read` or `*.list`) |
| `agent` | All observed actions are agent-shaped (`agent.*`, `cert.read`, `cert.issue`) |
| `operator` | Cert / profile / target lifecycle mutations without admin signals |
The classifier is conservative — when in doubt, it prefers the
narrower role. The operator confirms each suggestion before any
mutation lands (unless `--apply` is set).
### 4. Mint a fresh admin via bootstrap (optional, for fresh deployments)
If you're standing up a fresh deployment instead of upgrading an
existing one, the bootstrap path mints the first admin key without
needing the operator to know the env-var format:
```bash
# Set the bootstrap token in the server environment.
export CERTCTL_BOOTSTRAP_TOKEN=$(openssl rand -hex 32)
# Boot the server. Logs include "bootstrap endpoint enabled".
docker compose up -d
# Mint the first admin key.
curl -X POST $URL/api/v1/auth/bootstrap \
-H 'Content-Type: application/json' \
-d '{"token":"'$CERTCTL_BOOTSTRAP_TOKEN'","actor_name":"first-admin"}'
```
The response carries the plaintext `key_value` once. Capture it
and use it as the Bearer token for subsequent calls. Subsequent
bootstrap calls return HTTP 410 Gone.
See [`docs/operator/rbac.md`](../operator/rbac.md) for the full
bootstrap flow + the threat model.
## What changes for code that called `IsAdmin`
Pre-Bundle-1, the five admin handlers checked `auth.IsAdmin(ctx)`
directly in the body. Bundle 1 Phase 3.5 moved those checks to
the router via the `auth.RequirePermission` middleware (wrapped
through the `rbacGate` helper in
`internal/api/router/router.go`). The behavior contract is
unchanged: `r-admin`-roled callers reach the handler, anyone else
gets HTTP 403 BEFORE the body runs.
If your code consumed `auth.IsAdmin` directly (it shouldn't —
the helper is internal), the new convention is:
1. Wrap the route in `rbacGate(reg.Checker, "<perm>", handler)`
in `router.go`.
2. Add the perm to `migrations/000030_rbac_admin_perms.up.sql`
(or `migrations/000029_rbac.up.sql`'s catalogue).
3. Grant the perm to the right default roles.
The five admin-only fine-grained perms shipped in Phase 3.5 stay
on `r-admin` only by default. Operators delegate by creating
custom roles with the specific perm.
## Helm-specific upgrade
The certctl Helm chart applies migrations on container start via
the standard migrations runner. No chart changes are required;
the `helm upgrade` command runs identically:
```bash
helm upgrade certctl certctl/certctl \
--version <new-version> \
--reuse-values
```
Post-upgrade, the boot loader runs the named-key actor-role
backfill against the `CERTCTL_API_KEYS_NAMED` env-var-injected
into the deployment. The "AUDIT YOUR API KEYS" callout applies —
add a post-upgrade Job to your release pipeline that runs
`certctl-cli auth keys scope-down --non-interactive` against a
checked-in JSON config, so the role narrowing is deterministic
across upgrade rollouts.
Example post-upgrade Job:
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: certctl-scope-down
spec:
template:
spec:
containers:
- name: scope-down
image: ghcr.io/certctl-io/certctl-cli:<tag>
command:
- certctl-cli
- auth
- keys
- scope-down
- --non-interactive
- /config/scope-down.json
envFrom:
- secretRef:
name: certctl-cli-credentials
volumeMounts:
- name: scope-down-config
mountPath: /config
volumes:
- name: scope-down-config
configMap:
name: certctl-scope-down-config
restartPolicy: OnFailure
```
The ConfigMap holds the `{actor_id: role_id}` map; the Secret
holds the API key the Job uses to call `/v1/auth/keys/.../roles`.
## Docker Compose-specific upgrade
For `deploy/docker-compose.yml` deployments:
1. Pull the new images: `docker compose pull`
2. Verify your `CERTCTL_AUTH_TYPE` value before restarting. If it
was `none` (the demo path), the post-upgrade server will boot
in demo mode again — the synthetic `actor-demo-anon` admin
covers every request, no scope-down is meaningful. If you're
moving from `none` to `api-key` mode, set
`CERTCTL_API_KEYS_NAMED` first, then restart.
3. `docker compose up -d` to apply.
4. `docker compose logs certctl-server | grep -i 'loaded persisted api_keys'`
to verify the boot loader ran. The first-boot log line includes
the count of keys loaded into the runtime keystore.
5. Run `certctl-cli auth keys scope-down` against the running
server.
The five examples in `examples/` (acme-nginx, private-ca-traefik,
step-ca-haproxy, multi-issuer, acme-wildcard-dns01) all run in
demo mode (`CERTCTL_AUTH_TYPE=none`) and are unaffected by the
RBAC migration — the synthetic actor-demo-anon admin grant covers
every request.
## Verifying the upgrade landed
After the scope-down flow completes:
1. `certctl-cli auth me` while authenticated as each named key
confirms the right `effective_permissions` for that role.
2. `psql -c "SELECT actor_id, array_agg(role_id ORDER BY role_id) FROM actor_roles GROUP BY actor_id;"`
gives the full picture in one query.
3. The audit trail
(`GET /api/v1/audit?category=auth`)
shows the `auth.role.assign` and `auth.role.revoke` rows for
every change you made — confirm via the GUI's
`/audit?category=auth` view.
4. Read the updated [`docs/operator/rbac.md`](../operator/rbac.md)
for day-2 RBAC management.
## Rollback
If the upgrade goes wrong, the down migrations exist in lockstep:
```bash
# Roll back via your migration runner (golang-migrate, Atlas, etc.).
# Migrations 000029-000033 each have a .down.sql that reverses the
# .up.sql. Down migrations are destructive on data added by the up
# migration (api_keys rows, role grants on actors, profile-edit
# approvals); take a backup first.
```
After rollback, the v2.0.x binary works against the v2.0.x
schema unchanged. The operator's API keys still authenticate (the
in-memory hash table is rebuilt from `CERTCTL_API_KEYS_NAMED` on
boot regardless of schema version).
## Cross-references
- [`docs/operator/rbac.md`](../operator/rbac.md) — the operator
how-to for the new RBAC primitive
- [`docs/operator/auth-threat-model.md`](../operator/auth-threat-model.md) —
what the new controls defend against
- [`docs/reference/profiles.md`](../reference/profiles.md) — the
Phase 9 approval-bypass closure
- [`docs/operator/security.md`](../operator/security.md) — the
full security posture
- `cowork/auth-bundle-1-prompt.md` — the design + phase plan
- `cowork/auth-bundles-index.md` — the per-phase status tracker
- `CHANGELOG.md` — the v2.1.0 release notes lead with this guide
+244
View File
@@ -0,0 +1,244 @@
# Authentication & authorization threat model
> Last reviewed: 2026-05-09
This document describes the attack surface around authentication and
authorization in certctl after Bundle 1 (the RBAC primitive) lands.
It complements [`rbac.md`](rbac.md) — that doc explains how to use
the controls; this one explains what those controls defend against
and which threats they explicitly do NOT close.
For Bundle 2's OIDC + sessions extensions, this document will be
updated. The Bundle 1 boundary is "API-key auth + RBAC primitive +
day-0 bootstrap"; OIDC-federated humans, session cookies,
revocation lists, WebAuthn, and break-glass local accounts are
Bundle 2 scope.
## Threat actors
1. **External attacker with no credential** — probing the public
HTTP surface. The default trust boundary for everything except
the protocol-level endpoints (ACME / SCEP / EST / OCSP / CRL,
which authenticate via embedded credentials per their own RFCs).
2. **Authenticated caller with the wrong role** — has a valid API
key but the role doesn't grant the requested operation. The
primary RBAC threat model.
3. **Compromised API key** — attacker holds a valid Bearer token
that an honest operator originally provisioned. The key may
carry any role.
4. **Insider operator** — legitimate access; potentially trying
to escalate privilege or bypass the approval workflow.
5. **Compromised audit reviewer (auditor role)** — read-only
access to audit events but otherwise untrusted.
## Defenses Bundle 1 ships
### API-key authentication
- API keys live in `CERTCTL_API_KEYS_NAMED` (env-var) or
`api_keys` (DB row, written by Bundle 1 Phase 6 bootstrap and
the future role-management API). Keys hash via SHA-256; the
middleware compares hashes via `crypto/subtle.ConstantTimeCompare`
to defeat timing attacks.
- The auth middleware populates `ActorIDKey` / `ActorTypeKey` /
`TenantIDKey` on every authenticated request context. Audit rows
attribute every action to the named-key actor instead of the
pre-Bundle-1 hardcoded `api-key-user` placeholder.
- Demo mode (`CERTCTL_AUTH_TYPE=none`) injects the synthetic
`actor-demo-anon` actor with admin grants. Production deploys
MUST NOT use demo mode.
### Authorization (RBAC)
- Every gated handler routes through `auth.RequirePermission` (or
the router-level `rbacGate` wrap from Phase 3.5). The middleware
resolves the actor's effective permissions via the
`Authorizer.CheckPermission` service-layer call; on miss, the
handler returns HTTP 403 BEFORE the body runs. This is the
load-bearing gate.
- The five admin-only fine-grained perms (`cert.bulk_revoke` /
`crl.admin` / `scep.admin` / `est.admin` /
`ca.hierarchy.manage`) are seeded into `r-admin` only. To
delegate one, an operator creates a custom role with the
specific perm and grants it to the right actor.
- The auditor split: `r-auditor` holds only `audit.read` +
`audit.export`. Pinned by the
`internal/domain/auth/auditor_test.go` invariants. A regulator
with the auditor key cannot read certificates, profiles,
issuers, or any mutating surface.
- The privilege-escalation guard: granting or revoking a role
requires the caller to hold `auth.role.assign` (enforced in
`internal/service/auth/actor_role_service.go`). A non-admin
cannot self-grant admin.
- The reserved-actor guard: mutations against `actor-demo-anon`
return HTTP 409 from the service layer
(`ErrAuthReservedActor`). The synthetic actor is operator-
inaccessible.
### Day-0 bootstrap
- `CERTCTL_BOOTSTRAP_TOKEN` is constant-time-compared by
`EnvTokenStrategy.Validate`. The strategy is one-shot via
`sync.Mutex`-guarded `consumed` bool; the second call returns
`ErrDisabled` (HTTP 410), not `ErrInvalidToken` (HTTP 401), so
a probing attacker cannot distinguish "wrong token, retry"
from "already consumed".
- The strategy also re-probes admin existence on every Validate.
If an admin actor lands during the gap between Available and
Validate, the second caller still gets HTTP 410.
- The minted plaintext key is written to the response body once.
It is NEVER logged. The token-leak hygiene test in
`internal/api/handler/auth_bootstrap_test.go` redirects
`slog.Default` to a buffer and grep-asserts that neither the
bootstrap token nor the minted key appears in any log line,
audit row, or HTTP header.
- The minted key is hashed before persistence. Lost key →
rotate via the regular RBAC API; the plaintext is not
recoverable from the DB.
### Approval workflow + Phase 9 loophole closure
- `CertificateProfile.RequiresApproval=true` gates two surfaces:
(a) issuance + renewal of every cert pointing at the profile,
(b) edits to the profile itself (Bundle 1 Phase 9). The Phase 9
closure prevents the flip-flop bypass where an admin disables
approval, mutates, re-enables.
- Same-actor self-approve is rejected at the service layer with
`ErrApproveBySameActor` for both `cert_issuance` and
`profile_edit` kinds. Two-person integrity is the load-bearing
invariant; pinned by tests in
`internal/service/approval_test.go`.
### Audit trail
- Every mutating operation flows through `AuditService.RecordEvent`
or `RecordEventWithCategory`. Bundle 1 Phase 8 added the
`event_category` column with a `CHECK` constraint enforcing
the closed enum (`cert_lifecycle` / `auth` / `config`); the
category surfaces the auth-mutation slice to the auditor view.
- The WORM trigger from migration 000018
(`audit_events_worm_trigger`) blocks `UPDATE` and `DELETE` at
the database layer. Even an admin DB user cannot tamper with
audit history without dropping the trigger.
- Bundle-6's redactor (`internal/service/audit_redact.go`)
scrubs credentials + PII from the `details` JSONB before
persistence; an `_redacted_keys` field surfaces what the
redactor took out for compliance review.
### Protocol-endpoint allowlist
ACME / SCEP / EST / OCSP / CRL endpoints authenticate via
embedded credentials defined by their own RFCs (JWS-signed,
challenge passwords, mTLS, public-by-RFC). The auth middleware
explicitly bypasses these via `IsProtocolEndpoint`. The Phase 12
`internal/api/router/phase12_protocol_allowlist_test.go` pins
the invariant at three layers (middleware bypass, allowlist
constant, router-level no-rbacGate-wraps-protocol-paths).
## Threats Bundle 1 does NOT close
These are NOT defended; some are deferred to Bundle 2, others
are out-of-scope for the project entirely.
1. **OIDC / SAML / WebAuthn federation** — Bundle 2.
2. **Session management** — there is no session cookie, no
server-side revocation list. Each Bearer token is the bearer
credential. To revoke a key, delete the `actor_roles` rows or
remove the env-var entry; there is no "log out everywhere"
button. Bundle 2.
3. **Local password accounts (break-glass)** — Bundle 2.
4. **Time-bound role grants / JIT elevation** — the schema
reserves `actor_roles.expires_at` but no UI/API to set it.
Bundle 2 or v3.
5. **MFA / hardware tokens for the operator console**
Bundle 2.
6. **Rate limiting on the bootstrap endpoint** — the endpoint
is one-shot by construction (consumed flag + admin-existence
probe), so a brute-force attack on the token has at most the
single attempt before the path closes. Per-IP rate limiting
on the broader API is still in place via Bundle C's
`middleware.NewRateLimiter`.
7. **`scope_id` FK enforcement** — operators can grant a
permission at scope `profile`/`p-bogus` without the bogus
profile existing. The gate still works (no rows match at
request time) but a strict 404 on grant would be cleaner. See
`RoleRepository.AddPermission` `TODO(bundle-2)` comment in
`internal/repository/postgres/auth.go`.
8. **OIDC-first-admin bootstrap** — Bundle 1 ships only the
env-var-token strategy. Bundle 2 adds the OIDC-group-claim
strategy alongside (the `Strategy` interface in
`internal/auth/bootstrap/` is already in place).
9. **GUI E2E suite via Playwright** — the prompt asked for
nine end-to-end flow tests. Bundle 1 ships 19 React Testing
Library + Vitest tests covering the same surface; full
Playwright land in Phase 12-extended work.
## Compliance mapping
The control set in this document supports the following
framework requirements. This is a mapping; it is not a claim of
formal certification.
- **SOC 2 CC6.1** (logical access controls) — RBAC primitive
with role-based gating on every mutating endpoint.
- **SOC 2 CC6.3** (privileged access management) — `r-admin`
role separation + role-grant audit trail with two-person
integrity on approval-tier profile edits.
- **HIPAA §164.312(b)** (audit controls) — `event_category`
column lets the auditor role review authentication / authorization
changes specifically. WORM trigger keeps the audit table
append-only at the database layer.
- **NIST SSDF PO.5.2** (separation of duties) — two-person
integrity for compliance-tier issuance via the
`RequiresApproval` flow + Bundle 1 Phase 9's closure of the
flip-flop bypass.
- **FedRAMP AU-9** (audit information protection) — WORM
enforcement + auditor-only read access (the auditor role
cannot mutate, the WORM trigger blocks UPDATE/DELETE).
- **PCI-DSS §10** (audit logging) — every mutating operation
emits an audit row with actor + action + resource + timestamp +
category. The audit table is append-only.
## Operator-facing checks
Run these periodically to verify the controls are working.
1. `certctl-cli auth keys list` — confirm no unexpected actor
holds `r-admin`. Audit any new admin grants against the audit
log.
2. `SELECT actor, action, COUNT(*) FROM audit_events WHERE
action LIKE 'approval_%' AND timestamp > NOW() - INTERVAL '7
days' GROUP BY actor, action;` — confirm approvals are
happening and not concentrated in a single approver.
3. `SELECT COUNT(*) FROM audit_events WHERE actor =
'system-bypass';` — MUST return 0 in production. A non-zero
count means `CERTCTL_APPROVAL_BYPASS=true` was set; production
deploys MUST leave it unset.
4. `SELECT actor, COUNT(*) FROM audit_events WHERE action =
'bootstrap.consume';` — MUST return at most one row per
tenant. Multiple rows means the bootstrap endpoint was called
more than once, which the strategy's one-shot guard should
have prevented; investigate.
5. `certctl-cli auth me` while authenticated as the auditor
key — `effective_permissions` must contain `audit.read` +
`audit.export` ONLY. Any other permission means a role grant
widened the auditor's surface; revoke immediately.
## Cross-references
- [`rbac.md`](rbac.md) — the operator how-to
- [`security.md`](security.md) — the wider security posture
- [`approval-workflow.md`](approval-workflow.md) — the two-person
integrity gate
- [`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md) —
upgrade flow
- `internal/auth/` — middleware + keystore + RequirePermission +
bootstrap
- `internal/service/auth/` — Authorizer + privilege-escalation
guard + reserved-actor guard
- `migrations/000029_rbac.up.sql` — schema + seed
- `migrations/000030_rbac_admin_perms.up.sql` — five admin-only
fine-grained perms
- `migrations/000032_audit_category.up.sql` — auditor surface
- `migrations/000033_approval_kinds.up.sql` — approval-bypass
closure
+280
View File
@@ -0,0 +1,280 @@
# RBAC operator reference
> Last reviewed: 2026-05-09
This is the operator-facing reference for the role-based access
control primitive that ships with Bundle 1 (auth bundle 1) of certctl.
Read this if you're running certctl in production and need to grant /
revoke access to API keys, set up the auditor split, or onboard the
first admin.
For the threat model behind these controls, see
[`auth-threat-model.md`](auth-threat-model.md). For the migration
flow from a pre-Bundle-1 deployment, see
[`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md).
## Mental model
Every action against the certctl HTTP / CLI / MCP / GUI surface is
performed by an **actor** (an API key, an agent's machine identity,
the synthetic demo-anon actor when the server runs in
`CERTCTL_AUTH_TYPE=none` mode). Each actor holds zero or more
**roles**. Each role grants a set of **permissions** at a **scope**.
A request to a gated endpoint succeeds when the actor's effective
permission set (the union across all held roles) contains the
permission the endpoint requires.
The schema lives in `migrations/000029_rbac.up.sql` and ships with
seven seeded default roles + a 33-permission canonical catalogue.
The middleware that gates requests lives at
`internal/auth/require_permission.go`. The service-layer authorizer
that resolves "actor → permissions" lives at
`internal/service/auth/authorizer.go`.
## Default roles (seeded by migration 000029)
| Role | ID | Use case | Permission shape |
|---|---|---|---|
| Admin | `r-admin` | Operator with full control | Every permission in the canonical catalogue |
| Operator | `r-operator` | Day-to-day cert lifecycle | `cert.*`, `profile.read`, `issuer.read`, `target.*`, `agent.read`, `audit.read` |
| Viewer | `r-viewer` | Read-only console access | `*.read` for every resource type |
| Agent | `r-agent` | Machine identity for `certctl-agent` | `cert.read` + `agent.heartbeat` + `agent.job.poll` + `agent.job.complete` + `agent.job.report` |
| MCP | `r-mcp` | Operator-equivalent for the MCP server, minus destructive ops | Like Operator without `*.delete` |
| CLI | `r-cli` | Day-to-day operator CLI | Like Operator + `auth.key.list` / `auth.key.create` / `auth.key.rotate` |
| Auditor | `r-auditor` | Compliance reviewer | `audit.read` + `audit.export` ONLY |
The auditor split is the load-bearing one: an auditor cannot read
certificates, profiles, or issuers — only audit events. That makes the
role legitimate to hand to a SOC 2 / FedRAMP / PCI auditor without
giving them the keys to the kingdom. The
`internal/domain/auth/auditor_test.go` invariants pin this set going
forward.
The five **admin-only fine-grained perms** seeded by migration
000030 (Phase 3.5 conversion) gate the high-blast-radius endpoints:
- `cert.bulk_revoke``POST /api/v1/certificates/bulk-revoke` and the EST sibling
- `crl.admin``/api/v1/admin/crl/cache`
- `scep.admin``/api/v1/admin/scep/intune/*`
- `est.admin``/api/v1/admin/est/*`
- `ca.hierarchy.manage``/api/v1/issuers/{id}/intermediates`, `/api/v1/intermediates/{id}`
Only `r-admin` holds these by default. To delegate one, create a
custom role with the specific perm and grant it to the right actor.
## Permission catalogue
The catalogue is namespaced. Permission strings are stable across
releases; new permissions add to the namespace, never reshape an
existing one. Run
`certctl-cli auth permissions list` (or `GET /api/v1/auth/permissions`)
for the live catalogue.
| Namespace | Examples | What the namespace gates |
|---|---|---|
| `cert.*` | `cert.read`, `cert.issue`, `cert.revoke`, `cert.delete`, `cert.bulk_revoke` | The certificate lifecycle surface (`/api/v1/certificates`) |
| `profile.*` | `profile.read`, `profile.edit`, `profile.delete` | `CertificateProfile` CRUD |
| `issuer.*` | `issuer.read`, `issuer.edit`, `issuer.delete` | Issuer connector config |
| `target.*` | `target.read`, `target.edit`, `target.delete` | Deployment target config |
| `agent.*` | `agent.read`, `agent.edit`, `agent.retire`, `agent.heartbeat`, `agent.job.*` | Agent fleet + agent self-service endpoints |
| `audit.*` | `audit.read`, `audit.export` | The audit-events surface |
| `auth.role.*` | `auth.role.list`, `auth.role.create`, `auth.role.edit`, `auth.role.delete`, `auth.role.assign` | RBAC management |
| `auth.key.*` | `auth.key.list`, `auth.key.create`, `auth.key.rotate`, `auth.key.delete` | API key management |
| `auth.bootstrap.*` | `auth.bootstrap.use` | Day-0 first-admin path |
| `crl.admin`, `scep.admin`, `est.admin`, `ca.hierarchy.manage` | (single perms) | The five admin-only fine-grained perms (see above) |
## Scope semantics
Permissions are granted at one of three scopes:
- **`global`** — applies to every resource in the tenant. The
default for the seeded role grants. A `cert.read` grant at global
scope lets the actor read any certificate.
- **`profile`** — applies only to the named `CertificateProfile`
(matched by ID). `cert.issue` at scope `profile`/`p-corp-cdn` lets
the actor issue against `p-corp-cdn` only.
- **`issuer`** — applies only to the named issuer. Lets you grant
`issuer.edit` on the production issuer to a senior operator
without giving them edit on every issuer.
Global beats specific: an actor with `cert.read` at global scope
passes a `cert.read` check against any specific profile or issuer
even if no scoped grant exists. The reverse is also true — a
scoped grant doesn't satisfy a request against a different scope.
The Authorizer's `CheckPermission` is the single point of truth.
> **Note (Bundle 1 deferral):** the `scope_id` column is not
> currently FK-constrained against the resource tables. An
> operator can grant a permission at scope `profile`/`p-bogus`
> without `p-bogus` existing; the gate still works (no rows match
> at request time), but the API does not 404 the grant. Bundle 2
> tracks the strict-FK closure. See
> `internal/repository/postgres/auth.go::AddPermission`'s
> `TODO(bundle-2)` comment.
## Granting + revoking access
### From the GUI
`/auth/roles` lists every role; click into one to see its
permissions and (if you hold `auth.role.edit`) add or remove a
permission. `/auth/keys` lists every actor with role grants;
click "Assign role" to grant, click the × on a role tag to revoke.
The synthetic `actor-demo-anon` row is shown but flagged
"system-managed" with the mutation buttons hidden — the server-side
reserved-actor guard rejects mutations against it regardless.
### From the CLI
```bash
# Identity probe — what can the current API key actually do?
certctl-cli auth me
# Roles
certctl-cli auth roles list
certctl-cli auth roles get r-admin
# Permissions catalogue
certctl-cli auth permissions list
# Key → role assignment
certctl-cli auth keys list
certctl-cli auth keys assign alice --role r-operator
certctl-cli auth keys revoke alice --role r-admin
# Walk-every-key prompt for downgrade
certctl-cli auth keys scope-down
# Audit-driven role suggestion (last 30 days of audit events)
certctl-cli auth keys scope-down --suggest
certctl-cli auth keys scope-down --suggest --apply
# JSON-driven scope-down for automation (Helm post-upgrade hook etc.)
certctl-cli auth keys scope-down --non-interactive ./scope-down.json
```
The mutating role-lifecycle commands (`certctl-cli auth roles
create / update / delete` + `roles add-permission / remove-permission`)
are tracked as Bundle 1 Phase 5.5 follow-up; today, manage custom
roles via the HTTP API or GUI.
### From the HTTP API
Every endpoint is documented in `api/openapi.yaml` under the `[Auth]`
tag. Quick reference:
| Endpoint | Permission |
|---|---|
| `GET /v1/auth/me` | (none — own data) |
| `GET /v1/auth/roles` | `auth.role.list` |
| `GET /v1/auth/roles/{id}` | `auth.role.list` |
| `POST /v1/auth/roles` | `auth.role.create` |
| `PUT /v1/auth/roles/{id}` | `auth.role.edit` |
| `DELETE /v1/auth/roles/{id}` | `auth.role.delete` |
| `GET /v1/auth/permissions` | `auth.role.list` |
| `POST /v1/auth/roles/{id}/permissions` | `auth.role.edit` |
| `DELETE /v1/auth/roles/{id}/permissions/{perm}` | `auth.role.edit` |
| `GET /v1/auth/keys` | `auth.role.list` |
| `POST /v1/auth/keys/{id}/roles` | `auth.role.assign` |
| `DELETE /v1/auth/keys/{id}/roles/{role_id}` | `auth.role.assign` |
| `GET /v1/auth/check` | (authenticated; surfaces effective perms) |
| `GET /v1/auth/bootstrap` + `POST /v1/auth/bootstrap` | (auth-exempt; gated by env-var token) |
### From the MCP server
Bundle 1 Phase 11 ships 12 RBAC tools:
`certctl_auth_me`, `certctl_auth_list_roles`, `certctl_auth_get_role`,
`certctl_auth_create_role`, `certctl_auth_update_role`,
`certctl_auth_delete_role`, `certctl_auth_list_permissions`,
`certctl_auth_add_permission_to_role`,
`certctl_auth_remove_permission_from_role`,
`certctl_auth_list_keys`, `certctl_auth_assign_role_to_key`,
`certctl_auth_revoke_role_from_key`. Each routes through the same
HTTP surface above; permission gates fire server-side.
## The auditor pattern
Hand the auditor key to compliance reviewers. They get:
- `GET /api/v1/audit?category=auth` — every auth/authz mutation
in the system (role creates, role grants on actors, bootstrap
consumption, etc.).
- `GET /api/v1/audit?category=cert_lifecycle` — every cert event.
- `GET /api/v1/audit?category=config` — every issuer / target /
settings edit.
- `GET /api/v1/audit/export` — bulk export.
They do NOT get cert read, profile read, issuer read, or any
mutating permission. The categorization is enforced by the database
CHECK constraint (migration 000032); the WORM trigger from
migration 000018 keeps the audit table append-only at the DB layer.
To create an auditor key:
1. `certctl-cli auth keys assign <key-id> --role r-auditor`
2. (Optional) Revoke any other roles the key holds with
`certctl-cli auth keys revoke <key-id> --role r-...`
3. Confirm via `certctl-cli auth me` while authenticated as the
auditor key — the response should show only `audit.read` and
`audit.export` in `effective_permissions`.
## Day-0 bootstrap (first-admin path)
Bundle 1 Phase 6 ships a one-shot bootstrap endpoint for fresh
deployments where no admin actor exists yet.
1. Set `CERTCTL_BOOTSTRAP_TOKEN=$(openssl rand -hex 32)` in the
server environment.
2. Boot the server. Logs include
"bootstrap endpoint enabled — POST /api/v1/auth/bootstrap to
mint the first admin key (one-shot)" when the path is callable.
3. Run a single curl:
```bash
curl -X POST $URL/api/v1/auth/bootstrap \
-H 'Content-Type: application/json' \
-d '{"token":"<the-token>","actor_name":"first-admin"}'
```
4. Capture the `key_value` from the response. **It is shown ONCE.**
The server never logs it.
5. Use the new key to authenticate against the rest of the API.
The bootstrap path is now closed: subsequent calls return HTTP
410 Gone, even with the same valid token, because an admin
actor exists.
The token is constant-time-compared. The server logs a startup
warning if `CERTCTL_BOOTSTRAP_TOKEN` is set AND admin actors
already exist (config-drift signal). For OIDC-first-admin (the
"first user who signs in via SSO becomes admin" pattern), wait for
Bundle 2.
## Demo mode (`CERTCTL_AUTH_TYPE=none`)
When auth is disabled, the server injects a synthetic actor
`actor-demo-anon` into every request context. That actor holds
`r-admin` at global scope (seeded by migration 000029), so every
gated route resolves with a populated actor and admin grants. The
synthetic actor is reserved: the API rejects any mutation that
targets it (HTTP 409 with `ErrAuthReservedActor`).
Production deployments MUST NOT use demo mode — there is no
per-request actor identity for the audit trail, and every request
flows as admin. Use it for the `docker compose up` demo + the five
example folders only.
## Where to look next
- [Threat model](auth-threat-model.md) — what attacks this primitive
defends against and which it does not
- [Migration guide](../migration/api-keys-to-rbac.md) — moving
pre-Bundle-1 deployments onto RBAC
- [Profiles](../reference/profiles.md) — the `RequiresApproval=true`
flow that Bundle 1 Phase 9 closure protects from flip-flop
- [Approval workflow](approval-workflow.md) — the Rank 7 Infisical
deep-research deliverable that the Phase 9 closure piggybacks on
- `internal/auth/` — the middleware + keystore + RequirePermission
- `internal/service/auth/` — the service-layer Authorizer
- `cowork/auth-bundle-1-prompt.md` — the design + phase plan
- `cowork/auth-bundles-index.md` — the per-phase status tracker
+50 -5
View File
@@ -1,6 +1,6 @@
# certctl Security Posture & Operator Guidance
> Last reviewed: 2026-05-05
> Last reviewed: 2026-05-09
This document collects the operator-facing security guidance that the source
code's per-finding comment blocks reference. Each section names the audit
@@ -75,15 +75,60 @@ the accompanying tests for the format spec.
Bundle B / M-002. Two layers decide auth-exempt status:
1. **Router layer:** `internal/api/router/router.go::AuthExemptRouterRoutes`
— the 4 endpoints registered via direct `r.mux.Handle` without going
— the endpoints registered via direct `r.mux.Handle` without going
through the middleware chain (`/health`, `/ready`, `/api/v1/auth/info`,
`/api/v1/version`).
`/api/v1/version`, plus `/api/v1/auth/bootstrap` GET + POST per
Bundle 1 Phase 6).
2. **Dispatch layer:** `internal/api/router/router.go::AuthExemptDispatchPrefixes`
— URL-prefix routing in `cmd/server/main.go::buildFinalHandler` for
`/.well-known/pki/*`, `/.well-known/est/*`, and `/scep[/...]*`.
`/.well-known/pki/*`, `/.well-known/est/*`, `/.well-known/est-mtls`,
and `/scep[/...]*` (incl. `/scep-mtls`).
Both lists have AST-walking regression tests (`auth_exempt_test.go`) that
fail CI if a new bypass lands without an updating the documented constant.
fail CI if a new bypass lands without updating the documented constant.
### RBAC primitive (Bundle 1)
Bundle 1 ships role-based authorization on top of API-key
authentication. Every gated handler routes through the
`auth.RequirePermission` middleware (or its router-level wrap
`rbacGate`); the middleware resolves the actor's effective
permissions via the service-layer `Authorizer.CheckPermission`
and returns HTTP 403 BEFORE the handler body runs on miss. The
seven default roles (`admin` / `operator` / `viewer` / `agent` /
`mcp` / `cli` / `auditor`), 33-permission canonical catalogue,
and the auditor split (`r-auditor` holds only `audit.read` +
`audit.export`) are seeded by migration 000029.
For the operator how-to, see [`rbac.md`](rbac.md). For the
threat model + compliance mapping, see
[`auth-threat-model.md`](auth-threat-model.md). For the upgrade
flow from a pre-Bundle-1 deployment, see
[`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md).
### Day-0 admin bootstrap (Bundle 1 Phase 6)
Fresh deployments where no admin actor exists yet can mint the
first admin via `POST /api/v1/auth/bootstrap` — set
`CERTCTL_BOOTSTRAP_TOKEN`, POST a single curl with the token, and
the server returns the plaintext key value once. The token is
constant-time-compared; the strategy is one-shot via mutex; the
admin-existence probe re-closes the path once an admin lands.
The token is NEVER logged. The minted plaintext key flows only
into the HTTP response body. See
[`rbac.md`](rbac.md#day-0-bootstrap-first-admin-path) for the
full flow.
### Approval-bypass closure (Bundle 1 Phase 9)
`CertificateProfile.RequiresApproval=true` profiles route both
issuance/renewal AND profile edits through the
`ApprovalService` two-person integrity gate (Phase 9 closes the
flip-flop loophole where an admin could disable approval, mutate,
re-enable). Same-actor self-approve is rejected at the service
layer with `ErrApproveBySameActor`. See
[`docs/reference/profiles.md`](../reference/profiles.md) for the
full gate semantics.
## Per-user rate limiting
+1 -1
View File
@@ -1,6 +1,6 @@
# Certificate profiles
Last reviewed: 2026-05-09
> Last reviewed: 2026-05-09
A `CertificateProfile` is the policy object that groups every cert with
the same shape: which issuer mints it, which key algorithm + size are