README:
- Rewrite Status block: drop the stale 'federated identity not yet
shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout
+ break-glass as early-access; encourage GitHub issues for IdP
rough edges. (A1 framing — keep early-access umbrella, no
SAML/WebAuthn/JIT roadmap teaser.)
- Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks,
group-claim → role mapping, AES-256-GCM client_secret encryption,
JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding,
RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute
expiry, BCL, break-glass admin.
- Update Security paragraph: three auth paths (API keys / OIDC /
break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL.
- Correct CI coverage thresholds against
.github/coverage-thresholds.yml (service 70%, handler 75%,
crypto 88%, auth packages 85-95%); 'static analysis' replaces
the inflated '11 linters' claim (actual count is 4 active).
Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags:
- docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2
sections (API-key + RBAC defenses / OIDC + sessions + break-glass
defenses / OIDC + sessions threat catalogue / Closed federated-
identity threats / Future-work threats); clean ~12 H3/prose hits.
- docs/operator/rbac.md — strip Bundle 1 framing from intro,
scope_id deferral note, MCP tools section, day-0 bootstrap, and
'Where to look next'.
- docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from
title intro, hardware floor caption, result table caption,
methodology, and pre-merge audit section.
- docs/operator/security.md — already cleaned earlier this session
(RBAC / day-0 / approval-bypass / OIDC federation / sessions /
OIDC first-admin / break-glass H3s).
- docs/operator/oidc-runbooks/{index,keycloak,authentik,okta,
azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4
references; replace with feature-name prose.
- docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023
audit-reference framing; keep CWE-326.
- docs/operator/database-tls.md — drop Bundle B / M-018 framing
from intro + Helm section.
- docs/operator/runbooks/disaster-recovery.md — drop 'Production
hardening II Phase 10' status callout.
- docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO';
strip Bundle 1/2 framing from prereqs, troubleshooting, related
docs; update __Host- cookie callout from 'audit MED-14' to
v2.1.0-BREAKING.
- docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from
intro, migration table, IsAdmin section, and cross-references.
- docs/migration/acme-from-cert-manager.md — strip residual
'Phase 5' tags from cert-manager integration test references.
- docs/reference/configuration.md — retitle Auth section.
- docs/reference/profiles.md — strip Bundle 1 Phase 9 framing
from RequiresApproval section + Related list.
- docs/reference/auth-standards-implemented.md — rewrite intro
(API-key + RBAC + OIDC + sessions + back-channel logout +
break-glass); rename 'Bundle 1 (RBAC) standards covered
separately' H2; clean per-row Phase references.
- docs/README.md — rewrite nav-table entries to drop Bundle 1/2
parentheticals; retitle 'Enable OIDC SSO' migration entry.
No code or test changes; pure operator-facing prose polish for
the v2.1.0 tag.
11 KiB
Migrating API keys to RBAC (v2.0.x → v2.1.0)
Last reviewed: 2026-05-09
This is the upgrade guide for an existing certctl deployment moving
from v2.0.x's "every API key is admin or not" model to v2.1.0's
RBAC primitive. Everything keeps working through the upgrade - the
migration backfills every existing API key to the
r-admin role on first boot, so the pre-existing automation that
was using those keys does not change behavior. However, most
keys do not need full admin power; this guide walks the operator
through the post-upgrade scope-down flow.
⚠️ SECURITY: AUDIT YOUR API KEYS
v2.1.0 maps every existing CERTCTL_API_KEYS_NAMED entry
(and every legacy CERTCTL_AUTH_SECRET-synthesized key) to the
r-admin role on the first boot after migration 000029 applies.
This is the safe-for-back-compat default - your CI / agents / scripts
keep working without changes - but if you don't downgrade keys, every
key in your fleet has full admin permissions including bulk-revoke,
CRL admin, and CA hierarchy management.
Run the scope-down flow before tagging the next release. The release notes for v2.1.0 lead with this callout for a reason.
Upgrade flow
1. Apply the migration
The migration runner is idempotent. Re-applying is a no-op if the schema is already at the target version. The five RBAC migrations that ship in v2.1.0:
| Migration | What it does |
|---|---|
000029_rbac.up.sql |
Creates tenants, roles, permissions, role_permissions, actor_roles. Seeds 7 default roles + 33-permission catalogue + the synthetic actor-demo-anon admin grant. Backfills every named API key into actor_roles with the r-admin role. |
000030_rbac_admin_perms.up.sql |
Seeds 5 admin-only fine-grained permissions (cert.bulk_revoke, crl.admin, scep.admin, est.admin, ca.hierarchy.manage) into r-admin only. |
000031_api_keys.up.sql |
Creates the api_keys table for runtime-minted keys (day-0 bootstrap path). |
000032_audit_category.up.sql |
Adds event_category column to audit_events with the closed enum (cert_lifecycle / auth / config). |
000033_approval_kinds.up.sql |
Adds approval_kind + payload to issuance_approval_requests for the approval-bypass closure. |
The v2.1.0 server applies these on first boot. No operator action is required other than running the upgrade.
2. Verify the backfill landed
# Inspect the seeded actor_roles rows. You should see one row per
# entry in CERTCTL_API_KEYS_NAMED (Admin=true keys → r-admin,
# Admin=false keys → r-viewer) plus the seeded actor-demo-anon
# admin row.
psql -d certctl -c "SELECT actor_id, role_id, granted_by, granted_at FROM actor_roles ORDER BY granted_at;"
If the table is empty, the boot-loader hook in
cmd/server/auth_backfill.go::backfillNamedKeyActorRoles did not
run; re-check that CERTCTL_AUTH_TYPE is api-key (the boot
hook is gated on cfg.Auth.Type != none).
3. List + scope-down keys
The certctl-cli ships a four-mode scope-down command. Pick the
mode that matches your fleet size + automation posture.
Interactive walk
certctl-cli auth keys scope-down
Walks every actor (skips the synthetic actor-demo-anon) and
prompts for a target role. Empty input keeps the existing role.
Type one of admin, operator, viewer, agent, mcp, cli,
auditor to replace.
Non-interactive JSON config (Helm post-upgrade hook)
cat > scope-down.json <<EOF
{
"ci-bot": "operator",
"agent-prod-1": "agent",
"agent-prod-2": "agent",
"monitoring-bot": "viewer",
"compliance-bot": "auditor"
}
EOF
certctl-cli auth keys scope-down --non-interactive ./scope-down.json
Empty role values revoke every current grant WITHOUT granting a
replacement; assign roles selectively with
certctl-cli auth keys assign.
Audit-driven suggestion
# Preview suggestions based on the last 30 days of audit history
certctl-cli auth keys scope-down --suggest
# Apply the suggestions
certctl-cli auth keys scope-down --suggest --apply
The classifier (pure function in internal/cli/auth_scope_down.go::SuggestRoleFromAuditEvents)
walks the actor's audit events and emits one of:
| Suggestion | Trigger |
|---|---|
admin |
Any auth.role.* / auth.key.* / ca.hierarchy.* / *.bulk_revoke / *.admin action |
mcp |
All observed actions are MCP-shaped (mcp.*) |
viewer |
All observed actions are read-only (*.read or *.list) |
agent |
All observed actions are agent-shaped (agent.*, cert.read, cert.issue) |
operator |
Cert / profile / target lifecycle mutations without admin signals |
The classifier is conservative - when in doubt, it prefers the
narrower role. The operator confirms each suggestion before any
mutation lands (unless --apply is set).
4. Mint a fresh admin via bootstrap (optional, for fresh deployments)
If you're standing up a fresh deployment instead of upgrading an existing one, the bootstrap path mints the first admin key without needing the operator to know the env-var format:
# Set the bootstrap token in the server environment.
export CERTCTL_BOOTSTRAP_TOKEN=$(openssl rand -hex 32)
# Boot the server. Logs include "bootstrap endpoint enabled".
docker compose up -d
# Mint the first admin key.
curl -X POST $URL/api/v1/auth/bootstrap \
-H 'Content-Type: application/json' \
-d '{"token":"'$CERTCTL_BOOTSTRAP_TOKEN'","actor_name":"first-admin"}'
The response carries the plaintext key_value once. Capture it
and use it as the Bearer token for subsequent calls. Subsequent
bootstrap calls return HTTP 410 Gone.
See docs/operator/rbac.md for the full
bootstrap flow + the threat model.
What changes for code that called IsAdmin
In v2.0.x, the five admin handlers checked auth.IsAdmin(ctx)
directly in the body. v2.1.0 moved those checks to
the router via the auth.RequirePermission middleware (wrapped
through the rbacGate helper in
internal/api/router/router.go). The behavior contract is
unchanged: r-admin-roled callers reach the handler, anyone else
gets HTTP 403 BEFORE the body runs.
If your code consumed auth.IsAdmin directly (it shouldn't -
the helper is internal), the new convention is:
- Wrap the route in
rbacGate(reg.Checker, "<perm>", handler)inrouter.go. - Add the perm to
migrations/000030_rbac_admin_perms.up.sql(ormigrations/000029_rbac.up.sql's catalogue). - Grant the perm to the right default roles.
The five admin-only fine-grained perms stay on r-admin only by
default. Operators delegate by creating custom roles with the
specific perm.
Helm-specific upgrade
The certctl Helm chart applies migrations on container start via
the standard migrations runner. No chart changes are required;
the helm upgrade command runs identically:
helm upgrade certctl certctl/certctl \
--version <new-version> \
--reuse-values
Post-upgrade, the boot loader runs the named-key actor-role
backfill against the CERTCTL_API_KEYS_NAMED env-var-injected
into the deployment. The "AUDIT YOUR API KEYS" callout applies -
add a post-upgrade Job to your release pipeline that runs
certctl-cli auth keys scope-down --non-interactive against a
checked-in JSON config, so the role narrowing is deterministic
across upgrade rollouts.
Example post-upgrade Job:
apiVersion: batch/v1
kind: Job
metadata:
name: certctl-scope-down
spec:
template:
spec:
containers:
- name: scope-down
image: ghcr.io/certctl-io/certctl-cli:<tag>
command:
- certctl-cli
- auth
- keys
- scope-down
- --non-interactive
- /config/scope-down.json
envFrom:
- secretRef:
name: certctl-cli-credentials
volumeMounts:
- name: scope-down-config
mountPath: /config
volumes:
- name: scope-down-config
configMap:
name: certctl-scope-down-config
restartPolicy: OnFailure
The ConfigMap holds the {actor_id: role_id} map; the Secret
holds the API key the Job uses to call /v1/auth/keys/.../roles.
Docker Compose-specific upgrade
For deploy/docker-compose.yml deployments:
- Pull the new images:
docker compose pull - Verify your
CERTCTL_AUTH_TYPEvalue before restarting. If it wasnone(the demo path), the post-upgrade server will boot in demo mode again - the syntheticactor-demo-anonadmin covers every request, no scope-down is meaningful. If you're moving fromnonetoapi-keymode, setCERTCTL_API_KEYS_NAMEDfirst, then restart. docker compose up -dto apply.docker compose logs certctl-server | grep -i 'loaded persisted api_keys'to verify the boot loader ran. The first-boot log line includes the count of keys loaded into the runtime keystore.- Run
certctl-cli auth keys scope-downagainst the running server.
The five examples in examples/ (acme-nginx, private-ca-traefik,
step-ca-haproxy, multi-issuer, acme-wildcard-dns01) all run in
demo mode (CERTCTL_AUTH_TYPE=none) and are unaffected by the
RBAC migration - the synthetic actor-demo-anon admin grant covers
every request.
Verifying the upgrade landed
After the scope-down flow completes:
certctl-cli auth mewhile authenticated as each named key confirms the righteffective_permissionsfor that role.psql -c "SELECT actor_id, array_agg(role_id ORDER BY role_id) FROM actor_roles GROUP BY actor_id;"gives the full picture in one query.- The audit trail
(
GET /api/v1/audit?category=auth) shows theauth.role.assignandauth.role.revokerows for every change you made - confirm via the GUI's/audit?category=authview. - Read the updated
docs/operator/rbac.mdfor day-2 RBAC management.
Rollback
If the upgrade goes wrong, the down migrations exist in lockstep:
# Roll back via your migration runner (golang-migrate, Atlas, etc.).
# Migrations 000029-000033 each have a .down.sql that reverses the
# .up.sql. Down migrations are destructive on data added by the up
# migration (api_keys rows, role grants on actors, profile-edit
# approvals); take a backup first.
After rollback, the v2.0.x binary works against the v2.0.x
schema unchanged. The operator's API keys still authenticate (the
in-memory hash table is rebuilt from CERTCTL_API_KEYS_NAMED on
boot regardless of schema version).
Cross-references
docs/operator/rbac.md- the operator how-to for the new RBAC primitivedocs/operator/auth-threat-model.md- what the new controls defend againstdocs/reference/profiles.md- the approval-bypass closure onRequiresApprovalprofile editsdocs/operator/security.md- the full security postureCHANGELOG.md- the v2.1.0 release notes lead with this guide