mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 14:11:31 +00:00
56e2ea1ad7
README:
- Rewrite Status block: drop the stale 'federated identity not yet
shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout
+ break-glass as early-access; encourage GitHub issues for IdP
rough edges. (A1 framing — keep early-access umbrella, no
SAML/WebAuthn/JIT roadmap teaser.)
- Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks,
group-claim → role mapping, AES-256-GCM client_secret encryption,
JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding,
RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute
expiry, BCL, break-glass admin.
- Update Security paragraph: three auth paths (API keys / OIDC /
break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL.
- Correct CI coverage thresholds against
.github/coverage-thresholds.yml (service 70%, handler 75%,
crypto 88%, auth packages 85-95%); 'static analysis' replaces
the inflated '11 linters' claim (actual count is 4 active).
Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags:
- docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2
sections (API-key + RBAC defenses / OIDC + sessions + break-glass
defenses / OIDC + sessions threat catalogue / Closed federated-
identity threats / Future-work threats); clean ~12 H3/prose hits.
- docs/operator/rbac.md — strip Bundle 1 framing from intro,
scope_id deferral note, MCP tools section, day-0 bootstrap, and
'Where to look next'.
- docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from
title intro, hardware floor caption, result table caption,
methodology, and pre-merge audit section.
- docs/operator/security.md — already cleaned earlier this session
(RBAC / day-0 / approval-bypass / OIDC federation / sessions /
OIDC first-admin / break-glass H3s).
- docs/operator/oidc-runbooks/{index,keycloak,authentik,okta,
azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4
references; replace with feature-name prose.
- docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023
audit-reference framing; keep CWE-326.
- docs/operator/database-tls.md — drop Bundle B / M-018 framing
from intro + Helm section.
- docs/operator/runbooks/disaster-recovery.md — drop 'Production
hardening II Phase 10' status callout.
- docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO';
strip Bundle 1/2 framing from prereqs, troubleshooting, related
docs; update __Host- cookie callout from 'audit MED-14' to
v2.1.0-BREAKING.
- docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from
intro, migration table, IsAdmin section, and cross-references.
- docs/migration/acme-from-cert-manager.md — strip residual
'Phase 5' tags from cert-manager integration test references.
- docs/reference/configuration.md — retitle Auth section.
- docs/reference/profiles.md — strip Bundle 1 Phase 9 framing
from RequiresApproval section + Related list.
- docs/reference/auth-standards-implemented.md — rewrite intro
(API-key + RBAC + OIDC + sessions + back-channel logout +
break-glass); rename 'Bundle 1 (RBAC) standards covered
separately' H2; clean per-row Phase references.
- docs/README.md — rewrite nav-table entries to drop Bundle 1/2
parentheticals; retitle 'Enable OIDC SSO' migration entry.
No code or test changes; pure operator-facing prose polish for
the v2.1.0 tag.
295 lines
11 KiB
Markdown
295 lines
11 KiB
Markdown
# Migrating API keys to RBAC (v2.0.x → v2.1.0)
|
|
|
|
> Last reviewed: 2026-05-09
|
|
|
|
This is the upgrade guide for an existing certctl deployment moving
|
|
from v2.0.x's "every API key is admin or not" model to v2.1.0's
|
|
RBAC primitive. Everything keeps working through the upgrade - the
|
|
migration backfills every existing API key to the
|
|
`r-admin` role on first boot, so the pre-existing automation that
|
|
was using those keys does not change behavior. **However**, most
|
|
keys do not need full admin power; this guide walks the operator
|
|
through the post-upgrade scope-down flow.
|
|
|
|
## ⚠️ SECURITY: AUDIT YOUR API KEYS
|
|
|
|
v2.1.0 maps **every** existing `CERTCTL_API_KEYS_NAMED` entry
|
|
(and every legacy `CERTCTL_AUTH_SECRET`-synthesized key) to the
|
|
`r-admin` role on the first boot after migration 000029 applies.
|
|
This is the safe-for-back-compat default - your CI / agents / scripts
|
|
keep working without changes - but if you don't downgrade keys, every
|
|
key in your fleet has full admin permissions including bulk-revoke,
|
|
CRL admin, and CA hierarchy management.
|
|
|
|
**Run the scope-down flow before tagging the next release.** The
|
|
release notes for v2.1.0 lead with this callout for a reason.
|
|
|
|
## Upgrade flow
|
|
|
|
### 1. Apply the migration
|
|
|
|
The migration runner is idempotent. Re-applying is a no-op if the
|
|
schema is already at the target version. The five RBAC migrations
|
|
that ship in v2.1.0:
|
|
|
|
| Migration | What it does |
|
|
|---|---|
|
|
| `000029_rbac.up.sql` | Creates `tenants`, `roles`, `permissions`, `role_permissions`, `actor_roles`. Seeds 7 default roles + 33-permission catalogue + the synthetic `actor-demo-anon` admin grant. Backfills every named API key into `actor_roles` with the `r-admin` role. |
|
|
| `000030_rbac_admin_perms.up.sql` | Seeds 5 admin-only fine-grained permissions (`cert.bulk_revoke`, `crl.admin`, `scep.admin`, `est.admin`, `ca.hierarchy.manage`) into `r-admin` only. |
|
|
| `000031_api_keys.up.sql` | Creates the `api_keys` table for runtime-minted keys (day-0 bootstrap path). |
|
|
| `000032_audit_category.up.sql` | Adds `event_category` column to `audit_events` with the closed enum (`cert_lifecycle` / `auth` / `config`). |
|
|
| `000033_approval_kinds.up.sql` | Adds `approval_kind` + `payload` to `issuance_approval_requests` for the approval-bypass closure. |
|
|
|
|
The v2.1.0 server applies these on first boot. No operator
|
|
action is required other than running the upgrade.
|
|
|
|
### 2. Verify the backfill landed
|
|
|
|
```bash
|
|
# Inspect the seeded actor_roles rows. You should see one row per
|
|
# entry in CERTCTL_API_KEYS_NAMED (Admin=true keys → r-admin,
|
|
# Admin=false keys → r-viewer) plus the seeded actor-demo-anon
|
|
# admin row.
|
|
psql -d certctl -c "SELECT actor_id, role_id, granted_by, granted_at FROM actor_roles ORDER BY granted_at;"
|
|
```
|
|
|
|
If the table is empty, the boot-loader hook in
|
|
`cmd/server/auth_backfill.go::backfillNamedKeyActorRoles` did not
|
|
run; re-check that `CERTCTL_AUTH_TYPE` is `api-key` (the boot
|
|
hook is gated on `cfg.Auth.Type != none`).
|
|
|
|
### 3. List + scope-down keys
|
|
|
|
The `certctl-cli` ships a four-mode scope-down command. Pick the
|
|
mode that matches your fleet size + automation posture.
|
|
|
|
#### Interactive walk
|
|
|
|
```bash
|
|
certctl-cli auth keys scope-down
|
|
```
|
|
|
|
Walks every actor (skips the synthetic `actor-demo-anon`) and
|
|
prompts for a target role. Empty input keeps the existing role.
|
|
Type one of `admin`, `operator`, `viewer`, `agent`, `mcp`, `cli`,
|
|
`auditor` to replace.
|
|
|
|
#### Non-interactive JSON config (Helm post-upgrade hook)
|
|
|
|
```bash
|
|
cat > scope-down.json <<EOF
|
|
{
|
|
"ci-bot": "operator",
|
|
"agent-prod-1": "agent",
|
|
"agent-prod-2": "agent",
|
|
"monitoring-bot": "viewer",
|
|
"compliance-bot": "auditor"
|
|
}
|
|
EOF
|
|
|
|
certctl-cli auth keys scope-down --non-interactive ./scope-down.json
|
|
```
|
|
|
|
Empty role values revoke every current grant WITHOUT granting a
|
|
replacement; assign roles selectively with
|
|
`certctl-cli auth keys assign`.
|
|
|
|
#### Audit-driven suggestion
|
|
|
|
```bash
|
|
# Preview suggestions based on the last 30 days of audit history
|
|
certctl-cli auth keys scope-down --suggest
|
|
|
|
# Apply the suggestions
|
|
certctl-cli auth keys scope-down --suggest --apply
|
|
```
|
|
|
|
The classifier (pure function in `internal/cli/auth_scope_down.go::SuggestRoleFromAuditEvents`)
|
|
walks the actor's audit events and emits one of:
|
|
|
|
| Suggestion | Trigger |
|
|
|---|---|
|
|
| `admin` | Any auth.role.* / auth.key.* / ca.hierarchy.* / *.bulk_revoke / *.admin action |
|
|
| `mcp` | All observed actions are MCP-shaped (`mcp.*`) |
|
|
| `viewer` | All observed actions are read-only (`*.read` or `*.list`) |
|
|
| `agent` | All observed actions are agent-shaped (`agent.*`, `cert.read`, `cert.issue`) |
|
|
| `operator` | Cert / profile / target lifecycle mutations without admin signals |
|
|
|
|
The classifier is conservative - when in doubt, it prefers the
|
|
narrower role. The operator confirms each suggestion before any
|
|
mutation lands (unless `--apply` is set).
|
|
|
|
### 4. Mint a fresh admin via bootstrap (optional, for fresh deployments)
|
|
|
|
If you're standing up a fresh deployment instead of upgrading an
|
|
existing one, the bootstrap path mints the first admin key without
|
|
needing the operator to know the env-var format:
|
|
|
|
```bash
|
|
# Set the bootstrap token in the server environment.
|
|
export CERTCTL_BOOTSTRAP_TOKEN=$(openssl rand -hex 32)
|
|
|
|
# Boot the server. Logs include "bootstrap endpoint enabled".
|
|
docker compose up -d
|
|
|
|
# Mint the first admin key.
|
|
curl -X POST $URL/api/v1/auth/bootstrap \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"token":"'$CERTCTL_BOOTSTRAP_TOKEN'","actor_name":"first-admin"}'
|
|
```
|
|
|
|
The response carries the plaintext `key_value` once. Capture it
|
|
and use it as the Bearer token for subsequent calls. Subsequent
|
|
bootstrap calls return HTTP 410 Gone.
|
|
|
|
See [`docs/operator/rbac.md`](../operator/rbac.md) for the full
|
|
bootstrap flow + the threat model.
|
|
|
|
## What changes for code that called `IsAdmin`
|
|
|
|
In v2.0.x, the five admin handlers checked `auth.IsAdmin(ctx)`
|
|
directly in the body. v2.1.0 moved those checks to
|
|
the router via the `auth.RequirePermission` middleware (wrapped
|
|
through the `rbacGate` helper in
|
|
`internal/api/router/router.go`). The behavior contract is
|
|
unchanged: `r-admin`-roled callers reach the handler, anyone else
|
|
gets HTTP 403 BEFORE the body runs.
|
|
|
|
If your code consumed `auth.IsAdmin` directly (it shouldn't -
|
|
the helper is internal), the new convention is:
|
|
|
|
1. Wrap the route in `rbacGate(reg.Checker, "<perm>", handler)`
|
|
in `router.go`.
|
|
2. Add the perm to `migrations/000030_rbac_admin_perms.up.sql`
|
|
(or `migrations/000029_rbac.up.sql`'s catalogue).
|
|
3. Grant the perm to the right default roles.
|
|
|
|
The five admin-only fine-grained perms stay on `r-admin` only by
|
|
default. Operators delegate by creating custom roles with the
|
|
specific perm.
|
|
|
|
## Helm-specific upgrade
|
|
|
|
The certctl Helm chart applies migrations on container start via
|
|
the standard migrations runner. No chart changes are required;
|
|
the `helm upgrade` command runs identically:
|
|
|
|
```bash
|
|
helm upgrade certctl certctl/certctl \
|
|
--version <new-version> \
|
|
--reuse-values
|
|
```
|
|
|
|
Post-upgrade, the boot loader runs the named-key actor-role
|
|
backfill against the `CERTCTL_API_KEYS_NAMED` env-var-injected
|
|
into the deployment. The "AUDIT YOUR API KEYS" callout applies -
|
|
add a post-upgrade Job to your release pipeline that runs
|
|
`certctl-cli auth keys scope-down --non-interactive` against a
|
|
checked-in JSON config, so the role narrowing is deterministic
|
|
across upgrade rollouts.
|
|
|
|
Example post-upgrade Job:
|
|
|
|
```yaml
|
|
apiVersion: batch/v1
|
|
kind: Job
|
|
metadata:
|
|
name: certctl-scope-down
|
|
spec:
|
|
template:
|
|
spec:
|
|
containers:
|
|
- name: scope-down
|
|
image: ghcr.io/certctl-io/certctl-cli:<tag>
|
|
command:
|
|
- certctl-cli
|
|
- auth
|
|
- keys
|
|
- scope-down
|
|
- --non-interactive
|
|
- /config/scope-down.json
|
|
envFrom:
|
|
- secretRef:
|
|
name: certctl-cli-credentials
|
|
volumeMounts:
|
|
- name: scope-down-config
|
|
mountPath: /config
|
|
volumes:
|
|
- name: scope-down-config
|
|
configMap:
|
|
name: certctl-scope-down-config
|
|
restartPolicy: OnFailure
|
|
```
|
|
|
|
The ConfigMap holds the `{actor_id: role_id}` map; the Secret
|
|
holds the API key the Job uses to call `/v1/auth/keys/.../roles`.
|
|
|
|
## Docker Compose-specific upgrade
|
|
|
|
For `deploy/docker-compose.yml` deployments:
|
|
|
|
1. Pull the new images: `docker compose pull`
|
|
2. Verify your `CERTCTL_AUTH_TYPE` value before restarting. If it
|
|
was `none` (the demo path), the post-upgrade server will boot
|
|
in demo mode again - the synthetic `actor-demo-anon` admin
|
|
covers every request, no scope-down is meaningful. If you're
|
|
moving from `none` to `api-key` mode, set
|
|
`CERTCTL_API_KEYS_NAMED` first, then restart.
|
|
3. `docker compose up -d` to apply.
|
|
4. `docker compose logs certctl-server | grep -i 'loaded persisted api_keys'`
|
|
to verify the boot loader ran. The first-boot log line includes
|
|
the count of keys loaded into the runtime keystore.
|
|
5. Run `certctl-cli auth keys scope-down` against the running
|
|
server.
|
|
|
|
The five examples in `examples/` (acme-nginx, private-ca-traefik,
|
|
step-ca-haproxy, multi-issuer, acme-wildcard-dns01) all run in
|
|
demo mode (`CERTCTL_AUTH_TYPE=none`) and are unaffected by the
|
|
RBAC migration - the synthetic actor-demo-anon admin grant covers
|
|
every request.
|
|
|
|
## Verifying the upgrade landed
|
|
|
|
After the scope-down flow completes:
|
|
|
|
1. `certctl-cli auth me` while authenticated as each named key
|
|
confirms the right `effective_permissions` for that role.
|
|
2. `psql -c "SELECT actor_id, array_agg(role_id ORDER BY role_id) FROM actor_roles GROUP BY actor_id;"`
|
|
gives the full picture in one query.
|
|
3. The audit trail
|
|
(`GET /api/v1/audit?category=auth`)
|
|
shows the `auth.role.assign` and `auth.role.revoke` rows for
|
|
every change you made - confirm via the GUI's
|
|
`/audit?category=auth` view.
|
|
4. Read the updated [`docs/operator/rbac.md`](../operator/rbac.md)
|
|
for day-2 RBAC management.
|
|
|
|
## Rollback
|
|
|
|
If the upgrade goes wrong, the down migrations exist in lockstep:
|
|
|
|
```bash
|
|
# Roll back via your migration runner (golang-migrate, Atlas, etc.).
|
|
# Migrations 000029-000033 each have a .down.sql that reverses the
|
|
# .up.sql. Down migrations are destructive on data added by the up
|
|
# migration (api_keys rows, role grants on actors, profile-edit
|
|
# approvals); take a backup first.
|
|
```
|
|
|
|
After rollback, the v2.0.x binary works against the v2.0.x
|
|
schema unchanged. The operator's API keys still authenticate (the
|
|
in-memory hash table is rebuilt from `CERTCTL_API_KEYS_NAMED` on
|
|
boot regardless of schema version).
|
|
|
|
## Cross-references
|
|
|
|
- [`docs/operator/rbac.md`](../operator/rbac.md) - the operator
|
|
how-to for the new RBAC primitive
|
|
- [`docs/operator/auth-threat-model.md`](../operator/auth-threat-model.md) -
|
|
what the new controls defend against
|
|
- [`docs/reference/profiles.md`](../reference/profiles.md) - the
|
|
approval-bypass closure on `RequiresApproval` profile edits
|
|
- [`docs/operator/security.md`](../operator/security.md) - the
|
|
full security posture
|
|
- `CHANGELOG.md` - the v2.1.0 release notes lead with this guide
|