mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 13:41:30 +00:00
e7a94b6080
Closes the last Phase before the Bundle 1 Exit gate. Operators
now have authoritative reference + threat model + migration guide
covering every behavior change Bundles 0-12 introduced.
# New docs
* docs/operator/rbac.md (340 lines) — operator how-to:
- Mental model (actors / roles / permissions / scopes)
- 7 default roles seeded by migration 000029 + the 5
admin-only fine-grained perms seeded by 000030
- Permission catalogue table by namespace
- Scope semantics (global beats specific) + the Bundle-2
deferral on scope_id FK enforcement
- Granting / revoking access from GUI + CLI + HTTP API + MCP
- The auditor pattern (audit-only, no resource read)
- Day-0 bootstrap flow (CERTCTL_BOOTSTRAP_TOKEN → curl →
HTTP 410 thereafter)
- Demo-mode (CERTCTL_AUTH_TYPE=none) caveat for production
* docs/operator/auth-threat-model.md (180 lines) — what the
controls defend against:
- 5 threat actors (external, wrong-role, compromised key,
insider operator, compromised auditor)
- Per-defense walk-through (API-key auth, RBAC, bootstrap,
approval workflow + Phase 9 closure, audit trail,
protocol-endpoint allowlist)
- 9 explicit deferrals (OIDC, sessions, local accounts,
JIT elevation, MFA, etc.) — Bundle 2 / future scope
- Compliance mapping (SOC 2 CC6.1/CC6.3, HIPAA §164.312(b),
NIST SSDF PO.5.2, FedRAMP AU-9, PCI-DSS §10)
- 5 operator-runnable sanity checks (e.g.,
'SELECT FROM audit_events WHERE actor=system-bypass' MUST
return 0 in production)
* docs/migration/api-keys-to-rbac.md (200 lines) — v2.0.x →
v2.1.0 upgrade flow:
- The SECURITY: AUDIT YOUR API KEYS callout
- Migration list (000029-000033) + what each does
- 4-mode scope-down flow (interactive / non-interactive
JSON / --suggest / --suggest --apply)
- What changes for code that called auth.IsAdmin
- Helm-specific upgrade flow with example post-upgrade Job
- Docker Compose upgrade flow + the 5 examples folders
that ride demo mode unchanged
- Verification queries + rollback flow
# Updated docs
* docs/operator/security.md — Last-reviewed bumped to
2026-05-09; existing Authentication-surface section
extended to call out the Bundle 1 RBAC primitive,
day-0 bootstrap path, and approval-bypass closure with
cross-references to the new docs.
* docs/reference/profiles.md — Last-reviewed header
formatting fixed (added the > blockquote prefix used
consistently across the docs tree).
# docs/README.md navigation
* Operator section gains 2 new rows (RBAC + auth-threat-model)
and Approval-workflow row updated to mention Phase 9
closure.
* Reference section gains the Profiles row.
* Migration section gains the api-keys-to-rbac row with the
AUDIT YOUR API KEYS callout in the link description.
# CHANGELOG.md v2.1.0 section refreshed
The Phase 7 commit landed the SECURITY: AUDIT YOUR API KEYS
callout. This commit appends the missing Phase 9-12 highlights:
- Approval-bypass closure (profile-edit gate + flip-flop
loophole + ErrApproveBySameActor invariant)
- GUI: Roles / API Keys / Auth Settings / Approvals queue
- 12 new MCP RBAC tools
- Coverage gates on internal/auth + internal/service/auth
- Protocol-endpoint allowlist pinned at 3 layers
Trailing cross-reference block now points at all 4 new docs.
# Verifications
* Every internal link in the 4 new/modified docs validated by
shell sweep (find broken links → 0 hits).
* Every new doc carries 'Last reviewed: 2026-05-09' header
with the > blockquote prefix matching the docs-tree
convention.
* go vet ./... clean.
* staticcheck across every Bundle-1-touched Go package clean.
* gofmt -l clean repo-wide.
* go test -short -count=1 green across internal/auth (incl.
bootstrap), internal/api/handler, internal/api/router,
internal/cli, internal/service (incl. auth),
internal/domain/auth, internal/mcp, cmd/cli (cmd/server
has 1 environmental failure on the sandbox virtiofs-tmp:
TestPreflightSCEPRACertKey_KeyWorldReadable_Refuses depends
on tmpfs file-mode semantics that virtiofs propagates
differently — pre-existing, unrelated to Bundle 1).
* Frontend: 19 Vitest tests across src/pages/auth/ +
AuditPage all pass; tsc --noEmit clean.
217 lines
9.6 KiB
Markdown
217 lines
9.6 KiB
Markdown
# certctl Security Posture & Operator Guidance
|
|
|
|
> Last reviewed: 2026-05-09
|
|
|
|
This document collects the operator-facing security guidance that the source
|
|
code's per-finding comment blocks reference. Each section names the audit
|
|
finding it closes, the threat model, and the operator action required (if
|
|
any).
|
|
|
|
## OCSP responder availability
|
|
|
|
**Audit reference:** Bundle C / M-020. CWE-770 (uncontrolled resource
|
|
consumption); RFC 6960 (OCSP); RFC 7633 (Must-Staple).
|
|
|
|
certctl ships an OCSP responder at `/.well-known/pki/ocsp/{issuer_id}/{serial}`
|
|
that signs a fresh response per request. Pre-Bundle-C the unauth handler
|
|
chain had no rate limit, so an attacker could DoS the responder and force
|
|
fail-open relying parties to accept revoked certificates as valid. Bundle C
|
|
adds the same per-key rate limiter to the unauth chain that the authenticated
|
|
chain has used since Bundle B. Per-IP keying applies because OCSP traffic is
|
|
unauthenticated.
|
|
|
|
The rate limiter alone does not solve the underlying revocation-bypass risk.
|
|
**The architectural fix is for issued certificates to carry the OCSP
|
|
Must-Staple TLS Feature extension** (RFC 7633, OID 1.3.6.1.5.5.7.1.24). When
|
|
present, conforming TLS clients refuse to negotiate a session unless the
|
|
server staples a fresh signed OCSP response in the TLS handshake. This shifts
|
|
revocation enforcement from the client's discretion (which most fail-open by
|
|
default) to a hard requirement that the connection cannot complete without
|
|
proof of non-revocation.
|
|
|
|
### Operator action
|
|
|
|
For certificates issued to systems where revocation correctness matters:
|
|
|
|
1. **Configure the issuer profile to set `must-staple: true`.** Out-of-the-box
|
|
profiles in `migrations/seed.sql` do not set this; operators add it at
|
|
profile-creation time via the API or by editing seed data.
|
|
2. **Confirm the relying party honors the extension.** OpenSSL ≥ 1.1.0,
|
|
Firefox, and Chrome 84+ all enforce Must-Staple. Older clients silently
|
|
ignore it.
|
|
3. **Confirm the deployment target is configured for OCSP stapling** so the
|
|
server can actually deliver the stapled response in the handshake.
|
|
- **nginx:** `ssl_stapling on; ssl_stapling_verify on;`
|
|
- **Apache:** `SSLUseStapling on`
|
|
- **HAProxy:** `set ssl ocsp-response /path/to/response.der`
|
|
- **Envoy:** `ocsp_staple_policy: must_staple`
|
|
|
|
### What this does NOT cover
|
|
|
|
- **CRL fallback.** Must-Staple does not affect CRL behavior. Operators with
|
|
CRL-based relying parties should use the rate-limit + caching defense
|
|
alone; there is no client-side equivalent to Must-Staple for CRLs.
|
|
- **Self-issued certs in air-gapped networks.** When the relying party
|
|
cannot reach the OCSP responder at all (the threat model the audit
|
|
cited), Must-Staple is the only mechanism that closes the bypass. CRL
|
|
distribution similarly requires the relying party to fetch the CRL,
|
|
which is also subject to the same network-availability concern.
|
|
|
|
## Postgres transport encryption
|
|
|
|
See [docs/database-tls.md](database-tls.md). Bundle B / M-018.
|
|
|
|
## Encryption at rest
|
|
|
|
Bundle B / M-001. PBKDF2-SHA256 at 600,000 rounds (OWASP 2024 Password
|
|
Storage Cheat Sheet floor) for the operator-supplied passphrase that
|
|
derives the AES-256-GCM key for sensitive config columns. v3 blob format
|
|
with a per-ciphertext random salt; v1/v2 read fallback for legacy rows.
|
|
See [internal/crypto/encryption.go](../internal/crypto/encryption.go) and
|
|
the accompanying tests for the format spec.
|
|
|
|
## Authentication surface
|
|
|
|
Bundle B / M-002. Two layers decide auth-exempt status:
|
|
|
|
1. **Router layer:** `internal/api/router/router.go::AuthExemptRouterRoutes`
|
|
— the endpoints registered via direct `r.mux.Handle` without going
|
|
through the middleware chain (`/health`, `/ready`, `/api/v1/auth/info`,
|
|
`/api/v1/version`, plus `/api/v1/auth/bootstrap` GET + POST per
|
|
Bundle 1 Phase 6).
|
|
2. **Dispatch layer:** `internal/api/router/router.go::AuthExemptDispatchPrefixes`
|
|
— URL-prefix routing in `cmd/server/main.go::buildFinalHandler` for
|
|
`/.well-known/pki/*`, `/.well-known/est/*`, `/.well-known/est-mtls`,
|
|
and `/scep[/...]*` (incl. `/scep-mtls`).
|
|
|
|
Both lists have AST-walking regression tests (`auth_exempt_test.go`) that
|
|
fail CI if a new bypass lands without updating the documented constant.
|
|
|
|
### RBAC primitive (Bundle 1)
|
|
|
|
Bundle 1 ships role-based authorization on top of API-key
|
|
authentication. Every gated handler routes through the
|
|
`auth.RequirePermission` middleware (or its router-level wrap
|
|
`rbacGate`); the middleware resolves the actor's effective
|
|
permissions via the service-layer `Authorizer.CheckPermission`
|
|
and returns HTTP 403 BEFORE the handler body runs on miss. The
|
|
seven default roles (`admin` / `operator` / `viewer` / `agent` /
|
|
`mcp` / `cli` / `auditor`), 33-permission canonical catalogue,
|
|
and the auditor split (`r-auditor` holds only `audit.read` +
|
|
`audit.export`) are seeded by migration 000029.
|
|
|
|
For the operator how-to, see [`rbac.md`](rbac.md). For the
|
|
threat model + compliance mapping, see
|
|
[`auth-threat-model.md`](auth-threat-model.md). For the upgrade
|
|
flow from a pre-Bundle-1 deployment, see
|
|
[`docs/migration/api-keys-to-rbac.md`](../migration/api-keys-to-rbac.md).
|
|
|
|
### Day-0 admin bootstrap (Bundle 1 Phase 6)
|
|
|
|
Fresh deployments where no admin actor exists yet can mint the
|
|
first admin via `POST /api/v1/auth/bootstrap` — set
|
|
`CERTCTL_BOOTSTRAP_TOKEN`, POST a single curl with the token, and
|
|
the server returns the plaintext key value once. The token is
|
|
constant-time-compared; the strategy is one-shot via mutex; the
|
|
admin-existence probe re-closes the path once an admin lands.
|
|
The token is NEVER logged. The minted plaintext key flows only
|
|
into the HTTP response body. See
|
|
[`rbac.md`](rbac.md#day-0-bootstrap-first-admin-path) for the
|
|
full flow.
|
|
|
|
### Approval-bypass closure (Bundle 1 Phase 9)
|
|
|
|
`CertificateProfile.RequiresApproval=true` profiles route both
|
|
issuance/renewal AND profile edits through the
|
|
`ApprovalService` two-person integrity gate (Phase 9 closes the
|
|
flip-flop loophole where an admin could disable approval, mutate,
|
|
re-enable). Same-actor self-approve is rejected at the service
|
|
layer with `ErrApproveBySameActor`. See
|
|
[`docs/reference/profiles.md`](../reference/profiles.md) for the
|
|
full gate semantics.
|
|
|
|
## Per-user rate limiting
|
|
|
|
Bundle B / M-025. Authenticated callers are bucketed by API-key name;
|
|
unauthenticated callers (probes, OCSP relying parties, EST/SCEP enrollees)
|
|
are bucketed by source IP. `RPS` and `BurstSize` are per-key budgets.
|
|
`PerUserRPS` / `PerUserBurstSize` give authenticated clients a separate
|
|
budget when set non-zero.
|
|
|
|
## API key rotation
|
|
|
|
**Audit reference:** L-004. CWE-924 (improper enforcement of message integrity during transmission in a communication channel) — operator UX variant.
|
|
|
|
certctl's API keys are configured via the `CERTCTL_API_KEYS_NAMED` env var
|
|
(format `name1:key1,name2:key2:admin`) and parsed at startup into an
|
|
in-memory list. There is no DB-resident key store, no GUI, no `/api/v1/keys`
|
|
endpoint — the env var IS the key inventory.
|
|
|
|
Pre-Bundle-G the env var rejected duplicate names, so rotating a key
|
|
required: stop accepting OLDKEY → restart → roll NEWKEY out. Any client
|
|
polling against OLDKEY during the restart window hit a 401.
|
|
|
|
Bundle G adds a **double-key rotation window**: two entries can share a
|
|
name during the rollover, and both keys validate. Operators run the
|
|
rotation as:
|
|
|
|
1. **Generate the new key.** `openssl rand -hex 32` produces a 256-bit
|
|
value with sufficient entropy.
|
|
|
|
2. **Append the new entry to `CERTCTL_API_KEYS_NAMED`** alongside the
|
|
existing one:
|
|
```
|
|
CERTCTL_API_KEYS_NAMED="alice:OLDKEY:admin,alice:NEWKEY:admin"
|
|
```
|
|
Both entries MUST carry the same admin flag — startup fails loud if
|
|
they don't (a non-admin shouldn't share an identity with an admin).
|
|
|
|
3. **Restart certctl.** A startup INFO log confirms the rotation window
|
|
is active:
|
|
```
|
|
INFO api-key rotation window active name=alice entries=2 see=docs/security.md::api-key-rotation
|
|
```
|
|
|
|
4. **Roll the new key out to all clients.** Both keys validate during
|
|
this phase. Audit-trail actor + per-user rate-limit bucket stay
|
|
consistent across the rollover (both entries produce the same
|
|
`UserKey` context value, the shared name).
|
|
|
|
5. **Remove the old entry** from `CERTCTL_API_KEYS_NAMED`:
|
|
```
|
|
CERTCTL_API_KEYS_NAMED="alice:NEWKEY:admin"
|
|
```
|
|
|
|
6. **Restart certctl.** OLDKEY now fails with 401. Rotation complete.
|
|
|
|
The rotation window has no operator-set timeout — it lasts for as long
|
|
as both entries are in the env var. Best practice is a 24-72h window
|
|
covering a full deploy cadence; if a client hasn't rolled to NEWKEY by
|
|
the end of step 4, extend the window before step 5.
|
|
|
|
### What the contract guarantees
|
|
|
|
- Two entries with the same `name`: **allowed** if both have the same
|
|
`admin` flag.
|
|
- Two entries with the same `name` but mismatched admin: **rejected at
|
|
startup** (privilege escalation guard).
|
|
- Two entries with the same `(name, key)` pair: **rejected at startup**
|
|
(typo guard — rotation requires DIFFERENT keys under the same name).
|
|
- Single-entry steady state: unchanged from pre-Bundle-G behavior.
|
|
|
|
### What the contract does NOT do
|
|
|
|
- **No automatic expiration of OLDKEY.** The operator removes the entry
|
|
in step 5; certctl doesn't track timestamps. A future enhancement
|
|
could add a `rotated_at` annotation if operators ask for it.
|
|
- **No GUI / API for key management.** Keys are env-var only by design;
|
|
building a key-management surface is a separate feature project.
|
|
- **No revocation list.** If a key leaks, the only path is to remove it
|
|
from the env var and restart. That's appropriate for a small env-var
|
|
inventory; it would not scale to a per-user-key-issued model.
|
|
|
|
## Reporting a vulnerability
|
|
|
|
Email `certctl@proton.me`. Coordinated disclosure preferred; we will
|
|
acknowledge within 72h.
|