mirror of https://github.com/shankar0123/certctl.git synced 2026-06-07 19:41:30 +00:00

Files

T

shankar0123 69a2b5c55a config: default hardening + operator docs (Phase 2 closure — SEC-H1, SEC-H3, SEC-M4, DEPL-H1, DEPL-M2 + doc-only carve-outs)

Eleven findings from the architecture diligence audit's Phase 2 bundle
closed in one PR. All touch the same backend config + Helm chart +
operator docs surface, so reviewing in one diff is the natural fit.

config.go: three new fail-closed Validate() branches behind sentinels
=====================================================================

Three new error sentinels exported from internal/config/config.go for
tests to pin via errors.Is + message-text:
  - ErrAgentBootstrapTokenRequired (SEC-H1)
  - ErrACMEInsecureWithoutAck      (SEC-M4)
  - ErrDemoModeAckExpired          (SEC-H3)

SEC-H1 (staged): introduces CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY
as an opt-in feature flag. When true AND the bootstrap token is empty,
Validate() returns ErrAgentBootstrapTokenRequired and the server
refuses to start. Default in THIS release: false (warn-mode
pass-through preserved). WORKSPACE-ROADMAP.md schedules the default
flip to true for v2.2.0 — operators get one upgrade window.

SEC-M4: upgrades the existing boot-time WARN log for
CERTCTL_ACME_INSECURE=true into a hard refuse-to-start gate behind
CERTCTL_ACME_INSECURE_ACK=true. The ACK env var must be paired with
the existing INSECURE flag; either alone fails closed. The boot-time
WARN log at cmd/server/main.go:611 continues to fire for the ACK'd
case so every restart logs the reminder.

SEC-H3: tightens the sticky DemoModeAck bit so it expires after 24h.
When DemoModeAck=true, Validate() now requires CERTCTL_DEMO_MODE_ACK_TS
to be set as a unix-epoch timestamp within the last 24h (24h-tolerance
on the past side, 1-minute clock-skew on the future side). Catches the
"forgotten demo deployment promoted to production" failure mode —
next container restart past 24h refuses unless re-ack'd.

Tests in internal/config/config_test.go cover every new branch:
positive (passes when properly set), negative (each fail-closed path
fires with the matching sentinel + message-text). 11 new tests added.

Helm chart + HA runbook (DEPL-H1)
=================================

Created docs/operator/runbooks/ha.md documenting the three values
flips required for production HA: server.replicas, podDisruptionBudget,
service.sessionAffinity. Cross-link comments added to
deploy/helm/certctl/values.yaml next to the server.replicas (line 19)
and podDisruptionBudget (line 566) defaults. DEFAULTS DO NOT CHANGE
— that's the point per the prompt's 'do not flip networkPolicy default'
guidance: a default-enabled PDB blocks fresh helm install on
single-node clusters.

CI guard (DEPL-M2)
==================

scripts/ci-guards/no-change-me-in-prod-compose.sh grep-fails any
'change-me-' literal in compose files OTHER than docker-compose.demo.yml.
Catches the placeholder-credential-leak regression one layer earlier
than the runtime Validate() fail-closed guards from Bundle 2 (2026-05-12).
Excludes comment lines so docs explaining the pattern don't trip the
guard. Verified to fire on a synthetic leak; clean on the current tree.

Consolidated 'Security carve-outs' doc section
==============================================

docs/operator/security.md grows by one new section documenting the
seven existing carve-outs in one canonical place:
  - SEC-M3: 3 InsecureSkipVerify=true sites (Agent dev, verify probe, tlsprobe)
  - SEC-M5: F5 connector InsecureSkipVerify per-config field
  - SEC-M4: ACME insecure + new ACK gate
  - SEC-L1: CSP 'unsafe-inline' on style-src (Tailwind carve-out)
  - SEC-L2: break-glass Argon2id rest-defense reminder
  - SEC-L3: 1 MB body-size cap + CERTCTL_MAX_BODY_SIZE override
  - DEPL-M2: change-me-* placeholder credentials in demo overlay
  - DEPL-M3: K8s NetworkPolicy operator-opt-in default

Each entry cites the file:line, the rationale for the carve-out, and
the operator action.

CHANGELOG + ENVIRONMENTS coverage
==================================

CHANGELOG.md grows by one new '### Breaking changes (scheduled for
v2.2.0)' section under Unreleased, documenting SEC-H1 / SEC-M4 / SEC-H3
with explicit upgrade-window guidance for each.

deploy/ENVIRONMENTS.md adds five rows: AGENT_BOOTSTRAP_TOKEN +
AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY + DEMO_MODE_ACK + DEMO_MODE_ACK_TS +
ACME_INSECURE_ACK. G-3 env-docs-drift CI guard stays clean.

WORKSPACE-ROADMAP.md (cowork-side) schedules the SEC-H1 default-flip
for v2.2.0.

Sandbox limitation
==================

The certctl repo's working tree is 6.1 GB which fills the sandbox
volume; the go1.25.10 toolchain download (go.mod requires it,
sandbox has 1.25.9) keeps failing on disk-full. Local 'go build' /
'go test' were NOT run in this commit's verification path.
make verify MUST be run on the operator's workstation before push
per CLAUDE.md operating rules.

CI guards (no-change-me, G-3 env-docs-drift, doc-rot-detector, +
all existing) verified clean by running each individually.

Closes: cowork/certctl-architecture-diligence-audit.html#fix-SEC-H1,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-H3,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-M4,
        cowork/certctl-architecture-diligence-audit.html#fix-DEPL-H1,
        cowork/certctl-architecture-diligence-audit.html#fix-DEPL-M2,
        cowork/certctl-architecture-diligence-audit.html#fix-DEPL-M3,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-M3,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-M5,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-L1,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-L2,
        cowork/certctl-architecture-diligence-audit.html#fix-SEC-L3

2026-05-13 19:50:00 +00:00

24 KiB

Raw Blame History

certctl Security Posture & Operator Guidance

Last reviewed: 2026-05-11

This document collects the operator-facing security guidance that the source code's per-finding comment blocks reference. Each section names the audit finding it closes, the threat model, and the operator action required (if any).

OCSP responder availability

Audit reference: CWE-770 (uncontrolled resource consumption); RFC 6960 (OCSP); RFC 7633 (Must-Staple).

certctl ships an OCSP responder at /.well-known/pki/ocsp/{issuer_id}/{serial} that signs a fresh response per request. The unauth handler chain applies the same per-key rate limiter the authenticated chain uses; per-IP keying applies because OCSP traffic is unauthenticated. Without this defense an attacker could DoS the responder and force fail-open relying parties to accept revoked certificates as valid.

The rate limiter alone does not solve the underlying revocation-bypass risk. The architectural fix is for issued certificates to carry the OCSP Must-Staple TLS Feature extension (RFC 7633, OID 1.3.6.1.5.5.7.1.24). When present, conforming TLS clients refuse to negotiate a session unless the server staples a fresh signed OCSP response in the TLS handshake. This shifts revocation enforcement from the client's discretion (which most fail-open by default) to a hard requirement that the connection cannot complete without proof of non-revocation.

Operator action

For certificates issued to systems where revocation correctness matters:

Configure the issuer profile to set must-staple: true. Out-of-the-box profiles in migrations/seed.sql do not set this; operators add it at profile-creation time via the API or by editing seed data.
Confirm the relying party honors the extension. OpenSSL ≥ 1.1.0, Firefox, and Chrome 84+ all enforce Must-Staple. Older clients silently ignore it.
Confirm the deployment target is configured for OCSP stapling so the server can actually deliver the stapled response in the handshake.

nginx: ssl_stapling on; ssl_stapling_verify on;
Apache: SSLUseStapling on
HAProxy: set ssl ocsp-response /path/to/response.der
Envoy: ocsp_staple_policy: must_staple

What this does NOT cover

CRL fallback. Must-Staple does not affect CRL behavior. Operators with CRL-based relying parties should use the rate-limit + caching defense alone; there is no client-side equivalent to Must-Staple for CRLs.
Self-issued certs in air-gapped networks. When the relying party cannot reach the OCSP responder at all (the threat model the audit cited), Must-Staple is the only mechanism that closes the bypass. CRL distribution similarly requires the relying party to fetch the CRL, which is also subject to the same network-availability concern.

Postgres transport encryption

See docs/database-tls.md.

Encryption at rest

PBKDF2-SHA256 at 600,000 rounds (OWASP 2024 Password Storage Cheat Sheet floor) for the operator-supplied passphrase that derives the AES-256-GCM key for sensitive config columns. v3 blob format with a per-ciphertext random salt; v1/v2 read fallback for legacy rows. See internal/crypto/encryption.go and the accompanying tests for the format spec.

Authentication surface

Two layers decide auth-exempt status:

Router layer: internal/api/router/router.go::AuthExemptRouterRoutes

the endpoints registered via direct r.mux.Handle without going through the middleware chain (/health, /ready, /api/v1/auth/info, /api/v1/version, plus /api/v1/auth/bootstrap GET + POST for the first-admin path).

Dispatch layer: internal/api/router/router.go::AuthExemptDispatchPrefixes

URL-prefix routing in cmd/server/main.go::buildFinalHandler for /.well-known/pki/*, /.well-known/est/*, /.well-known/est-mtls, and /scep[/...]* (incl. /scep-mtls).

Both lists have AST-walking regression tests (auth_exempt_test.go) that fail CI if a new bypass lands without updating the documented constant.

Role-based authorization

Role-based authorization runs on top of API-key authentication. Every gated handler routes through the auth.RequirePermission middleware (or its router-level wrap rbacGate); the middleware resolves the actor's effective permissions via the service-layer Authorizer.CheckPermission and returns HTTP 403 BEFORE the handler body runs on miss. The seven default roles (admin / operator / viewer / agent / mcp / cli / auditor), 33-permission canonical catalogue, and the auditor split (r-auditor holds only audit.read + audit.export) are seeded by migration 000029.

For the operator how-to, see rbac.md. For the threat model + compliance mapping, see auth-threat-model.md. For the upgrade flow from an API-key-only deployment, see docs/migration/api-keys-to-rbac.md.

Day-0 admin bootstrap

Fresh deployments where no admin actor exists yet can mint the first admin via POST /api/v1/auth/bootstrap - set CERTCTL_BOOTSTRAP_TOKEN, POST a single curl with the token, and the server returns the plaintext key value once. The token is constant-time-compared; the strategy is one-shot via mutex; the admin-existence probe re-closes the path once an admin lands. The token is NEVER logged. The minted plaintext key flows only into the HTTP response body. See rbac.md for the full flow.

Approval-bypass closure

CertificateProfile.RequiresApproval=true profiles route both issuance/renewal AND profile edits through the ApprovalService two-person integrity gate. The flip-flop loophole (an admin disabling approval, mutating, re-enabling) is closed by gating profile-edit through the same approval flow. Same-actor self-approve is rejected at the service layer with ErrApproveBySameActor. See docs/reference/profiles.md for the full gate semantics.

OIDC federation

OIDC SSO runs on top of the API-key + RBAC foundation. Operators configure one or more identity providers (Keycloak, Authentik, Okta, Auth0, Entra ID, or Google Workspace via Keycloak broker); end users sign in at the IdP, certctl validates the returned ID token, and a session cookie is minted.

The token-validation pipeline pins:

Algorithm allow-list: RS256 / RS512 / ES256 / ES384 / EdDSA only. HS256 / HS384 / HS512 / none are rejected at the service-layer sentinel level.
IdP-downgrade-attack defense at provider creation AND every RefreshKeys: the IdP's advertised id_token_signing_alg_values_supported is intersected with the allow-list; a provider that advertises HS-family is rejected before any token is signed under the weak alg.
Exact iss match (ErrIssuerMismatch).
aud membership + azp for multi-aud tokens (per OIDC core §3.1.3.7 step 5).
at_hash REQUIRED-when-access_token-present (a tightening of the spec MAY → MUST so a substituted access token cannot ride alongside a clean ID token).
Single-use state + nonce (32-byte random server-generated; atomic DELETE...RETURNING on consume).
PKCE-S256 mandatory; plain rejected.
Configurable iat window (default 300s, capped 600s).
JWKS cache with operator-triggered RefreshKeys + auto-refresh on TTL expiry (default 3600s); JWKS-fetch failure during a key rotation returns 503 to the in-flight login (existing sessions untouched).

OIDC client_secret is encrypted at rest via AES-256-GCM (v3 blob format: magic 0x03 + salt(16) + nonce(12) + ciphertext+tag) using the CERTCTL_CONFIG_ENCRYPTION_KEY passphrase. The encryption invariant is pinned by an integration test (internal/repository/postgres/oidc_encryption_invariant_test.go) that asserts ciphertext != plaintext + correct blob shape + round-trip recovery + wrong-passphrase fails.

Per-IdP setup guides at oidc-runbooks/index.md cover Keycloak, Authentik, Okta, Auth0, Entra ID, and Google Workspace.

Sessions + back-channel logout

Successful OIDC login mints a session cookie: v1.<session_id>.<signing_key_id>.<base64url-no-pad(HMAC-SHA256)>. The HMAC input is length-prefixed as len:sid:len:kid to defeat concatenation-collision attacks on bare-concat designs. Cookie attributes:

HttpOnly=true (no JS access; defends XSS cookie theft).
Secure=true (HTTPS-only; defends network MITM).
SameSite=Lax default (configurable to Strict via CERTCTL_SESSION_SAMESITE).
Path=/, host-only.

Idle timeout default 1h; absolute timeout default 8h; both configurable via CERTCTL_SESSION_IDLE_TIMEOUT and CERTCTL_SESSION_ABSOLUTE_TIMEOUT. The scheduler's sessionGCLoop (default 1h interval) sweeps expired rows.

CSRF defense: plaintext CSRF token in the JS-readable certctl_csrf cookie (intentionally HttpOnly=false for the GUI to echo into the X-CSRF-Token header); SHA-256 hash on the session row; subtle.ConstantTimeCompare in CSRFMiddleware. API-key actors are CSRF-exempt (no session row in context).

Session signing keys rotate via RotateSigningKey; the old key stays valid for CERTCTL_SESSION_SIGNING_KEY_RETENTION (default 24h) so existing cookies validate during rollover. Past retention, the old key's row is dropped and any cookie still signed under it returns ErrSigningKeyNotFound. EnsureInitialSigningKey is fail-fatal at server boot.

Back-channel logout per OpenID Connect Back-Channel Logout 1.0 (NOT RFC 8414): POST /auth/oidc/back-channel-logout accepts a JWT-signed logout token from the IdP, validates the JWT against the IdP's JWKS (same alg allow-list as login), pins required claims (iss / aud / iat / jti / events; exactly one of sub / sid; nonce MUST be absent), defeats replay via jti-based deduplication, and revokes matching sessions.

For threat-model coverage of these surfaces, see auth-threat-model.md. For the operator-runnable performance baselines, see auth-benchmarks.md.

OIDC first-admin bootstrap

Coexists with the env-var-token bootstrap path. When the operator sets CERTCTL_BOOTSTRAP_ADMIN_GROUPS + (optionally) CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID, the first user with one of those IdP groups becomes admin on first login per tenant. Subsequent users go through normal mapping. The admin-existence probe ensures only one wins between the two bootstrap paths; once any actor holds r-admin, the OIDC bootstrap hook silently falls through to normal mapping. Audit row on every grant (bootstrap.oidc_first_admin, event_category=auth).

Break-glass admin

Default-OFF (CERTCTL_BREAKGLASS_ENABLED=false). When enabled, the local-password admin path bypasses OIDC + group-claim layers; intended ONLY for SSO-broken incidents.

Argon2id with OWASP 2024 params (m=64 MiB, t=3, p=4, 16-byte salt, 32-byte output, per-password random salt, PHC-format hash). Hash column is json:"-" so handlers cannot wire-leak.
Lockout state machine: 5 failures (default; configurable via CERTCTL_BREAKGLASS_LOCKOUT_THRESHOLD) within 1h reset window (_LOCKOUT_RESET_INTERVAL) trips a 30s lockout (_LOCKOUT_DURATION). Atomic single-statement IncrementFailure defeats concurrent racing attempts.
Constant-time across all failure paths via verifyDummy() — wrong-password / locked-account / no-actor all take statistically indistinguishable time.
Surface invisibility: when disabled, ALL four endpoints return HTTP 404 (NOT 403). Scanners cannot distinguish "endpoint disabled" from "endpoint doesn't exist".
WARN log at server boot when ENABLED=true; audit row on every break-glass login (auth.breakglass_login_*, event_category=auth); WebAuthn/FIDO2 second factor pairing on the v3 roadmap (Decision 12).

Operator should DISABLE break-glass within 24h of SSO recovery to avoid a permanent backdoor; the runbook at auth-threat-model.md#break-glass-risks-phase-75 documents the full state machine.

Demo-to-production cutover (Audit 2026-05-11 A-8)

Migration 000029_rbac.up.sql unconditionally seeds an actor-demo-anon → r-admin row into actor_roles. This row is the runtime principal injected by the demo-mode middleware when CERTCTL_AUTH_TYPE=none. Under any non-none auth type the row is DORMANT — the middleware chain never resolves to it. But its existence is a footgun: a future regression that resolves an unauthenticated request to actor-demo-anon (a misrouted CORS preflight, a fallback in a new auth-exempt route) would silently re-elevate to admin.

certctl-server detects this residue at startup and emits a WARN log + an auth.demo_residual_grants_detected audit row listing every grant present on actor-demo-anon. Every production deploy will see this WARN on first boot — the migration baseline is part of the install, not a side effect of running demo mode.

Operator workflow at production cutover:

Drain the WARN by calling the cleanup endpoint with an admin API key:
```
curl -X POST --cacert deploy/test/certs/ca.crt \
     -H "Authorization: Bearer $ADMIN_KEY" \
     https://certctl.example.com:8443/api/v1/auth/demo-residual/cleanup
# → {"removed": 1}
```
The endpoint is gated auth.role.assign (admin-class) and refuses to run when CERTCTL_AUTH_TYPE=none (HTTP 503 — the residue IS the active runtime state at that auth type). The cleanup is idempotent; a second call returns {"removed": 0} and still leaves an audit row.

Equivalent SQL for operators preferring direct DB access:
```
DELETE FROM actor_roles WHERE actor_id = 'actor-demo-anon';
```
To make subsequent boots refuse startup if the row reappears (the most paranoid stance), set:
```
CERTCTL_DEMO_MODE_RESIDUAL_STRICT=true
```
With the flag set, any actor-demo-anon row under a non-none auth type causes certctl-server to log the WARN AND exit non-zero before binding the HTTPS listener. Default is false (WARN only).
The CI guard scripts/ci-guards/no-new-synthetic-admin.sh pins the set of source files that may reference the actor-demo-anon literal. New runtime code paths that resolve to the synthetic actor are rejected at PR time so the credibility gap stays closed.

Migrating an existing deployment to OIDC

An existing API-key-only deployment that wants to add OIDC follows the step-by-step at docs/migration/oidc-enable.md: configure CERTCTL_CONFIG_ENCRYPTION_KEY, pick + configure an IdP per the relevant runbook, configure the certctl-side OIDCProvider

group→role mappings, verify the login flow against a single test user, then announce the SSO endpoint to the rest of the organization.

Per-user rate limiting

Authenticated callers are bucketed by API-key name; unauthenticated callers (probes, OCSP relying parties, EST/SCEP enrollees) are bucketed by source IP. RPS and BurstSize are per-key budgets. PerUserRPS / PerUserBurstSize give authenticated clients a separate budget when set non-zero.

API key rotation

Audit reference: L-004. CWE-924 (improper enforcement of message integrity during transmission in a communication channel) - operator UX variant.

certctl's API keys are configured via the CERTCTL_API_KEYS_NAMED env var (format name1:key1,name2:key2:admin) and parsed at startup into an in-memory list. There is no DB-resident key store, no GUI, no /api/v1/keys endpoint - the env var IS the key inventory.

The env var supports a double-key rotation window: two entries can share a name during the rollover, and both keys validate. Operators run the rotation as:

Generate the new key. openssl rand -hex 32 produces a 256-bit value with sufficient entropy.
Append the new entry to CERTCTL_API_KEYS_NAMED alongside the existing one:
```
CERTCTL_API_KEYS_NAMED="alice:OLDKEY:admin,alice:NEWKEY:admin"
```
Both entries MUST carry the same admin flag - startup fails loud if they don't (a non-admin shouldn't share an identity with an admin).

Restart certctl. A startup INFO log confirms the rotation window is active:

INFO api-key rotation window active name=alice entries=2 see=docs/security.md::api-key-rotation

Roll the new key out to all clients. Both keys validate during this phase. Audit-trail actor + per-user rate-limit bucket stay consistent across the rollover (both entries produce the same UserKey context value, the shared name).
Remove the old entry from CERTCTL_API_KEYS_NAMED:
```
CERTCTL_API_KEYS_NAMED="alice:NEWKEY:admin"
```
Restart certctl. OLDKEY now fails with 401. Rotation complete.

The rotation window has no operator-set timeout - it lasts for as long as both entries are in the env var. Best practice is a 24-72h window covering a full deploy cadence; if a client hasn't rolled to NEWKEY by the end of step 4, extend the window before step 5.

What the contract guarantees

Two entries with the same name: allowed if both have the same admin flag.
Two entries with the same name but mismatched admin: rejected at startup (privilege escalation guard).
Two entries with the same (name, key) pair: rejected at startup (typo guard - rotation requires DIFFERENT keys under the same name).
Single-entry steady state: the simple legacy behaviour.

What the contract does NOT do

No automatic expiration of OLDKEY. The operator removes the entry in step 5; certctl doesn't track timestamps. A future enhancement could add a rotated_at annotation if operators ask for it.
No GUI / API for key management. Keys are env-var only by design; building a key-management surface is a separate feature project.
No revocation list. If a key leaks, the only path is to remove it from the env var and restart. That's appropriate for a small env-var inventory; it would not scale to a per-user-key-issued model.

Security carve-outs & operator-tunable defaults

Phase 2 of the architecture diligence remediation (2026-05-13) consolidated the following carve-outs into one canonical section so operators reviewing security posture have a single search target. Each entry cites the exact file:line of the carve-out, why it exists, and what the operator should do.

TLS verification — dev escape hatches

certctl has three InsecureSkipVerify=true sites that are dev/probe escape hatches, never enabled by default in production:

Agent dev escape — cmd/agent/main.go:179 (wired from cmd/agent/main.go:61 config field + cmd/agent/main.go:1371 CLI flag). Operators flip this only when debugging an agent against a self-signed control plane that hasn't been added to the agent's trust store. Document as --insecure-skip-verify in the agent's install runbook; the agent logs a startup WARN any time the flag is set. SEC-M3 pins that the carve-out is intentional.
Agent verification probe — cmd/agent/verify.go:78. The probe intentionally opens a TLS connection with verification disabled so it can inspect any certificate the endpoint serves (including self-signed or expired ones — that's the whole point of a probe). The probe never returns trust state to a security-relevant code path; it only reads cert metadata. SEC-M3 pins this.
tlsprobe (network scanner) — internal/tlsprobe/probe.go:54. Same rationale as the agent verify probe — network discovery must introspect any certificate it finds, including the ones with the problems we're scanning for. SEC-M3 pins this.

F5 target connector — `InsecureSkipVerify` per-config

The F5 target connector exposes an Insecure: bool field on its per-target config blob (default false). When set, internal/connector/target/f5/f5.go:134 builds the HTTP client with InsecureSkipVerify: config.Insecure. SEC-M5 closure: operator opt-in for self-signed F5 BIG-IP device certs; mitigation is to run the F5 + the proxy-agent on a network-segmented internal subnet. Document in the F5 connector's per-target setup guide.

ACME issuer — `CERTCTL_ACME_INSECURE` (now gated on ACK)

internal/connector/issuer/acme/acme.go:201 builds the ACME HTTP client with InsecureSkipVerify: true for the Pebble integration test path. The per-issuer runtime setting comes from CERTCTL_ACME_INSECURE (internal/config/config.go:2116); Phase 2 SEC-M4 closure (2026-05-13) added the fail-closed gate so the operator must ALSO set CERTCTL_ACME_INSECURE_ACK=true for the server to boot. Production deploys must never set either flag. The boot-time WARN log at cmd/server/main.go:611 continues to fire for the ACK'd case so every restart logs the reminder.

CSP `'unsafe-inline'` on `style-src`

internal/api/middleware/securityheaders.go:58 ships the dashboard CSP with style-src 'self' 'unsafe-inline'. This is required because Tailwind compiles utility classes into a single stylesheet at build time, but inline-style attributes appear in the dashboard via inline <svg> elements + Recharts' <ResponsiveContainer> injecting inline width/height. SEC-L1 closure: the carve-out is necessary today; the planned tightening flow is the frontend audit's FE-H2 (icon library)

decorative-SVG sweep that then unlocks the CSP hardening (drops 'unsafe-inline').

Break-glass admin — Argon2id rest-defense reminder

The break-glass admin path (docs/operator/runbooks/disaster-recovery.md) hashes the operator-supplied password with Argon2id and stores the hash in the breakglass_credentials table. SEC-L2 reminder: the strength of the rest-defense is operator-supplied — pick a password with sufficient entropy (≥ 64 random bits via openssl rand -base64 12) and rotate after every use. Argon2id resists offline cracking but an operator-supplied "Password123" hashes the same way.

Body-size limit (1 MB default) — operator-tunable

The http.MaxBytesReader wrap caps inbound request bodies at 1 MB by default. The cap is necessary defense against unbounded-body DOS but catches legitimate operator workflows:

Bulk truststore PEM bundle uploads (CA bundles for federated trust stores can be > 1 MB).
Multi-MB CRL pushes via the CRL-cache endpoint.
Bulk-import of certificates with embedded chains.

SEC-L3 closure: operators raise the cap via CERTCTL_MAX_BODY_SIZE (units: bytes; e.g. CERTCTL_MAX_BODY_SIZE=10485760 for 10 MB). Document in deploy/ENVIRONMENTS.md.

Demo Compose placeholder credentials

deploy/docker-compose.demo.yml ships CERTCTL_AUTH_SECRET=change-me-in-production, CERTCTL_CONFIG_ENCRYPTION_KEY=change-me-32-char-encryption-key, and CERTCTL_API_KEY=change-me-in-production as documented demo defaults. The runtime Validate() fail-closed guards (internal/config/config.go::Validate, Bundle 2 2026-05-12) refuse to start if those literal strings reach a non-demo config. Phase 2 DEPL-M2 closure adds a CI guard (scripts/ci-guards/no-change-me-in-prod-compose.sh) that fails the build at PR time if a change-me-* literal leaks into a non-demo compose file — catching the regression one layer before the runtime guard fires.

Kubernetes NetworkPolicy — operator-opt-in

deploy/helm/certctl/templates/networkpolicy.yaml ships the template but deploy/helm/certctl/values.yaml defaults networkPolicy.enabled: false. DEPL-M3 rationale: most Kubernetes clusters don't have a NetworkPolicy controller installed (kind / minikube / fresh k3s); a default-enabled NetworkPolicy renders fine but produces no enforcement, and bare-metal kube-router-style controllers may interpret a permissive default differently. Production deploys with a real NetworkPolicy controller (Calico, Cilium, Antrea) flip the values key to true and tune the policy in their values overlay. Document the production-enable in docs/operator/runbooks/ha.md (added Phase 2 DEPL-H1).

Reporting a vulnerability

Email certctl@proton.me. Coordinated disclosure preferred; we will acknowledge within 72h.

24 KiB Raw Blame History