From a849c8b8cf1199a3a521dcce158cded8c34c6738 Mon Sep 17 00:00:00 2001 From: shankar0123 Date: Wed, 13 May 2026 00:14:59 +0000 Subject: [PATCH] =?UTF-8?q?fix(security):=20close=20BUNDLE=202=20=E2=80=94?= =?UTF-8?q?=20safe=20first=20run,=20demo=20mode,=20agent=20bootstrap?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the "docker compose up == accidental production" hazard: pre-Bundle-2 the base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal change-me-... placeholder creds), the README claimed "drop the demo overlay for a clean install", and ENVIRONMENTS.md table documented auth-type default as api-key — three contradictory stories layered on the same compose file. Source findings closed: R2 R3 C1 D9 finding-2 S9 (repo audit) SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit) Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml): The base now ships production-shaped — no AUTH_TYPE override, no KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET / CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID must come from deploy/.env (sample template in deploy/.env.example + root .env.example). The demo overlay carries the full demo posture (every env var + every placeholder credential) so the `-f docker-compose.demo.yml` one-flag flip remains a zero-config populated-dashboard path. Fail-closed startup guards (internal/config/config.go::Validate): Three new gates layered on the existing HIGH-12 demo-mode listen-bind guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay keeps working: • HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse • HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse • LOW-5: CORS_ORIGINS contains "*" (CWE-942 + CWE-352) → refuse Visible DEMO MODE banner (cmd/server/main.go): every boot under DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step production-promotion checklist. The 2026-04-19 incident (a screenshot run that kept running for three days) drove this; the per-startup banner makes the posture unmissable in any log scraper. Agent enrollment doc alignment: • docs/reference/configuration.md L83: corrected the non-existent URL `POST /api/v1/agents/register` to the real route `POST /api/v1/agents`; added the bootstrap-token note and the install-agent.sh handoff sequence. • docs/reference/architecture.md L154: replaced "agents register themselves at first heartbeat" (false — cmd/agent/main.go fail- fasts when CERTCTL_AGENT_ID is unset) with the actual two-step operator-driven flow (REST or GUI registration first, returned ID fed to install-agent.sh second). Tests + CI guard: • 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go covering: placeholder-secret refused + demo-ack exempt; placeholder encryption-key refused + demo-ack exempt; real key not mistaken for placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed into a concrete allowlist still refused; concrete allowlist accepted. • scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base compose for any of the demo-mode env vars + placeholder credentials. Comments stripped before checking so the narrative header in the base file can still reference the overlay's posture in prose. Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke): Switched to layering -f docker-compose.demo.yml on top of the base — the new production base requires real env vars the smoke doesn't have, and the smoke's purpose (catch migration-on-cold-DB regressions + the bootstrap-token mint path) is orthogonal to which auth posture the boot lands in. Receipts: • Current first-run truth table compose flag → posture -f docker-compose.yml (production) → requires .env; fail-fasts on missing AUTH_SECRET / CONFIG_ENCRYPTION _KEY / POSTGRES _PASSWORD; agent fail-fasts on missing AGENT_ID -f docker-compose.yml -f docker-compose.demo.yml (demo) → zero-config; AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN=server + DEMO_SEED=true; boot banner WARN -f docker-compose.yml -f docker-compose.dev.yml (dev) → base + PgAdmin + debug logging -f docker-compose.test.yml (test, standalone) → production-shape posture, real CA backends • Verification (PATH=/tmp/go/bin export GO* paths to /tmp): gofmt -l # clean (no diffs) go vet ./internal/config ./cmd/server # clean go test -short -count=1 ./internal/config/... # PASS (cumulative + all 9 new Bundle 2 cases green) go test -short -count=1 # PASS (no regression ./internal/connector/target/configcheck in the Bundle 1 - closure tests) go build ./cmd/server ./cmd/agent # clean ./cmd/cli ./cmd/mcp-server bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean bash scripts/ci-guards/G-3-env-docs-drift.sh # clean Remaining operator warnings (not blocking; tracked in CLAUDE.md "Open decisions"): • The first `docker compose -f docker-compose.yml up -d` against a pre-Bundle-2 .env (placeholder values still in place) will now fail-fast. This is the intended posture but operators upgrading from v2.0.x via .env-from-old-master need to rotate before upgrading. The CHANGELOG note for the v2.1.0 release should call this out alongside Auth Bundle 2's other breaking changes. Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 --- .env.example | 37 ++++- .github/workflows/ci.yml | 27 ++- README.md | 16 +- cmd/server/main.go | 13 ++ deploy/.env.example | 43 ++++- deploy/ENVIRONMENTS.md | 55 ++++-- deploy/docker-compose.demo.yml | 94 +++++++++-- deploy/docker-compose.yml | 120 ++++++++++---- docs/reference/architecture.md | 7 +- docs/reference/configuration.md | 2 +- internal/config/config.go | 64 +++++++ internal/config/config_test.go | 156 ++++++++++++++++++ .../ci-guards/B2-compose-base-no-demo-env.sh | 101 ++++++++++++ 13 files changed, 645 insertions(+), 90 deletions(-) create mode 100755 scripts/ci-guards/B2-compose-base-no-demo-env.sh diff --git a/.env.example b/.env.example index 31cfe05..606b5da 100644 --- a/.env.example +++ b/.env.example @@ -7,7 +7,7 @@ # ============================================================================== POSTGRES_DB=certctl POSTGRES_USER=certctl -POSTGRES_PASSWORD=change-me-in-production +POSTGRES_PASSWORD=replace-with-openssl-rand-hex-32 # ============================================================================== # Certctl Server @@ -24,7 +24,7 @@ POSTGRES_PASSWORD=change-me-in-production # seeds pg_authid on first boot of an empty volume. See docs/quickstart.md # "Warning" callout and `internal/repository/postgres/db.go::wrapPingError` # for the SQLSTATE 28P01 diagnostic that fires when the two drift. -CERTCTL_DATABASE_URL=postgres://certctl:change-me-in-production@postgres:5432/certctl?sslmode=disable +CERTCTL_DATABASE_URL=postgres://certctl:replace-with-openssl-rand-hex-32@postgres:5432/certctl?sslmode=disable CERTCTL_SERVER_HOST=0.0.0.0 CERTCTL_SERVER_PORT=8443 CERTCTL_LOG_LEVEL=info @@ -42,10 +42,27 @@ CERTCTL_LOG_FORMAT=json # option (no JWT middleware shipped - silent auth downgrade); see # docs/upgrade-to-v2-jwt-removal.md if you previously set # CERTCTL_AUTH_TYPE=jwt. -CERTCTL_AUTH_TYPE=none -# Required when CERTCTL_AUTH_TYPE is "api-key". -# Generate with: openssl rand -base64 32 -# CERTCTL_AUTH_SECRET=change-me-in-production +# +# Bundle 2 closure (2026-05-12): the docker-compose base file no longer +# defaults to AUTH_TYPE=none. The base ships production-shaped; the demo +# overlay (deploy/docker-compose.demo.yml) flips this baseline into the +# populated-dashboard demo path. +CERTCTL_AUTH_TYPE=api-key +# Required when CERTCTL_AUTH_TYPE is "api-key". Generate with: +# openssl rand -base64 32 +# The Bundle 2 fail-closed Validate() REFUSES TO START if this value +# equals the placeholder string "change-me-in-production" outside of +# demo mode (CERTCTL_DEMO_MODE_ACK=true). +CERTCTL_AUTH_SECRET=replace-with-openssl-rand-base64-32 + +# Bundle 2 closure: AES-256-GCM key for encrypting issuer/target config +# secrets at rest. Required for any deployment that uses the dynamic +# config GUI to store issuer credentials. Generate with: +# openssl rand -base64 32 +# Minimum 32 bytes. The Bundle 2 fail-closed Validate() REFUSES TO +# START if this value equals the placeholder string +# "change-me-32-char-encryption-key" outside of demo mode. +CERTCTL_CONFIG_ENCRYPTION_KEY=replace-with-openssl-rand-base64-32 # ============================================================================== # Certctl Agent @@ -54,8 +71,14 @@ CERTCTL_AUTH_TYPE=none # startup. Use the docker-compose self-signed bootstrap CA bundle from # `deploy/test/certs/ca.crt` or supply your own via CERTCTL_SERVER_CA_BUNDLE_PATH. CERTCTL_SERVER_URL=https://localhost:8443 -CERTCTL_API_KEY=change-me-in-production +# Matches one of the server's CERTCTL_AUTH_SECRET rotation values. The +# placeholder is rejected outside demo mode (Bundle 2 fail-closed guard). +CERTCTL_API_KEY=replace-with-openssl-rand-base64-32 CERTCTL_AGENT_NAME=local-agent +# Returned from `POST /api/v1/agents` during agent enrollment. The agent +# fail-fasts at startup with "agent-id flag or CERTCTL_AGENT_ID env var +# is required" if this is unset. +# CERTCTL_AGENT_ID=agent-from-registration-response # ============================================================================== # Optional: Scheduler Tuning (defaults are usually fine) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 833f07f..38b6039 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -322,21 +322,36 @@ jobs: curl "${args[@]}" } + # Bundle 2 closure (2026-05-12): the base compose is now + # production-shaped — auth=api-key + agent-keygen + fail-closed + # placeholder guards. The cold-DB smoke layers in the demo + # overlay so the boot path remains zero-config: the overlay + # supplies AUTH_TYPE=none + DEMO_MODE_ACK=true + the matching + # placeholder creds the fail-closed guards accept under + # DEMO_MODE_ACK. The agent service in the overlay also + # pre-seeds CERTCTL_AGENT_ID=agent-demo-1 so the bundled + # agent doesn't restart-loop. The smoke's purpose (catch + # migration-on-cold-DB regressions + verify bootstrap-token + # endpoint mints a day-0 admin against a freshly migrated + # schema) is orthogonal to whether the auth posture is + # demo-mode or api-key, so the overlay is acceptable here. + COMPOSE_FILES=(-f docker-compose.yml -f docker-compose.demo.yml) + log "1/4 down -v --remove-orphans" - docker compose down -v --remove-orphans 2>&1 | tail -3 || true + docker compose "${COMPOSE_FILES[@]}" down -v --remove-orphans 2>&1 | tail -3 || true log "2/4 up -d (cold boot)" - docker compose up -d 2>&1 | tail -3 + docker compose "${COMPOSE_FILES[@]}" up -d 2>&1 | tail -3 log "3/4 wait for healthchecks" wait_for_service_healthy postgres wait_for_service_healthy certctl-server - wait_for_service_healthy certctl-agent || log " (agent skipped — non-demo compose)" + wait_for_service_healthy certctl-agent || log " (agent skipped)" log "4/4 minting day-0 admin (proves migration ladder + bootstrap path)" TOKEN="$(openssl rand -base64 32 | tr -d '\n')" echo "CERTCTL_BOOTSTRAP_TOKEN=$TOKEN" > /tmp/_smoke.env - docker compose --env-file /tmp/_smoke.env up -d --force-recreate certctl-server 2>&1 | tail -2 + docker compose "${COMPOSE_FILES[@]}" --env-file /tmp/_smoke.env up -d --force-recreate certctl-server 2>&1 | tail -2 sleep 5 wait_for_service_healthy certctl-server BODY="$(http_call POST /api/v1/auth/bootstrap "{\"token\":\"$TOKEN\",\"actor_name\":\"smoke-admin\"}")" @@ -345,7 +360,7 @@ jobs: log "PASS — cold boot + force-recreate + admin bootstrap all green" log "tearing down" - docker compose down -v 2>&1 | tail -2 + docker compose "${COMPOSE_FILES[@]}" down -v 2>&1 | tail -2 - name: Dump compose logs on failure if: failure() @@ -353,7 +368,7 @@ jobs: run: | for svc in postgres certctl-server certctl-agent certctl-tls-init; do echo "==== $svc ====" - docker compose logs --no-color --tail 200 "$svc" || true + docker compose -f docker-compose.yml -f docker-compose.demo.yml logs --no-color --tail 200 "$svc" || true done frontend-build: diff --git a/README.md b/README.md index e64ed95..d52fd9a 100644 --- a/README.md +++ b/README.md @@ -88,15 +88,27 @@ Security: three authentication paths — API keys (SHA-256 hashed + constant-tim ### Docker Compose (recommended) +**Demo path — zero config, populated dashboard:** + ```bash git clone https://github.com/certctl-io/certctl.git cd certctl docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build ``` -Wait ~30 seconds, then open **https://localhost:8443** in your browser. The shipped demo overlay seeds 180 days of realistic history across 13 issuers, 8 agents, managed + discovered certs, jobs, deploys, audit, and notification events. The `certctl-tls-init` init container self-signs an ECDSA-P256 cert on first boot — accept the browser warning for the demo, or feed the generated `ca.crt` to your client. +Wait ~30 seconds, then open **https://localhost:8443** in your browser. The demo overlay flips the base into demo-mode auth (every request served as the synthetic admin actor `actor-demo-anon` — the server emits a prominent ⚠ DEMO MODE banner at boot reminding you this posture is for evaluation only) and seeds 180 days of realistic history across 13 issuers, 8 agents, managed + discovered certs, jobs, deploys, audit, and notification events. The `certctl-tls-init` init container self-signs an ECDSA-P256 cert on first boot — accept the browser warning for the demo, or feed the generated `ca.crt` to your client. -For a clean install without demo data, drop the `-f deploy/docker-compose.demo.yml` flag and run `docker compose -f deploy/docker-compose.yml up -d --build`. The four compose files (`docker-compose.yml` base, `docker-compose.demo.yml` overlay, `docker-compose.dev.yml` for PgAdmin + debug logging, `docker-compose.test.yml` for integration tests) are documented at [`deploy/ENVIRONMENTS.md`](deploy/ENVIRONMENTS.md). +**Production path — `.env` required, fail-closed on placeholders:** + +```bash +cp .env.example deploy/.env # or root .env if running outside compose +$EDITOR deploy/.env # set POSTGRES_PASSWORD, CERTCTL_AUTH_SECRET, + # CERTCTL_API_KEY, CERTCTL_CONFIG_ENCRYPTION_KEY, + # CERTCTL_AGENT_ID — all via openssl rand +docker compose -f deploy/docker-compose.yml up -d --build +``` + +The base compose alone (no demo overlay) ships production-shaped: default `auth-type=api-key`, default `keygen-mode=agent`, no demo seed, no demo-mode synthetic admin. The fail-closed startup guards in `internal/config/config.go::Validate` refuse to boot when any of the change-me-... placeholder credentials reach config outside of demo mode (Bundle 2 closure, 2026-05-12). The four compose files (`docker-compose.yml` base, `docker-compose.demo.yml` overlay, `docker-compose.dev.yml` for PgAdmin + debug logging, `docker-compose.test.yml` for integration tests) are documented at [`deploy/ENVIRONMENTS.md`](deploy/ENVIRONMENTS.md). ```bash curl --cacert $(docker compose -f deploy/docker-compose.yml exec -T certctl-server cat /etc/certctl/tls/ca.crt) https://localhost:8443/health diff --git a/cmd/server/main.go b/cmd/server/main.go index ce0585a..d82f485 100644 --- a/cmd/server/main.go +++ b/cmd/server/main.go @@ -102,6 +102,19 @@ func main() { "server_host", cfg.Server.Host, "server_port", cfg.Server.Port) + // Bundle 2 (2026-05-12) — visible demo-mode banner at boot. + // + // When CERTCTL_DEMO_MODE_ACK=true the HIGH-12 startup guard already + // passed and the server is about to serve every request as the + // synthetic admin actor `actor-demo-anon`. Operators have lost + // production deploys to this posture more than once (last incident: + // 2026-04-19, a screenshot run that kept running for three days); + // the per-startup banner makes the posture unmissable in any log + // scraper, dashboard, or `journalctl --since boot` review. + if cfg.Auth.DemoModeAck { + logger.Warn("⚠ DEMO MODE ACTIVE — CERTCTL_DEMO_MODE_ACK=true is set; every request is served as the synthetic admin actor `actor-demo-anon` (no authentication enforced). This deployment MUST NOT hold production keys, certificates, or audit history. To promote to production: (1) unset CERTCTL_DEMO_MODE_ACK; (2) set CERTCTL_AUTH_TYPE=api-key or oidc; (3) set CERTCTL_AUTH_SECRET to a fresh `openssl rand -base64 32`; (4) set CERTCTL_KEYGEN_MODE=agent; (5) rotate CERTCTL_CONFIG_ENCRYPTION_KEY to a fresh `openssl rand -base64 32` (≥ 32 bytes, not the change-me placeholder); (6) restart the server. See docs/operator/security.md for the full posture.") + } + // Bundle-5 / Audit H-007: deprecation WARN when the agent bootstrap // token is unset. Pre-Bundle-5 there was no token at all; the v2.0.x // default keeps the warn-mode pass-through so existing demo deploys diff --git a/deploy/.env.example b/deploy/.env.example index e2fd597..10ada4b 100644 --- a/deploy/.env.example +++ b/deploy/.env.example @@ -1,8 +1,39 @@ -# certctl Docker Compose environment variables -# Copy this file to .env and customize for your deployment +# certctl Docker Compose environment variables (Bundle 2 — 2026-05-12) +# +# Copy this file to deploy/.env and customize. The production-shaped base +# compose (docker-compose.yml) requires every variable below to be set; +# the Bundle 2 fail-closed startup guards REFUSE TO BOOT if any value +# remains at a "change-me-..." or "replace-with-..." placeholder outside +# demo mode (CERTCTL_DEMO_MODE_ACK=true). +# +# DEMO PATH (zero-config, populated dashboard, demo-mode auth): +# docker compose -f deploy/docker-compose.yml \ +# -f deploy/docker-compose.demo.yml up -d --build +# The demo overlay supplies its own placeholder values plus DEMO_MODE_ACK +# so this .env is NOT needed. +# +# PRODUCTION PATH (this .env is required): +# docker compose -f deploy/docker-compose.yml up -d -# PostgreSQL password (change in production!) -POSTGRES_PASSWORD=certctl +# PostgreSQL password — openssl rand -hex 32 +POSTGRES_PASSWORD=replace-with-openssl-rand-hex-32 -# Agent API key (change in production! Generate with: openssl rand -hex 32) -CERTCTL_API_KEY=change-me-in-production +# Server API-key secret — openssl rand -base64 32 +CERTCTL_AUTH_SECRET=replace-with-openssl-rand-base64-32 + +# Bundled-agent API key (matches one of the server's AUTH_SECRET rotation +# values). Generate with: openssl rand -base64 32 +CERTCTL_API_KEY=replace-with-openssl-rand-base64-32 + +# AES-256-GCM key for encrypting issuer/target config secrets at rest. +# Minimum 32 bytes. Generate with: openssl rand -base64 32 +CERTCTL_CONFIG_ENCRYPTION_KEY=replace-with-openssl-rand-base64-32 + +# Agent ID returned from `POST /api/v1/agents` during agent enrollment. +# Without this the bundled certctl-agent service fail-fasts at startup. +# CERTCTL_AGENT_ID=agent-from-registration-response + +# Day-0 admin bootstrap token (optional — generate with: openssl rand -hex 32). +# When set, POST /api/v1/auth/bootstrap mints the first admin actor + API +# key. When unset (default), that endpoint returns 410 Gone. +# CERTCTL_BOOTSTRAP_TOKEN= diff --git a/deploy/ENVIRONMENTS.md b/deploy/ENVIRONMENTS.md index d53fc76..b703be6 100644 --- a/deploy/ENVIRONMENTS.md +++ b/deploy/ENVIRONMENTS.md @@ -62,7 +62,9 @@ A compose file defines **services** (containers), **networks** (how they talk to ## Base Environment **File:** `docker-compose.yml` -**When to use:** Production deployments, first-time setup, or any time you want a clean dashboard with the onboarding wizard. +**When to use:** Production deployments and any time you want a clean, production-shaped stack with real authentication enforced. + +**Bundle 2 closure (2026-05-12):** the base compose was split from the demo overlay. Pre-Bundle-2 this file IS the demo path (auth=none, keygen=server, demo-seed=true, change-me placeholder credentials baked in). Operators reading "drop the demo overlay for a clean install" were not getting a clean install — they were getting a demo stack with the overlay's data layer stripped off. Post-Bundle-2 the base ships production-shaped: `CERTCTL_AUTH_TYPE` defaults to `api-key`, `CERTCTL_KEYGEN_MODE` defaults to `agent`, demo-mode + demo-seed default to false, and every credential placeholder is rejected at startup. The demo path is now a single overlay flag away (`-f deploy/docker-compose.demo.yml`). ### What it runs @@ -79,9 +81,20 @@ Three services on a private bridge network: ```bash git clone https://github.com/certctl-io/certctl.git cd certctl + +# Required: provide real credentials. Without this step the server fail-fasts +# at startup on the Bundle 2 placeholder-credential guards. +cp .env.example deploy/.env +$EDITOR deploy/.env +# Set: POSTGRES_PASSWORD, CERTCTL_AUTH_SECRET, CERTCTL_API_KEY, +# CERTCTL_CONFIG_ENCRYPTION_KEY (all via `openssl rand -base64 32`), +# CERTCTL_AGENT_ID (returned from `POST /api/v1/agents`). + docker compose -f deploy/docker-compose.yml up -d --build ``` +If you just want to kick the tires without writing a `.env`, use the demo overlay instead — see [Demo Overlay](#demo-overlay) below. + `--build` compiles the Go server and agent from source, including the React frontend. Without it, Docker may reuse a stale image from a previous build. `-d` runs in detached mode (background). Omit it to see logs in your terminal. @@ -132,14 +145,16 @@ certctl-server: postgres: condition: service_healthy environment: - CERTCTL_DATABASE_URL: postgres://certctl:${POSTGRES_PASSWORD:-certctl}@postgres:5432/certctl?sslmode=disable + CERTCTL_DATABASE_URL: postgres://certctl:${POSTGRES_PASSWORD}@postgres:5432/certctl?sslmode=disable CERTCTL_SERVER_HOST: 0.0.0.0 CERTCTL_SERVER_PORT: 8443 CERTCTL_LOG_LEVEL: info - CERTCTL_AUTH_TYPE: none - CERTCTL_KEYGEN_MODE: server + # Bundle 2 (2026-05-12): no auth-type / keygen-mode override here. + # Code defaults (api-key + agent) take effect; the demo overlay flips + # both to demo-mode (none + server). + CERTCTL_AUTH_SECRET: ${CERTCTL_AUTH_SECRET} CERTCTL_NETWORK_SCAN_ENABLED: "true" - CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY:-change-me-32-char-encryption-key} + CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY} ``` The server is the control plane. It serves the REST API, the React dashboard, runs 7 background scheduler loops (renewal, job processing, health checks, notifications, short-lived cert expiry, network scanning, digest emails), and manages the issuer/target registry. @@ -147,9 +162,10 @@ The server is the control plane. It serves the REST API, the React dashboard, ru Key environment variables explained: - `CERTCTL_DATABASE_URL` references the `postgres` service by hostname. Docker's internal DNS resolves `postgres` to the container's IP on the bridge network. `sslmode=disable` is appropriate because traffic stays on the private Docker network. -- `CERTCTL_AUTH_TYPE: none` disables API key authentication so you can explore immediately. For production, set `api-key` and configure `CERTCTL_AUTH_SECRET`. -- `CERTCTL_KEYGEN_MODE: server` means the server generates private keys. This is convenient for demos but insecure for production. In production, set `agent` so keys are generated on agent machines and never transmitted. -- `CERTCTL_CONFIG_ENCRYPTION_KEY` enables AES-256-GCM encryption for issuer and target configurations stored in the database (credentials, API keys). Without this, the dynamic configuration GUI (adding issuers/targets from the dashboard) won't encrypt sensitive fields. For production, generate a strong random key. +- `CERTCTL_AUTH_TYPE` defaults to `api-key` in the code (`internal/config/config.go`); the base compose does NOT override it. To run demo-mode auth (every request served as the synthetic admin actor), layer the demo overlay on top. +- `CERTCTL_AUTH_SECRET` is the API-key value the server accepts. The Bundle 2 fail-closed guard rejects the literal placeholder `change-me-in-production` outside demo mode. Generate with `openssl rand -base64 32`. +- `CERTCTL_KEYGEN_MODE` defaults to `agent` in the code (the base compose does NOT override it). Production deploys leave it there so private keys stay on agent infrastructure; the demo overlay flips it to `server` so the demo can issue + hold the key on the server box without an agent dance. +- `CERTCTL_CONFIG_ENCRYPTION_KEY` enables AES-256-GCM encryption for issuer and target configurations stored in the database (credentials, API keys). Required for any deploy that adds issuers via the GUI. The Bundle 2 fail-closed guard rejects the literal placeholder `change-me-32-char-encryption-key` outside demo mode. Generate with `openssl rand -base64 32` (≥ 32 bytes). - `CERTCTL_NETWORK_SCAN_ENABLED` activates the scheduler loop that probes TLS endpoints on your network to discover certificates you might not be managing. **Expert note:** The healthcheck hits `GET /health` every 10 seconds with 5 retries. The `depends_on: condition: service_healthy` on the agent means Docker holds agent startup until this check passes. Resource limits (`cpus: '1.0'`, `memory: 512M`) prevent the server from consuming unbounded resources in shared environments. @@ -162,8 +178,12 @@ certctl-agent: certctl-server: condition: service_healthy environment: - CERTCTL_SERVER_URL: http://certctl-server:8443 - CERTCTL_API_KEY: ${CERTCTL_API_KEY:-change-me-in-production} + CERTCTL_SERVER_URL: https://certctl-server:8443 + # Bundle 2 (2026-05-12): no placeholder fallbacks. Operators MUST + # set CERTCTL_API_KEY + CERTCTL_AGENT_ID in deploy/.env. The agent + # binary fail-fasts at startup when CERTCTL_AGENT_ID is unset. + CERTCTL_API_KEY: ${CERTCTL_API_KEY} + CERTCTL_AGENT_ID: ${CERTCTL_AGENT_ID} CERTCTL_AGENT_NAME: docker-agent CERTCTL_LOG_LEVEL: info CERTCTL_DISCOVERY_DIRS: /var/lib/certctl/keys @@ -194,13 +214,18 @@ docker compose -f deploy/docker-compose.yml down -v ## Demo Overlay **File:** `docker-compose.demo.yml` -**When to use:** Demos, screenshots, stakeholder presentations, or any time you want a populated dashboard on first boot. +**When to use:** Demos, screenshots, stakeholder presentations, or any time you want a one-command zero-config evaluation stack with a populated dashboard. ### What it adds -One env var: `CERTCTL_DEMO_SEED=true` on the `certctl-server` service. The server applies `migrations/seed_demo.sql` at boot via `postgres.RunDemoSeed` AFTER the baseline migrations + `seed.sql` are in place. The demo seed file inserts 180 days of simulated operational history: teams, owners, certificates across multiple issuers, agents on different platforms, jobs with realistic timestamps, discovery scan results, audit events, policies, and profiles. +Bundle 2 closure (2026-05-12) moved every demo-mode env var out of the base compose into this overlay. The overlay now carries: -Pre-U-3 the overlay used to mount `seed_demo.sql` into PostgreSQL's `/docker-entrypoint-initdb.d/` and rely on initdb-time application. That worked only because the production stack also mounted the migrations there, so the schema existed when initdb ran. Once U-3 dropped the production initdb mounts (single source of truth: server runs `RunMigrations` + `RunSeed` at boot), the demo seed could no longer be applied at initdb time — the tables it references wouldn't exist yet. Post-U-3 the overlay is a 27-line override file with no `image:` / `build:` of its own; it MUST be passed alongside the base, or compose errors with `service "certctl-server" has neither an image nor a build context specified`. +- `CERTCTL_AUTH_TYPE=none` + `CERTCTL_DEMO_MODE_ACK=true` — demo-mode synthetic admin actor (`actor-demo-anon`). The server emits a prominent ⚠ DEMO MODE WARN banner at boot with a production-promotion checklist (`cmd/server/main.go`). +- `CERTCTL_KEYGEN_MODE=server` — demo-only server-side keygen. +- `CERTCTL_DEMO_SEED=true` — the server applies `migrations/seed_demo.sql` at boot via `postgres.RunDemoSeed`, inserting 180 days of simulated operational history (teams, owners, certificates, agents, jobs, discovery results, audit events, policies, profiles). +- Fixed weak `POSTGRES_PASSWORD=certctl`, `CERTCTL_AUTH_SECRET=change-me-in-production`, `CERTCTL_CONFIG_ENCRYPTION_KEY=change-me-32-char-encryption-key`, `CERTCTL_API_KEY=change-me-in-production`, `CERTCTL_AGENT_ID=agent-demo-1` — placeholder credentials the Bundle 2 fail-closed `Validate()` rejects outside demo mode, but the demo overlay's `DEMO_MODE_ACK=true` unlocks them. + +Pre-U-3 the overlay used to mount `seed_demo.sql` into PostgreSQL's `/docker-entrypoint-initdb.d/` and rely on initdb-time application. That worked only because the production stack also mounted the migrations there, so the schema existed when initdb ran. Once U-3 dropped the production initdb mounts (single source of truth: server runs `RunMigrations` + `RunSeed` at boot), the demo seed could no longer be applied at initdb time — the tables it references wouldn't exist yet. Post-U-3 the overlay is an override file with no `image:` / `build:` of its own; it MUST be passed alongside the base, or compose errors with `service "certctl-server" has neither an image nor a build context specified`. ### Starting it @@ -382,7 +407,7 @@ Every `CERTCTL_*` environment variable is read by the server's `internal/config/ | `CERTCTL_SERVER_HOST` | `0.0.0.0` | Listen address | | `CERTCTL_SERVER_PORT` | `8443` | Listen port | | `CERTCTL_LOG_LEVEL` | `info` | Log verbosity: `debug`, `info`, `warn`, `error` | -| `CERTCTL_AUTH_TYPE` | `api-key` | Auth mode: `api-key` or `none` | +| `CERTCTL_AUTH_TYPE` | `api-key` | Auth mode: `api-key`, `none`, or `oidc` (Auth Bundle 2). | | `CERTCTL_AUTH_SECRET` | (none) | API key(s), comma-separated for rotation | | `CERTCTL_KEYGEN_MODE` | `agent` | Key generation: `agent` (production) or `server` (demo) | | `CERTCTL_CONFIG_ENCRYPTION_KEY` | (none) | AES-256-GCM key for encrypting issuer/target configs in DB | @@ -400,7 +425,7 @@ Every `CERTCTL_*` environment variable is read by the server's `internal/config/ | `CERTCTL_SERVER_URL` | (required) | Server API URL | | `CERTCTL_API_KEY` | (none) | API key for authenticating with server | | `CERTCTL_AGENT_NAME` | (hostname) | Display name in dashboard | -| `CERTCTL_AGENT_ID` | (auto-generated) | Stable agent identifier | +| `CERTCTL_AGENT_ID` | (none — required) | Stable agent identifier returned from `POST /api/v1/agents`. The agent binary fail-fasts at startup if unset. | | `CERTCTL_KEYGEN_MODE` | `agent` | Must match server setting | | `CERTCTL_LOG_LEVEL` | `info` | Log verbosity | | `CERTCTL_KEY_DIR` | `/var/lib/certctl/keys` | Directory for private key storage (0600 perms) | diff --git a/deploy/docker-compose.demo.yml b/deploy/docker-compose.demo.yml index c1b3dd7..9931564 100644 --- a/deploy/docker-compose.demo.yml +++ b/deploy/docker-compose.demo.yml @@ -1,26 +1,88 @@ -# Demo mode: pre-populated dashboard with 32 certificates, 8 agents, 10 issuers, etc. -# Use this to showcase certctl's dashboard with realistic data. +# ============================================================================= +# certctl DEMO overlay — Bundle 2 (2026-05-12) +# ============================================================================= # -# Usage: -# docker compose -f docker-compose.yml -f docker-compose.demo.yml up --build +# Layered on top of the production-shaped base (docker-compose.yml) to give +# operators a one-command, zero-config demo path: # -# To start fresh (wipe previous data): -# docker compose -f docker-compose.yml -f docker-compose.demo.yml down -v -# docker compose -f docker-compose.yml -f docker-compose.demo.yml up --build +# docker compose -f deploy/docker-compose.yml \ +# -f deploy/docker-compose.demo.yml up -d --build # -# U-3 (P1, cat-u-seed_initdb_schema_drift): pre-U-3 this overlay mounted -# `seed_demo.sql` into postgres `/docker-entrypoint-initdb.d/`. That worked -# only because the production stack also mounted the migrations there, so -# the schema existed at initdb time. Once U-3 dropped the production +# What this overlay does: +# +# 1. Flips CERTCTL_AUTH_TYPE=none + CERTCTL_DEMO_MODE_ACK=true. Every +# request is served as the synthetic admin actor `actor-demo-anon`; +# the server emits a prominent ⚠ DEMO MODE WARN banner at boot with +# a production-promotion checklist (cmd/server/main.go::emitDemoBanner). +# +# 2. Flips CERTCTL_KEYGEN_MODE=server (the demo issues + holds the key on +# the server to keep the dashboard populated; production deploys must +# use the default `agent` mode where keys never leave the agent box). +# +# 3. Flips CERTCTL_DEMO_SEED=true. The server applies migrations/seed_demo.sql +# at boot via postgres.RunDemoSeed AFTER baseline migrations + seed.sql, +# pre-seeding 180 days of simulated history across 13 issuers + 8 agents. +# +# 4. Supplies the change-me-... placeholder values for POSTGRES_PASSWORD, +# CERTCTL_API_KEY, CERTCTL_CONFIG_ENCRYPTION_KEY, and CERTCTL_AGENT_ID +# so the demo runs without a deploy/.env file. The Bundle 2 fail-closed +# Validate() rejects these placeholders outside demo mode, so this only +# works alongside DEMO_MODE_ACK=true. +# +# U-3 history: pre-U-3 this overlay mounted seed_demo.sql into postgres +# `/docker-entrypoint-initdb.d/`. That worked only because the production +# stack also mounted the migrations there. Once U-3 dropped the production # initdb mounts (single source of truth: server runs RunMigrations + RunSeed # at boot), the demo seed could no longer be applied at initdb time — the -# tables it references wouldn't exist yet. +# tables it references wouldn't exist yet. Post-U-3 the overlay just sets +# CERTCTL_DEMO_SEED=true; the server applies seed_demo.sql at boot via +# postgres.RunDemoSeed AFTER baseline migrations + seed.sql. # -# Post-U-3 the demo overlay just sets CERTCTL_DEMO_SEED=true; the server -# applies seed_demo.sql at boot via postgres.RunDemoSeed AFTER baseline -# migrations + seed.sql are in place. Same single source of truth, no -# initdb mounts, no schema-vs-seed drift. +# Bundle 2 history: pre-Bundle-2 the base compose IS this demo path; this +# overlay was a single-flag thin shim. Bundle 2 split the demo env vars +# out of the base so `docker compose -f deploy/docker-compose.yml up` +# (no overlay) boots production-shaped — which is what every operator +# reading the README quickstart line "drop the demo overlay for a clean +# install" expected. The overlay carries the full demo posture now. +# +# To start fresh (wipe previous data): +# docker compose -f deploy/docker-compose.yml \ +# -f deploy/docker-compose.demo.yml down -v +# docker compose -f deploy/docker-compose.yml \ +# -f deploy/docker-compose.demo.yml up -d --build + services: + postgres: + # Fixed weak password is intentional for the no-setup demo path. + # See docker-compose.yml for the production override pattern. + environment: + POSTGRES_PASSWORD: certctl + certctl-server: environment: + # Demo-mode auth: every request served as the synthetic + # `actor-demo-anon` admin. The server's HIGH-12 startup guard + # requires DEMO_MODE_ACK=true to allow this combination on a + # non-loopback bind; the boot-time WARN banner (cmd/server/main.go) + # reminds the operator on every start. + CERTCTL_AUTH_TYPE: none + CERTCTL_DEMO_MODE_ACK: "true" + # Server-side keygen so the demo can populate the dashboard with + # full lifecycle history. Production deploys leave this at the + # code default `agent` (CertctlAgent generates ECDSA P-256 keys + # locally and submits CSRs only). + CERTCTL_KEYGEN_MODE: server + # Demo creds — the Bundle 2 fail-closed Validate() rejects these + # sentinels outside demo mode, but DEMO_MODE_ACK=true unlocks them. + CERTCTL_CONFIG_ENCRYPTION_KEY: change-me-32-char-encryption-key + CERTCTL_AUTH_SECRET: change-me-in-production + # 180-day simulated history seed applied at boot. CERTCTL_DEMO_SEED: "true" + + certctl-agent: + environment: + # Pre-seeded by migrations/seed_demo.sql; the bundled agent + # connects with these creds and the demo-mode synthetic admin + # accepts every request regardless of API key. + CERTCTL_API_KEY: change-me-in-production + CERTCTL_AGENT_ID: agent-demo-1 diff --git a/deploy/docker-compose.yml b/deploy/docker-compose.yml index 2b1925f..4ff4193 100644 --- a/deploy/docker-compose.yml +++ b/deploy/docker-compose.yml @@ -1,3 +1,49 @@ +# ============================================================================= +# certctl base compose — PRODUCTION-SHAPED (Bundle 2, 2026-05-12) +# ============================================================================= +# +# This base file ships a SAFE-BY-DEFAULT control plane: +# +# - CERTCTL_AUTH_TYPE defaults to api-key (the code default; not overridden +# here). The server REFUSES to start with auth=none on a non-loopback +# bind unless CERTCTL_DEMO_MODE_ACK=true (Audit 2026-05-10 HIGH-12 + +# Bundle 2 closure: see internal/config/config.go::Validate). +# - CERTCTL_KEYGEN_MODE defaults to agent (the code default). +# - CERTCTL_DEMO_SEED defaults to false (the code default; the 180-day +# simulated history seed only runs under the demo overlay). +# - Default placeholder credentials (`change-me-...` sentinels) are NOT +# interpolated by this compose. The server REFUSES to start when those +# placeholder strings reach config (Bundle 2 fail-closed guards) unless +# DEMO_MODE_ACK=true. Operators MUST set: +# POSTGRES_PASSWORD (openssl rand -hex 32) +# CERTCTL_AUTH_SECRET (openssl rand -hex 32) +# CERTCTL_CONFIG_ENCRYPTION_KEY (openssl rand -base64 32) +# CERTCTL_API_KEY (matches CERTCTL_AUTH_SECRET or one +# of its rotation siblings) +# CERTCTL_AGENT_ID (returned from POST /api/v1/agents) +# in deploy/.env or the shell environment. See deploy/.env.example. +# +# USAGE +# ----- +# +# Production-shaped (this base alone): +# docker compose -f deploy/docker-compose.yml up -d +# +# Bundled demo (zero-config, populated dashboard, demo-mode auth): +# docker compose -f deploy/docker-compose.yml \ +# -f deploy/docker-compose.demo.yml up -d +# +# The demo overlay (docker-compose.demo.yml) layers in the demo-mode env +# vars (AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN_MODE=server + +# DEMO_SEED=true + the change-me placeholder creds). It exists so the +# `docker compose up` smoke + screenshot path stays one command — but it +# ALSO carries the operator-visible warning banner the server emits at +# boot when DEMO_MODE_ACK=true. +# +# Pre-Bundle-2 this base file WAS the demo path. The split happened in +# 2026-05-12; the README quickstart, deploy/ENVIRONMENTS.md, and the +# cold-DB compose smoke in .github/workflows/ci.yml were updated in the +# same commit to point at the new layout. services: # HTTPS-Everywhere Phase 3 — self-signed TLS bootstrap (init container). # Generates a CN=certctl-server ECDSA-P256 (SHA-256 signature) cert with @@ -82,7 +128,12 @@ services: environment: POSTGRES_DB: certctl POSTGRES_USER: certctl - POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-certctl} + # Bundle 2 closure: no `:-certctl` fallback. Operators MUST set + # POSTGRES_PASSWORD in deploy/.env or the shell environment. The + # demo overlay (docker-compose.demo.yml) supplies a fixed weak + # default for screenshot/demo use; production deploys never + # depend on that fallback. + POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} ports: - "5432:5432" volumes: @@ -123,34 +174,30 @@ services: # on the docker bridge network keeps sslmode=disable acceptable; for # external/managed Postgres operators MUST override CERTCTL_DATABASE_URL # with sslmode=verify-full and provide the CA bundle. See docs/database-tls.md. - CERTCTL_DATABASE_URL: ${CERTCTL_DATABASE_URL:-postgres://certctl:${POSTGRES_PASSWORD:-certctl}@postgres:5432/certctl?sslmode=disable} + CERTCTL_DATABASE_URL: ${CERTCTL_DATABASE_URL:-postgres://certctl:${POSTGRES_PASSWORD}@postgres:5432/certctl?sslmode=disable} CERTCTL_SERVER_HOST: 0.0.0.0 CERTCTL_SERVER_PORT: 8443 CERTCTL_SERVER_TLS_CERT_PATH: /etc/certctl/tls/server.crt CERTCTL_SERVER_TLS_KEY_PATH: /etc/certctl/tls/server.key CERTCTL_LOG_LEVEL: info - CERTCTL_AUTH_TYPE: none - # Audit 2026-05-10 HIGH-12 closure: when AUTH_TYPE=none AND the - # server binds to a non-loopback address (SERVER_HOST=0.0.0.0 - # above), every request is served as the synthetic actor - # `actor-demo-anon`. The server fail-fasts at startup unless - # DEMO_MODE_ACK=true acknowledges that posture. This compose IS - # the bundled demo path (see DEMO_SEED comment below), so the - # ACK is correct here. Production deploys override AUTH_TYPE + - # KEYGEN_MODE + DEMO_SEED + DEMO_MODE_ACK via their own compose. - CERTCTL_DEMO_MODE_ACK: "true" - CERTCTL_KEYGEN_MODE: server # Demo uses server-side keygen; production should use "agent" - CERTCTL_NETWORK_SCAN_ENABLED: "true" # Enable network scan GUI with seeded demo targets - CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY:-change-me-32-char-encryption-key} # AES-256-GCM for dynamic issuer/target config - # Bundle 1 follow-on: this compose IS the bundled demo path - # (CERTCTL_AUTH_TYPE=none + KEYGEN_MODE=server above), so the - # demo seed runs by default. seed_demo.sql pre-seeds the - # agent-demo-1 row that the bundled certctl-agent below needs - # to authenticate. The docker-compose.demo.yml overlay still - # works (it sets the same flag) and remains for backward - # compat. Production deploys override CERTCTL_AUTH_TYPE + - # KEYGEN_MODE + DEMO_SEED via their own compose. - CERTCTL_DEMO_SEED: "true" + # Bundle 2 closure (compose split). The base compose no longer + # sets CERTCTL_AUTH_TYPE / CERTCTL_KEYGEN_MODE / DEMO_MODE_ACK / + # DEMO_SEED — the code defaults take over (auth-type api-key, + # keygen agent, demo-mode false, demo-seed false). The demo + # overlay (docker-compose.demo.yml) is what flips this baseline + # into the populated-dashboard demo path; without that overlay + # the server boots production-shaped and refuses to start unless + # the operator has supplied CERTCTL_AUTH_SECRET + + # CERTCTL_CONFIG_ENCRYPTION_KEY. + # + # Audit 2026-05-10 HIGH-12: when DEMO_MODE_ACK=true (set by the + # demo overlay) AND the listener binds to a non-loopback address, + # every request is served as the synthetic admin actor + # `actor-demo-anon`. The server emits a prominent boot-time WARN + # banner with a production-promotion checklist in that case. + CERTCTL_AUTH_SECRET: ${CERTCTL_AUTH_SECRET} + CERTCTL_NETWORK_SCAN_ENABLED: "true" # Enable network scan GUI + CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY} # AES-256-GCM for dynamic issuer/target config # Bootstrap token interpolation surface (Auditable Codebase Bundle # cold-DB smoke closure, 2026-05-12). Pre-fix, the `env-file + # --force-recreate certctl-server` pattern documented in @@ -214,18 +261,19 @@ services: environment: CERTCTL_SERVER_URL: https://certctl-server:8443 CERTCTL_SERVER_CA_BUNDLE_PATH: /etc/certctl/tls/ca.crt - CERTCTL_API_KEY: ${CERTCTL_API_KEY:-change-me-in-production} - # Bundle 1 follow-on: pre-Bundle-1 the bundled agent had no - # CERTCTL_AGENT_ID set, hit cmd/agent/main.go's fail-fast guard - # ("agent-id flag or CERTCTL_AGENT_ID env var is required"), and - # restart-looped silently on every fresh `docker compose up`. - # Latent since 2026-03-14 (commit d395776). seed_demo.sql now - # pre-seeds the matching agents row; the demo runs with - # CERTCTL_AUTH_TYPE=none on the server so the api_key Bearer - # token is irrelevant here. Production deploys override - # CERTCTL_AGENT_ID with the value returned from - # POST /api/v1/agents during registration. - CERTCTL_AGENT_ID: ${CERTCTL_AGENT_ID:-agent-demo-1} + # Bundle 2 closure (compose split). No placeholder fallbacks. + # Operators MUST set CERTCTL_API_KEY (matching one of the server's + # CERTCTL_AUTH_SECRET rotation values) and CERTCTL_AGENT_ID + # (returned from `POST /api/v1/agents` during agent enrollment). + # Without an agent ID, cmd/agent/main.go fails fast at startup + # with "agent-id flag or CERTCTL_AGENT_ID env var is required" — + # the cold-DB compose smoke in .github/workflows/ci.yml tolerates + # the agent restart loop because the smoke targets server boot + # only. The demo overlay (docker-compose.demo.yml) supplies a + # pre-seeded agent-demo-1 row + matching env vars so the demo + # path stays one-command. + CERTCTL_API_KEY: ${CERTCTL_API_KEY} + CERTCTL_AGENT_ID: ${CERTCTL_AGENT_ID} CERTCTL_AGENT_NAME: docker-agent CERTCTL_LOG_LEVEL: info CERTCTL_DISCOVERY_DIRS: /var/lib/certctl/keys # Agent scans this directory for existing certificates diff --git a/docs/reference/architecture.md b/docs/reference/architecture.md index 572a40e..7bce55f 100644 --- a/docs/reference/architecture.md +++ b/docs/reference/architecture.md @@ -151,7 +151,12 @@ The agent runs two background loops: a heartbeat (every 60 seconds) to signal it Retired agents receive `410 Gone` on subsequent heartbeats (`service.ErrAgentRetired`). `cmd/agent` treats 410 as a terminal signal and exits cleanly so retired agents stop phoning home. Migration `000015` flipped `deployment_targets.agent_id` from `ON DELETE CASCADE` to `ON DELETE RESTRICT`, making the old hard-delete path a schema error and forcing all retirement through this contract. -**Registration is by-design pull-only (C-1 closure, cat-b-6177f36636fb).** Agents register themselves at first heartbeat via `install-agent.sh` + `cmd/agent/main.go` — never via the GUI. The `web/src/api/client.ts::registerAgent` client function is intentionally orphan in the dashboard for this reason. It's preserved in `client.ts` (rather than deleted) so future features that want to drive registration from the GUI — for example, a one-click "register proxy agent" panel for network-appliance topologies where the agent runs in a different network zone from the device it manages — can reach the endpoint without a `client.ts` edit. Operators looking to scale agent enrollment use `install-agent.sh` against a config-management system (Ansible, Salt, Puppet) or a baked-in cloud-init script, not the dashboard. +**Registration is a two-step operator-driven flow (C-1 closure, cat-b-6177f36636fb).** Agent enrollment is intentionally NOT auto-driven by the agent binary — the agent fail-fasts at startup if `CERTCTL_AGENT_ID` is unset (`cmd/agent/main.go`: "agent-id flag or CERTCTL_AGENT_ID env var is required"). Operators register an agent in one of two ways before starting it: + +1. **Programmatic** — `POST /api/v1/agents` with the agent's metadata payload and (when configured) an `Authorization: Bearer ` header. The response carries the `id` field; that string goes into `CERTCTL_AGENT_ID` for the agent process. Suitable for config-management (Ansible, Salt, Puppet) or cloud-init flows. +2. **GUI** — the dashboard's Agents page exposes the same endpoint via `web/src/api/client.ts::registerAgent`. The function is kept reachable rather than deleted so the eventual "register proxy agent" panel for network-appliance topologies can land without a `client.ts` edit; today the panel is not yet wired into the page. + +Once registered, the operator passes the returned ID to `install-agent.sh` via `--agent-id` (or sets the env var directly) and starts the agent. The pull-only deployment model (the server never initiates outbound connections to agents) means this asymmetric flow is by-design: only the agent's network reach matters, and registration always crosses that boundary outbound from the agent's side once the agent boots with a valid ID. ### Web Dashboard diff --git a/docs/reference/configuration.md b/docs/reference/configuration.md index c74c487..6061a53 100644 --- a/docs/reference/configuration.md +++ b/docs/reference/configuration.md @@ -80,7 +80,7 @@ For the full deploy contract see | Variable | Default | Description | |---|---|---| -| `CERTCTL_AGENT_ID` | (none — required) | The agent's unique ID, issued by `POST /api/v1/agents/register` and bundled into the agent's registration response. Pass via this env var when the agent runs as a systemd unit / container without the `-agent-id` CLI flag. | +| `CERTCTL_AGENT_ID` | (none — required) | The agent's unique ID, issued by `POST /api/v1/agents` (requires `CERTCTL_AGENT_BOOTSTRAP_TOKEN` when configured) and returned in the registration response body. Pass via this env var when the agent runs as a systemd unit / container without the `-agent-id` CLI flag. The bundled `install-agent.sh` does NOT auto-register — operators pre-register an agent via the REST endpoint (or the dashboard), then pass the returned ID to the script via `--agent-id`. | ## Auth (RBAC + OIDC + sessions + break-glass) diff --git a/internal/config/config.go b/internal/config/config.go index a7f0007..e6da123 100644 --- a/internal/config/config.go +++ b/internal/config/config.go @@ -2633,6 +2633,70 @@ func (c *Config) Validate() error { } } + // Bundle 2 (2026-05-12) — fail-closed startup guards for placeholder + // credentials shipped by the demo overlay (docker-compose.demo.yml). + // + // Rationale: pre-Bundle-2 the base docker-compose.yml file interpolated + // these strings as the default value when an operator didn't set + // CERTCTL_AUTH_SECRET / CERTCTL_API_KEY / CERTCTL_CONFIG_ENCRYPTION_KEY + // in deploy/.env. The result: `docker compose up` produced a working + // stack with documented "weak" credentials that nobody actually + // remembered to rotate before going to production. The Bundle 2 compose + // split moved those defaults into the demo overlay; the guards below + // catch any path that still surfaces them in a non-demo deploy (e.g. + // the .env-example was committed unedited, or a custom compose copied + // the placeholder verbatim). + // + // All three sentinels exactly match the literal strings shipped in + // deploy/docker-compose.demo.yml. The demo overlay also sets + // DemoModeAck=true, so the demo path itself is exempt and these + // strings only fail in production. + const ( + placeholderAPISecret = "change-me-in-production" + placeholderEncryptionKey = "change-me-32-char-encryption-key" + ) + if !c.Auth.DemoModeAck { + // HIGH-6 closure (Audit Bundle 2): placeholder API-key secret. + if c.Auth.Type == string(AuthTypeAPIKey) && c.Auth.Secret == placeholderAPISecret { + return fmt.Errorf( + "CERTCTL_AUTH_SECRET is set to the demo placeholder %q — refuse to start. "+ + "Generate a real value with: openssl rand -base64 32. "+ + "This guard exempts demo mode (CERTCTL_DEMO_MODE_ACK=true); production "+ + "deploys MUST rotate.", + placeholderAPISecret) + } + // HIGH-6 closure (Audit Bundle 2): placeholder encryption key. + if c.Encryption.ConfigEncryptionKey == placeholderEncryptionKey { + return fmt.Errorf( + "CERTCTL_CONFIG_ENCRYPTION_KEY is set to the demo placeholder %q — refuse to start. "+ + "Generate a real value with: openssl rand -base64 32 (must be ≥ 32 bytes). "+ + "This guard exempts demo mode (CERTCTL_DEMO_MODE_ACK=true); production "+ + "deploys MUST rotate before any issuer/target credentials are encrypted at rest "+ + "with the placeholder passphrase.", + placeholderEncryptionKey) + } + // LOW-5 closure (Audit Bundle 2): CORS wildcard in non-demo mode. + // Wildcard CORS combined with credentialed cookies (the session + // auth Bundle 2 ships) is a CSRF cross-origin escalation channel + // (CWE-942 + CWE-352). The auth-exempt routes already route through + // middleware.NewCORS with the operator's allowlist; "*" in the + // allowlist short-circuits the entire defense. Demo mode is + // exempt because the demo synthetic actor has no real credentials + // worth stealing, and demo screencaps frequently want to exercise + // the dashboard from a Mermaid-rendered URL or whatever. + for _, origin := range c.CORS.AllowedOrigins { + if origin == "*" { + return fmt.Errorf( + "CERTCTL_CORS_ORIGINS contains \"*\" wildcard — refuse to start. " + + "Wildcard CORS combined with credentialed cookies is a cross-origin " + + "CSRF / session-theft channel (CWE-942 + CWE-352). Set a concrete " + + "allowlist (e.g. CERTCTL_CORS_ORIGINS=https://dashboard.example.com) " + + "or set CERTCTL_DEMO_MODE_ACK=true if this is a demo deploy that " + + "has no real session credentials worth defending.") + } + } + } + // Validate keygen mode validKeygenModes := map[string]bool{ "agent": true, diff --git a/internal/config/config_test.go b/internal/config/config_test.go index 34ee2a6..d4817e1 100644 --- a/internal/config/config_test.go +++ b/internal/config/config_test.go @@ -1526,3 +1526,159 @@ func TestValidate_SCEPDisabled_EmptyRAPair_Accepts(t *testing.T) { t.Errorf("Validate() = %v, want nil for SCEP disabled with empty RA pair", err) } } + +// Bundle 2 closure (2026-05-12) — fail-closed startup guards against +// placeholder credentials shipped by the demo overlay +// (deploy/docker-compose.demo.yml). The literal strings below MUST stay +// in sync with the sentinels in internal/config/config.go::Validate; the +// demo overlay also writes these exact values into its env block, so any +// drift between the three locations would silently break the closure. + +// TestValidate_Bundle2_PlaceholderAuthSecret_Refused pins the contract +// that the placeholder string "change-me-in-production" in +// CERTCTL_AUTH_SECRET hard-fails Validate() outside demo mode. +func TestValidate_Bundle2_PlaceholderAuthSecret_Refused(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.Auth.Type = "api-key" + cfg.Auth.Secret = "change-me-in-production" + cfg.Auth.DemoModeAck = false + + err := cfg.Validate() + if err == nil { + t.Fatal("Validate() returned nil; expected refusal on placeholder CERTCTL_AUTH_SECRET") + } + for _, want := range []string{"CERTCTL_AUTH_SECRET", "change-me-in-production", "openssl rand"} { + if !strings.Contains(err.Error(), want) { + t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want) + } + } +} + +// TestValidate_Bundle2_PlaceholderAuthSecret_DemoAckExempt pins that +// the demo overlay (which sets the placeholder + DemoModeAck=true) is +// exempt — without this exemption the demo path would fail to boot. +func TestValidate_Bundle2_PlaceholderAuthSecret_DemoAckExempt(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + // Demo overlay sets AUTH_TYPE=none (so the placeholder doesn't even + // hit the api-key branch), but cover the api-key + ack edge case too + // in case an operator manually flips the demo overlay's AUTH_TYPE. + cfg.Auth.Type = "api-key" + cfg.Auth.Secret = "change-me-in-production" + cfg.Auth.DemoModeAck = true + + if err := cfg.Validate(); err != nil { + t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept placeholder secret", err) + } +} + +// TestValidate_Bundle2_PlaceholderEncryptionKey_Refused pins the +// contract that "change-me-32-char-encryption-key" hard-fails Validate() +// outside demo mode. Note: this string is exactly 32 bytes, so it +// passes the H-1 length floor; the only thing catching it is the +// Bundle 2 value-equality guard. +func TestValidate_Bundle2_PlaceholderEncryptionKey_Refused(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.Encryption.ConfigEncryptionKey = "change-me-32-char-encryption-key" + cfg.Auth.DemoModeAck = false + + err := cfg.Validate() + if err == nil { + t.Fatal("Validate() returned nil; expected refusal on placeholder CERTCTL_CONFIG_ENCRYPTION_KEY") + } + for _, want := range []string{"CERTCTL_CONFIG_ENCRYPTION_KEY", "change-me-32-char-encryption-key", "openssl rand"} { + if !strings.Contains(err.Error(), want) { + t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want) + } + } +} + +// TestValidate_Bundle2_PlaceholderEncryptionKey_DemoAckExempt covers +// the demo overlay's posture (placeholder + DemoModeAck=true). +func TestValidate_Bundle2_PlaceholderEncryptionKey_DemoAckExempt(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.Encryption.ConfigEncryptionKey = "change-me-32-char-encryption-key" + cfg.Auth.DemoModeAck = true + + if err := cfg.Validate(); err != nil { + t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept placeholder encryption key", err) + } +} + +// TestValidate_Bundle2_RealEncryptionKey_NotMistakenForPlaceholder +// pins that a real `openssl rand -base64 32` output sails through. +// Defense against an over-broad match (e.g. accidentally rejecting any +// key starting with "change-me-"). +func TestValidate_Bundle2_RealEncryptionKey_NotMistakenForPlaceholder(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + // 44-char base64 sample — same shape `openssl rand -base64 32` produces. + cfg.Encryption.ConfigEncryptionKey = "Tc1hZ4n3Ph5gC8e2zR0qV6jX9mYwL1pK4wB7uE3nQ5o=" + cfg.Auth.DemoModeAck = false + + if err := cfg.Validate(); err != nil { + t.Errorf("Validate() returned %v; want nil for realistic operator key", err) + } +} + +// TestValidate_Bundle2_CORSWildcard_Refused pins the LOW-5 closure: +// CERTCTL_CORS_ORIGINS containing "*" hard-fails Validate() outside +// demo mode. Wildcard CORS + session cookies = CWE-942 + CWE-352. +func TestValidate_Bundle2_CORSWildcard_Refused(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.CORS.AllowedOrigins = []string{"*"} + cfg.Auth.DemoModeAck = false + + err := cfg.Validate() + if err == nil { + t.Fatal("Validate() returned nil; expected refusal on wildcard CORS") + } + for _, want := range []string{"CERTCTL_CORS_ORIGINS", "wildcard", "CSRF"} { + if !strings.Contains(err.Error(), want) { + t.Errorf("Validate() error = %q; missing operator guidance substring %q", err, want) + } + } +} + +// TestValidate_Bundle2_CORSWildcard_DemoAckExempt covers the demo +// posture (operators frequently want unrestricted CORS for dashboard +// screencaps + curl-from-any-origin diagnostics). +func TestValidate_Bundle2_CORSWildcard_DemoAckExempt(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.CORS.AllowedOrigins = []string{"*"} + cfg.Auth.DemoModeAck = true + + if err := cfg.Validate(); err != nil { + t.Errorf("Validate() returned %v with DemoModeAck=true; demo path must accept wildcard CORS", err) + } +} + +// TestValidate_Bundle2_CORSWildcard_MixedAllowlistStillRefused pins +// that "*" mixed into an otherwise-concrete allowlist still trips the +// guard. The wildcard short-circuits the entire allowlist in +// middleware.NewCORS, so leaving "*" alongside legit origins is just +// as dangerous as "*" alone. +func TestValidate_Bundle2_CORSWildcard_MixedAllowlistStillRefused(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.CORS.AllowedOrigins = []string{"https://dashboard.example.com", "*", "https://other.example.com"} + cfg.Auth.DemoModeAck = false + + err := cfg.Validate() + if err == nil { + t.Fatal("Validate() returned nil; expected refusal on wildcard mixed into allowlist") + } + if !strings.Contains(err.Error(), "wildcard") { + t.Errorf("Validate() error = %q; want wildcard mention", err) + } +} + +// TestValidate_Bundle2_CORSConcreteAllowlist_Accepted pins that a real +// operator allowlist sails through (no false-positive on substring match +// or similar over-broad matching). +func TestValidate_Bundle2_CORSConcreteAllowlist_Accepted(t *testing.T) { + cfg := validBaseConfigForEncryption(t) + cfg.CORS.AllowedOrigins = []string{"https://dashboard.example.com", "https://admin.example.com"} + cfg.Auth.DemoModeAck = false + + if err := cfg.Validate(); err != nil { + t.Errorf("Validate() returned %v; want nil for concrete CORS allowlist", err) + } +} diff --git a/scripts/ci-guards/B2-compose-base-no-demo-env.sh b/scripts/ci-guards/B2-compose-base-no-demo-env.sh new file mode 100755 index 0000000..862bc52 --- /dev/null +++ b/scripts/ci-guards/B2-compose-base-no-demo-env.sh @@ -0,0 +1,101 @@ +#!/usr/bin/env bash +# scripts/ci-guards/B2-compose-base-no-demo-env.sh +# +# Bundle 2 closure (2026-05-12) — base compose must stay production-shaped. +# +# Pre-Bundle-2 the base file `deploy/docker-compose.yml` shipped with the +# demo-mode env vars baked in (CERTCTL_AUTH_TYPE=none + DEMO_MODE_ACK=true + +# KEYGEN_MODE=server + DEMO_SEED=true + literal change-me placeholder +# credentials). The README, ENVIRONMENTS.md, and operator intuition all +# said "drop the demo overlay for a clean install" — but dropping the +# overlay still produced a demo-shape stack because the demo posture +# lived in the base. The Bundle 2 closure (cowork/bundle-2-prompt.md) +# moved every demo-mode env var out of the base into the demo overlay. +# +# This guard catches any future regression that would re-introduce a +# demo-mode env var into the base file. The signals checked map 1:1 to +# the env vars the overlay now owns: +# +# CERTCTL_AUTH_TYPE: none — demo-mode synthetic admin +# CERTCTL_DEMO_MODE_ACK: "true" — the HIGH-12 bypass acknowledgment +# CERTCTL_KEYGEN_MODE: server — demo-only server-side keygen +# CERTCTL_DEMO_SEED: "true" — 180-day simulated history seeder +# change-me-in-production — literal placeholder API secret +# change-me-32-char-encryption-key — literal placeholder encryption key +# +# Per the contract documented in scripts/ci-guards/README.md: +# bare callable, no args, no env, exit 0 on clean. + +set -e + +GUARD_NAME="B2-compose-base-no-demo-env" +BASE="deploy/docker-compose.yml" + +if [ ! -f "$BASE" ]; then + echo "${GUARD_NAME}: ${BASE} not found — refuse to skip silently." + exit 1 +fi + +# The patterns below match the literal Bundle-2-overlay-owned env values +# anywhere in the base compose file. Comments are excluded so the same +# strings can still appear in documentation-style # comments inside the +# file (which is exactly what we want — the base file still references +# the overlay's name and posture in its header). + +# grep helpers: -E for ERE, -v '^\s*#' to drop YAML comments, -F to +# match literal strings (no regex meta-chars in sentinels). The base +# compose still has narrative-comment references to the overlay's +# posture (CERTCTL_AUTH_TYPE=none ... etc.) so we can't grep the file +# raw — strip comments first. + +stripped=$(sed -E 's/^\s*#.*$//' "$BASE" \ + | sed -E 's/^([^#]*)#.*$/\1/') + +failed=0 + +# Pattern 1: CERTCTL_AUTH_TYPE: none (with the YAML "key: value" shape). +if echo "$stripped" | grep -qE '^\s*CERTCTL_AUTH_TYPE\s*:\s*none\s*$'; then + echo "::error file=${BASE}::CERTCTL_AUTH_TYPE: none belongs in deploy/docker-compose.demo.yml (the demo overlay), not the base compose. Bundle 2 closure: the base must boot production-shaped. See cowork/bundle-2-prompt.md." + failed=1 +fi + +# Pattern 2: CERTCTL_DEMO_MODE_ACK: "true" (the HIGH-12 bypass ACK). +if echo "$stripped" | grep -qE '^\s*CERTCTL_DEMO_MODE_ACK\s*:\s*"?true"?\s*$'; then + echo "::error file=${BASE}::CERTCTL_DEMO_MODE_ACK: \"true\" belongs in deploy/docker-compose.demo.yml. Setting it in the base disables the HIGH-12 fail-closed guard on every deploy that uses the base alone." + failed=1 +fi + +# Pattern 3: CERTCTL_KEYGEN_MODE: server (demo-only setting). +if echo "$stripped" | grep -qE '^\s*CERTCTL_KEYGEN_MODE\s*:\s*server\s*$'; then + echo "::error file=${BASE}::CERTCTL_KEYGEN_MODE: server belongs in deploy/docker-compose.demo.yml. Production deploys must use the code default 'agent' so private keys never leave agent infrastructure." + failed=1 +fi + +# Pattern 4: CERTCTL_DEMO_SEED: "true" (180-day simulated history seeder). +if echo "$stripped" | grep -qE '^\s*CERTCTL_DEMO_SEED\s*:\s*"?true"?\s*$'; then + echo "::error file=${BASE}::CERTCTL_DEMO_SEED: \"true\" belongs in deploy/docker-compose.demo.yml. The 180-day demo-history seeder must not run on the production-shaped base." + failed=1 +fi + +# Pattern 5+6: literal change-me placeholder credentials. +# Use fgrep against the stripped (no-comment) content so the narrative +# header in deploy/docker-compose.yml can still mention the sentinels. +if echo "$stripped" | grep -qF 'change-me-in-production'; then + echo "::error file=${BASE}::literal \"change-me-in-production\" placeholder belongs in deploy/docker-compose.demo.yml. The base compose Validate() now refuses to start with this placeholder outside demo mode." + failed=1 +fi + +if echo "$stripped" | grep -qF 'change-me-32-char-encryption-key'; then + echo "::error file=${BASE}::literal \"change-me-32-char-encryption-key\" placeholder belongs in deploy/docker-compose.demo.yml. The base compose Validate() now refuses to start with this placeholder outside demo mode." + failed=1 +fi + +if [ "$failed" -ne 0 ]; then + echo "" + echo "${GUARD_NAME}: FAILED — base compose has regressed into demo-mode territory." + echo " Move the offending env vars into deploy/docker-compose.demo.yml and" + echo " re-run: bash scripts/ci-guards/${GUARD_NAME}.sh" + exit 1 +fi + +echo "${GUARD_NAME}: clean."