mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 18:41:30 +00:00
d030c26914
Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.
Source findings closed:
R2 R3 C1 D9 finding-2 S9 (repo audit)
SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)
Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.
Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
• HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse
• HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
• LOW-5: CORS_ORIGINS contains "*" (CWE-942 + CWE-352) → refuse
Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.
Agent enrollment doc alignment:
• docs/reference/configuration.md L83: corrected the non-existent
URL `POST /api/v1/agents/register` to the real route
`POST /api/v1/agents`; added the bootstrap-token note and the
install-agent.sh handoff sequence.
• docs/reference/architecture.md L154: replaced "agents register
themselves at first heartbeat" (false — cmd/agent/main.go fail-
fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
operator-driven flow (REST or GUI registration first, returned ID
fed to install-agent.sh second).
Tests + CI guard:
• 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
covering: placeholder-secret refused + demo-ack exempt; placeholder
encryption-key refused + demo-ack exempt; real key not mistaken for
placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
into a concrete allowlist still refused; concrete allowlist accepted.
• scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
compose for any of the demo-mode env vars + placeholder credentials.
Comments stripped before checking so the narrative header in the
base file can still reference the overlay's posture in prose.
Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.
Receipts:
• Current first-run truth table
compose flag → posture
-f docker-compose.yml (production)
→ requires .env;
fail-fasts on
missing AUTH_SECRET
/ CONFIG_ENCRYPTION
_KEY / POSTGRES
_PASSWORD; agent
fail-fasts on
missing AGENT_ID
-f docker-compose.yml -f docker-compose.demo.yml (demo)
→ zero-config;
AUTH_TYPE=none +
DEMO_MODE_ACK=true
+ KEYGEN=server +
DEMO_SEED=true;
boot banner WARN
-f docker-compose.yml -f docker-compose.dev.yml (dev)
→ base + PgAdmin
+ debug logging
-f docker-compose.test.yml (test, standalone)
→ production-shape
posture, real CA
backends
• Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
gofmt -l # clean (no diffs)
go vet ./internal/config ./cmd/server # clean
go test -short -count=1 ./internal/config/... # PASS (cumulative +
all 9 new Bundle 2
cases green)
go test -short -count=1 # PASS (no regression
./internal/connector/target/configcheck in the Bundle 1 -
closure tests)
go build ./cmd/server ./cmd/agent # clean
./cmd/cli ./cmd/mcp-server
bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean
bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
bash scripts/ci-guards/G-3-env-docs-drift.sh # clean
Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
• The first `docker compose -f docker-compose.yml up -d` against a
pre-Bundle-2 .env (placeholder values still in place) will now
fail-fast. This is the intended posture but operators upgrading
from v2.0.x via .env-from-old-master need to rotate before
upgrading. The CHANGELOG note for the v2.1.0 release should
call this out alongside Auth Bundle 2's other breaking changes.
Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
313 lines
14 KiB
YAML
313 lines
14 KiB
YAML
# =============================================================================
|
|
# certctl base compose — PRODUCTION-SHAPED (Bundle 2, 2026-05-12)
|
|
# =============================================================================
|
|
#
|
|
# This base file ships a SAFE-BY-DEFAULT control plane:
|
|
#
|
|
# - CERTCTL_AUTH_TYPE defaults to api-key (the code default; not overridden
|
|
# here). The server REFUSES to start with auth=none on a non-loopback
|
|
# bind unless CERTCTL_DEMO_MODE_ACK=true (Audit 2026-05-10 HIGH-12 +
|
|
# Bundle 2 closure: see internal/config/config.go::Validate).
|
|
# - CERTCTL_KEYGEN_MODE defaults to agent (the code default).
|
|
# - CERTCTL_DEMO_SEED defaults to false (the code default; the 180-day
|
|
# simulated history seed only runs under the demo overlay).
|
|
# - Default placeholder credentials (`change-me-...` sentinels) are NOT
|
|
# interpolated by this compose. The server REFUSES to start when those
|
|
# placeholder strings reach config (Bundle 2 fail-closed guards) unless
|
|
# DEMO_MODE_ACK=true. Operators MUST set:
|
|
# POSTGRES_PASSWORD (openssl rand -hex 32)
|
|
# CERTCTL_AUTH_SECRET (openssl rand -hex 32)
|
|
# CERTCTL_CONFIG_ENCRYPTION_KEY (openssl rand -base64 32)
|
|
# CERTCTL_API_KEY (matches CERTCTL_AUTH_SECRET or one
|
|
# of its rotation siblings)
|
|
# CERTCTL_AGENT_ID (returned from POST /api/v1/agents)
|
|
# in deploy/.env or the shell environment. See deploy/.env.example.
|
|
#
|
|
# USAGE
|
|
# -----
|
|
#
|
|
# Production-shaped (this base alone):
|
|
# docker compose -f deploy/docker-compose.yml up -d
|
|
#
|
|
# Bundled demo (zero-config, populated dashboard, demo-mode auth):
|
|
# docker compose -f deploy/docker-compose.yml \
|
|
# -f deploy/docker-compose.demo.yml up -d
|
|
#
|
|
# The demo overlay (docker-compose.demo.yml) layers in the demo-mode env
|
|
# vars (AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN_MODE=server +
|
|
# DEMO_SEED=true + the change-me placeholder creds). It exists so the
|
|
# `docker compose up` smoke + screenshot path stays one command — but it
|
|
# ALSO carries the operator-visible warning banner the server emits at
|
|
# boot when DEMO_MODE_ACK=true.
|
|
#
|
|
# Pre-Bundle-2 this base file WAS the demo path. The split happened in
|
|
# 2026-05-12; the README quickstart, deploy/ENVIRONMENTS.md, and the
|
|
# cold-DB compose smoke in .github/workflows/ci.yml were updated in the
|
|
# same commit to point at the new layout.
|
|
services:
|
|
# HTTPS-Everywhere Phase 3 — self-signed TLS bootstrap (init container).
|
|
# Generates a CN=certctl-server ECDSA-P256 (SHA-256 signature) cert with
|
|
# the SAN list locked by milestone §3.6 on first boot; subsequent boots
|
|
# see the cert already present in the `certs` named volume and no-op out.
|
|
# Server + agent mount the volume read-only. Destroy via `docker compose
|
|
# down -v` to force regeneration. This bootstrap is for docker-compose
|
|
# demos and local dev only; Helm operators supply a Secret / cert-manager
|
|
# Certificate per docs/tls.md.
|
|
#
|
|
# Rationale for ECDSA-P256 (was ed25519 pre-v2.0.48): Apple's TLS stack
|
|
# — Safari Network Framework and the macOS-bundled LibreSSL 3.3.6
|
|
# /usr/bin/curl — does not advertise ed25519 in the ClientHello
|
|
# signature_algorithms extension for server certs, yielding "tls: peer
|
|
# doesn't support any of the certificate's signature algorithms" at
|
|
# handshake. ECDSA-P256 with SHA-256 is universally supported. See
|
|
# docs/tls.md Pattern 1.
|
|
certctl-tls-init:
|
|
image: alpine/openssl:latest
|
|
container_name: certctl-tls-init
|
|
restart: "no"
|
|
entrypoint: /bin/sh
|
|
command:
|
|
- -c
|
|
- |
|
|
set -eu
|
|
CERT=/etc/certctl/tls/server.crt
|
|
KEY=/etc/certctl/tls/server.key
|
|
CA=/etc/certctl/tls/ca.crt
|
|
if [ -f "$$CERT" ] && [ -f "$$KEY" ] && [ -f "$$CA" ]; then
|
|
echo "TLS cert already present at $$CERT — skipping generation"
|
|
else
|
|
mkdir -p /etc/certctl/tls
|
|
openssl req -x509 -newkey ec \
|
|
-pkeyopt ec_paramgen_curve:P-256 \
|
|
-nodes \
|
|
-keyout "$$KEY" \
|
|
-out "$$CERT" \
|
|
-days 3650 \
|
|
-subj "/CN=certctl-server" \
|
|
-addext "subjectAltName=DNS:certctl-server,DNS:localhost,IP:127.0.0.1,IP:::1"
|
|
cp "$$CERT" "$$CA"
|
|
echo "Generated self-signed TLS cert for certctl-server (ECDSA-P256/SHA-256, 3650d, CN=certctl-server)"
|
|
fi
|
|
# certctl binary runs as UID 1000 inside the server container per
|
|
# Dockerfile:64-65; the cert + key must be readable by that UID.
|
|
chown 1000:1000 "$$CERT" "$$KEY" "$$CA"
|
|
chmod 0644 "$$CERT" "$$CA"
|
|
chmod 0600 "$$KEY"
|
|
volumes:
|
|
- certs:/etc/certctl/tls
|
|
networks:
|
|
- certctl-network
|
|
|
|
# PostgreSQL database
|
|
#
|
|
# U-3 (P1, cat-u-seed_initdb_schema_drift, GitHub #10):
|
|
# Pre-U-3 this stack mounted a hand-curated subset of `migrations/*.up.sql`
|
|
# plus `seed.sql` into `/docker-entrypoint-initdb.d/`, and postgres
|
|
# initdb-applied them on first boot. The mount list rotted every time a
|
|
# new migration shipped that the seed depended on (000013 added
|
|
# policy_rules.severity, 000017 renames retry_interval_minutes, etc.) —
|
|
# initdb crashed, the container reported `unhealthy` indefinitely, and
|
|
# `docker compose -f deploy/docker-compose.yml up -d --build` from a
|
|
# fresh clone of v2.0.50 hit it on the first try.
|
|
#
|
|
# Post-U-3 the schema is built EXCLUSIVELY by the server at startup via
|
|
# internal/repository/postgres.RunMigrations + RunSeed. Single source of
|
|
# truth, no list to keep in sync. Postgres comes up empty; the server
|
|
# waits for it healthy, then applies the full migration ladder + seed in
|
|
# one shot. Helm + the dev examples were already runtime-only (Path B)
|
|
# and worked through the same window.
|
|
#
|
|
# `start_period: 30s` gives postgres room to bootstrap on slow runners
|
|
# (CI macOS, low-spec laptops) before the healthcheck failure counter
|
|
# starts ticking. Pre-U-3 a slow first-init combined with the
|
|
# `unhealthy` flap to cascade into certctl-server's `service_healthy`
|
|
# depends_on, blocking the whole stack.
|
|
postgres:
|
|
image: postgres:16-alpine
|
|
container_name: certctl-postgres
|
|
environment:
|
|
POSTGRES_DB: certctl
|
|
POSTGRES_USER: certctl
|
|
# Bundle 2 closure: no `:-certctl` fallback. Operators MUST set
|
|
# POSTGRES_PASSWORD in deploy/.env or the shell environment. The
|
|
# demo overlay (docker-compose.demo.yml) supplies a fixed weak
|
|
# default for screenshot/demo use; production deploys never
|
|
# depend on that fallback.
|
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
|
|
ports:
|
|
- "5432:5432"
|
|
volumes:
|
|
- postgres_data:/var/lib/postgresql/data
|
|
networks:
|
|
- certctl-network
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U certctl -d certctl"]
|
|
interval: 5s
|
|
timeout: 5s
|
|
retries: 5
|
|
start_period: 30s
|
|
restart: unless-stopped
|
|
|
|
# Certctl Server (API + scheduler)
|
|
certctl-server:
|
|
build:
|
|
context: ..
|
|
dockerfile: Dockerfile
|
|
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
|
|
# vars into the Docker build so the Node frontend stage and Go module
|
|
# download can reach the public registries behind corporate proxies.
|
|
# Defaults to empty; omit the variables from the host environment for
|
|
# un-proxied builds and the behaviour is byte-identical to the pre-fix
|
|
# tree.
|
|
args:
|
|
HTTP_PROXY: ${HTTP_PROXY:-}
|
|
HTTPS_PROXY: ${HTTPS_PROXY:-}
|
|
NO_PROXY: ${NO_PROXY:-}
|
|
container_name: certctl-server
|
|
depends_on:
|
|
postgres:
|
|
condition: service_healthy
|
|
certctl-tls-init:
|
|
condition: service_completed_successfully
|
|
environment:
|
|
# Bundle B / Audit M-018 (PCI-DSS Req 4 / CWE-319): in-cluster Postgres
|
|
# on the docker bridge network keeps sslmode=disable acceptable; for
|
|
# external/managed Postgres operators MUST override CERTCTL_DATABASE_URL
|
|
# with sslmode=verify-full and provide the CA bundle. See docs/database-tls.md.
|
|
CERTCTL_DATABASE_URL: ${CERTCTL_DATABASE_URL:-postgres://certctl:${POSTGRES_PASSWORD}@postgres:5432/certctl?sslmode=disable}
|
|
CERTCTL_SERVER_HOST: 0.0.0.0
|
|
CERTCTL_SERVER_PORT: 8443
|
|
CERTCTL_SERVER_TLS_CERT_PATH: /etc/certctl/tls/server.crt
|
|
CERTCTL_SERVER_TLS_KEY_PATH: /etc/certctl/tls/server.key
|
|
CERTCTL_LOG_LEVEL: info
|
|
# Bundle 2 closure (compose split). The base compose no longer
|
|
# sets CERTCTL_AUTH_TYPE / CERTCTL_KEYGEN_MODE / DEMO_MODE_ACK /
|
|
# DEMO_SEED — the code defaults take over (auth-type api-key,
|
|
# keygen agent, demo-mode false, demo-seed false). The demo
|
|
# overlay (docker-compose.demo.yml) is what flips this baseline
|
|
# into the populated-dashboard demo path; without that overlay
|
|
# the server boots production-shaped and refuses to start unless
|
|
# the operator has supplied CERTCTL_AUTH_SECRET +
|
|
# CERTCTL_CONFIG_ENCRYPTION_KEY.
|
|
#
|
|
# Audit 2026-05-10 HIGH-12: when DEMO_MODE_ACK=true (set by the
|
|
# demo overlay) AND the listener binds to a non-loopback address,
|
|
# every request is served as the synthetic admin actor
|
|
# `actor-demo-anon`. The server emits a prominent boot-time WARN
|
|
# banner with a production-promotion checklist in that case.
|
|
CERTCTL_AUTH_SECRET: ${CERTCTL_AUTH_SECRET}
|
|
CERTCTL_NETWORK_SCAN_ENABLED: "true" # Enable network scan GUI
|
|
CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY} # AES-256-GCM for dynamic issuer/target config
|
|
# Bootstrap token interpolation surface (Auditable Codebase Bundle
|
|
# cold-DB smoke closure, 2026-05-12). Pre-fix, the `env-file +
|
|
# --force-recreate certctl-server` pattern documented in
|
|
# cowork/manual-testing-bundle-2.html (and used by the cold-DB
|
|
# smoke job in .github/workflows/ci.yml::cold-db-compose-smoke)
|
|
# set CERTCTL_BOOTSTRAP_TOKEN in compose's own interpolation
|
|
# environment but the container never received it because this
|
|
# block didn't reference the variable. Wiring it as an explicit
|
|
# interpolation (default empty) makes the documented manual flow
|
|
# actually work end-to-end. Empty value = bootstrap strategy
|
|
# disabled (server returns 410 Gone on POST /api/v1/auth/bootstrap),
|
|
# which is the safe default — only set the var when you intend to
|
|
# mint a day-0 admin via the bootstrap path.
|
|
CERTCTL_BOOTSTRAP_TOKEN: ${CERTCTL_BOOTSTRAP_TOKEN:-}
|
|
ports:
|
|
- "8443:8443"
|
|
volumes:
|
|
- certs:/etc/certctl/tls:ro
|
|
networks:
|
|
- certctl-network
|
|
healthcheck:
|
|
test: ["CMD", "curl", "--cacert", "/etc/certctl/tls/ca.crt", "-f", "https://localhost:8443/health"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
# U-3: server boot now does RunMigrations + RunSeed before listening on
|
|
# 8443. On a fresh clone the full migration ladder + seed application
|
|
# can take ~10s on a small VM; start_period prevents the first few
|
|
# healthcheck attempts from counting as failures while that work runs.
|
|
start_period: 30s
|
|
restart: unless-stopped
|
|
logging:
|
|
driver: "json-file"
|
|
options:
|
|
max-size: "10m"
|
|
max-file: "3"
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '1.0'
|
|
memory: 512M
|
|
|
|
# Certctl Agent
|
|
certctl-agent:
|
|
build:
|
|
context: ..
|
|
dockerfile: Dockerfile.agent
|
|
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
|
|
# vars into the Docker build so the Go module download stage can reach
|
|
# the public Go module proxy behind corporate proxies. Defaults to
|
|
# empty; omit the variables from the host environment for un-proxied
|
|
# builds and the behaviour is byte-identical to the pre-fix tree.
|
|
args:
|
|
HTTP_PROXY: ${HTTP_PROXY:-}
|
|
HTTPS_PROXY: ${HTTPS_PROXY:-}
|
|
NO_PROXY: ${NO_PROXY:-}
|
|
container_name: certctl-agent
|
|
depends_on:
|
|
certctl-server:
|
|
condition: service_healthy
|
|
environment:
|
|
CERTCTL_SERVER_URL: https://certctl-server:8443
|
|
CERTCTL_SERVER_CA_BUNDLE_PATH: /etc/certctl/tls/ca.crt
|
|
# Bundle 2 closure (compose split). No placeholder fallbacks.
|
|
# Operators MUST set CERTCTL_API_KEY (matching one of the server's
|
|
# CERTCTL_AUTH_SECRET rotation values) and CERTCTL_AGENT_ID
|
|
# (returned from `POST /api/v1/agents` during agent enrollment).
|
|
# Without an agent ID, cmd/agent/main.go fails fast at startup
|
|
# with "agent-id flag or CERTCTL_AGENT_ID env var is required" —
|
|
# the cold-DB compose smoke in .github/workflows/ci.yml tolerates
|
|
# the agent restart loop because the smoke targets server boot
|
|
# only. The demo overlay (docker-compose.demo.yml) supplies a
|
|
# pre-seeded agent-demo-1 row + matching env vars so the demo
|
|
# path stays one-command.
|
|
CERTCTL_API_KEY: ${CERTCTL_API_KEY}
|
|
CERTCTL_AGENT_ID: ${CERTCTL_AGENT_ID}
|
|
CERTCTL_AGENT_NAME: docker-agent
|
|
CERTCTL_LOG_LEVEL: info
|
|
CERTCTL_DISCOVERY_DIRS: /var/lib/certctl/keys # Agent scans this directory for existing certificates
|
|
volumes:
|
|
- agent_keys:/var/lib/certctl/keys
|
|
- certs:/etc/certctl/tls:ro
|
|
networks:
|
|
- certctl-network
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pgrep -f certctl-agent || exit 1"]
|
|
interval: 30s
|
|
timeout: 5s
|
|
retries: 3
|
|
restart: unless-stopped
|
|
logging:
|
|
driver: "json-file"
|
|
options:
|
|
max-size: "10m"
|
|
max-file: "3"
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '0.5'
|
|
memory: 256M
|
|
|
|
networks:
|
|
certctl-network:
|
|
driver: bridge
|
|
|
|
volumes:
|
|
postgres_data:
|
|
driver: local
|
|
agent_keys:
|
|
driver: local
|
|
certs:
|
|
driver: local
|