Files
shankar0123 af47d19ae2 fix(deploy,examples,env): close U-1 trap end-to-end across Helm, examples, and root env
Follow-up to cfc234e (U-1 docker-compose fix) — closes the remaining adjacent
code paths that share the postgres-first-boot-password-binding root cause but
were scoped out of the original commit.

The runtime diagnostic in internal/repository/postgres/db.go::wrapPingError
(landed in a911970) already covers every NewDB call site, so Helm operators
and example users hit the SQLSTATE 28P01 guidance for free at startup. What
was missing: deployment-shape-specific remediation guidance (kubectl vs
docker-compose), the hardcoded password in the *root* .env.example, and
shared ops notes for the 5 examples/ compose files. This commit closes all
three.

Files changed:

- .env.example (root) — line 16 had `postgres://certctl:certctl@...` with
  the password hardcoded literally instead of interpolating POSTGRES_PASSWORD.
  Edit if a user copied this file as their .env (binary-direct deployment,
  not docker-compose) and rotated POSTGRES_PASSWORD on line 10, the URL on
  line 16 still carried 'certctl' — silent two-line drift. Replaced 'certctl'
  with the same default that line 10 carries ('change-me-in-production') and
  added an explanatory comment block describing the docker-compose
  override semantics, when this URL matters (binary-direct), and the
  cross-reference to the U-1 wrapPingError diagnostic. Also fixed an
  adjacent bug: line 31 CERTCTL_SERVER_URL was `http://localhost:8443`,
  which agents reject at startup since v2.2 (HTTPS-everywhere milestone made
  the control plane HTTPS-only with TLS 1.3 pinned). Updated to https://
  with a comment pointing operators at the bootstrap CA bundle.

- deploy/helm/certctl/values.yaml — postgresql.auth.password field had a
  one-line 'REQUIRED' comment. Expanded into a full WARNING block (~25
  lines) explaining the PVC retention semantics, the failure symptom,
  and both kubectl-flavored remediation paths: non-destructive
  (`kubectl exec ... ALTER ROLE`) preferred for environments with data,
  and destructive (`helm uninstall + kubectl delete pvc`) for dev/demo.
  Cross-references the wrapPingError runtime diagnostic.

- deploy/helm/certctl/README.md (new, ~115 lines) — chart-level operational
  guide. Covers quick install, both remediation paths with concrete
  kubectl commands, why-we-don't-fix-this-in-the-chart explanation,
  cross-references to the docker-compose docs, server API key rotation
  (the easy case — comma-separated key list), TLS provisioning shapes,
  embedded-vs-external postgres, and uninstall semantics with the PVC
  retention gotcha called out.

- examples/README.md (new, ~55 lines) — shared operational notes for the
  5 example deployments. Covers the postgres password rotation trap with
  example-flavored remediation paths (`docker compose -f examples/<x>/...`),
  the TLS warning, and teardown semantics. Replaces what would otherwise
  be 5x duplication across per-example READMEs.

- examples/{acme-nginx,acme-wildcard-dns01,multi-issuer,private-ca-traefik,
  step-ca-haproxy}/*.md — one-line cross-reference at the top of each
  example's primary doc, pointing at examples/README.md for the shared
  ops notes. Avoids 5x duplication of the same warning text while still
  surfacing the link in every operator's first-touch surface.

Verification:

- go build ./... — clean
- go vet ./... — clean
- go test -short ./internal/repository/postgres/ — 4/4 wrapPingError tests
  still passing (no production-code touch in this commit)
- helm lint deploy/helm/certctl/ — clean (1 INFO about chart icon, pre-existing)
- helm template smoke test — renders without error
- python3 yaml.safe_load on values.yaml — parses

Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
      §2 P1 cluster, cat-u-quickstart_postgres_password_volume_trap
      Closes the three deliberate scope-outs from cfc234e (Helm,
      root .env.example, examples/) end-to-end.

      Adjacent bugs caught while in scope:
      - root .env.example:16 hardcoded password not matching line 10
      - root .env.example:31 http:// URL incompatible with HTTPS-only v2.2
2026-04-24 23:51:13 +00:00

4.2 KiB

Deployment Examples

Five turnkey docker-compose scenarios that show certctl deployed against real CA backends and target shapes. Each subdirectory is self-contained — pick the one closest to your stack and have it running in minutes.

Example Stack What it shows
acme-nginx/ Let's Encrypt + NGINX (HTTP-01) The default public-CA path: ACME-issued certs deployed to NGINX.
acme-wildcard-dns01/ Let's Encrypt wildcard (DNS-01) Wildcard certificates via DNS-01 with pluggable DNS hooks.
private-ca-traefik/ Local CA + Traefik Internal-only certs from a private CA, deployed to Traefik.
step-ca-haproxy/ Smallstep step-ca + HAProxy Self-hosted CA with HAProxy as the deployment target.
multi-issuer/ Let's Encrypt + Local CA Public + private certs side-by-side from a single dashboard.

Common operational notes

These notes apply to every example. They're called out here so the per-example walkthroughs stay focused on the issuer/target wiring instead of repeating ops boilerplate.

Postgres password rotation — first-boot binding trap (U-1)

Every example file uses ${DB_PASSWORD:-certctl-dev-password} as the postgres password env var, with the data directory persisted via a named volume. The postgres:16-alpine image runs initdb exactly once — when /var/lib/postgresql/data is empty — and that's the only time POSTGRES_PASSWORD is written into pg_authid. If you boot once with the default and then change DB_PASSWORD (in your shell, in a .env file, or in a wrapper script), the certctl-server container picks up the new value but the postgres container continues to authenticate against the old one. The server fails its startup db.Ping() with pq: password authentication failed for user "certctl" (SQLSTATE 28P01).

The certctl-server emits guidance pointing at the fix when this fires (see internal/repository/postgres/db.go::wrapPingError). The two remediation paths:

  • Destructive — wipes all certctl data, only acceptable on demo/test setups:
    docker compose -f examples/<example>/docker-compose.yml down -v
    docker compose -f examples/<example>/docker-compose.yml up -d --build
    
  • Non-destructive — preserves data, rotates pg_authid in place:
    docker compose -f examples/<example>/docker-compose.yml exec postgres \
      psql -U certctl -c "ALTER ROLE certctl PASSWORD '<new>';"
    # Then redeploy with DB_PASSWORD set to <new> in your shell or .env
    

The cleanest practice for a fresh demo: set DB_PASSWORD once in your shell before the very first docker compose up, and don't change it during the demo's lifetime. If you must rotate, use the non-destructive path.

Same root cause and remediation pattern is documented for the canonical quickstart in ../docs/quickstart.md, the production compose surface in ../deploy/ENVIRONMENTS.md, and the Helm chart in ../deploy/helm/certctl/README.md.

TLS for the certctl control plane

Every example boots certctl with HTTPS-only on port 8443 (TLS 1.3 pinned, no plaintext listener as of v2.2). The shipped certctl-tls-init init container generates a self-signed ECDSA-P256 cert on first boot — fine for the example demos, never acceptable for a public deployment. For production, swap the init container for cert-manager, an operator-supplied Secret, or your internal CA — see ../docs/tls.md for the full pattern matrix.

Tearing down

To stop services but keep the postgres volume (so you can pick up where you left off):

docker compose -f examples/<example>/docker-compose.yml down

To stop services and wipe all data (clean slate for the next run):

docker compose -f examples/<example>/docker-compose.yml down -v

Note that down -v is the only canonical way to recover from the postgres-password trap when the non-destructive ALTER ROLE route is unavailable (e.g., you've forgotten the original password).