Files
shankar0123 af47d19ae2 fix(deploy,examples,env): close U-1 trap end-to-end across Helm, examples, and root env
Follow-up to cfc234e (U-1 docker-compose fix) — closes the remaining adjacent
code paths that share the postgres-first-boot-password-binding root cause but
were scoped out of the original commit.

The runtime diagnostic in internal/repository/postgres/db.go::wrapPingError
(landed in a911970) already covers every NewDB call site, so Helm operators
and example users hit the SQLSTATE 28P01 guidance for free at startup. What
was missing: deployment-shape-specific remediation guidance (kubectl vs
docker-compose), the hardcoded password in the *root* .env.example, and
shared ops notes for the 5 examples/ compose files. This commit closes all
three.

Files changed:

- .env.example (root) — line 16 had `postgres://certctl:certctl@...` with
  the password hardcoded literally instead of interpolating POSTGRES_PASSWORD.
  Edit if a user copied this file as their .env (binary-direct deployment,
  not docker-compose) and rotated POSTGRES_PASSWORD on line 10, the URL on
  line 16 still carried 'certctl' — silent two-line drift. Replaced 'certctl'
  with the same default that line 10 carries ('change-me-in-production') and
  added an explanatory comment block describing the docker-compose
  override semantics, when this URL matters (binary-direct), and the
  cross-reference to the U-1 wrapPingError diagnostic. Also fixed an
  adjacent bug: line 31 CERTCTL_SERVER_URL was `http://localhost:8443`,
  which agents reject at startup since v2.2 (HTTPS-everywhere milestone made
  the control plane HTTPS-only with TLS 1.3 pinned). Updated to https://
  with a comment pointing operators at the bootstrap CA bundle.

- deploy/helm/certctl/values.yaml — postgresql.auth.password field had a
  one-line 'REQUIRED' comment. Expanded into a full WARNING block (~25
  lines) explaining the PVC retention semantics, the failure symptom,
  and both kubectl-flavored remediation paths: non-destructive
  (`kubectl exec ... ALTER ROLE`) preferred for environments with data,
  and destructive (`helm uninstall + kubectl delete pvc`) for dev/demo.
  Cross-references the wrapPingError runtime diagnostic.

- deploy/helm/certctl/README.md (new, ~115 lines) — chart-level operational
  guide. Covers quick install, both remediation paths with concrete
  kubectl commands, why-we-don't-fix-this-in-the-chart explanation,
  cross-references to the docker-compose docs, server API key rotation
  (the easy case — comma-separated key list), TLS provisioning shapes,
  embedded-vs-external postgres, and uninstall semantics with the PVC
  retention gotcha called out.

- examples/README.md (new, ~55 lines) — shared operational notes for the
  5 example deployments. Covers the postgres password rotation trap with
  example-flavored remediation paths (`docker compose -f examples/<x>/...`),
  the TLS warning, and teardown semantics. Replaces what would otherwise
  be 5x duplication across per-example READMEs.

- examples/{acme-nginx,acme-wildcard-dns01,multi-issuer,private-ca-traefik,
  step-ca-haproxy}/*.md — one-line cross-reference at the top of each
  example's primary doc, pointing at examples/README.md for the shared
  ops notes. Avoids 5x duplication of the same warning text while still
  surfacing the link in every operator's first-touch surface.

Verification:

- go build ./... — clean
- go vet ./... — clean
- go test -short ./internal/repository/postgres/ — 4/4 wrapPingError tests
  still passing (no production-code touch in this commit)
- helm lint deploy/helm/certctl/ — clean (1 INFO about chart icon, pre-existing)
- helm template smoke test — renders without error
- python3 yaml.safe_load on values.yaml — parses

Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
      §2 P1 cluster, cat-u-quickstart_postgres_password_volume_trap
      Closes the three deliberate scope-outs from cfc234e (Helm,
      root .env.example, examples/) end-to-end.

      Adjacent bugs caught while in scope:
      - root .env.example:16 hardcoded password not matching line 10
      - root .env.example:31 http:// URL incompatible with HTTPS-only v2.2
2026-04-24 23:51:13 +00:00

11 KiB

Private CA + Traefik Example

Operational notes shared by every example (postgres password rotation trap, TLS provisioning, teardown semantics) live in ../README.md. Read it first if you plan to change DB_PASSWORD after the initial docker compose up — the postgres volume binds the password on first boot only.

This example demonstrates certctl managing certificates for internal services without public CA dependency. Ideal for enterprise environments where:

  • All services are internal (VPN, private networks)
  • You need unified certificate lifecycle management across multiple internal apps
  • You want automatic cert deployment to your reverse proxy
  • You may have an existing enterprise root CA (ADCS, OpenCA, etc.)

What's Included

  • certctl server with Local CA issuer (self-signed or sub-CA mode)
  • certctl agent that deploys certificates to Traefik
  • Traefik reverse proxy with file provider for dynamic cert discovery
  • PostgreSQL database for certificate storage and audit trail
  • Automatic certificate discovery for existing certs in Traefik

Architecture

flowchart TD
    A["certctl-server<br/>(control plane)<br/>(Local CA issuer)"]
    B["certctl-agent<br/>(certificate deployer)"]
    C["Traefik<br/>(watches cert directory)"]
    D["[Internal Services]"]

    A -->|REST API<br/>job polling| B
    B -->|Write cert/key files| C
    C -->|TLS handshakes| D

TLS Security

certctl is HTTPS-only as of v2.2. The demo compose stack provisions a self-signed certificate. When accessing https://localhost:8443, you can either:

  • Use curl --cacert ./deploy/test/certs/ca.crt ... to pin the CA certificate
  • Use curl -k ... for quick smoke tests (never in production)
  • Import the CA at ./deploy/test/certs/ca.crt into your OS trust store for browser visits

Quick Start (Self-Signed CA)

The simplest way to get running in 2 minutes:

# 1. Create directory structure
mkdir -p traefik-config ca-certs

# 2. Create a minimal Traefik dynamic config
cat > traefik-config/default.yaml << 'EOF'
# Traefik will auto-load certificates from /etc/traefik/certs
# Certctl deploys {cert-id}.crt and {cert-id}.key files here
http:
  routers:
    api:
      rule: "Host(`api.internal.local`)"
      service: api-service
      tls: {}
  services:
    api-service:
      loadBalancer:
        servers:
          - url: "http://localhost:3000"
EOF

# 3. Start the stack
docker compose up -d

# 4. Access the dashboards
# - certctl: https://localhost:8443 (API only, use the CLI or direct HTTP calls)
# - Traefik dashboard: http://localhost:8080

The self-signed CA will be automatically generated on first startup.

Using Sub-CA Mode (Enterprise Root CA)

If you have an existing enterprise CA (ADCS, OpenCA, etc.) and want issued certs to chain to your root:

# 1. Create directory structure
mkdir -p traefik-config ca-certs

# 2. Copy your enterprise CA cert and key
cp /path/to/your/enterprise-ca.crt ca-certs/ca-cert.pem
cp /path/to/your/enterprise-ca-key.pem ca-certs/ca-key.pem

# 3. Edit docker-compose.yml and uncomment the sub-CA env vars:
#    CERTCTL_CA_CERT_PATH: /etc/certctl/ca-cert.pem
#    CERTCTL_CA_KEY_PATH: /etc/certctl/ca-key.pem

# 4. Create the dynamic config (same as above)
mkdir -p traefik-config
cat > traefik-config/default.yaml << 'EOF'
http:
  routers:
    api:
      rule: "Host(`api.internal.local`)"
      service: api-service
      tls: {}
  services:
    api-service:
      loadBalancer:
        servers:
          - url: "http://localhost:3000"
EOF

# 5. Start the stack
docker compose up -d

Requirements for sub-CA mode:

  • CA certificate must have X509v3 Basic Constraints: CA:TRUE
  • CA certificate must have X509v3 Key Usage: Certificate Sign
  • Key format: RSA, ECDSA, or PKCS#8
  • Paths: must be absolute paths to mounted files

Creating a Certificate

Once the stack is running:

# 1. Create a certificate profile in certctl (defines allowed key types, TTL, etc.)
curl -X POST https://localhost:8443/api/v1/profiles \
  -H "Content-Type: application/json" \
  -d '{
    "id": "prof-internal",
    "name": "Internal Services",
    "description": "For internal APIs and web apps",
    "max_ttl_hours": 8760,
    "key_types": ["rsa-2048", "ecdsa-p256"]
  }'

# 2. Create a renewal policy (defines issuer, renewal thresholds, etc.)
curl -X POST https://localhost:8443/api/v1/policies \
  -H "Content-Type: application/json" \
  -d '{
    "id": "pol-internal",
    "name": "Internal Renewal Policy",
    "issuer_id": "iss-local",
    "profile_id": "prof-internal",
    "renewal_threshold_days": 30,
    "alert_thresholds_days": [30, 14, 7, 0]
  }'

# 3. Create a certificate (triggers issuance immediately)
curl -X POST https://localhost:8443/api/v1/certificates \
  -H "Content-Type: application/json" \
  -d '{
    "common_name": "api.internal.local",
    "sans": ["app.internal.local", "www.internal.local"],
    "policy_id": "pol-internal"
  }'

# 4. Create a Traefik target (agent will deploy to this)
curl -X POST https://localhost:8443/api/v1/targets \
  -H "Content-Type: application/json" \
  -d '{
    "id": "target-traefik-01",
    "name": "Traefik Primary",
    "type": "traefik",
    "config": {
      "cert_dir": "/etc/traefik/certs"
    }
  }'

# 5. Create a deployment job (agent picks this up and deploys)
curl -X POST https://localhost:8443/api/v1/certificates/{cert-id}/deploy \
  -H "Content-Type: application/json" \
  -d '{
    "target_ids": ["target-traefik-01"]
  }'

Once deployed, Traefik automatically loads the new certificate from the certs directory.

How It Works

Certificate Lifecycle

  1. Issue — certctl-server generates certificate from Local CA (self-signed or sub-CA)
  2. Store — certificate stored in PostgreSQL with full audit trail
  3. Deploy — certctl-agent writes {cert-id}.crt + {cert-id}.key to /etc/traefik/certs
  4. Reload — Traefik file provider detects new files and hot-loads them (zero downtime)
  5. Monitor — certctl tracks deployment status and renewal timelines

Self-Signed CA

  • Generated automatically on first startup if CERTCTL_CA_CERT_PATH and CERTCTL_CA_KEY_PATH are not set
  • Certificate stored in server's in-memory state (not persisted)
  • All issued certs chain to this self-signed root
  • Use this for: demos, development, internal labs

Sub-CA Mode

  • Requires you to provide an existing CA certificate and key
  • Issued certificates chain to your enterprise root CA
  • All issued certs are trustworthy to systems with your root CA in their trust store
  • Use this for: production internal services, compliance requirements, enterprise PKI

File Organization

private-ca-traefik/
├── docker-compose.yml          # Stack definition
├── traefik-config/             # Traefik dynamic config (you create)
│   └── default.yaml            # Routing rules and TLS settings
├── ca-certs/                   # CA certificate and key (for sub-CA mode)
│   ├── ca-cert.pem            # Your enterprise CA certificate
│   └── ca-key.pem             # Your enterprise CA private key
└── README.md                   # This file

Monitoring

certctl Dashboard

The server provides a REST API on port 8443. Example queries:

# List all certificates
curl https://localhost:8443/api/v1/certificates

# Check certificate status
curl https://localhost:8443/api/v1/certificates/{cert-id}

# View audit trail
curl https://localhost:8443/api/v1/audit

# Check renewal policy compliance
curl https://localhost:8443/api/v1/policies/{policy-id}

Traefik Dashboard

http://localhost:8080 shows:

  • HTTP routers and services
  • TLS certificates currently loaded
  • Request/response metrics

Logs

# certctl server logs
docker compose logs certctl-server

# certctl agent logs
docker compose logs certctl-agent

# Traefik logs
docker compose logs traefik

Customizing Traefik Config

Edit traefik-config/default.yaml to add routers for your services:

http:
  routers:
    # Internal API
    api:
      rule: "Host(`api.internal.local`)"
      service: api-service
      tls: {}

    # Web application
    webapp:
      rule: "Host(`app.internal.local`)"
      service: webapp-service
      tls: {}

  services:
    api-service:
      loadBalancer:
        servers:
          - url: "http://api-backend:3000"

    webapp-service:
      loadBalancer:
        servers:
          - url: "http://webapp-backend:3001"

Changes are picked up automatically (file watcher enabled).

Production Considerations

  1. Use sub-CA mode — chain to your enterprise root for full trust
  2. Enable API key authentication — set CERTCTL_AUTH_TYPE: api-key and CERTCTL_API_KEY
  3. Use agent-side key generation — set CERTCTL_KEYGEN_MODE: agent (keys never leave agents)
  4. Back up PostgreSQL — certificate data is authoritative; database loss means certificate loss
  5. Monitor renewal windows — set up alerts on policy thresholds
  6. Rotate CA keys regularly — plan for future CA refresh (sub-CA mode)
  7. Audit certificate usage — review certctl_audit_events for compliance

Troubleshooting

Certificates not deploying

# Check agent is healthy
docker compose logs certctl-agent | grep heartbeat

# Check deployment job status
curl https://localhost:8443/api/v1/jobs | jq '.[] | select(.type == "Deployment")'

# Check Traefik is watching the directory
docker compose exec traefik ls -la /etc/traefik/certs/

Traefik not reloading certs

# Verify file provider is enabled (check docker-compose.yml command)
# Verify certs volume is mounted at /etc/traefik/certs
# Check Traefik logs
docker compose logs traefik | grep "file"

CA cert not loading in sub-CA mode

# Verify file permissions
docker compose exec certctl-server ls -la /etc/certctl/

# Check server logs for CA loading errors
docker compose logs certctl-server | grep -i "ca\|cert"

# Verify CA certificate format
openssl x509 -in ca-certs/ca-cert.pem -text -noout | grep -A 3 "Basic Constraints"

Cleanup

# Stop all services
docker compose down

# Remove all data (certificates, database, etc.)
docker compose down -v

# Remove CA cert files (if using custom CA)
rm -rf ca-certs/

Next Steps

  1. Add more services — create additional routers and backends in traefik-config/default.yaml
  2. Set up renewal automation — configure renewal policies with thresholds
  3. Integrate with monitoring — expose certctl metrics to Prometheus
  4. Enable notifications — configure email/Slack alerts on certificate events
  5. Scale to multiple environments — deploy separate certctl stacks per environment (dev/staging/prod)