Files
certctl/examples/acme-wildcard-dns01/acme-wildcard-dns01.md
T
Shankar 3155b9475f v2.0.47: HTTPS Everywhere — TLS-only control plane, agents/CLI/MCP
Breaking change release. Plaintext HTTP listener removed. The certctl
control plane now terminates TLS 1.3 on :8443 via
http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape
hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md.

Server
- cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert
  swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback),
  preflightServerTLS validation
- cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe,
  watchSIGHUP wiring, cert/key path config threading
- tls_test.go: 418-line regression coverage of reload, preflight,
  callback behavior, SAN validation

Config
- CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required)
- Plaintext rejection: agents/CLI/MCP pre-flight-fail on http://
  URLs with a pointer to docs/upgrade-to-tls.md

Agents, CLI, MCP
- All three pre-flight-reject http:// URLs with fail-loud diagnostic
- CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust
- CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass
  (loud warning on startup)
- install-agent.sh emits both vars as commented template lines

docker-compose
- certctl-tls-init sidecar generates SAN-valid self-signed cert into
  deploy/test/certs/ on first boot
- All demo-stack curls pin against ca.crt with --cacert

Helm chart
- Three TLS provisioning modes, exactly one required:
  - server.tls.existingSecret (operator-supplied)
  - server.tls.certManager.enabled (cert-manager integration)
  - server.tls.selfSigned.enabled (eval only — not for production)
- server-certificate.yaml template for cert-manager mode
- helm install without a TLS source fails at template render with
  a pointer to docs/tls.md

CI
- .github/workflows/ci.yml Helm Chart Validation step renders the
  chart in both existingSecret and cert-manager modes, plus an
  inverse guard-regression test that asserts helm template MUST
  refuse to render when no TLS source is configured. Previously
  the single `helm template` invocation hit the certctl.tls.required
  fail-loud guard and exit-1'd CI. Four invocations now: lint
  (existingSecret), template (existingSecret), template
  (cert-manager), template (no args — must fail).

Integration tests
- deploy/test/integration_test.go stands up the Compose stack over
  HTTPS, extracts the CA bundle, and exercises every certctl API
  over https://localhost:8443
- All 34 integration subtests green (per Phase 8 local CI-parity)

Documentation
- New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload)
- New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade
  warnings, fleet-roll sequencing)
- CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry
  (file heading unchanged; release tag is v2.0.47)
- All curls in docs/, examples/, deploy/helm/ guides use
  https://localhost:8443 --cacert

Verification
- grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits
- grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin
  API default, SSRF doc comment) — zero certctl endpoints
- Tasks #197–#206 (Phases 0–8) all closed in the tracker

Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
2026-04-20 03:43:10 +00:00

9.4 KiB

ACME Wildcard DNS-01 Example

What this does: Issues wildcard certificates (e.g., *.example.com) from Let's Encrypt using DNS-01 challenge validation.

This example is ideal for:

  • Issuing wildcard certificates (*.example.com)
  • Services behind NAT, firewalls, or non-public networks
  • Batch issuance of multiple domains in parallel
  • Internal PKI with public DNS names
  • Scenarios where you have programmatic access to your DNS provider's API

TLS Security

certctl is HTTPS-only as of v2.2. The demo compose stack provisions a self-signed certificate. When accessing https://localhost:8443, you can either:

  • Use curl --cacert ./deploy/test/certs/ca.crt ... to pin the CA certificate
  • Use curl -k ... for quick smoke tests (never in production)
  • Import the CA at ./deploy/test/certs/ca.crt into your OS trust store for browser visits

Prerequisites

Before running this example, you need:

  1. A domain name (e.g., example.com) that you control and can manage DNS records for
  2. DNS provider credentials:
    • Cloudflare (example included): API token with DNS:write permission + Zone ID
    • Route53 (AWS): AWS access key + secret key
    • Azure DNS: Azure subscription ID + credentials
    • Other providers: See "Adapting for Other DNS Providers" below
  3. Docker and Docker Compose installed

Quick Start (Cloudflare)

Step 1: Get Cloudflare Credentials

  1. Log in to Cloudflare Dashboard
  2. Select your domain (e.g., example.com)
  3. In the sidebar, find Zone ID (copy this)
  4. Go to Account Settings > API Tokens
  5. Create a new token with these scopes:
    • Zone > Zone:Read (to list DNS records)
    • Zone > DNS:Write (to create/delete challenge records)
  6. Copy the API token

Step 2: Set Environment Variables

Create a .env file in this directory:

# .env
CLOUDFLARE_API_TOKEN=your-api-token-here
CLOUDFLARE_ZONE_ID=your-zone-id-here
ACME_EMAIL=admin@example.com
DB_PASSWORD=your-secure-db-password

Or export them in your shell:

export CLOUDFLARE_API_TOKEN="your-api-token-here"
export CLOUDFLARE_ZONE_ID="your-zone-id-here"
export ACME_EMAIL="admin@example.com"
export DB_PASSWORD="your-secure-db-password"

Step 3: Make DNS Scripts Executable

chmod +x dns-hooks/*.sh

Step 4: Start the Stack

docker compose up -d

This starts:

  • certctl-server (port 8443): Control plane and ACME orchestrator
  • postgres: Certificate metadata database
  • certctl-agent: Certificate deployment agent

Step 5: Access the Dashboard

Open your browser to https://localhost:8443

Step 6: Create a Wildcard Certificate

  1. Go to Issuers page
  2. Verify the ACME issuer is registered
  3. Go to Certificates > New Certificate
  4. Fill in:
    • Issuer: ACME (Let's Encrypt)
    • Common Name: *.example.com
    • Subject Alt Names: example.com (to also cover the root domain)
  5. Click Request

The renewal job will:

  1. Send a request to Let's Encrypt
  2. Run dns-hooks/cloudflare-present.sh to create _acme-challenge.example.com TXT record
  3. Wait for Let's Encrypt to verify the TXT record
  4. Issue the certificate
  5. Run dns-hooks/cloudflare-cleanup.sh to delete the temporary TXT record

Step 7: Monitor the Job

Go to Jobs page to see the renewal progress:

  • AwaitingCSR: Agent is generating the CSR
  • Running: ACME challenge in progress (DNS record being validated)
  • Completed: Certificate issued and stored
  • Failed: Check logs for errors (e.g., DNS provider API issues)

How DNS-01 Works

The DNS-01 challenge proves you own a domain by creating a DNS TXT record:

_acme-challenge.example.com TXT "acme-validation-token-xxxxx"

Let's Encrypt then queries this TXT record. Once verified, it issues the certificate and certctl cleans up the TXT record.

Why DNS-01 is better than HTTP-01 for wildcards:

  • HTTP-01 requires a public web server; DNS-01 works anywhere
  • Wildcard certificates require DNS proof (not HTTP)
  • DNS challenges can be solved for multiple domains in parallel
  • No need for public IP or inbound port 80/443

Adapting for Other DNS Providers

The example uses Cloudflare, but certctl supports any DNS provider via pluggable shell scripts.

AWS Route53

Replace the CERTCTL_ACME_DNS_PRESENT_SCRIPT and CERTCTL_ACME_DNS_CLEANUP_SCRIPT in docker-compose.yml with:

  • ./dns-hooks/route53-present.sh
  • ./dns-hooks/route53-cleanup.sh

Example script outline (using AWS CLI):

#!/bin/bash
DOMAIN="$1"
VALIDATION_TOKEN="$2"

# Get Route53 hosted zone ID for the domain
ZONE_ID=$(aws route53 list-hosted-zones --query \
  "HostedZones[?Name=='$DOMAIN.'].Id" --output text | cut -d'/' -f3)

# Create TXT record
aws route53 change-resource-record-sets \
  --hosted-zone-id "$ZONE_ID" \
  --change-batch "{
    \"Changes\": [{
      \"Action\": \"CREATE\",
      \"ResourceRecordSet\": {
        \"Name\": \"_acme-challenge.$DOMAIN\",
        \"Type\": \"TXT\",
        \"TTL\": 120,
        \"ResourceRecords\": [{\"Value\": \"\\\"$VALIDATION_TOKEN\\\"\"}]
      }
    }]
  }"

Azure DNS

#!/bin/bash
DOMAIN="$1"
VALIDATION_TOKEN="$2"

# Set Azure credentials via environment variables
# AZURE_SUBSCRIPTION_ID, AZURE_RESOURCE_GROUP, AZURE_TENANT_ID, etc.

az network dns record-set txt create \
  --resource-group "$AZURE_RESOURCE_GROUP" \
  --zone-name "$DOMAIN" \
  --name "_acme-challenge" \
  --ttl 120 \
  --txt-value "$VALIDATION_TOKEN"

Generic DNS Provider (using dig + TSIG)

If your DNS provider supports NSUPDATE (RFC 2136):

#!/bin/bash
DOMAIN="$1"
VALIDATION_TOKEN="$2"

nsupdate <<EOF
zone $DOMAIN
update add _acme-challenge.$DOMAIN 120 TXT "$VALIDATION_TOKEN"
send
EOF

Manual DNS (for testing)

Replace scripts with no-ops during testing:

#!/bin/bash
echo "Please create: _acme-challenge.$1 TXT $2"
sleep 60  # Manual wait for you to create the record

Alternative: DNS-PERSIST-01 (Standing Records)

If your DNS provider supports it, use DNS-PERSIST-01 for zero-maintenance renewals.

Instead of creating a new TXT record for each renewal, you create one standing record once:

_validation-persist.example.com TXT "letsencrypt.org; accounturi=https://acme-v02.api.letsencrypt.org/acme/acct/12345678"

Then every renewal uses the same record — no cleanup scripts needed!

To enable in docker-compose.yml:

CERTCTL_ACME_CHALLENGE_TYPE: dns-persist-01
CERTCTL_ACME_DNS_PERSIST_ISSUER_DOMAIN: letsencrypt.org

Certctl will:

  1. Fetch your ACME account URI
  2. Create the standing _validation-persist record once
  3. Reuse it for all future renewals (no per-renewal DNS updates)

Security Notes

  1. API Token Scope: Restrict Cloudflare/AWS tokens to DNS:write only (not full account access)
  2. Key Generation: This example uses agent-side key generation (CERTCTL_KEYGEN_MODE=agent), which is production-standard. Private keys never leave the agent.
  3. Script Safety: The DNS scripts run in the certctl-server container. For production:
    • Validate script inputs (already done in certctl code)
    • Log all API calls
    • Monitor for failed DNS operations
    • Use a separate proxy agent for DNS operations if needed

Troubleshooting

DNS record not created

Check the server logs:

docker logs certctl-server-dns01

Look for lines like:

  • [certctl DNS-01] Creating DNS record: _acme-challenge.example.com
  • Error: Cloudflare API failed: ...

Common issues:

  • Missing or invalid CLOUDFLARE_API_TOKEN
  • Invalid CLOUDFLARE_ZONE_ID
  • API token doesn't have DNS:write permission
  • Domain not in your Cloudflare account

DNS propagation timeout

If the TLS negotiation fails, it might be DNS caching. Increase the wait time in the script:

sleep 30  # Increase from 10 to 30 seconds

Let's Encrypt rate limits

Let's Encrypt has strict rate limits:

  • 50 certificates per registered domain per week
  • 5 duplicate certificates per domain per week

For testing, use the staging directory:

CERTCTL_ACME_DIRECTORY_URL: https://acme-staging-v02.api.letsencrypt.org/directory

(Staging certs won't be trusted by browsers, but don't count against rate limits.)

Job fails with "CSR generation timeout"

If your DNS provider is very slow, increase the timeout in the cleanup script or add a longer wait time:

sleep 60  # Wait 1 minute for DNS propagation

Next Steps

  1. Monitor renewals: Set up notifications (email, Slack, PagerDuty) for renewal events
  2. Deploy certificates: Configure target connectors (NGINX, HAProxy, Traefik) to automatically deploy issued certs
  3. Multi-domain: Use certificate profiles to group wildcard + subdomain certs
  4. Backup DNS scripts: Version control your DNS provider scripts in git

Files in This Example

  • docker-compose.yml — Container stack definition with ACME DNS-01 configuration
  • dns-hooks/cloudflare-present.sh — Creates _acme-challenge TXT record (Cloudflare)
  • dns-hooks/cloudflare-cleanup.sh — Deletes _acme-challenge TXT record (Cloudflare)
  • README.md — This file

Additional Resources