mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:41:41 +00:00
52248be717
Breaking change release. Plaintext HTTP listener removed. The certctl control plane now terminates TLS 1.3 on :8443 via http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md. Server - cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback), preflightServerTLS validation - cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe, watchSIGHUP wiring, cert/key path config threading - tls_test.go: 418-line regression coverage of reload, preflight, callback behavior, SAN validation Config - CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required) - Plaintext rejection: agents/CLI/MCP pre-flight-fail on http:// URLs with a pointer to docs/upgrade-to-tls.md Agents, CLI, MCP - All three pre-flight-reject http:// URLs with fail-loud diagnostic - CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust - CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass (loud warning on startup) - install-agent.sh emits both vars as commented template lines docker-compose - certctl-tls-init sidecar generates SAN-valid self-signed cert into deploy/test/certs/ on first boot - All demo-stack curls pin against ca.crt with --cacert Helm chart - Three TLS provisioning modes, exactly one required: - server.tls.existingSecret (operator-supplied) - server.tls.certManager.enabled (cert-manager integration) - server.tls.selfSigned.enabled (eval only — not for production) - server-certificate.yaml template for cert-manager mode - helm install without a TLS source fails at template render with a pointer to docs/tls.md CI - .github/workflows/ci.yml Helm Chart Validation step renders the chart in both existingSecret and cert-manager modes, plus an inverse guard-regression test that asserts helm template MUST refuse to render when no TLS source is configured. Previously the single `helm template` invocation hit the certctl.tls.required fail-loud guard and exit-1'd CI. Four invocations now: lint (existingSecret), template (existingSecret), template (cert-manager), template (no args — must fail). Integration tests - deploy/test/integration_test.go stands up the Compose stack over HTTPS, extracts the CA bundle, and exercises every certctl API over https://localhost:8443 - All 34 integration subtests green (per Phase 8 local CI-parity) Documentation - New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload) - New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade warnings, fleet-roll sequencing) - CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry (file heading unchanged; release tag is v2.0.47) - All curls in docs/, examples/, deploy/helm/ guides use https://localhost:8443 --cacert Verification - grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits - grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin API default, SSRF doc comment) — zero certctl endpoints - Tasks #197–#206 (Phases 0–8) all closed in the tracker Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
174 lines
7.0 KiB
Markdown
174 lines
7.0 KiB
Markdown
# Migrating from Certbot to certctl
|
|
|
|
You have 50 Let's Encrypt certificates across 10 servers, managed by a mix of Certbot cron jobs and manual renewals. Certbot handles issuance, but you lack inventory visibility, centralized alerting, and audit trails. This guide walks you through moving to certctl while keeping your existing certificates and ACME account.
|
|
|
|
## Why Migrate
|
|
|
|
Certbot renews certs in isolation. If a renewal fails on one server, you don't know until the cert expires. certctl gives you a single pane of glass: see all certs across all servers, get alerts 30/14/7 days before expiry, track who renewed what when, and verify each deployment succeeded via TLS fingerprint validation.
|
|
|
|
## What You Keep
|
|
|
|
- Your existing Certbot ACME account key and Let's Encrypt account
|
|
- All issued certificates in `/etc/letsencrypt/live/`
|
|
- Certbot's renewal history and hooks
|
|
|
|
You will not re-issue any certificates. certctl discovers them and takes over renewal scheduling.
|
|
|
|
## Step-by-Step Migration
|
|
|
|
### 1. Deploy certctl Control Plane
|
|
|
|
Option A: Docker Compose (quickest for evaluation)
|
|
```bash
|
|
cd /opt/certctl
|
|
docker compose up -d
|
|
# Dashboard & API: https://localhost:8443 (self-signed cert — use --cacert ./deploy/test/certs/ca.crt for the default compose stack)
|
|
# Default API key in logs (grep CERTCTL_API_KEY docker logs certctl-server)
|
|
```
|
|
|
|
Option B: Kubernetes (Helm)
|
|
```bash
|
|
helm install certctl deploy/helm/certctl/ \
|
|
--set auth.apiKey=YOUR_SECURE_KEY
|
|
```
|
|
|
|
### 2. Deploy Agents to Each Server
|
|
|
|
On each of your 10 servers running Certbot:
|
|
|
|
```bash
|
|
# Linux amd64 (adjust for your architecture)
|
|
curl -sSL https://github.com/shankar0123/certctl/releases/download/v2.1.0/certctl-agent-linux-amd64 \
|
|
-o /usr/local/bin/certctl-agent
|
|
chmod +x /usr/local/bin/certctl-agent
|
|
|
|
# Create config
|
|
sudo mkdir -p /etc/certctl /var/lib/certctl/keys
|
|
sudo tee /etc/certctl/agent.env > /dev/null <<EOF
|
|
CERTCTL_SERVER_URL=https://certctl-control-plane.example.com:8443
|
|
CERTCTL_SERVER_CA_BUNDLE_PATH=/etc/certctl/tls/ca.crt
|
|
CERTCTL_API_KEY=your-api-key-here
|
|
CERTCTL_DISCOVERY_DIRS=/etc/letsencrypt/live
|
|
CERTCTL_KEY_DIR=/var/lib/certctl/keys
|
|
EOF
|
|
sudo chmod 600 /etc/certctl/agent.env
|
|
|
|
# Start agent
|
|
sudo systemctl start certctl-agent # if installed via script
|
|
# OR manually:
|
|
sudo certctl-agent --server https://... --api-key ... --discovery-dirs /etc/letsencrypt/live
|
|
```
|
|
|
|
The agent will scan `/etc/letsencrypt/live/` and report all discovered certificates to the control plane.
|
|
|
|
### 3. Triage Discovered Certificates
|
|
|
|
In the certctl dashboard, go to **Discovery**:
|
|
- See all discovered certs grouped by agent
|
|
- Status shows "Unmanaged" for certificates not yet claimed
|
|
- For each Certbot cert, click **Claim** and link it to managed inventory
|
|
|
|
The control plane now knows about all 50 certs and where they live.
|
|
|
|
### 4. Configure ACME Issuer
|
|
|
|
Go to **Issuers** → **+ New Issuer**:
|
|
1. Select **ACME** from the issuer type grid
|
|
2. Fill in the type-specific fields: name, directory URL (`https://acme-v02.api.letsencrypt.org/directory`), and any required config
|
|
|
|
Alternatively, configure via environment variables before starting the server:
|
|
```bash
|
|
export CERTCTL_ACME_DIRECTORY_URL=https://acme-v02.api.letsencrypt.org/directory
|
|
export CERTCTL_ACME_EMAIL=your-email@example.com
|
|
export CERTCTL_ACME_CHALLENGE_TYPE=http-01 # or dns-01 for wildcard certs
|
|
```
|
|
|
|
For DNS-01, also set:
|
|
```bash
|
|
export CERTCTL_ACME_DNS_PRESENT_SCRIPT=/etc/certctl/dns/present.sh
|
|
export CERTCTL_ACME_DNS_CLEANUP_SCRIPT=/etc/certctl/dns/cleanup.sh
|
|
```
|
|
|
|
certctl uses the same Let's Encrypt account; no new credentials needed.
|
|
|
|
### 5. Create Renewal Policies
|
|
|
|
Go to **Policies** → **+ New Policy** to create enforcement rules:
|
|
- Name: e.g., "ACME Renewal Policy"
|
|
- Type: `expiration_window` (to enforce renewal thresholds)
|
|
- Severity: `high`
|
|
- Config: set your renewal threshold (default: 30 days before expiry)
|
|
|
|
Renewal scheduling is driven by the certificate's assigned profile and issuer. Policies add enforcement guardrails (key algorithm requirements, expiration windows, etc.).
|
|
|
|
### 6. Disable Certbot Cron, One Server at a Time
|
|
|
|
On the first server (start with a low-traffic one):
|
|
|
|
```bash
|
|
# Stop Certbot renewal
|
|
sudo systemctl disable certbot.timer
|
|
sudo systemctl stop certbot.timer
|
|
|
|
# Or remove the cron job
|
|
sudo rm /etc/cron.d/certbot # if managed by cron
|
|
```
|
|
|
|
Monitor that server in the certctl dashboard. Certctl will renew the cert ~30 days before expiry.
|
|
|
|
### 7. Verify First Renewal Succeeds
|
|
|
|
Wait for the renewal to trigger (or manually trigger it in **Certificates** → select cert → **Renew**). Check the dashboard:
|
|
- **Certificates** page: status transitions from `Active` to `Renewing` to `Active`
|
|
- **Jobs** page: renewal job shows `Completed` status
|
|
- **Verification** tab: TLS check confirms the new cert is deployed and live
|
|
|
|
After verifying, disable Certbot on the remaining 9 servers.
|
|
|
|
### 8. Enable Alerting
|
|
|
|
Configure notifiers via environment variables before starting the server:
|
|
```bash
|
|
# Example: Slack alerting
|
|
export CERTCTL_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
|
|
docker compose up -d
|
|
|
|
# Or email alerting
|
|
export CERTCTL_SMTP_HOST=smtp.gmail.com
|
|
export CERTCTL_SMTP_PORT=587
|
|
export CERTCTL_SMTP_USERNAME=your-email@gmail.com
|
|
export CERTCTL_SMTP_PASSWORD=your-app-password
|
|
export CERTCTL_SMTP_FROM_ADDRESS=certctl@example.com
|
|
docker compose up -d
|
|
|
|
# Other options: CERTCTL_TEAMS_WEBHOOK_URL, CERTCTL_PAGERDUTY_ROUTING_KEY, CERTCTL_OPSGENIE_API_KEY
|
|
```
|
|
|
|
Now you get 30/14/7-day warnings before any cert expires, across all 10 servers, in one place.
|
|
|
|
## What Changes
|
|
|
|
- **Renewal**: Agent polls certctl for work instead of Certbot cron triggering locally. Faster failure detection (agent heartbeat every 60 seconds vs. cron running once a day).
|
|
- **Deployment**: certctl verifies post-deployment by probing the live TLS endpoint and comparing SHA-256 fingerprints. Catches reload failures silently.
|
|
- **Audit Trail**: Every renewal, deployment, and alert is logged immutably. Answer "who renewed cert X when and why" within seconds.
|
|
- **Alerting**: Threshold-based alerts to Slack/email/webhook 30/14/7 days before expiry, not when cert expires.
|
|
|
|
## Coexistence and Rollback
|
|
|
|
During migration, certctl and Certbot can run simultaneously. The agent will discover Certbot certs even while Certbot continues renewing them. Run both for a week to build confidence.
|
|
|
|
**If you need to rollback**: Re-enable Certbot cron on any server:
|
|
```bash
|
|
sudo systemctl enable certbot.timer
|
|
sudo systemctl start certbot.timer
|
|
```
|
|
|
|
certctl will stop renewing that cert when the policy is disabled. Certbot resumes as before. Your certificates and ACME account remain untouched.
|
|
|
|
## Next Steps
|
|
|
|
- Try the [ACME + NGINX example](../examples/acme-nginx/acme-nginx.md) — a working docker-compose you can run locally before deploying to production
|
|
- Review the [Concepts Guide](./concepts.md) for terminology (profiles, policies, agents, jobs)
|
|
- Explore [Network Discovery](./quickstart.md#network-discovery-agentless) to find certificates you didn't know about
|
|
- See all [Deployment Examples](./examples.md) for other scenarios (wildcard DNS-01, private CA, step-ca, multi-issuer)
|