docs: Phase 11 (partial) — fix cross-references after Phase 2 moves

Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
Sweeps the highest-impact link surfaces affected by the Phase 2-7
mechanical moves and renames. Covers README.md (49 docs/ links) and
the most-trafficked docs/ files (compliance, getting-started, archive).

README.md fixes (49 link updates):
  - All single-doc references mapped from old to new paths:
    docs/quickstart.md → docs/getting-started/quickstart.md
    docs/architecture.md → docs/reference/architecture.md
    docs/connectors.md → docs/reference/connectors/index.md
    docs/acme-server.md → docs/reference/protocols/acme-server.md
    docs/{soc2,pci-dss,nist}.md → docs/compliance/{soc2,pci-dss,nist-sp-800-57}.md
    ... (full mapping in the sed pipeline)
  - 3 references to deleted features.md replaced with pointers to
    architecture.md + connectors/index.md.

docs/compliance/index.md (3 sibling renames):
  compliance-soc2.md     → soc2.md
  compliance-pci-dss.md  → pci-dss.md
  compliance-nist.md     → nist-sp-800-57.md

docs/compliance/pci-dss.md (3 external refs need ../):
  architecture.md  → ../reference/architecture.md
  connectors.md    → ../reference/connectors/index.md
  quickstart.md    → ../getting-started/quickstart.md

docs/getting-started/concepts.md (4 external refs):
  crl-ocsp.md      → ../reference/protocols/crl-ocsp.md
  architecture.md  → ../reference/architecture.md
  mcp.md           → ../reference/mcp.md
  openapi.md       → ../reference/api.md

docs/getting-started/quickstart.md (4 external refs + 1 sibling):
  tls.md           → ../operator/tls.md
  upgrade-to-tls.md → ../archive/upgrades/to-tls-v2.2.md
  architecture.md  → ../reference/architecture.md
  demo-advanced.md → advanced-demo.md (sibling rename)

docs/getting-started/examples.md (4 external refs):
  migrate-from-certbot.md         → ../migration/from-certbot.md
  migrate-from-acmesh.md          → ../migration/from-acmesh.md
  certctl-for-cert-manager-users.md → ../migration/cert-manager-coexistence.md
  connectors.md                   → ../reference/connectors/index.md

docs/archive/upgrades/to-tls-v2.2.md (3 external refs need ../../):
  tls.md           → ../../operator/tls.md
  quickstart.md    → ../../getting-started/quickstart.md
  test-env.md      → ../../contributor/test-environment.md

docs/archive/upgrades/to-v2-jwt-removal.md (2 external refs need ../../):
  architecture.md  → ../../reference/architecture.md
  tls.md           → ../../operator/tls.md

Verified all README.md docs/ links resolve to existing files. The only
remaining top-level link is testing-guide.md which still exists at the
top of docs/ (Phase 5 will prune it later).

Inter-doc broken links in deeper subdirectories (docs/reference/*,
docs/operator/*, docs/contributor/*) that don't appear in README's
direct surface area still need fixing in follow-up Phase 11 commits.
This commit handles the operator-facing entry points.
This commit is contained in:
shankar0123
2026-05-05 03:19:21 +00:00
parent 633e440787
commit dca1900815
8 changed files with 71 additions and 72 deletions
+47 -48
View File
@@ -41,27 +41,26 @@ gantt
| Guide | Description |
|-------|-------------|
| [Why certctl?](docs/why-certctl.md) | How certctl compares to ACME clients, agent-based SaaS, and enterprise platforms |
| [Concepts](docs/concepts.md) | TLS certificates explained from scratch — for beginners who know nothing about certs |
| [Quick Start](docs/quickstart.md) | 5-minute setup — dashboard, API, CLI, discovery, stakeholder demo flow |
| [Why certctl?](docs/getting-started/why-certctl.md) | How certctl compares to ACME clients, agent-based SaaS, and enterprise platforms |
| [Concepts](docs/getting-started/concepts.md) | TLS certificates explained from scratch — for beginners who know nothing about certs |
| [Quick Start](docs/getting-started/quickstart.md) | 5-minute setup — dashboard, API, CLI, discovery, stakeholder demo flow |
| [Docker Compose Environments](deploy/ENVIRONMENTS.md) | Service-by-service walkthrough of all 4 compose files, env var reference |
| [Deployment Examples](docs/examples.md) | 5 turnkey scenarios (ACME+NGINX, wildcard DNS-01, private CA, step-ca, multi-issuer) with migration guides |
| [Advanced Demo](docs/demo-advanced.md) | Issue a certificate end-to-end with technical deep-dives |
| [Architecture](docs/architecture.md) | System design, data flow diagrams, security model |
| [Feature Inventory](docs/features.md) | Complete reference of all capabilities, API endpoints, and configuration |
| [Connector Reference](docs/connectors.md) | Configuration for all issuer, target, and notifier connectors |
| [ACME Server](docs/acme-server.md) | Run certctl as a drop-in ACME server — cert-manager / Caddy / Traefik walkthroughs + [threat model](docs/acme-server-threat-model.md) |
| [Approval Workflow](docs/approval-workflow.md) | Two-person-integrity gate for certificate issuance — RBAC, audit, bypass mode |
| [CA Hierarchy](docs/intermediate-ca-hierarchy.md) | Multi-level intermediate CA management — FedRAMP boundary CA, financial-services policy CA, internal-PKI patterns |
| [Cloud Target Runbook](docs/runbook-cloud-targets.md) | AWS ACM + Azure Key Vault deploy connectors — config, debugging, atomic-rollback semantics |
| [Expiry Alert Runbook](docs/runbook-expiry-alerts.md) | Per-policy multi-channel routing matrix — severity tiers, fault-isolating dispatch |
| [MCP Server](docs/mcp.md) | AI integration via Model Context Protocol — setup, available tools, examples |
| [OpenAPI 3.1 Spec](docs/openapi.md) | API reference guide with endpoint overview ([raw spec](api/openapi.yaml)) |
| [Compliance Mapping](docs/compliance.md) | SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 alignment guides |
| [Migrate from certbot](docs/migrate-from-certbot.md) | Step-by-step migration from certbot cron jobs to certctl |
| [Migrate from acme.sh](docs/migrate-from-acmesh.md) | Migration guide for acme.sh users, DNS hook compatibility |
| [certctl for cert-manager users](docs/certctl-for-cert-manager-users.md) | How certctl complements cert-manager for mixed infrastructure |
| [Test Environment](docs/test-env.md) | Docker Compose test environment with real CA backends |
| [Deployment Examples](docs/getting-started/examples.md) | 5 turnkey scenarios (ACME+NGINX, wildcard DNS-01, private CA, step-ca, multi-issuer) with migration guides |
| [Advanced Demo](docs/getting-started/advanced-demo.md) | Issue a certificate end-to-end with technical deep-dives |
| [Architecture](docs/reference/architecture.md) | System design, data flow diagrams, security model |
| [Connector Reference](docs/reference/connectors/index.md) | Configuration for all issuer, target, and notifier connectors |
| [ACME Server](docs/reference/protocols/acme-server.md) | Run certctl as a drop-in ACME server — cert-manager / Caddy / Traefik walkthroughs + [threat model](docs/reference/protocols/acme-server-threat-model.md) |
| [Approval Workflow](docs/operator/approval-workflow.md) | Two-person-integrity gate for certificate issuance — RBAC, audit, bypass mode |
| [CA Hierarchy](docs/reference/intermediate-ca-hierarchy.md) | Multi-level intermediate CA management — FedRAMP boundary CA, financial-services policy CA, internal-PKI patterns |
| [Cloud Target Runbook](docs/operator/runbooks/cloud-targets.md) | AWS ACM + Azure Key Vault deploy connectors — config, debugging, atomic-rollback semantics |
| [Expiry Alert Runbook](docs/operator/runbooks/expiry-alerts.md) | Per-policy multi-channel routing matrix — severity tiers, fault-isolating dispatch |
| [MCP Server](docs/reference/mcp.md) | AI integration via Model Context Protocol — setup, available tools, examples |
| [OpenAPI 3.1 Spec](docs/reference/api.md) | API reference guide with endpoint overview ([raw spec](api/openapi.yaml)) |
| [Compliance Mapping](docs/compliance/index.md) | SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 alignment guides |
| [Migrate from certbot](docs/migration/from-certbot.md) | Step-by-step migration from certbot cron jobs to certctl |
| [Migrate from acme.sh](docs/migration/from-acmesh.md) | Migration guide for acme.sh users, DNS hook compatibility |
| [certctl for cert-manager users](docs/migration/cert-manager-coexistence.md) | How certctl complements cert-manager for mixed infrastructure |
| [Test Environment](docs/contributor/test-environment.md) | Docker Compose test environment with real CA backends |
| [Testing Guide](docs/testing-guide.md) | Comprehensive test procedures, smoke tests, and release sign-off checklist |
## Supported Integrations
@@ -70,7 +69,7 @@ gantt
| Issuer | Type | Notes |
|--------|------|-------|
| Local CA (self-signed + sub-CA + tree mode) | `GenericCA` | Sub-CA mode chains to enterprise root (ADCS, etc.). **Tree mode (Rank 8)** manages multi-level intermediate CAs (`intermediate_cas` table) with RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 enforcement — FedRAMP boundary CAs, financial-services policy CAs, internal PKI. See [`docs/intermediate-ca-hierarchy.md`](docs/intermediate-ca-hierarchy.md). |
| Local CA (self-signed + sub-CA + tree mode) | `GenericCA` | Sub-CA mode chains to enterprise root (ADCS, etc.). **Tree mode (Rank 8)** manages multi-level intermediate CAs (`intermediate_cas` table) with RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 enforcement — FedRAMP boundary CAs, financial-services policy CAs, internal PKI. See [`docs/reference/intermediate-ca-hierarchy.md`](docs/reference/intermediate-ca-hierarchy.md). |
| ACME v2 (Let's Encrypt, ZeroSSL, etc.) | `ACME` | HTTP-01, DNS-01, DNS-PERSIST-01 challenges. EAB auto-fetch from ZeroSSL. Profile selection (`tlsserver`, `shortlived`). |
| step-ca (Smallstep) | `StepCA` | JWK provisioner auth, issuance + renewal + revocation |
| OpenSSL / Custom CA | `OpenSSL` | Shell script adapter — any CA with a CLI |
@@ -103,20 +102,20 @@ gantt
| Windows Certificate Store | `WinCertStore` | PowerShell Import-PfxCertificate + Get-ChildItem snapshot for rollback |
| Java Keystore | `JavaKeystore` | PEM→PKCS#12→keytool pipeline + keytool snapshot for rollback |
| Kubernetes Secrets | `KubernetesSecrets` | `kubernetes.io/tls` Secrets, atomic API + SHA-256 verify + kubelet sync poll |
| **AWS Certificate Manager** | `AWSACM` | SDK-driven `ImportCertificate` (fresh ARN or rotate-in-place) + `DescribeCertificate` snapshot for atomic rollback + tag re-application. See [`docs/runbook-cloud-targets.md`](docs/runbook-cloud-targets.md). |
| **AWS Certificate Manager** | `AWSACM` | SDK-driven `ImportCertificate` (fresh ARN or rotate-in-place) + `DescribeCertificate` snapshot for atomic rollback + tag re-application. See [`docs/operator/runbooks/cloud-targets.md`](docs/operator/runbooks/cloud-targets.md). |
| **Azure Key Vault** | `AzureKeyVault` | SDK-driven PEM→PKCS#12 import via `ImportCertificate` (always new version) + snapshot CER bytes for atomic rollback + tag carry-forward. |
**Deploy-hardening I** (post-2026-04-30 master bundle): every connector now goes through `internal/deploy.Apply` for atomic-write + ownership-preservation + SHA-256 idempotency + per-target-type Prometheus counters (`certctl_deploy_*_total`). See [`docs/deployment-atomicity.md`](docs/deployment-atomicity.md) for the operator guide.
**Deploy-hardening I** (post-2026-04-30 master bundle): every connector now goes through `internal/deploy.Apply` for atomic-write + ownership-preservation + SHA-256 idempotency + per-target-type Prometheus counters (`certctl_deploy_*_total`). See [`docs/reference/deployment-model.md`](docs/reference/deployment-model.md) for the operator guide.
### Enrollment Protocols
| Protocol | Standard | Use Case |
|----------|----------|----------|
| **EST (production-grade)** | RFC 7030 + RFC 9266 channel binding | Native EST server hardened for enterprise WiFi/802.1X, IoT bootstrap, and corporate device enrollment (post-2026-04-29 hardening master bundle). All six RFC 7030 endpoints — `cacerts` / `simpleenroll` / `simplereenroll` / `csrattrs` (profile-driven) / `serverkeygen` (CMS EnvelopedData wire format). Multi-profile dispatch (`/.well-known/est/<pathID>/`). Per-profile auth modes: mTLS sibling route at `/.well-known/est-mtls/<pathID>/`, HTTP Basic enrollment-password (constant-time compare + per-source-IP failed-auth limiter), RFC 9266 `tls-exporter` channel binding (TLS 1.3, opt-in per profile). Per-(CN, sourceIP) sliding-window rate limit. EST-source-scoped bulk revoke (`POST /api/v1/est/certificates/bulk-revoke`, M-008 admin-gated). Tabbed admin GUI at `/est` (Profiles / Recent Activity / Trust Bundle). `SIGHUP`-equivalent trust-bundle reload. libest reference-client interop tested in CI (`deploy/test/libest/Dockerfile` + `deploy/test/est_e2e_test.go`). Typed audit-action codes per failure dimension (`est_simple_enroll_success`/`_failed`, `est_auth_failed_basic`/`_mtls`/`_channel_binding`, `est_rate_limited`, `est_csr_policy_violation`, `est_bulk_revoke`, `est_trust_anchor_reloaded`, etc. — full set in `internal/service/est_audit_actions.go`). CLI + matching MCP tool family (rebuild count via `grep -cE '"est_' internal/mcp/tools_est.go`). See [`docs/est.md`](docs/est.md) for the operator guide — WiFi/802.1X + FreeRADIUS recipe, IoT bootstrap, troubleshooting matrix per audit-action code. |
| **EST (production-grade)** | RFC 7030 + RFC 9266 channel binding | Native EST server hardened for enterprise WiFi/802.1X, IoT bootstrap, and corporate device enrollment (post-2026-04-29 hardening master bundle). All six RFC 7030 endpoints — `cacerts` / `simpleenroll` / `simplereenroll` / `csrattrs` (profile-driven) / `serverkeygen` (CMS EnvelopedData wire format). Multi-profile dispatch (`/.well-known/est/<pathID>/`). Per-profile auth modes: mTLS sibling route at `/.well-known/est-mtls/<pathID>/`, HTTP Basic enrollment-password (constant-time compare + per-source-IP failed-auth limiter), RFC 9266 `tls-exporter` channel binding (TLS 1.3, opt-in per profile). Per-(CN, sourceIP) sliding-window rate limit. EST-source-scoped bulk revoke (`POST /api/v1/est/certificates/bulk-revoke`, M-008 admin-gated). Tabbed admin GUI at `/est` (Profiles / Recent Activity / Trust Bundle). `SIGHUP`-equivalent trust-bundle reload. libest reference-client interop tested in CI (`deploy/test/libest/Dockerfile` + `deploy/test/est_e2e_test.go`). Typed audit-action codes per failure dimension (`est_simple_enroll_success`/`_failed`, `est_auth_failed_basic`/`_mtls`/`_channel_binding`, `est_rate_limited`, `est_csr_policy_violation`, `est_bulk_revoke`, `est_trust_anchor_reloaded`, etc. — full set in `internal/service/est_audit_actions.go`). CLI + matching MCP tool family (rebuild count via `grep -cE '"est_' internal/mcp/tools_est.go`). See [`docs/reference/protocols/est.md`](docs/reference/protocols/est.md) for the operator guide — WiFi/802.1X + FreeRADIUS recipe, IoT bootstrap, troubleshooting matrix per audit-action code. |
| SCEP (Simple Certificate Enrollment Protocol) | RFC 8894 | MDM platforms (Jamf, Intune), network devices, ChromeOS. Full RFC 8894 wire format: EnvelopedData decryption, signerInfo POPO verification, CertRep PKIMessage builder; PKCSReq + RenewalReq + GetCertInitial messageType dispatch; multi-profile dispatch (`/scep/<pathID>`); per-profile RA cert + key. Lightweight raw-CSR clients keep working via the legacy MVP fall-through path. |
| **Microsoft Intune SCEP fleet (drop-in NDES replacement)** | RFC 8894 + Intune Connector signed-challenge dispatcher | Per-profile Intune dispatcher validates the Connector's signed challenge against an operator-supplied trust anchor; binds device claim to CSR (set-equality on CN + SAN-DNS/RFC822/UPN); replay cache + per-device rate limit; `SIGHUP`-reloadable trust pool; admin GUI **SCEP Administration** page at `/scep` (Profiles tab with per-profile RA cert expiry + mTLS status, Intune Monitoring tab with per-status counters + reload, Recent Activity tab with full SCEP audit log filter). See [`docs/scep-intune.md`](docs/scep-intune.md) for the migration playbook + Microsoft support statement. |
| **Microsoft Intune SCEP fleet (drop-in NDES replacement)** | RFC 8894 + Intune Connector signed-challenge dispatcher | Per-profile Intune dispatcher validates the Connector's signed challenge against an operator-supplied trust anchor; binds device claim to CSR (set-equality on CN + SAN-DNS/RFC822/UPN); replay cache + per-device rate limit; `SIGHUP`-reloadable trust pool; admin GUI **SCEP Administration** page at `/scep` (Profiles tab with per-profile RA cert expiry + mTLS status, Intune Monitoring tab with per-status counters + reload, Recent Activity tab with full SCEP audit log filter). See [`docs/reference/protocols/scep-intune.md`](docs/reference/protocols/scep-intune.md) for the migration playbook + Microsoft support statement. |
| ACME v2 client | RFC 8555 | Public CA automated issuance (Let's Encrypt, ZeroSSL) |
| **ACME v2 server (drop-in for cert-manager / Caddy / Traefik)** | RFC 8555 + RFC 9773 ARI | Run certctl as your internal ACME CA. Per-profile endpoints at `/acme/profile/{id}/*` (directory, new-nonce, new-account, new-order, finalize, account, order, authz, challenge, key-change, revoke-cert, renewal-info). Per-profile `acme_auth_mode`: `trust_authenticated` for internal PKI; `challenge` for HTTP-01 / DNS-01 / TLS-ALPN-01 validation. Doubly-signed key rollover (§7.3.5), revoke-cert (§7.6, both kid-path and jwk-path auth), per-account rate limiting (orders/hour, key-change/hour, challenge-respond/hour), scheduler-driven nonce/authz/order GC. Three client walkthroughs: [cert-manager](docs/acme-cert-manager-walkthrough.md), [Caddy](docs/acme-caddy-walkthrough.md), [Traefik](docs/acme-traefik-walkthrough.md). Reference: [`docs/acme-server.md`](docs/acme-server.md) + [threat model](docs/acme-server-threat-model.md). |
| **ACME v2 server (drop-in for cert-manager / Caddy / Traefik)** | RFC 8555 + RFC 9773 ARI | Run certctl as your internal ACME CA. Per-profile endpoints at `/acme/profile/{id}/*` (directory, new-nonce, new-account, new-order, finalize, account, order, authz, challenge, key-change, revoke-cert, renewal-info). Per-profile `acme_auth_mode`: `trust_authenticated` for internal PKI; `challenge` for HTTP-01 / DNS-01 / TLS-ALPN-01 validation. Doubly-signed key rollover (§7.3.5), revoke-cert (§7.6, both kid-path and jwk-path auth), per-account rate limiting (orders/hour, key-change/hour, challenge-respond/hour), scheduler-driven nonce/authz/order GC. Three client walkthroughs: [cert-manager](docs/migration/acme-from-cert-manager.md), [Caddy](docs/migration/acme-from-caddy.md), [Traefik](docs/migration/acme-from-traefik.md). Reference: [`docs/reference/protocols/acme-server.md`](docs/reference/protocols/acme-server.md) + [threat model](docs/reference/protocols/acme-server-threat-model.md). |
| ACME ARI (Renewal Information) | RFC 9773 | CA-directed renewal timing — the CA tells you when to renew (client-side and server-side) |
### Standards & Revocation
@@ -130,7 +129,7 @@ gantt
| Per-endpoint rate limits | — | **Production hardening II.** OCSP per-source-IP cap at `CERTCTL_OCSP_RATE_LIMIT_PER_IP_MIN` (default 1000/min, zero disables); cert-export per-actor cap at `CERTCTL_CERT_EXPORT_RATE_LIMIT_PER_ACTOR_HR` (default 50/hr, zero disables). OCSP rate-limit trip returns the canonical "unauthorized" OCSP blob plus `Retry-After: 60`; cert-export trip returns HTTP 429. The OCSP limiter does NOT honor `X-Forwarded-For` (publicly reachable; spoofed headers would bypass the cap). |
| Cert-export typed audit | — | **Production hardening II.** Typed action constants (`cert_export_pem` / `cert_export_pkcs12` / `cert_export_pem_with_key` reserved / `cert_export_failed`) emitted via split-emit alongside the legacy bare codes for back-compat. Detail map carries `has_private_key` (always false in V2) and `cipher` (`AES-256-CBC-PBE2-SHA256` — pinned so a future dependency upgrade that changes the encoder default surfaces in audit drift review). |
| Prometheus per-area metrics | OpenMetrics | `GET /api/v1/metrics/prometheus` — production hardening II surfaces `certctl_ocsp_counter_total{label="..."}` per-event series (`request_get`/`_post`, `request_success`/`_invalid`, `nonce_echoed`/`_malformed`, `rate_limited`, `signing_failed`, etc.) wired from the shared counter table that ticks in the cache hot path. CRL / cert-export / EST / SCEP / Intune per-area counters plug in via the same `SetXxxCounters` setter pattern as follow-up commits. |
| Disaster-recovery runbook | — | **Production hardening II.** [`docs/disaster-recovery.md`](docs/disaster-recovery.md) — 8-section operator-grade runbook: CRL cache recovery, OCSP responder cert recovery, OCSP response cache recovery, CA private-key rotation 9-step playbook, Postgres restore + operator-managed-artifacts list, trust-bundle reload semantics, printable DR checklist. The SOC 2 / PCI procurement-team deliverable. |
| Disaster-recovery runbook | — | **Production hardening II.** [`docs/operator/runbooks/disaster-recovery.md`](docs/operator/runbooks/disaster-recovery.md) — 8-section operator-grade runbook: CRL cache recovery, OCSP responder cert recovery, OCSP response cache recovery, CA private-key rotation 9-step playbook, Postgres restore + operator-managed-artifacts list, trust-bundle reload semantics, printable DR checklist. The SOC 2 / PCI procurement-team deliverable. |
| S/MIME certificates | RFC 8551 | Email protection EKU, adaptive KeyUsage flags (`DigitalSignature \| ContentCommitment` instead of the TLS default `DigitalSignature \| KeyEncipherment`). |
| Certificate export | — | PEM (JSON/file) and PKCS#12 (cert-only trust-store mode via `pkcs12.Modern` — AES-256-CBC PBE2 with SHA-256 KDF). Key-bearing PKCS#12 export deferred — V2 export is cert-only by design (private keys live on agents, never touch the control plane). |
| ACME DNS-PERSIST-01 | IETF draft | Standing validation record, no per-renewal DNS updates |
@@ -146,7 +145,7 @@ gantt
| PagerDuty | `PagerDuty` |
| OpsGenie | `OpsGenie` |
All connectors are pluggable — build your own by implementing the [connector interface](docs/connectors.md).
All connectors are pluggable — build your own by implementing the [connector interface](docs/reference/connectors/index.md).
### Screenshots
@@ -167,9 +166,9 @@ All connectors are pluggable — build your own by implementing the [connector i
Certificate lifecycle tooling falls into two camps: enterprise platforms (Venafi, Keyfactor) that cost six figures and take months to deploy, or single-purpose tools (certbot, cert-manager) that handle one slice of the problem. certctl fills the gap — full lifecycle automation, self-hosted, free, CA-agnostic, and target-agnostic. If you're running certbot cron jobs, manually renewing certs, or stitching together scripts across mixed infrastructure, certctl replaces all of that.
Built for **platform engineering and DevOps teams** managing 10500+ certificates, **security and compliance teams** who need audit trails and policy enforcement for SOC 2, PCI-DSS 4.0, or NIST SP 800-57 ([compliance mapping included](docs/compliance.md)), and **small teams without enterprise budgets** who need Venafi-grade automation for a 50-server environment. For a detailed comparison, see [Why certctl?](docs/why-certctl.md)
Built for **platform engineering and DevOps teams** managing 10500+ certificates, **security and compliance teams** who need audit trails and policy enforcement for SOC 2, PCI-DSS 4.0, or NIST SP 800-57 ([compliance mapping included](docs/compliance/index.md)), and **small teams without enterprise budgets** who need Venafi-grade automation for a 50-server environment. For a detailed comparison, see [Why certctl?](docs/getting-started/why-certctl.md)
**Architecture.** Go 1.25 control plane with handler→service→repository layering, PostgreSQL 16 backend (35+ tables), and a pull-only deployment model — the server never initiates outbound connections. Agents poll for work. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). Background scheduler runs 7 loops: renewal with ARI integration (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/architecture.md) for full system diagrams.
**Architecture.** Go 1.25 control plane with handler→service→repository layering, PostgreSQL 16 backend (35+ tables), and a pull-only deployment model — the server never initiates outbound connections. Agents poll for work. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). Background scheduler runs 7 loops: renewal with ARI integration (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/reference/architecture.md) for full system diagrams.
**Security-first.** Agents generate ECDSA P-256 keys locally — private keys never touch the control plane. API key auth enforced by default with SHA-256 hashing and constant-time comparison. CORS deny-by-default. Shell injection prevention on all connector scripts. SSRF protection (reserved IP filtering) on the network scanner. Atomic idempotency guards on scheduler loops. Issuer and target credentials encrypted at rest with AES-256-GCM. Every API call recorded to an immutable audit trail with actor attribution, body hash, and latency tracking. CI runs race detection, 11 linters, and vulnerability scanning on every commit.
@@ -187,27 +186,27 @@ Built for **platform engineering and DevOps teams** managing 10500+ certifica
**Policy engine.** Certificate profiles constrain key types, max TTL, and EKUs — with crypto policy enforcement that validates every CSR against profile rules before it reaches the issuer. MaxTTL caps are enforced per issuer connector. Ownership tracking routes notifications to the right team. Agent groups match devices by OS, architecture, IP CIDR, and version.
**Two-person integrity for issuance (compliance-grade).** Set `requires_approval=true` on a `CertificateProfile` and every renewal-loop tick or manual `POST /api/v1/certificates/{id}/renew` blocks at `JobStatusAwaitingApproval` until a different actor approves via `POST /api/v1/approvals/{id}/approve`. Same-actor self-approval is rejected at the service layer with `ErrApproveBySameActor` → HTTP 403. Bypass mode (`CERTCTL_APPROVAL_BYPASS=true`) is auditable — every auto-approve records `actor=system-bypass` so audit-tier review surfaces it. Closes the procurement-checklist question for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. See [`docs/approval-workflow.md`](docs/approval-workflow.md).
**Two-person integrity for issuance (compliance-grade).** Set `requires_approval=true` on a `CertificateProfile` and every renewal-loop tick or manual `POST /api/v1/certificates/{id}/renew` blocks at `JobStatusAwaitingApproval` until a different actor approves via `POST /api/v1/approvals/{id}/approve`. Same-actor self-approval is rejected at the service layer with `ErrApproveBySameActor` → HTTP 403. Bypass mode (`CERTCTL_APPROVAL_BYPASS=true`) is auditable — every auto-approve records `actor=system-bypass` so audit-tier review surfaces it. Closes the procurement-checklist question for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. See [`docs/operator/approval-workflow.md`](docs/operator/approval-workflow.md).
**Multi-level CA hierarchy management.** Set `Issuer.HierarchyMode = "tree"` and certctl manages a real N-level CA tree backed by the `intermediate_cas` table — root → policy → issuing leaves. RFC 5280 §3.2 (self-signed root validation), §4.2.1.9 (path-length tightening), and §4.2.1.10 (NameConstraints subset semantics) are all enforced at the service layer fail-closed. Drain-first retirement (active → retiring → retired) refuses terminal transitions while active children remain. Patterns documented for FedRAMP boundary CAs (4-level), financial-services policy CAs (3-level with per-BU `PermittedDNSDomains`), and internal PKI (2-level). The pre-Rank-8 single-sub-CA flow stays byte-identical for unmigrated deployments — pinned by `TestLocal_HierarchyMode_SingleVsTree_ByteIdentical`. See [`docs/intermediate-ca-hierarchy.md`](docs/intermediate-ca-hierarchy.md).
**Multi-level CA hierarchy management.** Set `Issuer.HierarchyMode = "tree"` and certctl manages a real N-level CA tree backed by the `intermediate_cas` table — root → policy → issuing leaves. RFC 5280 §3.2 (self-signed root validation), §4.2.1.9 (path-length tightening), and §4.2.1.10 (NameConstraints subset semantics) are all enforced at the service layer fail-closed. Drain-first retirement (active → retiring → retired) refuses terminal transitions while active children remain. Patterns documented for FedRAMP boundary CAs (4-level), financial-services policy CAs (3-level with per-BU `PermittedDNSDomains`), and internal PKI (2-level). The pre-Rank-8 single-sub-CA flow stays byte-identical for unmigrated deployments — pinned by `TestLocal_HierarchyMode_SingleVsTree_ByteIdentical`. See [`docs/reference/intermediate-ca-hierarchy.md`](docs/reference/intermediate-ca-hierarchy.md).
**Run certctl as your ACME server.** Beyond consuming public ACME CAs (Let's Encrypt, ZeroSSL), certctl now *serves* RFC 8555 — point cert-manager, Caddy, or Traefik at certctl's per-profile ACME endpoints (`/acme/profile/{id}/*`) and you get internal-PKI cert issuance with the same wire protocol the public CAs use. Full surface: directory, new-nonce, new-account, new-order, finalize, key-change (§7.3.5), revoke-cert (§7.6), renewal-info (RFC 9773 ARI), HTTP-01 / DNS-01 / TLS-ALPN-01 validation, per-account rate limiting, scheduler-driven nonce / authz / order GC. Three client walkthroughs ship — [cert-manager](docs/acme-cert-manager-walkthrough.md), [Caddy](docs/acme-caddy-walkthrough.md), [Traefik](docs/acme-traefik-walkthrough.md) — plus the [operator reference](docs/acme-server.md) and [threat model](docs/acme-server-threat-model.md).
**Run certctl as your ACME server.** Beyond consuming public ACME CAs (Let's Encrypt, ZeroSSL), certctl now *serves* RFC 8555 — point cert-manager, Caddy, or Traefik at certctl's per-profile ACME endpoints (`/acme/profile/{id}/*`) and you get internal-PKI cert issuance with the same wire protocol the public CAs use. Full surface: directory, new-nonce, new-account, new-order, finalize, key-change (§7.3.5), revoke-cert (§7.6), renewal-info (RFC 9773 ARI), HTTP-01 / DNS-01 / TLS-ALPN-01 validation, per-account rate limiting, scheduler-driven nonce / authz / order GC. Three client walkthroughs ship — [cert-manager](docs/migration/acme-from-cert-manager.md), [Caddy](docs/migration/acme-from-caddy.md), [Traefik](docs/migration/acme-from-traefik.md) — plus the [operator reference](docs/reference/protocols/acme-server.md) and [threat model](docs/reference/protocols/acme-server-threat-model.md).
**Enrollment protocols.** EST server (RFC 7030) for device and WiFi enrollment. SCEP server (RFC 8894) for MDM platforms and network devices — full wire format (EnvelopedData decrypt + signerInfo POPO verify + CertRep PKIMessage builder), tested against ChromeOS-shape requests; multi-profile dispatch (`/scep/<pathID>`); RenewalReq + GetCertInitial messageType support; lightweight raw-CSR fallback for legacy clients. See [docs/legacy-est-scep.md](docs/legacy-est-scep.md) for the operator + device-integration guide. S/MIME issuance with email protection EKU.
**Enrollment protocols.** EST server (RFC 7030) for device and WiFi enrollment. SCEP server (RFC 8894) for MDM platforms and network devices — full wire format (EnvelopedData decrypt + signerInfo POPO verify + CertRep PKIMessage builder), tested against ChromeOS-shape requests; multi-profile dispatch (`/scep/<pathID>`); RenewalReq + GetCertInitial messageType support; lightweight raw-CSR fallback for legacy clients. See [docs/reference/protocols/scep-server.md](docs/reference/protocols/scep-server.md) for the operator + device-integration guide. S/MIME issuance with email protection EKU.
**Revocation.** Single and bulk revocation (by profile, owner, agent, or issuer). RFC 5280 reason codes. Production-grade revocation status surface for relying parties: DER-encoded X.509 CRL per issuer, scheduler-pre-generated and cached so HTTP fetches do not rebuild per request; embedded OCSP responder serving both GET and POST forms (RFC 6960 §A.1.1) with responses signed by a per-issuer dedicated OCSP responder cert (RFC 6960 §2.6, `id-pkix-ocsp-nocheck` per §4.2.2.2.1) — the CA private key is never used directly for OCSP signing. Both endpoints live unauthenticated under `/.well-known/pki/` per RFC 8615. Short-lived certs (TTL < 1 hour) are exempt — expiry is sufficient revocation. See [docs/crl-ocsp.md](docs/crl-ocsp.md) for the relying-party integration guide.
**Revocation.** Single and bulk revocation (by profile, owner, agent, or issuer). RFC 5280 reason codes. Production-grade revocation status surface for relying parties: DER-encoded X.509 CRL per issuer, scheduler-pre-generated and cached so HTTP fetches do not rebuild per request; embedded OCSP responder serving both GET and POST forms (RFC 6960 §A.1.1) with responses signed by a per-issuer dedicated OCSP responder cert (RFC 6960 §2.6, `id-pkix-ocsp-nocheck` per §4.2.2.2.1) — the CA private key is never used directly for OCSP signing. Both endpoints live unauthenticated under `/.well-known/pki/` per RFC 8615. Short-lived certs (TTL < 1 hour) are exempt — expiry is sufficient revocation. See [docs/reference/protocols/crl-ocsp.md](docs/reference/protocols/crl-ocsp.md) for the relying-party integration guide.
**Audit and observability.** Immutable append-only audit trail records every lifecycle action, every API call, and every approval decision. Prometheus metrics endpoint. Scheduled certificate digest emails. Continuous endpoint health monitoring with state machine transitions and real-time alerts.
**Notifications + per-policy multi-channel routing.** Slack, Teams, PagerDuty, OpsGenie, SMTP, webhooks. Routed by certificate owner. Daily digest emails with stats and expiring certs. Each `RenewalPolicy` carries an `AlertChannels` matrix (per-severity-tier channel set) + `AlertSeverityMap` (per-threshold tier resolution) so production-tier 7-day alerts page PagerDuty *and* Slack while informational 30-day alerts go email-only. Per-channel dispatch is fault-isolating — a PagerDuty failure does NOT skip Slack/Email at the same threshold. Per-channel dedup row + audit row + Prometheus counter (`certctl_expiry_alerts_total{channel,threshold,result}`). See [`docs/runbook-expiry-alerts.md`](docs/runbook-expiry-alerts.md).
**Notifications + per-policy multi-channel routing.** Slack, Teams, PagerDuty, OpsGenie, SMTP, webhooks. Routed by certificate owner. Daily digest emails with stats and expiring certs. Each `RenewalPolicy` carries an `AlertChannels` matrix (per-severity-tier channel set) + `AlertSeverityMap` (per-threshold tier resolution) so production-tier 7-day alerts page PagerDuty *and* Slack while informational 30-day alerts go email-only. Per-channel dispatch is fault-isolating — a PagerDuty failure does NOT skip Slack/Email at the same threshold. Per-channel dedup row + audit row + Prometheus counter (`certctl_expiry_alerts_total{channel,threshold,result}`). See [`docs/operator/runbooks/expiry-alerts.md`](docs/operator/runbooks/expiry-alerts.md).
**Cloud-managed targets.** Beyond on-server deploys (NGINX, Apache, IIS, F5, ...), certctl pushes renewed certs directly into AWS Certificate Manager (`ImportCertificate` + `DescribeCertificate` snapshot for atomic rollback + tag re-application) and Azure Key Vault (PEM→PKCS#12 import + snapshot CER bytes for rollback + tag carry-forward). The control plane never touches the cloud credentials — agents own them. See [`docs/runbook-cloud-targets.md`](docs/runbook-cloud-targets.md).
**Cloud-managed targets.** Beyond on-server deploys (NGINX, Apache, IIS, F5, ...), certctl pushes renewed certs directly into AWS Certificate Manager (`ImportCertificate` + `DescribeCertificate` snapshot for atomic rollback + tag re-application) and Azure Key Vault (PEM→PKCS#12 import + snapshot CER bytes for rollback + tag carry-forward). The control plane never touches the cloud credentials — agents own them. See [`docs/operator/runbooks/cloud-targets.md`](docs/operator/runbooks/cloud-targets.md).
**Multiple interfaces.** REST API (180+ routes), CLI (`certs` / `agents` / `jobs` / `import` / `est` / `status` / `version` command groups), MCP server (85+ tools for Claude, Cursor, Windsurf), Helm chart, web dashboard. Certificate export in PEM and PKCS#12.
**First-run onboarding.** Wizard guides you through connecting a CA, deploying an agent, and issuing your first certificate. Or start with the pre-populated demo — 32 certificates, 10 issuers, 180 days of history.
For the complete capability breakdown, see the [Feature Inventory](docs/features.md).
For the complete capability breakdown, see the [Architecture Guide](docs/reference/architecture.md) and the [Connector Reference](docs/reference/connectors/index.md).
## Quick Start
@@ -227,14 +226,14 @@ Wait ~30 seconds, then open **https://localhost:8443** in your browser. (The shi
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build
```
The `deploy/` directory has four compose files: `docker-compose.yml` (base platform), `docker-compose.demo.yml` (demo data overlay), `docker-compose.dev.yml` (PgAdmin + debug logging), and `docker-compose.test.yml` (standalone integration tests with real CA backends). See the [Docker Compose Environments Guide](deploy/ENVIRONMENTS.md) for a service-by-service walkthrough, or the [Quick Start](docs/quickstart.md#docker-compose-environments) for a summary.
The `deploy/` directory has four compose files: `docker-compose.yml` (base platform), `docker-compose.demo.yml` (demo data overlay), `docker-compose.dev.yml` (PgAdmin + debug logging), and `docker-compose.test.yml` (standalone integration tests with real CA backends). See the [Docker Compose Environments Guide](deploy/ENVIRONMENTS.md) for a service-by-service walkthrough, or the [Quick Start](docs/getting-started/quickstart.md#docker-compose-environments) for a summary.
```bash
curl --cacert $(docker compose -f deploy/docker-compose.yml exec -T certctl-server cat /etc/certctl/tls/ca.crt) https://localhost:8443/health
# {"status":"healthy"}
```
The control plane is HTTPS-only (TLS 1.3, no plaintext listener). See [`docs/tls.md`](docs/tls.md) for cert provisioning patterns and [`docs/upgrade-to-tls.md`](docs/upgrade-to-tls.md) if you're upgrading from a pre-v2.2 release.
The control plane is HTTPS-only (TLS 1.3, no plaintext listener). See [`docs/operator/tls.md`](docs/operator/tls.md) for cert provisioning patterns and [`docs/archive/upgrades/to-tls-v2.2.md`](docs/archive/upgrades/to-tls-v2.2.md) if you're upgrading from a pre-v2.2 release.
### Agent Install (One-Liner)
@@ -415,21 +414,21 @@ Core lifecycle management — Local CA + ACME v2 issuers, NGINX target connector
### V2: Operational Maturity — Shipped
40+ milestones shipping enterprise-grade features for free. Highlights below; the [Feature Inventory](docs/features.md) has the complete reference.
40+ milestones shipping enterprise-grade features for free. Highlights below; the [Architecture Guide](docs/reference/architecture.md) and the [Connector Reference](docs/reference/connectors/index.md) cover the complete surface.
- **Issuers (12).** Local CA (self-signed + sub-CA + tree-mode N-level hierarchy), ACME (DNS-01 / DNS-PERSIST-01 / EAB / ARI / profile selection), step-ca, Vault PKI (with auto-token-renewal at TTL/2), DigiCert CertCentral, Sectigo SCM, Google CAS, AWS ACM PCA, Entrust (mTLS), GlobalSign Atlas HVCA, EJBCA (mTLS auto-reload via `mtlscache`), OpenSSL/Custom CA shell adapter.
- **On-server deploy targets (14).** NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS (WinRM), F5 BIG-IP, SSH, Windows Certificate Store, Java Keystore, Kubernetes Secrets — every connector goes through `internal/deploy.Apply` for atomic-write + ownership preservation + SHA-256 idempotency + per-target Prometheus counters + pre-deploy snapshot + on-failure rollback.
- **Cloud-managed deploy targets (2).** AWS Certificate Manager + Azure Key Vault — SDK-driven import with snapshot bytes for atomic rollback, tag carry-forward, no cloud creds touch the control plane. ([runbook](docs/runbook-cloud-targets.md))
- **certctl as an ACME server.** Full RFC 8555 surface (per-profile endpoints, accounts, orders, finalize, key-change §7.3.5, revoke-cert §7.6) + RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 validation + per-account rate limiting + scheduler-driven nonce/authz/order GC. Drop in for cert-manager / Caddy / Traefik. ([reference](docs/acme-server.md), [threat model](docs/acme-server-threat-model.md))
- **Cloud-managed deploy targets (2).** AWS Certificate Manager + Azure Key Vault — SDK-driven import with snapshot bytes for atomic rollback, tag carry-forward, no cloud creds touch the control plane. ([runbook](docs/operator/runbooks/cloud-targets.md))
- **certctl as an ACME server.** Full RFC 8555 surface (per-profile endpoints, accounts, orders, finalize, key-change §7.3.5, revoke-cert §7.6) + RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 validation + per-account rate limiting + scheduler-driven nonce/authz/order GC. Drop in for cert-manager / Caddy / Traefik. ([reference](docs/reference/protocols/acme-server.md), [threat model](docs/reference/protocols/acme-server-threat-model.md))
- **Enrollment protocols.** EST server (RFC 7030 + RFC 9266 channel binding, multi-profile dispatch, libest-tested CI). SCEP server (RFC 8894 full wire format, Microsoft Intune Connector signed-challenge dispatcher with replay cache + per-device rate limit, ChromeOS-shape interop).
- **Two-person-integrity approval workflow.** Per-profile `requires_approval=true` gate, `JobStatusAwaitingApproval` scheduler skip, same-actor RBAC reject, auditable bypass mode. Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. ([playbook](docs/approval-workflow.md))
- **First-class CA hierarchy management.** `intermediate_cas` table, RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement, drain-first retire (active → retiring → retired), 4 admin-gated endpoints, GUI tree view. Patterns documented for FedRAMP / financial-services / internal PKI. ([runbook](docs/intermediate-ca-hierarchy.md))
- **Multi-channel expiry alerts.** Per-policy `AlertChannels` matrix + `AlertSeverityMap`, fault-isolating per-channel dispatch (PagerDuty failure does not skip Slack/Email at the same threshold), per-channel dedup + audit + Prometheus counter. ([runbook](docs/runbook-expiry-alerts.md))
- **Revocation infrastructure.** RFC 5280 DER CRL per issuer (scheduler-pre-generated + ETag-cached) + embedded RFC 6960 OCSP responder (dedicated per-issuer responder cert per §2.6, `id-pkix-ocsp-nocheck`, RFC §4.4.1 nonce echo, OCSP response cache with revoke-invalidate hot path). Single + bulk revocation. ([guide](docs/crl-ocsp.md))
- **Two-person-integrity approval workflow.** Per-profile `requires_approval=true` gate, `JobStatusAwaitingApproval` scheduler skip, same-actor RBAC reject, auditable bypass mode. Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. ([playbook](docs/operator/approval-workflow.md))
- **First-class CA hierarchy management.** `intermediate_cas` table, RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement, drain-first retire (active → retiring → retired), 4 admin-gated endpoints, GUI tree view. Patterns documented for FedRAMP / financial-services / internal PKI. ([runbook](docs/reference/intermediate-ca-hierarchy.md))
- **Multi-channel expiry alerts.** Per-policy `AlertChannels` matrix + `AlertSeverityMap`, fault-isolating per-channel dispatch (PagerDuty failure does not skip Slack/Email at the same threshold), per-channel dedup + audit + Prometheus counter. ([runbook](docs/operator/runbooks/expiry-alerts.md))
- **Revocation infrastructure.** RFC 5280 DER CRL per issuer (scheduler-pre-generated + ETag-cached) + embedded RFC 6960 OCSP responder (dedicated per-issuer responder cert per §2.6, `id-pkix-ocsp-nocheck`, RFC §4.4.1 nonce echo, OCSP response cache with revoke-invalidate hot path). Single + bulk revocation. ([guide](docs/reference/protocols/crl-ocsp.md))
- **Discovery & lifecycle.** Filesystem, network-CIDR, and cloud secret manager (AWS SM / Azure KV / GCP SM) certificate discovery with triage GUI. Continuous endpoint health monitoring. ACME ARI client-driven renewal timing. Approval workflows. Ownership routing. Agent groups (OS / arch / IP CIDR / version match).
- **Secrets at rest.** Issuer + target config encrypted with AES-256-GCM (versioned blob format, PBKDF2-SHA256 100K rounds, fail-closed sentinel `ErrEncryptionKeyRequired`). Vault token + DigiCert API key + EJBCA / GlobalSign / Sectigo credentials migrated to opaque `*secret.Ref` references.
- **Operator interfaces.** REST API (180+ routes), CLI (`certs` / `agents` / `jobs` / `import` / `est` / `status` / `version` command groups), MCP server (85+ tools for Claude / Cursor / Windsurf), Helm chart, 30+ page web dashboard with first-run onboarding wizard.
- **Compliance.** SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 mapping ([compliance docs](docs/compliance.md)). Disaster-recovery runbook (8-section operator-grade procedure). Migration guides from [certbot](docs/migrate-from-certbot.md), [acme.sh](docs/migrate-from-acmesh.md), and [cert-manager](docs/certctl-for-cert-manager-users.md).
- **Compliance.** SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 mapping ([compliance docs](docs/compliance/index.md)). Disaster-recovery runbook (8-section operator-grade procedure). Migration guides from [certbot](docs/migration/from-certbot.md), [acme.sh](docs/migration/from-acmesh.md), and [cert-manager](docs/migration/cert-manager-coexistence.md).
### Forward-looking work — all free, all self-hostable
Everything ships free under BSL 1.1. No paid tier, no V3 / V4 gating, no enterprise edition. Future revenue path is a managed-service hosting offering — operate certctl-server as a hosted service while customers self-install only the agent.