docs(README): strategic refresh — surface Rank 4/5/7/8 + ACME server + cloud targets

README audit found six classes of drift between the README and the
shipped repo. Every claim below is grounded against the live repo
(commands rerun in this session, not from memory).

Stale numeric claims fixed:
  '111 routes'   → '180+ routes'
                   (live: grep -cE 'r\.Register' router.go = 184)
  '80 tools'     → '85+ tools'
                   (live: grep -cE 'mcp\.AddTool' tools.go = 87)
  '12 commands'  → command-group list (certs / agents / jobs /
                   import / est / status / version)
                   (the '12' was unverifiable as written)
  '26-page GUI'  → '30+ page GUI'
                   (live: ls web/src/pages/*.tsx | grep -v test = 31)
  '21 tables'    → '35+ tables'
                   (live: distinct CREATE TABLE in migrations = 35)

Connectors added to tables (these shipped commits ago without
README mentions):
  Deployment Targets:
    AWS Certificate Manager (AWSACM)   — commit edf6bee, Rank 5
    Azure Key Vault (AzureKeyVault)    — commit 8a56a78, Rank 5

  Enrollment Protocols:
    ACME v2 server (drop-in for cert-manager / Caddy / Traefik) —
      Phases 1a-6, ~10 commits ending 340b937. Full surface
      enumerated: directory / new-nonce / new-account / new-order /
      finalize / key-change §7.3.5 / revoke-cert §7.6 / renewal-info
      RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 + per-account
      rate limiting + scheduler-driven nonce/authz/order GC.

  Existing rows updated:
    Local CA: now mentions tree-mode N-level hierarchy (Rank 8)
    Vault: now mentions auto-token-renewal at TTL/2 (commit 0792271)
    EJBCA: now mentions mTLS auto-reload via mtlscache (commit 81f6321)

Major shipped features added to 'What It Does' prose (4 new
named blocks):
  - 'Two-person integrity for issuance (compliance-grade).'
    — Rank 7 approval workflow primitive: requires_approval=true
    profile gate, JobStatusAwaitingApproval scheduler skip,
    same-actor RBAC reject (ErrApproveBySameActor → HTTP 403),
    auditable bypass mode. Procurement-checklist closer for PCI-DSS
    Level 1 / FedRAMP / SOC 2 / HIPAA.

  - 'Multi-level CA hierarchy management.'
    — Rank 8 first-class CA hierarchy: intermediate_cas table,
    RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement,
    drain-first retire, FedRAMP / financial-services / internal-PKI
    patterns, byte-equivalence pin for unmigrated deployments.

  - 'Run certctl as your ACME server.'
    — Beyond consuming public ACME CAs, certctl now serves RFC 8555.
    Three client walkthroughs (cert-manager, Caddy, Traefik) cited.

  - 'Cloud-managed targets.'
    — AWS ACM + Azure Key Vault SDK-driven import + atomic rollback.

  - 'Notifications + per-policy multi-channel routing.'
    — Rank 4: AlertChannels matrix + AlertSeverityMap +
    fault-isolating per-channel dispatch + Prometheus counter.

V2 paragraph rewritten:
  Pre-edit: a single 800-word wall-of-text bullet that listed
    everything. Buried Rank 4-8 features in the middle.
  Post-edit: 12 named feature blocks, each one to two sentences.
    Scannable. Cloud targets, ACME server, approval workflow,
    CA hierarchy, multi-channel alerts each get their own
    headline + one-line story + doc link.

Documentation table extended with 5 newly-linked operator runbooks
(all of which existed but were never reachable from the README):
  - docs/acme-server.md
  - docs/approval-workflow.md
  - docs/intermediate-ca-hierarchy.md
  - docs/runbook-cloud-targets.md
  - docs/runbook-expiry-alerts.md

Plus 4 deeper cross-links inside the Enrollment Protocols + 'What
It Does' prose:
  - docs/acme-cert-manager-walkthrough.md
  - docs/acme-caddy-walkthrough.md
  - docs/acme-traefik-walkthrough.md
  - docs/acme-server-threat-model.md

Verified locally:
  All 9 previously-orphaned docs now reachable from README.md.
  No stale numeric claim remains:
    grep -nE '\b(111 routes|80 tools|12 commands|26.page|21 tables)' README.md
    → no matches.
  README size: 426 → 457 lines (+31). Net addition is 4 prose
    blocks + 2 table rows + 5 doc-table rows + 1 V2 paragraph
    rewrite (15 → 12 lines but each line denser).

Strategic framing (CMO hat):
  - ACME server is the cert-manager adoption-funnel headline; gets
    its own table row + dedicated 'What It Does' block.
  - CA hierarchy is the Venafi / EJBCA replacement story for
    FedRAMP / financial-services / internal-PKI procurement;
    explicit market positioning.
  - Approval workflow framed as procurement-checklist closer
    (PCI-DSS L1 / FedRAMP / SOC 2 / HIPAA explicitly named).
  - Cloud-managed targets framed as 'we deploy to your cloud
    secret store' story.

Doc-only commit. No code, no test changes.
This commit is contained in:
shankar0123
2026-05-04 03:58:21 +00:00
parent 7d48bd0367
commit e50ba168ac
+42 -11
View File
@@ -50,6 +50,11 @@ gantt
| [Architecture](docs/architecture.md) | System design, data flow diagrams, security model |
| [Feature Inventory](docs/features.md) | Complete reference of all capabilities, API endpoints, and configuration |
| [Connector Reference](docs/connectors.md) | Configuration for all issuer, target, and notifier connectors |
| [ACME Server](docs/acme-server.md) | Run certctl as a drop-in ACME server — cert-manager / Caddy / Traefik walkthroughs + [threat model](docs/acme-server-threat-model.md) |
| [Approval Workflow](docs/approval-workflow.md) | Two-person-integrity gate for certificate issuance — RBAC, audit, bypass mode |
| [CA Hierarchy](docs/intermediate-ca-hierarchy.md) | Multi-level intermediate CA management — FedRAMP boundary CA, financial-services policy CA, internal-PKI patterns |
| [Cloud Target Runbook](docs/runbook-cloud-targets.md) | AWS ACM + Azure Key Vault deploy connectors — config, debugging, atomic-rollback semantics |
| [Expiry Alert Runbook](docs/runbook-expiry-alerts.md) | Per-policy multi-channel routing matrix — severity tiers, fault-isolating dispatch |
| [MCP Server](docs/mcp.md) | AI integration via Model Context Protocol — setup, available tools, examples |
| [OpenAPI 3.1 Spec](docs/openapi.md) | API reference guide with endpoint overview ([raw spec](api/openapi.yaml)) |
| [Compliance Mapping](docs/compliance.md) | SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 alignment guides |
@@ -65,18 +70,18 @@ gantt
| Issuer | Type | Notes |
|--------|------|-------|
| Local CA (self-signed + sub-CA) | `GenericCA` | Sub-CA mode chains to enterprise root (ADCS, etc.) |
| Local CA (self-signed + sub-CA + tree mode) | `GenericCA` | Sub-CA mode chains to enterprise root (ADCS, etc.). **Tree mode (Rank 8)** manages multi-level intermediate CAs (`intermediate_cas` table) with RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 enforcement — FedRAMP boundary CAs, financial-services policy CAs, internal PKI. See [`docs/intermediate-ca-hierarchy.md`](docs/intermediate-ca-hierarchy.md). |
| ACME v2 (Let's Encrypt, ZeroSSL, etc.) | `ACME` | HTTP-01, DNS-01, DNS-PERSIST-01 challenges. EAB auto-fetch from ZeroSSL. Profile selection (`tlsserver`, `shortlived`). |
| step-ca (Smallstep) | `StepCA` | JWK provisioner auth, issuance + renewal + revocation |
| OpenSSL / Custom CA | `OpenSSL` | Shell script adapter — any CA with a CLI |
| HashiCorp Vault PKI | `VaultPKI` | Token auth, synchronous issuance, CRL/OCSP delegated to Vault |
| HashiCorp Vault PKI | `VaultPKI` | Token auth with **automatic renewal at TTL/2** + Prometheus metric, synchronous issuance, CRL/OCSP delegated to Vault, opaque `*secret.Ref` credential storage |
| DigiCert CertCentral | `DigiCert` | Async order model, OV/EV support, PEM bundle parsing |
| Sectigo SCM | `Sectigo` | 3-header auth, DV/OV/EV, collect-not-ready graceful handling |
| Google Cloud CAS | `GoogleCAS` | OAuth2 service account, synchronous issuance, CA pool selection |
| AWS ACM Private CA | `AWSACMPCA` | Synchronous issuance, configurable signing algorithm/template ARN |
| Entrust Certificate Services | `Entrust` | mTLS client certificate auth, synchronous/approval-pending issuance |
| GlobalSign Atlas HVCA | `GlobalSign` | mTLS + API key/secret dual auth, serial-based tracking |
| EJBCA (Keyfactor) | `EJBCA` | Dual auth (mTLS or OAuth2), self-hosted open-source CA |
| EJBCA (Keyfactor) | `EJBCA` | Dual auth (mTLS with auto-reload-on-mtime via `mtlscache`, or OAuth2), self-hosted open-source CA |
**Note:** ADCS integration is handled via the Local CA's sub-CA mode — certctl operates as a subordinate CA with its signing certificate issued by ADCS. Any CA with a shell-accessible signing interface can be integrated via the OpenSSL/Custom CA connector.
@@ -98,6 +103,8 @@ gantt
| Windows Certificate Store | `WinCertStore` | PowerShell Import-PfxCertificate + Get-ChildItem snapshot for rollback |
| Java Keystore | `JavaKeystore` | PEM→PKCS#12→keytool pipeline + keytool snapshot for rollback |
| Kubernetes Secrets | `KubernetesSecrets` | `kubernetes.io/tls` Secrets, atomic API + SHA-256 verify + kubelet sync poll |
| **AWS Certificate Manager** | `AWSACM` | SDK-driven `ImportCertificate` (fresh ARN or rotate-in-place) + `DescribeCertificate` snapshot for atomic rollback + tag re-application. See [`docs/runbook-cloud-targets.md`](docs/runbook-cloud-targets.md). |
| **Azure Key Vault** | `AzureKeyVault` | SDK-driven PEM→PKCS#12 import via `ImportCertificate` (always new version) + snapshot CER bytes for atomic rollback + tag carry-forward. |
**Deploy-hardening I** (post-2026-04-30 master bundle): every connector now goes through `internal/deploy.Apply` for atomic-write + ownership-preservation + SHA-256 idempotency + per-target-type Prometheus counters (`certctl_deploy_*_total`). See [`docs/deployment-atomicity.md`](docs/deployment-atomicity.md) for the operator guide.
@@ -108,8 +115,9 @@ gantt
| **EST (production-grade)** | RFC 7030 + RFC 9266 channel binding | Native EST server hardened for enterprise WiFi/802.1X, IoT bootstrap, and corporate device enrollment (post-2026-04-29 hardening master bundle). All six RFC 7030 endpoints — `cacerts` / `simpleenroll` / `simplereenroll` / `csrattrs` (profile-driven) / `serverkeygen` (CMS EnvelopedData wire format). Multi-profile dispatch (`/.well-known/est/<pathID>/`). Per-profile auth modes: mTLS sibling route at `/.well-known/est-mtls/<pathID>/`, HTTP Basic enrollment-password (constant-time compare + per-source-IP failed-auth limiter), RFC 9266 `tls-exporter` channel binding (TLS 1.3, opt-in per profile). Per-(CN, sourceIP) sliding-window rate limit. EST-source-scoped bulk revoke (`POST /api/v1/est/certificates/bulk-revoke`, M-008 admin-gated). Tabbed admin GUI at `/est` (Profiles / Recent Activity / Trust Bundle). `SIGHUP`-equivalent trust-bundle reload. libest reference-client interop tested in CI (`deploy/test/libest/Dockerfile` + `deploy/test/est_e2e_test.go`). Typed audit-action codes per failure dimension (`est_simple_enroll_success`/`_failed`, `est_auth_failed_basic`/`_mtls`/`_channel_binding`, `est_rate_limited`, `est_csr_policy_violation`, `est_bulk_revoke`, `est_trust_anchor_reloaded`, etc. — full set in `internal/service/est_audit_actions.go`). CLI + matching MCP tool family (rebuild count via `grep -cE '"est_' internal/mcp/tools_est.go`). See [`docs/est.md`](docs/est.md) for the operator guide — WiFi/802.1X + FreeRADIUS recipe, IoT bootstrap, troubleshooting matrix per audit-action code. |
| SCEP (Simple Certificate Enrollment Protocol) | RFC 8894 | MDM platforms (Jamf, Intune), network devices, ChromeOS. Full RFC 8894 wire format: EnvelopedData decryption, signerInfo POPO verification, CertRep PKIMessage builder; PKCSReq + RenewalReq + GetCertInitial messageType dispatch; multi-profile dispatch (`/scep/<pathID>`); per-profile RA cert + key. Lightweight raw-CSR clients keep working via the legacy MVP fall-through path. |
| **Microsoft Intune SCEP fleet (drop-in NDES replacement)** | RFC 8894 + Intune Connector signed-challenge dispatcher | Per-profile Intune dispatcher validates the Connector's signed challenge against an operator-supplied trust anchor; binds device claim to CSR (set-equality on CN + SAN-DNS/RFC822/UPN); replay cache + per-device rate limit; `SIGHUP`-reloadable trust pool; admin GUI **SCEP Administration** page at `/scep` (Profiles tab with per-profile RA cert expiry + mTLS status, Intune Monitoring tab with per-status counters + reload, Recent Activity tab with full SCEP audit log filter). See [`docs/scep-intune.md`](docs/scep-intune.md) for the migration playbook + Microsoft support statement. |
| ACME v2 | RFC 8555 | Public CA automated issuance (Let's Encrypt, ZeroSSL) |
| ACME ARI (Renewal Information) | RFC 9773 | CA-directed renewal timing — the CA tells you when to renew |
| ACME v2 client | RFC 8555 | Public CA automated issuance (Let's Encrypt, ZeroSSL) |
| **ACME v2 server (drop-in for cert-manager / Caddy / Traefik)** | RFC 8555 + RFC 9773 ARI | Run certctl as your internal ACME CA. Per-profile endpoints at `/acme/profile/{id}/*` (directory, new-nonce, new-account, new-order, finalize, account, order, authz, challenge, key-change, revoke-cert, renewal-info). Per-profile `acme_auth_mode`: `trust_authenticated` for internal PKI; `challenge` for HTTP-01 / DNS-01 / TLS-ALPN-01 validation. Doubly-signed key rollover (§7.3.5), revoke-cert (§7.6, both kid-path and jwk-path auth), per-account rate limiting (orders/hour, key-change/hour, challenge-respond/hour), scheduler-driven nonce/authz/order GC. Three client walkthroughs: [cert-manager](docs/acme-cert-manager-walkthrough.md), [Caddy](docs/acme-caddy-walkthrough.md), [Traefik](docs/acme-traefik-walkthrough.md). Reference: [`docs/acme-server.md`](docs/acme-server.md) + [threat model](docs/acme-server-threat-model.md). |
| ACME ARI (Renewal Information) | RFC 9773 | CA-directed renewal timing — the CA tells you when to renew (client-side and server-side) |
### Standards & Revocation
@@ -161,7 +169,7 @@ Certificate lifecycle tooling falls into two camps: enterprise platforms (Venafi
Built for **platform engineering and DevOps teams** managing 10500+ certificates, **security and compliance teams** who need audit trails and policy enforcement for SOC 2, PCI-DSS 4.0, or NIST SP 800-57 ([compliance mapping included](docs/compliance.md)), and **small teams without enterprise budgets** who need Venafi-grade automation for a 50-server environment. For a detailed comparison, see [Why certctl?](docs/why-certctl.md)
**Architecture.** Go 1.25 control plane with handler→service→repository layering, PostgreSQL 16 backend (21 tables), and a pull-only deployment model — the server never initiates outbound connections. Agents poll for work. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). Background scheduler runs 7 loops: renewal with ARI integration (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/architecture.md) for full system diagrams.
**Architecture.** Go 1.25 control plane with handler→service→repository layering, PostgreSQL 16 backend (35+ tables), and a pull-only deployment model — the server never initiates outbound connections. Agents poll for work. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). Background scheduler runs 7 loops: renewal with ARI integration (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/architecture.md) for full system diagrams.
**Security-first.** Agents generate ECDSA P-256 keys locally — private keys never touch the control plane. API key auth enforced by default with SHA-256 hashing and constant-time comparison. CORS deny-by-default. Shell injection prevention on all connector scripts. SSRF protection (reserved IP filtering) on the network scanner. Atomic idempotency guards on scheduler loops. Issuer and target credentials encrypted at rest with AES-256-GCM. Every API call recorded to an immutable audit trail with actor attribution, body hash, and latency tracking. CI runs race detection, 11 linters, and vulnerability scanning on every commit.
@@ -171,13 +179,19 @@ Built for **platform engineering and DevOps teams** managing 10500+ certifica
**Automated lifecycle.** Certificates renew and deploy themselves. The scheduler monitors expiration, issues through your CA, and deploys to targets — zero human intervention. ACME ARI (RFC 9773) lets the CA direct renewal timing. Ready for 47-day (SC-081v3) and 6-day (Let's Encrypt shortlived) certificate lifetimes.
**Operational dashboard.** 26-page GUI covers the entire lifecycle: certificate inventory with bulk ops, deployment timeline with rollback, discovery triage, network scan management, agent fleet health, short-lived credential countdown, approval workflows, and observability metrics. Configure issuers and targets from the dashboard — no env var editing, no server restarts.
**Operational dashboard.** 30+ page GUI covers the entire lifecycle: certificate inventory with bulk ops, deployment timeline with rollback, discovery triage, network scan management, agent fleet health, short-lived credential countdown, approval workflows, CA-hierarchy management, and observability metrics. Configure issuers and targets from the dashboard — no env var editing, no server restarts.
**Private keys stay on your servers.** Agents generate ECDSA P-256 keys locally, submit only the CSR. The control plane never touches private keys. After deployment, agents probe the live TLS endpoint and compare SHA-256 fingerprints to confirm the right certificate is actually being served.
**Discovery.** Agents scan filesystems for existing PEM/DER certificates. The network scanner probes TLS endpoints across CIDR ranges without agents. Cloud discovery finds certificates in AWS Secrets Manager, Azure Key Vault, and GCP Secret Manager. Continuous TLS health monitoring tracks endpoint status (healthy/degraded/down/cert_mismatch) with configurable thresholds and historical probe data. All discovery modes feed into a unified triage workflow — claim, dismiss, or import what you find.
**Policy engine.** Certificate profiles constrain key types, max TTL, and EKUs — with crypto policy enforcement that validates every CSR against profile rules before it reaches the issuer. MaxTTL caps are enforced per issuer connector. Approval workflows pause jobs for human review. Ownership tracking routes notifications to the right team. Agent groups match devices by OS, architecture, IP CIDR, and version.
**Policy engine.** Certificate profiles constrain key types, max TTL, and EKUs — with crypto policy enforcement that validates every CSR against profile rules before it reaches the issuer. MaxTTL caps are enforced per issuer connector. Ownership tracking routes notifications to the right team. Agent groups match devices by OS, architecture, IP CIDR, and version.
**Two-person integrity for issuance (compliance-grade).** Set `requires_approval=true` on a `CertificateProfile` and every renewal-loop tick or manual `POST /api/v1/certificates/{id}/renew` blocks at `JobStatusAwaitingApproval` until a different actor approves via `POST /api/v1/approvals/{id}/approve`. Same-actor self-approval is rejected at the service layer with `ErrApproveBySameActor` → HTTP 403. Bypass mode (`CERTCTL_APPROVAL_BYPASS=true`) is auditable — every auto-approve records `actor=system-bypass` so audit-tier review surfaces it. Closes the procurement-checklist question for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. See [`docs/approval-workflow.md`](docs/approval-workflow.md).
**Multi-level CA hierarchy management.** Set `Issuer.HierarchyMode = "tree"` and certctl manages a real N-level CA tree backed by the `intermediate_cas` table — root → policy → issuing leaves. RFC 5280 §3.2 (self-signed root validation), §4.2.1.9 (path-length tightening), and §4.2.1.10 (NameConstraints subset semantics) are all enforced at the service layer fail-closed. Drain-first retirement (active → retiring → retired) refuses terminal transitions while active children remain. Patterns documented for FedRAMP boundary CAs (4-level), financial-services policy CAs (3-level with per-BU `PermittedDNSDomains`), and internal PKI (2-level). The pre-Rank-8 single-sub-CA flow stays byte-identical for unmigrated deployments — pinned by `TestLocal_HierarchyMode_SingleVsTree_ByteIdentical`. See [`docs/intermediate-ca-hierarchy.md`](docs/intermediate-ca-hierarchy.md).
**Run certctl as your ACME server.** Beyond consuming public ACME CAs (Let's Encrypt, ZeroSSL), certctl now *serves* RFC 8555 — point cert-manager, Caddy, or Traefik at certctl's per-profile ACME endpoints (`/acme/profile/{id}/*`) and you get internal-PKI cert issuance with the same wire protocol the public CAs use. Full surface: directory, new-nonce, new-account, new-order, finalize, key-change (§7.3.5), revoke-cert (§7.6), renewal-info (RFC 9773 ARI), HTTP-01 / DNS-01 / TLS-ALPN-01 validation, per-account rate limiting, scheduler-driven nonce / authz / order GC. Three client walkthroughs ship — [cert-manager](docs/acme-cert-manager-walkthrough.md), [Caddy](docs/acme-caddy-walkthrough.md), [Traefik](docs/acme-traefik-walkthrough.md) — plus the [operator reference](docs/acme-server.md) and [threat model](docs/acme-server-threat-model.md).
**Enrollment protocols.** EST server (RFC 7030) for device and WiFi enrollment. SCEP server (RFC 8894) for MDM platforms and network devices — full wire format (EnvelopedData decrypt + signerInfo POPO verify + CertRep PKIMessage builder), tested against ChromeOS-shape requests; multi-profile dispatch (`/scep/<pathID>`); RenewalReq + GetCertInitial messageType support; lightweight raw-CSR fallback for legacy clients. See [docs/legacy-est-scep.md](docs/legacy-est-scep.md) for the operator + device-integration guide. S/MIME issuance with email protection EKU.
@@ -185,9 +199,11 @@ Built for **platform engineering and DevOps teams** managing 10500+ certifica
**Audit and observability.** Immutable append-only audit trail records every lifecycle action, every API call, and every approval decision. Prometheus metrics endpoint. Scheduled certificate digest emails. Continuous endpoint health monitoring with state machine transitions and real-time alerts.
**Notifications.** Slack, Teams, PagerDuty, OpsGenie, SMTP, webhooks. Routed by certificate owner. Daily digest emails with stats and expiring certs.
**Notifications + per-policy multi-channel routing.** Slack, Teams, PagerDuty, OpsGenie, SMTP, webhooks. Routed by certificate owner. Daily digest emails with stats and expiring certs. Each `RenewalPolicy` carries an `AlertChannels` matrix (per-severity-tier channel set) + `AlertSeverityMap` (per-threshold tier resolution) so production-tier 7-day alerts page PagerDuty *and* Slack while informational 30-day alerts go email-only. Per-channel dispatch is fault-isolating — a PagerDuty failure does NOT skip Slack/Email at the same threshold. Per-channel dedup row + audit row + Prometheus counter (`certctl_expiry_alerts_total{channel,threshold,result}`). See [`docs/runbook-expiry-alerts.md`](docs/runbook-expiry-alerts.md).
**Multiple interfaces.** REST API (111 routes), CLI (12 commands), MCP server (80 tools for Claude, Cursor, Windsurf), Helm chart, web dashboard. Certificate export in PEM and PKCS#12.
**Cloud-managed targets.** Beyond on-server deploys (NGINX, Apache, IIS, F5, ...), certctl pushes renewed certs directly into AWS Certificate Manager (`ImportCertificate` + `DescribeCertificate` snapshot for atomic rollback + tag re-application) and Azure Key Vault (PEM→PKCS#12 import + snapshot CER bytes for rollback + tag carry-forward). The control plane never touches the cloud credentials — agents own them. See [`docs/runbook-cloud-targets.md`](docs/runbook-cloud-targets.md).
**Multiple interfaces.** REST API (180+ routes), CLI (`certs` / `agents` / `jobs` / `import` / `est` / `status` / `version` command groups), MCP server (85+ tools for Claude, Cursor, Windsurf), Helm chart, web dashboard. Certificate export in PEM and PKCS#12.
**First-run onboarding.** Wizard guides you through connecting a CA, deploying an agent, and issuing your first certificate. Or start with the pre-populated demo — 32 certificates, 10 issuers, 180 days of history.
@@ -398,7 +414,22 @@ CI runs on every push: `go vet`, `go test -race`, `golangci-lint`, `govulncheck`
Core lifecycle management — Local CA + ACME v2 issuers, NGINX target connector, agent-side key generation, API auth + rate limiting, React dashboard, CI pipeline with coverage gates, Docker images on GHCR.
### V2: Operational Maturity — Shipped
30+ milestones shipping enterprise-grade features for free. Sub-CA mode, ACME DNS-01/DNS-PERSIST-01/EAB/ARI (RFC 9773)/profile selection, step-ca, Vault PKI, DigiCert CertCentral, Sectigo SCM, Google CAS, AWS ACM PCA, Entrust, GlobalSign, EJBCA, OpenSSL/Custom CA issuers. NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS (WinRM), F5 BIG-IP, SSH, Windows Certificate Store, Java Keystore, Kubernetes Secrets targets. EST server (RFC 7030) and SCEP server (RFC 8894) enrollment protocols. RFC 5280 revocation with DER CRL + embedded OCSP responder. Certificate profiles, ownership tracking, team assignment, agent groups, interactive approval workflows. Filesystem, network, and cloud secret manager (AWS SM, Azure KV, GCP SM) certificate discovery with triage GUI. Dynamic issuer/target configuration via GUI with AES-256-GCM encrypted storage. First-run onboarding wizard. Post-deployment TLS verification. Certificate export (PEM/PKCS#12). S/MIME support. Prometheus metrics. Scheduled certificate digest emails. Slack, Teams, PagerDuty, OpsGenie, SMTP notifications. MCP server (80 tools), CLI (12 commands), Helm chart. Compliance mapping (SOC 2, PCI-DSS 4.0, NIST SP 800-57). 5 turnkey deployment examples. Agent install script. Migration guides from certbot, acme.sh, and cert-manager. See the [Feature Inventory](docs/features.md) for details.
40+ milestones shipping enterprise-grade features for free. Highlights below; the [Feature Inventory](docs/features.md) has the complete reference.
- **Issuers (12).** Local CA (self-signed + sub-CA + tree-mode N-level hierarchy), ACME (DNS-01 / DNS-PERSIST-01 / EAB / ARI / profile selection), step-ca, Vault PKI (with auto-token-renewal at TTL/2), DigiCert CertCentral, Sectigo SCM, Google CAS, AWS ACM PCA, Entrust (mTLS), GlobalSign Atlas HVCA, EJBCA (mTLS auto-reload via `mtlscache`), OpenSSL/Custom CA shell adapter.
- **On-server deploy targets (14).** NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS (WinRM), F5 BIG-IP, SSH, Windows Certificate Store, Java Keystore, Kubernetes Secrets — every connector goes through `internal/deploy.Apply` for atomic-write + ownership preservation + SHA-256 idempotency + per-target Prometheus counters + pre-deploy snapshot + on-failure rollback.
- **Cloud-managed deploy targets (2).** AWS Certificate Manager + Azure Key Vault — SDK-driven import with snapshot bytes for atomic rollback, tag carry-forward, no cloud creds touch the control plane. ([runbook](docs/runbook-cloud-targets.md))
- **certctl as an ACME server.** Full RFC 8555 surface (per-profile endpoints, accounts, orders, finalize, key-change §7.3.5, revoke-cert §7.6) + RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 validation + per-account rate limiting + scheduler-driven nonce/authz/order GC. Drop in for cert-manager / Caddy / Traefik. ([reference](docs/acme-server.md), [threat model](docs/acme-server-threat-model.md))
- **Enrollment protocols.** EST server (RFC 7030 + RFC 9266 channel binding, multi-profile dispatch, libest-tested CI). SCEP server (RFC 8894 full wire format, Microsoft Intune Connector signed-challenge dispatcher with replay cache + per-device rate limit, ChromeOS-shape interop).
- **Two-person-integrity approval workflow.** Per-profile `requires_approval=true` gate, `JobStatusAwaitingApproval` scheduler skip, same-actor RBAC reject, auditable bypass mode. Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA. ([playbook](docs/approval-workflow.md))
- **First-class CA hierarchy management.** `intermediate_cas` table, RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement, drain-first retire (active → retiring → retired), 4 admin-gated endpoints, GUI tree view. Patterns documented for FedRAMP / financial-services / internal PKI. ([runbook](docs/intermediate-ca-hierarchy.md))
- **Multi-channel expiry alerts.** Per-policy `AlertChannels` matrix + `AlertSeverityMap`, fault-isolating per-channel dispatch (PagerDuty failure does not skip Slack/Email at the same threshold), per-channel dedup + audit + Prometheus counter. ([runbook](docs/runbook-expiry-alerts.md))
- **Revocation infrastructure.** RFC 5280 DER CRL per issuer (scheduler-pre-generated + ETag-cached) + embedded RFC 6960 OCSP responder (dedicated per-issuer responder cert per §2.6, `id-pkix-ocsp-nocheck`, RFC §4.4.1 nonce echo, OCSP response cache with revoke-invalidate hot path). Single + bulk revocation. ([guide](docs/crl-ocsp.md))
- **Discovery & lifecycle.** Filesystem, network-CIDR, and cloud secret manager (AWS SM / Azure KV / GCP SM) certificate discovery with triage GUI. Continuous endpoint health monitoring. ACME ARI client-driven renewal timing. Approval workflows. Ownership routing. Agent groups (OS / arch / IP CIDR / version match).
- **Secrets at rest.** Issuer + target config encrypted with AES-256-GCM (versioned blob format, PBKDF2-SHA256 100K rounds, fail-closed sentinel `ErrEncryptionKeyRequired`). Vault token + DigiCert API key + EJBCA / GlobalSign / Sectigo credentials migrated to opaque `*secret.Ref` references.
- **Operator interfaces.** REST API (180+ routes), CLI (`certs` / `agents` / `jobs` / `import` / `est` / `status` / `version` command groups), MCP server (85+ tools for Claude / Cursor / Windsurf), Helm chart, 30+ page web dashboard with first-run onboarding wizard.
- **Compliance.** SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 mapping ([compliance docs](docs/compliance.md)). Disaster-recovery runbook (8-section operator-grade procedure). Migration guides from [certbot](docs/migrate-from-certbot.md), [acme.sh](docs/migrate-from-acmesh.md), and [cert-manager](docs/certctl-for-cert-manager-users.md).
### Forward-looking work — all free, all self-hostable
Everything ships free under BSL 1.1. No paid tier, no V3 / V4 gating, no enterprise edition. Future revenue path is a managed-service hosting offering — operate certctl-server as a hosted service while customers self-install only the agent.