mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 22:01:36 +00:00
acme-server: foundation — directory + new-nonce + per-profile routing (Phase 1a/7)
First slice of the RFC 8555 ACME server endpoint (master plan at cowork/acme-server-endpoint-prompt.md, per-phase prompts at cowork/acme-server-prompts/). This commit lands the smallest viable end-to-end deployable slice: an ACME client running curl -sk https://certctl/acme/profile/<id>/directory curl -sk -I https://certctl/acme/profile/<id>/new-nonce successfully fetches the directory document and a Replay-Nonce. Account creation, JWS verification, orders, challenges, and revocation are all out of scope for this phase and arrive in Phases 1b–4. Closes the Rank 1 LHF from the 2026-05-03 Infisical deep-research (cowork/infisical-deep-research-results.md). Pre-fix, certctl was an ACME consumer only — no /acme/directory endpoint, no JWS verifier, no challenge validators. K8s customers running cert-manager could not point at certctl as an ACME issuer; they had to deploy a certctl agent on every node. What ships: - internal/api/acme/{directory,nonce,errors}.go (+ tests). - internal/api/handler/acme.go + acme_handler_test.go. - internal/repository/postgres/acme.go (nonce ops only — Phase 1b extends with account CRUD; Phases 2-4 extend with order / authz / challenge CRUD). - internal/service/acme.go (BuildDirectory + IssueNonce stubs; Phase 1b adds VerifyJWS / NewAccount / etc.). - migrations/000025_acme_server.{up,down}.sql ships the full 5-table ACME schema (acme_accounts / acme_orders / acme_authorizations / acme_challenges / acme_nonces) PLUS the per-profile certificate_profiles.acme_auth_mode column. Phase 1a actively uses only acme_nonces; remaining tables are empty until Phases 1b-4 plug in. - internal/config/config.go: ACMEServerConfig struct + ACMEServer field on Config. Env vars use CERTCTL_ACME_SERVER_* prefix to avoid colliding with the existing consumer-side ACMEConfig at config.go:1746 (CERTCTL_ACME_DIRECTORY_URL / PROFILE / CHALLENGE_TYPE etc.). Phase 1a wires Enabled + DefaultAuthMode + DefaultProfileID + NonceTTL + DirectoryMeta; Order/Authz TTLs + per-challenge-type concurrency caps + DNS01 resolver are reserved fields parsed in 1a so operators can set them ahead of Phases 2/3. - cmd/server/main.go: wire ACMEHandler into the HandlerRegistry literal alongside the existing certificate / EST / SCEP / etc. handlers. - internal/api/router/router.go: HandlerRegistry.ACME field + 6 Register calls (3 per-profile + 3 shorthand). - internal/api/router/openapi_parity_test.go: 6 new entries in SpecParityExceptions. ACME is a wire-protocol surface (JWS-signed JSON over HTTPS per RFC 7515) whose semantics are dictated by RFC 8555 + RFC 9773 rather than by an OpenAPI document, same precedent as SCEP/EST. The canonical reference is docs/acme-server.md. - docs/acme-server.md: Phase-1a-shaped reference. Configuration table for every CERTCTL_ACME_SERVER_* env var. Per-profile auth-mode decision tree skeleton. TLS trust bootstrap section flagging cert-manager's ClusterIssuer.spec.acme.caBundle requirement (the single biggest first-time-deploy footgun; the full cert-manager walkthrough lands in Phase 6 but the requirement is documented up front). Architecture decisions baked in: - URL family is /acme/profile/<id>/* (per-profile, canonical) with /acme/* shorthand active when CERTCTL_ACME_SERVER_DEFAULT_PROFILE_ID is set. Path matches existing per-profile precedent in EST + SCEP. - Auth mode is per-profile (acme_auth_mode column on certificate_profiles), NOT server-wide. One certctl-server can serve trust_authenticated for an internal-PKI profile and challenge for a public-trust-style profile simultaneously. The column is read at request time, not cached at server start — operators flipping a profile's mode via SQL take effect on the next order without restart. - Nonces are DB-backed (acme_nonces table). Survive server restart. The RFC 8555 §6.5 replay defense requires the store to outlast the client's nonce caching window; an in-memory-only nonce store would lose every in-flight order on restart. - Per-op atomic counters on service.ACMEService.Metrics() — certctl_acme_directory_total, certctl_acme_directory_failures_total, certctl_acme_new_nonce_total, certctl_acme_new_nonce_failures_total. Naming follows certctl frozen decision 0.10 cardinality discipline. Phase 1b will extend with new_account counters; Phase 2 with order / finalize / cert; Phase 3 with per-challenge-type counters. Audit fixes #11 + #12 (cowork/acme-server-prompts/audit-additions.md) applied: - #11: CERTCTL_ACME_SERVER_* prefix avoids the consumer-side CERTCTL_ACME_* namespace collision. - #12: prior-attempt WIP from two failed Phase-1 dispatches was discarded at phase start; this commit starts from a clean tree. Tests: - 14 unit tests in internal/api/acme/ (directory, nonce, errors). - 7 handler-level tests via httptest.NewServer + mockACMEService (mirrors the mockSCEPService pattern at scep_handler_test.go). - 7 service-layer tests with mocked repo + injected profileLookup. - All pass under -race -count=1 -short. Deferred to Phase 1b: - JWS verification (go-jose v4 — see master-prompt §8a for the API surface and audit doc for the speculation pitfalls). - new-account / account/<id> endpoints + AccountService. - Nonce *consumption* path (issue path is in this commit; consume is only invoked by JWS-verified POSTs which Phase 1b adds). Engineering history: cowork/WORKSPACE-CHANGELOG.md "ACME-Server-1a". Per-phase implementation plan: cowork/acme-server-prompts/. Master plan + audit fixes: cowork/acme-server-endpoint-prompt.md + cowork/acme-server-prompt-audit.md + cowork/acme-server-prompts/audit-additions.md.
This commit is contained in:
@@ -0,0 +1,160 @@
|
||||
# certctl ACME Server (Built-in)
|
||||
|
||||
certctl ships an RFC 8555 + RFC 9773 ARI ACME server endpoint at
|
||||
`/acme/profile/<profile-id>/*`. Any RFC 8555 client (cert-manager 1.15+,
|
||||
Caddy, Traefik, win-acme, certbot, Posh-ACME) can integrate with certctl
|
||||
as an ACME issuer with no certctl-side modification — closing the
|
||||
"deploy a certctl agent on every K8s node" friction that costs deals to
|
||||
external PKI vendors today.
|
||||
|
||||
> **Phase status (2026-05-03):** Phase 1a (foundation — directory +
|
||||
> new-nonce + per-profile routing). The directory document is live and
|
||||
> ACME clients can fetch nonces. Account creation, JWS verification,
|
||||
> orders, challenges, key rollover, revocation, and ARI all land in
|
||||
> subsequent phases. Track shipped phases via
|
||||
> `git log --grep='acme-server:'`.
|
||||
|
||||
## Configuration
|
||||
|
||||
All ACME-server config uses the `CERTCTL_ACME_SERVER_*` env-var prefix
|
||||
(distinct from `CERTCTL_ACME_*` which configures the consumer-side
|
||||
issuer connector). The struct definition lives in
|
||||
`internal/config/config.go::ACMEServerConfig`.
|
||||
|
||||
| Env var | Default | Phase | Description |
|
||||
|--------------------------------------------------|------------------------|-------|-------------|
|
||||
| `CERTCTL_ACME_SERVER_ENABLED` | `false` | 1a | Master enable flag. Phase 1a's handler is constructed unconditionally so the registry shape stays stable; routes are registered in `internal/api/router/router.go::RegisterHandlers` regardless. Operators flip this on after configuring per-profile auth_mode. |
|
||||
| `CERTCTL_ACME_SERVER_DEFAULT_AUTH_MODE` | `trust_authenticated` | 1a | Default value for `certificate_profiles.acme_auth_mode` on newly-created profiles. Existing profiles retain their stored value. Per-profile column is the source of truth at request time. |
|
||||
| `CERTCTL_ACME_SERVER_DEFAULT_PROFILE_ID` | `""` | 1a | When set, `/acme/*` shorthand mirrors `/acme/profile/<DefaultProfileID>/*` for single-profile deployments. When empty, requests to the shorthand return RFC 7807 + RFC 8555 §6.7 `userActionRequired`. |
|
||||
| `CERTCTL_ACME_SERVER_NONCE_TTL` | `5m` | 1a | How long an issued ACME nonce remains valid before the JWS verifier (Phase 1b) returns `urn:ietf:params:acme:error:badNonce` per RFC 8555 §6.5.1. Tune up if cert-manager + certctl clocks frequently skew. |
|
||||
| `CERTCTL_ACME_SERVER_TOS_URL` | `""` | 1a | Optional `meta.termsOfService` URL in the directory document. |
|
||||
| `CERTCTL_ACME_SERVER_WEBSITE` | `""` | 1a | Optional `meta.website` URL in the directory document. |
|
||||
| `CERTCTL_ACME_SERVER_CAA_IDENTITIES` | (empty) | 1a | Comma-separated `meta.caaIdentities` list. |
|
||||
| `CERTCTL_ACME_SERVER_EAB_REQUIRED` | `false` | 1a | `meta.externalAccountRequired` advertisement. EAB enforcement is a follow-up; Phase 1a only advertises. |
|
||||
| `CERTCTL_ACME_SERVER_ORDER_TTL` | `24h` | 2 | Reserved field, parsed in Phase 1a so operators can set it ahead of Phase 2's order endpoints. |
|
||||
| `CERTCTL_ACME_SERVER_AUTHZ_TTL` | `24h` | 2 | Reserved. |
|
||||
| `CERTCTL_ACME_SERVER_HTTP01_CONCURRENCY` | `10` | 3 | Reserved. |
|
||||
| `CERTCTL_ACME_SERVER_DNS01_RESOLVER` | `8.8.8.8:53` | 3 | Reserved. |
|
||||
| `CERTCTL_ACME_SERVER_DNS01_CONCURRENCY` | `10` | 3 | Reserved. |
|
||||
| `CERTCTL_ACME_SERVER_TLSALPN01_CONCURRENCY` | `10` | 3 | Reserved. |
|
||||
|
||||
## Per-profile auth mode
|
||||
|
||||
Two modes per `certificate_profiles.acme_auth_mode`:
|
||||
|
||||
- **`trust_authenticated`** (default for internal PKI). The JWS-
|
||||
authenticated ACME account is trusted to issue certs for any
|
||||
identifier the profile policy allows; there is no per-identifier
|
||||
ownership proof. The most common certctl use case.
|
||||
- **`challenge`**. Full HTTP-01 + DNS-01 + TLS-ALPN-01 validation per
|
||||
RFC 8555 §8. Required when certctl is exposing public-trust-style PKI.
|
||||
|
||||
A single certctl-server can serve both modes simultaneously — the mode
|
||||
is read from the bound profile's column at request time, not cached at
|
||||
server start. Operators can flip a profile's mode via SQL and the next
|
||||
order picks up the new mode without restart.
|
||||
|
||||
The `CERTCTL_ACME_SERVER_DEFAULT_AUTH_MODE` env var sets the default
|
||||
value for newly-created profiles (e.g. via the certctl API). Existing
|
||||
profile rows retain whatever value they were created with.
|
||||
|
||||
## TLS trust bootstrap (read this before configuring cert-manager)
|
||||
|
||||
When certctl-server uses a self-signed TLS bootstrap cert
|
||||
(`deploy/test/certs/server.crt` is the demo default; see
|
||||
[`docs/tls.md`](./tls.md)), cert-manager 1.15+ will refuse to talk to
|
||||
the directory URL unless the certctl root is trusted. The fix lives in
|
||||
`ClusterIssuer.spec.acme.caBundle`:
|
||||
|
||||
```yaml
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: certctl-test
|
||||
spec:
|
||||
acme:
|
||||
server: https://certctl.example.com:8443/acme/profile/prof-corp/directory
|
||||
email: ops@example.com
|
||||
caBundle: |
|
||||
LS0tLS1CRUdJTi... # base64-encoded PEM of certctl's self-signed root
|
||||
privateKeySecretRef:
|
||||
name: certctl-test-account-key
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: nginx
|
||||
```
|
||||
|
||||
The `caBundle` value is the base64-encoded PEM of the root that signed
|
||||
your certctl-server's TLS certificate. Extract it from your operator
|
||||
bootstrap (e.g. `cat deploy/test/certs/ca.crt | base64 -w0`).
|
||||
|
||||
This is the single biggest first-time-deploy footgun on the cert-manager
|
||||
integration path. The full cert-manager walkthrough lands in Phase 6;
|
||||
the `caBundle` requirement is flagged here in Phase 1a's docs because
|
||||
operators hit it the moment they try to point a real ACME client at
|
||||
certctl.
|
||||
|
||||
## Endpoints (Phase 1a)
|
||||
|
||||
Routes registered in `internal/api/router/router.go::RegisterHandlers`:
|
||||
|
||||
| Method | Path | RFC ref | Auth | Description |
|
||||
|--------|-------------------------------------------|-----------------|-----------|-------------|
|
||||
| GET | `/acme/profile/{id}/directory` | RFC 8555 §7.1.1 | unauth | Per-profile directory document. |
|
||||
| HEAD | `/acme/profile/{id}/new-nonce` | RFC 8555 §7.2 | unauth | Returns 200 + Replay-Nonce header. |
|
||||
| GET | `/acme/profile/{id}/new-nonce` | RFC 8555 §7.2 | unauth | Returns 204 + Replay-Nonce header. |
|
||||
| GET | `/acme/directory` | RFC 8555 §7.1.1 | unauth | Shorthand path; mirrors per-profile when `CERTCTL_ACME_SERVER_DEFAULT_PROFILE_ID` is set. |
|
||||
| HEAD | `/acme/new-nonce` | RFC 8555 §7.2 | unauth | Shorthand. |
|
||||
| GET | `/acme/new-nonce` | RFC 8555 §7.2 | unauth | Shorthand. |
|
||||
|
||||
The remaining RFC 8555 endpoints (`new-account`, `account/{id}`,
|
||||
`new-order`, `order/{id}`, `order/{id}/finalize`, `authz/{id}`,
|
||||
`challenge/{id}`, `cert/{id}`, `key-change`, `revoke-cert`,
|
||||
`renewal-info`) are advertised in the directory document but not yet
|
||||
served — clients hitting them get a 404 until subsequent phases land.
|
||||
The directory document includes their URLs because RFC 8555 doesn't
|
||||
permit a partial directory.
|
||||
|
||||
## Phases (cross-reference)
|
||||
|
||||
| Phase | Status | Surface |
|
||||
|-------|-------------|---------|
|
||||
| 1a | live | directory + new-nonce + per-profile routing |
|
||||
| 1b | not yet | new-account + JWS verifier (RFC 7515) |
|
||||
| 2 | not yet | orders + authzs + finalize + cert download (trust_authenticated mode end-to-end) |
|
||||
| 3 | not yet | HTTP-01 + DNS-01 + TLS-ALPN-01 challenge validation |
|
||||
| 4 | not yet | key rollover + revocation + ARI (RFC 9773) |
|
||||
| 5 | not yet | cert-manager integration test + production hardening |
|
||||
| 6 | not yet | full operator-facing reference + walkthroughs + threat model |
|
||||
|
||||
Track shipped phases via `git log --grep='acme-server:' --oneline`.
|
||||
|
||||
## Operational notes (Phase 1a)
|
||||
|
||||
- **Schema:** `migrations/000025_acme_server.up.sql` adds 5 ACME tables
|
||||
+ the `certificate_profiles.acme_auth_mode` column. Phase 1a actively
|
||||
uses only `acme_nonces`. The full schema ships now so the migration
|
||||
is stable and Phases 1b-4 don't need additional `CREATE TABLE`
|
||||
migrations.
|
||||
|
||||
- **Replay protection:** nonces are persisted in `acme_nonces` (NOT
|
||||
in-memory). They survive server restart, which is required for the
|
||||
RFC 8555 §6.5 replay defense to hold against a multi-replica
|
||||
certctl-server fleet behind a load balancer.
|
||||
|
||||
- **Metrics:** the service layer exposes per-op atomic counters via
|
||||
`service.ACMEService.Metrics().Snapshot()`:
|
||||
- `certctl_acme_directory_total`
|
||||
- `certctl_acme_directory_failures_total`
|
||||
- `certctl_acme_new_nonce_total`
|
||||
- `certctl_acme_new_nonce_failures_total`
|
||||
|
||||
Phase 1b will extend with `new_account` counters; Phase 2 with order
|
||||
/ finalize / cert; Phase 3 with per-challenge-type counters.
|
||||
|
||||
- **Audit:** Phase 1a is read-mostly (directory + nonce). Phase 1b's
|
||||
account-creation path will route through the canonical
|
||||
`s.tx.WithinTx(...)` + `auditService.RecordEventWithTx(...)` pattern
|
||||
so every account state mutation is paired with an `audit_events`
|
||||
row.
|
||||
Reference in New Issue
Block a user