Files
certctl/internal/api/acme/errors.go
T
shankar0123 ec88a61274 acme-server: foundation — directory + new-nonce + per-profile routing (Phase 1a/7)
First slice of the RFC 8555 ACME server endpoint (master plan at
cowork/acme-server-endpoint-prompt.md, per-phase prompts at
cowork/acme-server-prompts/). This commit lands the smallest viable
end-to-end deployable slice: an ACME client running

  curl -sk https://certctl/acme/profile/<id>/directory
  curl -sk -I https://certctl/acme/profile/<id>/new-nonce

successfully fetches the directory document and a Replay-Nonce.
Account creation, JWS verification, orders, challenges, and
revocation are all out of scope for this phase and arrive in Phases
1b–4.

Closes the Rank 1 LHF from the 2026-05-03 Infisical deep-research
(cowork/infisical-deep-research-results.md). Pre-fix, certctl was an
ACME consumer only — no /acme/directory endpoint, no JWS verifier,
no challenge validators. K8s customers running cert-manager could
not point at certctl as an ACME issuer; they had to deploy a certctl
agent on every node.

What ships:
  - internal/api/acme/{directory,nonce,errors}.go (+ tests).
  - internal/api/handler/acme.go + acme_handler_test.go.
  - internal/repository/postgres/acme.go (nonce ops only — Phase 1b
    extends with account CRUD; Phases 2-4 extend with order / authz /
    challenge CRUD).
  - internal/service/acme.go (BuildDirectory + IssueNonce stubs;
    Phase 1b adds VerifyJWS / NewAccount / etc.).
  - migrations/000025_acme_server.{up,down}.sql ships the full 5-table
    ACME schema (acme_accounts / acme_orders / acme_authorizations /
    acme_challenges / acme_nonces) PLUS the per-profile
    certificate_profiles.acme_auth_mode column. Phase 1a actively
    uses only acme_nonces; remaining tables are empty until Phases
    1b-4 plug in.
  - internal/config/config.go: ACMEServerConfig struct + ACMEServer
    field on Config. Env vars use CERTCTL_ACME_SERVER_* prefix to
    avoid colliding with the existing consumer-side ACMEConfig at
    config.go:1746 (CERTCTL_ACME_DIRECTORY_URL / PROFILE /
    CHALLENGE_TYPE etc.). Phase 1a wires Enabled +
    DefaultAuthMode + DefaultProfileID + NonceTTL + DirectoryMeta;
    Order/Authz TTLs + per-challenge-type concurrency caps + DNS01
    resolver are reserved fields parsed in 1a so operators can set
    them ahead of Phases 2/3.
  - cmd/server/main.go: wire ACMEHandler into the HandlerRegistry
    literal alongside the existing certificate / EST / SCEP / etc.
    handlers.
  - internal/api/router/router.go: HandlerRegistry.ACME field + 6
    Register calls (3 per-profile + 3 shorthand).
  - internal/api/router/openapi_parity_test.go: 6 new entries in
    SpecParityExceptions. ACME is a wire-protocol surface (JWS-signed
    JSON over HTTPS per RFC 7515) whose semantics are dictated by
    RFC 8555 + RFC 9773 rather than by an OpenAPI document, same
    precedent as SCEP/EST. The canonical reference is
    docs/acme-server.md.
  - docs/acme-server.md: Phase-1a-shaped reference. Configuration
    table for every CERTCTL_ACME_SERVER_* env var. Per-profile
    auth-mode decision tree skeleton. TLS trust bootstrap section
    flagging cert-manager's ClusterIssuer.spec.acme.caBundle
    requirement (the single biggest first-time-deploy footgun;
    the full cert-manager walkthrough lands in Phase 6 but the
    requirement is documented up front).

Architecture decisions baked in:
  - URL family is /acme/profile/<id>/* (per-profile, canonical) with
    /acme/* shorthand active when CERTCTL_ACME_SERVER_DEFAULT_PROFILE_ID
    is set. Path matches existing per-profile precedent in EST + SCEP.
  - Auth mode is per-profile (acme_auth_mode column on
    certificate_profiles), NOT server-wide. One certctl-server can
    serve trust_authenticated for an internal-PKI profile and
    challenge for a public-trust-style profile simultaneously. The
    column is read at request time, not cached at server start —
    operators flipping a profile's mode via SQL take effect on the
    next order without restart.
  - Nonces are DB-backed (acme_nonces table). Survive server restart.
    The RFC 8555 §6.5 replay defense requires the store to outlast
    the client's nonce caching window; an in-memory-only nonce
    store would lose every in-flight order on restart.
  - Per-op atomic counters on service.ACMEService.Metrics() —
    certctl_acme_directory_total, certctl_acme_directory_failures_total,
    certctl_acme_new_nonce_total, certctl_acme_new_nonce_failures_total.
    Naming follows certctl frozen decision 0.10 cardinality discipline.
    Phase 1b will extend with new_account counters; Phase 2 with
    order / finalize / cert; Phase 3 with per-challenge-type counters.

Audit fixes #11 + #12 (cowork/acme-server-prompts/audit-additions.md)
applied:
  - #11: CERTCTL_ACME_SERVER_* prefix avoids the consumer-side
    CERTCTL_ACME_* namespace collision.
  - #12: prior-attempt WIP from two failed Phase-1 dispatches was
    discarded at phase start; this commit starts from a clean tree.

Tests:
  - 14 unit tests in internal/api/acme/ (directory, nonce, errors).
  - 7 handler-level tests via httptest.NewServer + mockACMEService
    (mirrors the mockSCEPService pattern at scep_handler_test.go).
  - 7 service-layer tests with mocked repo + injected profileLookup.
  - All pass under -race -count=1 -short.

Deferred to Phase 1b:
  - JWS verification (go-jose v4 — see master-prompt §8a for the API
    surface and audit doc for the speculation pitfalls).
  - new-account / account/<id> endpoints + AccountService.
  - Nonce *consumption* path (issue path is in this commit; consume
    is only invoked by JWS-verified POSTs which Phase 1b adds).

Engineering history: cowork/WORKSPACE-CHANGELOG.md "ACME-Server-1a".
Per-phase implementation plan: cowork/acme-server-prompts/.
Master plan + audit fixes: cowork/acme-server-endpoint-prompt.md +
cowork/acme-server-prompt-audit.md +
cowork/acme-server-prompts/audit-additions.md.
2026-05-03 12:55:40 +00:00

128 lines
4.4 KiB
Go

// Copyright (c) certctl
// SPDX-License-Identifier: BSL-1.1
package acme
import (
"encoding/json"
"net/http"
)
// ProblemContentType is the MIME type RFC 7807 §3 mandates for the
// JSON-Problem error envelope. ACME inherits this from RFC 8555 §6.7.
const ProblemContentType = "application/problem+json"
// ACME error type URN prefix per RFC 8555 §6.7.
const acmeErrorPrefix = "urn:ietf:params:acme:error:"
// Problem is the RFC 7807 Problem Details document. ACME extends it
// per RFC 8555 §6.7 with subproblems (per-identifier-rejection
// breakdowns) and identifier (the failing identifier on
// rejectedIdentifier). Both extension fields land in Phase 2 along
// with the order endpoints; Phase 1a only emits the base shape.
type Problem struct {
Type string `json:"type"`
Detail string `json:"detail"`
Status int `json:"status"`
Subproblems []Problem `json:"subproblems,omitempty"`
Identifier *Identifier `json:"identifier,omitempty"`
}
// Identifier is the ACME identifier shape (RFC 8555 §7.4). Defined here
// (rather than in a Phase-2-only file) so Phase 1a's Problem struct can
// reference *Identifier without a forward-package-dependency.
type Identifier struct {
Type string `json:"type"`
Value string `json:"value"`
}
// Malformed is RFC 8555 §6.7's "request body did not parse / decode" /
// "the JWS was malformed" / "payload JSON was malformed" error. HTTP
// status 400.
func Malformed(detail string) Problem {
return Problem{
Type: acmeErrorPrefix + "malformed",
Detail: detail,
Status: http.StatusBadRequest,
}
}
// ServerInternal is the catch-all for unexpected server-side errors.
// HTTP status 500. The detail string is operator-facing; per the
// master prompt's acquisition-readiness criterion #10 it MUST NOT
// echo SQL errors, internal trace IDs, or credential bytes.
func ServerInternal(detail string) Problem {
return Problem{
Type: acmeErrorPrefix + "serverInternal",
Detail: detail,
Status: http.StatusInternalServerError,
}
}
// UserActionRequired is RFC 8555 §6.7's "the user has to do something
// out of band before this request will succeed" error. We return it
// from the /acme/* shorthand path family when
// CERTCTL_ACME_SERVER_DEFAULT_PROFILE_ID is not set — the operator
// has to either set the env var or update the client to use
// /acme/profile/<id>/*. HTTP status 403 per RFC 8555.
func UserActionRequired(detail string) Problem {
return Problem{
Type: acmeErrorPrefix + "userActionRequired",
Detail: detail,
Status: http.StatusForbidden,
}
}
// UnsupportedContentType is RFC 7807-shaped (no ACME error type) for
// requests with a Content-Type the endpoint doesn't accept. Phase 1b
// will switch the JWS endpoints to require
// "application/jose+json" specifically; Phase 1a's directory + nonce
// have no Content-Type requirements and never emit this.
func UnsupportedContentType(got string) Problem {
return Problem{
Type: "about:blank",
Detail: "unsupported content type: " + got,
Status: http.StatusUnsupportedMediaType,
}
}
// AccountDoesNotExist (RFC 8555 §7.3.1) is what the JWS verifier returns
// when the request's `kid` points at an unknown account. Phase 1b
// implements the verifier; this shape is exposed in Phase 1a for the
// errors_test.go round-trip cases.
func AccountDoesNotExist(detail string) Problem {
return Problem{
Type: acmeErrorPrefix + "accountDoesNotExist",
Detail: detail,
Status: http.StatusBadRequest,
}
}
// BadNonce is what the JWS verifier returns on a missing / replayed /
// expired nonce per RFC 8555 §6.5.1. Phase 1b wires the verifier;
// shape exposed now so errors_test.go can round-trip it.
func BadNonce(detail string) Problem {
return Problem{
Type: acmeErrorPrefix + "badNonce",
Detail: detail,
Status: http.StatusBadRequest,
}
}
// WriteProblem renders a Problem as RFC 7807 JSON to w, with the
// appropriate Content-Type and status. Any nil-Problem is rendered as
// 500 + serverInternal so the handler never panics on a forgotten
// error path.
func WriteProblem(w http.ResponseWriter, p Problem) {
if p.Status == 0 {
p = ServerInternal("unspecified error")
}
w.Header().Set("Content-Type", ProblemContentType)
w.WriteHeader(p.Status)
// Marshaling can only fail on un-encodable types; Problem only
// uses primitives + slices so json.Marshal cannot fail. The
// _ = ... discard mirrors how response.go handles json.Encoder
// errors.
_ = json.NewEncoder(w).Encode(p)
}