Files
certctl/internal/api/acme/ari.go
T
shankar0123 4dc8d3fa5b acme-server: key rollover + revocation + ARI (Phase 4/7)
Closes the RFC 8555 + RFC 9773 surface beyond the issuance happy-path:
  - POST /acme/profile/<id>/key-change   (RFC 8555 §7.3.5)
  - POST /acme/profile/<id>/revoke-cert  (RFC 8555 §7.6)
  - GET  /acme/profile/<id>/renewal-info/<cert-id>  (RFC 9773 ARI)

After this commit, ACME clients can rotate account keys, revoke certs
through the ACME surface (rather than only via the certctl GUI/API),
and fetch ARI for proactive renewal scheduling.

Architecture:
  - Key rollover: outer JWS verified against the registered account key
    (existing kid path); the inner JWS — embedded as the outer's payload
    — verified against the embedded NEW jwk in a new dedicated routine
    (ParseAndVerifyKeyChangeInner) that enforces RFC 8555 §7.3.5
    inner-only invariants: MUST use jwk + MUST NOT use kid, payload
    .account == outer.kid, payload.oldKey thumbprint-equals registered.
    A single WithinTx swaps the stored thumbprint+pem and writes the
    audit row. Concurrent-rollover safety via SELECT…FOR UPDATE on the
    conflicting account row in UpdateAccountJWKWithTx; the loser
    observes the winner's new thumbprint and is told to retry (409).
  - Revocation: two auth paths. kid → AccountOwnsCertificate single-
    indexed COUNT lookup over acme_orders. jwk → constant-time RFC 7638
    thumbprint compare against the cert's pubkey. Both paths route
    through service.RevocationSvc.RevokeCertificateWithActor so the
    existing CRL/OCSP refresh + audit + metrics pipeline applies. RFC
    5280 §5.3.1 numeric reason codes clamp to certctl's
    domain.ValidRevocationReasons; codes 8 (removeFromCRL) + 10
    (aACompromise) clamp to 'unspecified' since they aren't in the set.
  - ARI is GET-only and unauth per RFC 9773 §4. Cert-id wire shape is
    base64url(AKI).base64url(serial); ParseARICertID strict-decodes,
    SerialHex emits the canonical certctl-shape lowercase-no-leading-
    zeros hex used in certificate_versions.serial_number.
    ComputeRenewalWindow has 3 branches: bound RenewalPolicy →
    [notAfter - days, notAfter - days/2]; no policy → last 33% of
    validity; past expiry → [now, now + 1d] (renew immediately).
    Retry-After honors CERTCTL_ACME_SERVER_ARI_POLL_INTERVAL.

What ships:
  - internal/api/acme/{keychange,ari}.go (+ phase4_test.go: 15 tests).
  - internal/api/acme/order.go: RevokeCertRequest wire shape.
  - internal/api/handler/acme.go: KeyChange, RevokeCert, RenewalInfo
    + 11 new writeServiceError mappings.
  - internal/repository/postgres/acme.go: UpdateAccountJWKWithTx (FOR
    UPDATE + expectedOldThumbprint precondition; ErrACMEAccountKey-
    ConcurrentUpdate sentinel) + AccountOwnsCertificate.
  - internal/service/acme.go: RotateAccountKey + RevokeCert +
    RenewalInfo; CertificateRevoker + RenewalPolicyLookup interfaces;
    SetRevocationDelegate + SetRenewalPolicyLookup wiring; 11 new
    sentinels; 6 new metrics.
  - internal/service/acme_phase4_test.go: service-layer tests for
    RotateAccountKey (happy + duplicate-key) + RevokeCert (kid mismatch
    + jwk mismatch + jwk happy + already-revoked + reason-clamping) +
    RenewalInfo (disabled + bad cert-id).
  - internal/api/router/router.go: 6 new register calls (3 per-profile
    + 3 shorthand). Router parity exceptions extended in lockstep
    (in-tree SpecParityExceptions + CI-only openapi-handler-exceptions
    .yaml).
  - cmd/server/main.go: SetRevocationDelegate(revocationSvc) +
    SetRenewalPolicyLookup(renewalPolicyRepo) at startup.
  - internal/config/config.go: CERTCTL_ACME_SERVER_ARI_ENABLED (default
    true) + CERTCTL_ACME_SERVER_ARI_POLL_INTERVAL (default 6h);
    BuildDirectory's ariEnabled flag now flips on under
    cfg.ARIEnabled.
  - docs/acme-server.md: phase status flipped to Phase 4; endpoints
    table grows 6 rows (3 per-profile + 3 shorthand); FAQ section
    appended explaining how to rotate keys, revoke certs, and consume
    ARI.

Tests:
  - 'go vet ./...' clean across the repo.
  - 'go test -short -count=1 ./...' green across every package.
  - phase4_test.go covers: keychange happy-path + 5 negatives +
    MapKeyChangeErrorToProblem coverage; ARI cert-id round-trip + 6
    malformed cases + BuildARICertID from a generated cert; window-
    math 3 branches.
  - service-layer tests confirm: RotateAccountKey atomically swaps the
    thumbprint (verifies persisted state) and rejects duplicate keys;
    RevokeCert routes through the stub RevocationSvc with the right
    actor string + reason on the jwk path, rejects mismatched keys,
    rejects already-revoked certs, clamps reason codes correctly;
    RenewalInfo respects ARIEnabled + cert-id format.

Engineering history: cowork/WORKSPACE-CHANGELOG.md 'ACME-Server-4'.
2026-05-03 16:51:06 +00:00

225 lines
7.8 KiB
Go

// Copyright (c) certctl
// SPDX-License-Identifier: BSL-1.1
package acme
import (
"crypto/x509"
"encoding/base64"
"encoding/hex"
"encoding/pem"
"errors"
"fmt"
"math/big"
"strings"
"time"
"github.com/shankar0123/certctl/internal/domain"
)
// Phase 4 — RFC 9773 ACME Renewal Information.
//
// RFC 9773 §4.1: a client computes the cert-id as
//
// base64url-no-pad(authorityKeyIdentifier) || "." || base64url-no-pad(serial)
//
// and GETs /acme/.../renewal-info/<cert-id>. The server responds with a
// JSON document carrying a `suggestedWindow` (start, end) the client
// SHOULD plan its renewal inside, plus an optional `explanationURL`.
// Response also carries a Retry-After header (RFC 9773 §4.2) hinting
// at the next-poll cadence.
//
// This file:
//
// - parses the cert-id wire format → (akiBytes, serialBytes).
// - converts the serial bytes to a hex string in the canonical
// certctl shape (lowercase, no leading zeros, matching how
// internal/repository/postgres/certificate.go stores them).
// - computes the suggested-window from a cert's NotAfter and an
// optional bound RenewalPolicy (last 33% of validity if no policy
// is bound).
// RenewalInfoResponse is the JSON document returned by the renewal-
// info endpoint per RFC 9773 §4.1.
type RenewalInfoResponse struct {
SuggestedWindow RenewalWindow `json:"suggestedWindow"`
ExplanationURL string `json:"explanationURL,omitempty"`
}
// RenewalWindow is the embedded {start, end} pair. RFC 9773 mandates
// start ≤ end; the server is responsible for emitting RFC 3339 UTC
// timestamps.
type RenewalWindow struct {
Start time.Time `json:"start"`
End time.Time `json:"end"`
}
// ARICertID is the parsed shape of an RFC 9773 §4.1 cert-id —
// authorityKeyIdentifier and serial bytes after base64url-no-pad
// decoding. Callers compare against the certificate they already have
// in the database; AKI is informational on the server side because
// certctl's serial-uniqueness invariant is per-issuer.
type ARICertID struct {
// AKI is the raw bytes of the certificate's authorityKeyIdentifier
// extension.
AKI []byte
// Serial is the raw bytes of the certificate's serial number, in
// big-endian unsigned-integer form.
Serial []byte
}
// SerialHex returns the canonical certctl-shape hex representation of
// the serial number — lowercase, no leading zeros (matches what's
// stored in certificate_versions.serial_number).
func (a ARICertID) SerialHex() string {
if len(a.Serial) == 0 {
return ""
}
n := new(big.Int).SetBytes(a.Serial)
if n.Sign() == 0 {
return "0"
}
return strings.ToLower(n.Text(16))
}
// AKIHex returns the AKI as a lowercase hex string. Useful for logging
// + future per-AKI lookup paths.
func (a ARICertID) AKIHex() string {
return strings.ToLower(hex.EncodeToString(a.AKI))
}
// Sentinel errors. ChooseProblem in writeServiceError translates the
// not-found cases to RFC 7807 + RFC 8555 §6.7 problems.
var (
ErrARICertIDMalformed = errors.New("acme ari: cert-id is not <aki>.<serial>")
ErrARICertIDDecodeAKI = errors.New("acme ari: cert-id AKI is not valid base64url")
ErrARICertIDDecodeSeria = errors.New("acme ari: cert-id serial is not valid base64url")
ErrARICertIDEmpty = errors.New("acme ari: cert-id has empty AKI or serial")
)
// ParseARICertID decodes an RFC 9773 §4.1 cert-id. The wire format is
// strictly base64url-NO-PADDING; rfc9773 §4.1 forbids regular base64.
//
// Common malformations:
// - missing or extra `.` separator → ErrARICertIDMalformed.
// - either side fails base64url decode → ErrARICertIDDecode*.
// - either side decodes to empty → ErrARICertIDEmpty.
func ParseARICertID(certID string) (*ARICertID, error) {
parts := strings.Split(certID, ".")
if len(parts) != 2 {
return nil, fmt.Errorf("%w: got %d parts", ErrARICertIDMalformed, len(parts))
}
if parts[0] == "" || parts[1] == "" {
return nil, ErrARICertIDEmpty
}
aki, err := base64.RawURLEncoding.DecodeString(parts[0])
if err != nil {
return nil, fmt.Errorf("%w: %v", ErrARICertIDDecodeAKI, err)
}
serial, err := base64.RawURLEncoding.DecodeString(parts[1])
if err != nil {
return nil, fmt.Errorf("%w: %v", ErrARICertIDDecodeSeria, err)
}
if len(aki) == 0 || len(serial) == 0 {
return nil, ErrARICertIDEmpty
}
return &ARICertID{AKI: aki, Serial: serial}, nil
}
// BuildARICertID is the inverse of ParseARICertID — useful for tests
// and operator tools that want to construct a cert-id from a leaf cert.
//
// The input is the leaf certificate's PEM. We extract the
// authorityKeyIdentifier extension and the serial number, then
// base64url-no-pad-encode each + join with a `.`.
func BuildARICertID(certPEM string) (string, error) {
block, _ := pem.Decode([]byte(certPEM))
if block == nil {
return "", fmt.Errorf("acme ari: pem decode failed")
}
cert, err := x509.ParseCertificate(block.Bytes)
if err != nil {
return "", fmt.Errorf("acme ari: parse cert: %w", err)
}
if len(cert.AuthorityKeyId) == 0 {
return "", fmt.Errorf("acme ari: certificate has no authorityKeyIdentifier extension")
}
if cert.SerialNumber == nil {
return "", fmt.Errorf("acme ari: certificate has no serial number")
}
akiB64 := base64.RawURLEncoding.EncodeToString(cert.AuthorityKeyId)
serialB64 := base64.RawURLEncoding.EncodeToString(cert.SerialNumber.Bytes())
return akiB64 + "." + serialB64, nil
}
// ComputeRenewalWindow returns the RFC 9773 suggestedWindow for a
// (cert, optional renewal-policy) pair.
//
// Algorithm:
//
// - When policy is non-nil and policy.RenewalWindowDays > 0: the
// window starts at NotAfter - RenewalWindowDays + spans half of
// RenewalWindowDays. So a 30-day-renewal-window cert with NotAfter
// 2026-06-30 emits start=2026-05-31, end=2026-06-15. This matches
// boulder's default ARI behavior + ensures a Let's-Encrypt-shaped
// client can plan its renewals exactly inside our renewal window.
// - When policy is nil OR RenewalWindowDays ≤ 0: the window is the
// last 33% of validity. So a cert with NotBefore 2026-01-01 +
// NotAfter 2026-04-01 (90d validity) emits start=2026-03-01 (30d
// before expiry), end=2026-03-21 (10d before expiry).
// - When the cert is past NotAfter: the window starts at "now" and
// ends at "now + 1 day" so a client polling on an expired cert
// gets a "renew immediately" answer rather than a window in the
// past.
//
// Returns (start, end). start ≤ end is invariant.
func ComputeRenewalWindow(cert *domain.ManagedCertificate, version *domain.CertificateVersion, policy *domain.RenewalPolicy, now time.Time) (time.Time, time.Time) {
if cert == nil {
return time.Time{}, time.Time{}
}
notAfter := cert.ExpiresAt.UTC()
notBefore := notAfter
if version != nil && !version.NotBefore.IsZero() {
notBefore = version.NotBefore.UTC()
}
// Past expiry: emit a 1-day "renew now" window.
if !now.IsZero() && now.UTC().After(notAfter) {
nowUTC := now.UTC()
return nowUTC, nowUTC.Add(24 * time.Hour)
}
if policy != nil && policy.RenewalWindowDays > 0 {
windowDays := time.Duration(policy.RenewalWindowDays) * 24 * time.Hour
start := notAfter.Add(-windowDays)
end := start.Add(windowDays / 2)
// Defensive: never emit start in the past from "now".
if !now.IsZero() && start.Before(now.UTC()) {
start = now.UTC()
}
if end.Before(start) {
end = start
}
return start, end
}
// No policy → last 33% of validity.
validity := notAfter.Sub(notBefore)
if validity <= 0 {
// Degenerate cert (nb >= na). Use a 1-day default window
// ending at notAfter.
return notAfter.Add(-24 * time.Hour), notAfter
}
thirty3 := validity / 3
start := notAfter.Add(-thirty3)
// End is 1/3 before expiry → midpoint of the renewal third.
end := notAfter.Add(-thirty3 / 3)
if !now.IsZero() && start.Before(now.UTC()) {
start = now.UTC()
}
if end.Before(start) {
end = start
}
return start, end
}