mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-09 07:58:51 +00:00
acme-server: HTTP-01 + DNS-01 + TLS-ALPN-01 challenge validation (Phase 3/7)
Wires up the actual challenge-validation machinery so profiles in
acme_auth_mode='challenge' resolve end-to-end. After this commit,
cert-manager 1.15+ with `solver: http01: ingress` against a
challenge-mode profile completes a real HTTP-01 flow and gets a cert.
DNS-01 + TLS-ALPN-01 share the same code path with the appropriate
validator selection.
Architecture (the load-bearing parts):
- 3 separate semaphore-bounded worker pools (one per challenge type),
so HTTP-01 and DNS-01 can't starve each other under load. Default
weight 10 per type; tunable via CERTCTL_ACME_SERVER_HTTP01_CONCURRENCY,
DNS01_CONCURRENCY, TLSALPN01_CONCURRENCY.
- 30s per-challenge timeout (configurable via PoolConfig.PerChallengeTimeout).
- HTTP-01 validator runs validation.IsReservedIPForDial (newly
exported wrapper preserving the existing private impl byte-for-byte
for the network scanner + ValidateSafeURL paths) on the resolved
IP — both at the initial dial and every redirect hop. SSRF probes
into private IP space are refused before the connect.
- DNS-01 validator uses a dedicated resolver pointed at
CERTCTL_ACME_SERVER_DNS01_RESOLVER (default 8.8.8.8:53) — does
NOT use the system resolver to keep behavior deterministic across
deployments. Wildcard handling: `*.example.com` queries
_acme-challenge.example.com.
- TLS-ALPN-01 validator (RFC 8737) connects with ALPN `acme-tls/1`,
inspects the id-pe-acmeIdentifier extension (OID 1.3.6.1.5.5.7.1.31),
asserts the ASN.1 OCTET STRING value equals SHA-256 of the key
authorization. Cert chain is intentionally NOT validated
(InsecureSkipVerify=true is correct per RFC 8737 — the proof is
in the extension, not the chain). Documented in docs/tls.md L-001
table + the //nolint:gosec comment carries the justification.
SSRF guard: same posture as HTTP-01.
- Validation is asynchronous: handler accepts the POST and returns
200 immediately with status=processing; the worker-pool fires a
callback that updates challenge → authz → order in a fresh
background-context WithinTx. The order auto-promotes to `ready`
when ALL authzs become valid; auto-fails to `invalid` when ANY
authz becomes invalid.
What ships:
- internal/api/acme/challenge.go: KeyAuthorization (RFC 8555 §8.1) +
DNS01TXTRecordValue (§8.4) + TLSALPN01ExtensionValue (RFC 8737 §3)
helpers; IDPEAcmeIdentifierOID; ChallengeProblemFromError mapper
(4-way: connection / dns / tls / incorrectResponse); 9 sentinel
errors covering every named failure mode.
- internal/api/acme/validators.go: ChallengeValidator interface;
Pool dispatcher with 3 semaphores + per-type in-flight + peak
gauges; HTTP01Validator + DNS01Validator + TLSALPN01Validator
implementations; Drain method called from cmd/server/main.go's
shutdown sequence.
- internal/api/acme/validators_test.go: KeyAuthorization round-trip,
DNS01 / TLS-ALPN-01 helper tests, SSRF rejection, bounded-
concurrency saturation test (peak-in-flight ≤ cap), type-isolation
test (HTTP-01 saturation doesn't block DNS-01), UnknownType test,
7-case ChallengeProblemFromError mapping.
- internal/repository/postgres/acme.go: GetChallengeByID +
UpdateChallengeWithTx + UpdateAuthzStatusWithTx.
- internal/service/acme.go: SetValidatorPool wires the *acme.Pool;
RespondToChallenge dispatches with account-ownership assertion +
KeyAuthorization computation + processing-status transition (atomic
+ audit); recordChallengeOutcome callback persists the final
challenge + cascading authz + order-promote/-fail in one WithinTx +
audit row. 4 new metrics.
- internal/api/handler/acme.go: Challenge handler; round-trips
account.JWKPEM through ParseJWKFromPEM to recover the *jose.JSONWebKey
the validator pool needs.
- internal/api/router/router.go + openapi_parity_test.go +
api/openapi-handler-exceptions.yaml: 2 new routes (per-profile +
shorthand for challenge/{chall_id}) with parity exceptions.
- cmd/server/main.go: constructs the Pool at startup with the
per-type concurrency caps from cfg.ACMEServer; ACMEService.ValidatorPool()
accessor exposed for the shutdown drain sequence.
- internal/validation/ssrf.go: exported IsReservedIPForDial wrapper
(private impl unchanged; network scanner + ValidateSafeURL paths
byte-identical with prior behavior).
- docs/tls.md: L-001 InsecureSkipVerify table extended with the
TLS-ALPN-01 validator justification (RFC 8737 §3).
- docs/acme-server.md: phase status updated; endpoints table grows
the challenge row; phases-cross-reference flips Phase 3 → live.
Tests:
- 80%+ coverage on the new files.
- BoundedConcurrency test: 10 challenges submitted against an
HTTP-01 pool of weight 3; observed peak-in-flight ≤ 3, all 10
eventually complete, post-Drain in-flight returns to 0.
- TypeIsolation test: HTTP-01 saturation does NOT block a DNS-01
submission; DNS-01 callback fires within 2s.
- SSRF rejection test: a Validate against `localhost` is refused
before the dial (ErrChallengeReservedIP or ErrChallengeConnection).
Engineering history: cowork/WORKSPACE-CHANGELOG.md "ACME-Server-3".
This commit is contained in:
@@ -49,6 +49,8 @@ type ACMEService interface {
|
||||
ListAuthzsByOrder(ctx context.Context, orderID string) ([]*domain.ACMEAuthorization, error)
|
||||
FinalizeOrder(ctx context.Context, accountID, orderID, profileID string, csr *x509.CertificateRequest, csrPEM string) (*service.FinalizeOrderResult, error)
|
||||
LookupCertificate(ctx context.Context, certID, accountID string) (string, error)
|
||||
// Phase 3 — challenge validation.
|
||||
RespondToChallenge(ctx context.Context, accountID, challengeID string, accountJWK *jose.JSONWebKey) (*domain.ACMEChallenge, error)
|
||||
}
|
||||
|
||||
// ACMEHandler exposes the ACME server's RFC 8555 endpoints under the
|
||||
@@ -211,8 +213,20 @@ func writeServiceError(w http.ResponseWriter, err error) {
|
||||
Detail: "order is not in the `ready` state; complete authorizations first",
|
||||
Status: http.StatusForbidden,
|
||||
})
|
||||
case errors.Is(err, service.ErrACMEUnsupportedAuthMode), errors.Is(err, service.ErrACMEFinalizeUnconfigured):
|
||||
case errors.Is(err, service.ErrACMEUnsupportedAuthMode), errors.Is(err, service.ErrACMEFinalizeUnconfigured), errors.Is(err, service.ErrACMEChallengePoolUnconfigured):
|
||||
acme.WriteProblem(w, acme.ServerInternal("ACME server is not fully configured; contact the operator"))
|
||||
case errors.Is(err, service.ErrACMEChallengeNotFound):
|
||||
acme.WriteProblem(w, acme.Problem{
|
||||
Type: "urn:ietf:params:acme:error:malformed",
|
||||
Detail: "challenge not found",
|
||||
Status: http.StatusNotFound,
|
||||
})
|
||||
case errors.Is(err, service.ErrACMEChallengeWrongState):
|
||||
acme.WriteProblem(w, acme.Problem{
|
||||
Type: "urn:ietf:params:acme:error:malformed",
|
||||
Detail: "challenge is no longer in pending state",
|
||||
Status: http.StatusBadRequest,
|
||||
})
|
||||
default:
|
||||
// Avoid leaking internal error text per master-prompt
|
||||
// criterion #10 (operator-actionable errors with no info
|
||||
@@ -793,3 +807,81 @@ func parseOptionalTime(s string) *time.Time {
|
||||
}
|
||||
return &t
|
||||
}
|
||||
|
||||
// Challenge handles POST /acme/profile/{id}/challenge/{chall_id}
|
||||
// (RFC 8555 §7.5.1). The client posts an empty body (modern ACME) or
|
||||
// a `{}` payload to indicate "I'm ready for you to validate this
|
||||
// challenge." The handler dispatches the validator-pool work + returns
|
||||
// the challenge in its current (processing) state. Clients poll authz
|
||||
// or challenge for the eventual outcome.
|
||||
//
|
||||
// Phase 3: account JWK is needed to compute the key authorization. The
|
||||
// JWS verifier returns the registered account's stored JWKPEM in the
|
||||
// VerifiedRequest.Account; we round-trip that PEM through ParseJWKFromPEM
|
||||
// to get the *jose.JSONWebKey the validator pool needs.
|
||||
func (h ACMEHandler) Challenge(w http.ResponseWriter, r *http.Request) {
|
||||
profileID := r.PathValue("id")
|
||||
challengeID := r.PathValue("chall_id")
|
||||
requestURL := h.requestURL(r)
|
||||
|
||||
body, err := io.ReadAll(io.LimitReader(r.Body, MaxJWSBodyBytes+1))
|
||||
if err != nil {
|
||||
acme.WriteProblem(w, acme.Malformed("could not read request body"))
|
||||
return
|
||||
}
|
||||
if len(body) > MaxJWSBodyBytes {
|
||||
acme.WriteProblem(w, acme.Malformed("request body too large"))
|
||||
return
|
||||
}
|
||||
|
||||
verified, err := h.svc.VerifyJWS(r.Context(), body, requestURL, false /*expectNewAccount*/, h.accountKID(r, profileID))
|
||||
if err != nil {
|
||||
acme.WriteProblem(w, acme.MapJWSErrorToProblem(err))
|
||||
return
|
||||
}
|
||||
if verified.Account == nil {
|
||||
acme.WriteProblem(w, acme.MapJWSErrorToProblem(acme.ErrJWSAccountNotFound))
|
||||
return
|
||||
}
|
||||
|
||||
// Reconstruct the account's public JWK from its stored PEM. This
|
||||
// is what the validator pool needs to compute key authorizations.
|
||||
jwk, err := acme.ParseJWKFromPEM(verified.Account.JWKPEM)
|
||||
if err != nil {
|
||||
acme.WriteProblem(w, acme.ServerInternal("could not parse stored account JWK"))
|
||||
return
|
||||
}
|
||||
|
||||
ch, err := h.svc.RespondToChallenge(r.Context(), verified.Account.AccountID, challengeID, jwk)
|
||||
if err != nil {
|
||||
writeServiceError(w, err)
|
||||
return
|
||||
}
|
||||
|
||||
if nonce, err := h.svc.IssueNonce(r.Context()); err == nil {
|
||||
w.Header().Set("Replay-Nonce", nonce)
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_ = json.NewEncoder(w).Encode(marshalChallengeResponse(ch, h.challengeURLBuilder(r, profileID)))
|
||||
}
|
||||
|
||||
// marshalChallengeResponse renders a single ACMEChallenge in the
|
||||
// RFC 8555 §8 wire shape. Distinct from MarshalAuthorization (which
|
||||
// embeds challenges in an authz wrapper); the challenge endpoint
|
||||
// returns one challenge directly per RFC 8555 §7.5.1.
|
||||
func marshalChallengeResponse(ch *domain.ACMEChallenge, urlBuilder func(string) string) acme.ChallengeResponseJSON {
|
||||
out := acme.ChallengeResponseJSON{
|
||||
Type: string(ch.Type),
|
||||
URL: urlBuilder(ch.ChallengeID),
|
||||
Status: string(ch.Status),
|
||||
Token: ch.Token,
|
||||
}
|
||||
if ch.ValidatedAt != nil {
|
||||
out.Validated = ch.ValidatedAt.UTC().Format(time.RFC3339)
|
||||
}
|
||||
if ch.Error != nil {
|
||||
out.Error = &acme.Problem{Type: ch.Error.Type, Detail: ch.Error.Detail, Status: ch.Error.Status}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
@@ -40,6 +40,8 @@ type mockACMEService struct {
|
||||
ListAuthzsByOrderFn func(ctx context.Context, orderID string) ([]*domain.ACMEAuthorization, error)
|
||||
FinalizeOrderFn func(ctx context.Context, accountID, orderID, profileID string, csr *x509.CertificateRequest, csrPEM string) (*service.FinalizeOrderResult, error)
|
||||
LookupCertificateFn func(ctx context.Context, certID, accountID string) (string, error)
|
||||
// Phase 3.
|
||||
RespondToChallengeFn func(ctx context.Context, accountID, challengeID string, accountJWK *jose.JSONWebKey) (*domain.ACMEChallenge, error)
|
||||
}
|
||||
|
||||
func (m *mockACMEService) BuildDirectory(ctx context.Context, profileID, baseURL string) (*acme.Directory, error) {
|
||||
@@ -133,6 +135,13 @@ func (m *mockACMEService) LookupCertificate(ctx context.Context, certID, account
|
||||
return "", errors.New("LookupCertificate not stubbed")
|
||||
}
|
||||
|
||||
func (m *mockACMEService) RespondToChallenge(ctx context.Context, accountID, challengeID string, accountJWK *jose.JSONWebKey) (*domain.ACMEChallenge, error) {
|
||||
if m.RespondToChallengeFn != nil {
|
||||
return m.RespondToChallengeFn(ctx, accountID, challengeID, accountJWK)
|
||||
}
|
||||
return nil, errors.New("RespondToChallenge not stubbed")
|
||||
}
|
||||
|
||||
// newACMETestServer wires the ACMEHandler against the mock + a stdlib
|
||||
// ServeMux configured exactly the way internal/api/router/router.go
|
||||
// does it in production. Routes:
|
||||
@@ -156,6 +165,7 @@ func newACMETestServer(t *testing.T, mock *mockACMEService) *httptest.Server {
|
||||
mux.HandleFunc("POST /acme/profile/{id}/order/{ord_id}", h.Order)
|
||||
mux.HandleFunc("POST /acme/profile/{id}/order/{ord_id}/finalize", h.OrderFinalize)
|
||||
mux.HandleFunc("POST /acme/profile/{id}/authz/{authz_id}", h.Authz)
|
||||
mux.HandleFunc("POST /acme/profile/{id}/challenge/{chall_id}", h.Challenge)
|
||||
mux.HandleFunc("POST /acme/profile/{id}/cert/{cert_id}", h.Cert)
|
||||
mux.HandleFunc("GET /acme/directory", h.Directory)
|
||||
mux.HandleFunc("HEAD /acme/new-nonce", h.NewNonce)
|
||||
|
||||
Reference in New Issue
Block a user