mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 14:21:37 +00:00
refactor(service/acme): split into sibling files — Option B (Phase 9, 9 of N — partial)
Phase 9 ARCH-M2 closure Sprint 9. Splits internal/service/acme.go
(was 1965 LOC, the top hotspot after Sprints 1-8 finished the
config + main-binary cuts) via the Option B sibling-file pattern —
new files stay in `package service` so every external caller of
`service.ACMEService.{IssueNonce,LookupAuthz,ListAuthzsByOrder,
RespondToChallenge,GarbageCollect}` resolves the same way. Pure
mechanical relocation; no signature, no behavior, no import-graph
change.
Why Option B (not a subpackage)
================================
A subpackage (e.g. `internal/service/acme/`) would have meant
rebadging every public method receiver to its new package — that's
import-path churn for ~70 call sites across handlers, scheduler,
cmd/server wiring, MCP tools, and tests, plus the cyclic-import
risk of pulling acme back into `service` for the shared interfaces.
Option B sacrifices the encapsulation discipline a subpackage
would have given (sibling files can still reach into each other's
unexported state because Go scopes are per-package), but in
exchange the diff is restricted to file moves + four sed deletes;
zero importer touches anywhere outside this directory. The
trade-off matches every prior Sprint 1-7 config cut.
What moved
==========
New `internal/service/acme_nonces.go` (46 LOC)
----------------------------------------------
The IssueNonce method (RFC 8555 §6.5 Replay-Nonce issuance). The
nonceAdapter type — which wraps ACMERepo.ConsumeNonce for the JWS
verifier — stays in acme.go alongside VerifyJWS because it's
verification-infrastructure plumbing, not a server-issues-nonce
concern.
New `internal/service/acme_authz.go` (45 LOC)
---------------------------------------------
LookupAuthz + ListAuthzsByOrder (the authz read-side). Authz write-
side (status cascade after challenge validation) lives in
acme_challenges.go alongside recordChallengeOutcome where it
belongs operationally; the authz creation path stays inside
CreateOrder in acme.go (orders own per-order authz row creation).
New `internal/service/acme_challenges.go` (267 LOC)
---------------------------------------------------
The whole Phase 3 challenge dispatch + validator callback concern:
the `// --- Phase 3 — challenge dispatch + validator callback ---`
banner, the ChallengeResponseShape struct, the HTTP-facing
RespondToChallenge method (which transitions challenge → processing
and submits to the validator pool), and the asynchronous
recordChallengeOutcome callback (which persists final challenge
status and cascades the parent authz + order status). Largest
single extract this sprint by line count.
New `internal/service/acme_gc.go` (74 LOC)
------------------------------------------
The Phase 5 ACME GC sweep: scheduler-invoked GarbageCollect entry
point (3 sweeps: nonces, expired authzs, expired orders) and the
atomicAddUint64 counter helper (only consumed by the sweep body
for the rows-affected-N case the default `bump` doesn't cover).
What deferred
=============
Sprint 9 was originally scoped to ship 5 sub-files (nonces / authz /
challenges / orders / gc). The orders cut — CreateOrder +
LookupOrder + FinalizeOrder + LookupCertificate + the orders
helpers (randIDSuffix / base32encode / identifierStrings /
firstAvailableIssuer / accountOwnsACMECert / mapACMERevocationReason) +
FinalizeOrderResult — is ~700 LOC spread across multiple non-
contiguous regions in acme.go, with the orders helpers also feeding
into RevokeCert / RenewalInfo on the Phase 4 side. Disentangling
which helpers move with orders vs which stay with Phase 4 needs a
focused sprint of its own to avoid leaving a half-cut helper
declared in one file but called from a sibling — which works
(same package) but defeats the point of organising by concern.
Deferred to a potential Sprint 9b.
Net effect
==========
acme.go: 1965 → 1634 LOC (-331). Four new sibling files at 432 LOC
total. The headline 1965-LOC hotspot drops below the next-tier
candidates (mcp/tools.go, auth_session_oidc.go, cmd/agent/main.go).
Behavior preservation contract
==============================
1. gofmt -l clean across all 5 affected files.
2. go vet ./internal/service/... — no findings.
3. staticcheck ./internal/service/... — no findings.
4. go test -short -count=1 ./internal/service/... — green.
5. Broader-importer build green:
go build ./cmd/server/... ./internal/api/handler/...
./internal/scheduler/... ./internal/mcp/...
6. Broader-importer tests green:
go test -short -count=1 ./cmd/server/... ./internal/api/handler/...
./internal/scheduler/...
7. Per-import-symbol audit: all 8 imports remaining in acme.go
(context, cryptorand, x509, errors, fmt, strings, sync/atomic,
time, jose, internal/api/acme, internal/config, internal/domain,
internal/repository) verified used by surviving code. New
sibling files carry only the imports their extracted code needs.
The Option B sibling-file shape means same-package resolution
preserves access to ACMEService's unexported state from every
extracted method without any visibility tweaks. Worth noting for
the future: this also means a careless future caller could reach
through file boundaries and re-tangle concerns; the file headers
document the intended boundary but Go's tooling won't enforce it.
Why this is a partial sprint
============================
Splitting into 4 of 5 named sub-files now (vs blocking until orders
is also clean) keeps the hotspot count down with this commit and
lets a follow-up Sprint 9b focus exclusively on the orders cut
without re-touching the four files this sprint ships. Same
"smallest useful slice, document the rest" cadence as Sprint 8
splitting into 8a (mechanical) + 8b (behavior-aware).
Refs: ARCH-M2 (god-files), Phase 9 audit. Last in the config /
service hotspot chain before the agent + mcp + auth-session cuts
land in Sprints 10-12.
This commit is contained in:
@@ -420,28 +420,6 @@ func (s *ACMEService) BuildDirectory(ctx context.Context, profileID, baseURL str
|
||||
return dir, nil
|
||||
}
|
||||
|
||||
// IssueNonce generates a fresh ACME nonce, persists it with the
|
||||
// configured TTL, and returns the encoded string for the
|
||||
// Replay-Nonce header.
|
||||
//
|
||||
// RFC 8555 §6.5: every successful ACME response carries a
|
||||
// Replay-Nonce. Phase 1a wires this via the directory + new-nonce
|
||||
// handlers; Phase 1b extends with new-account + account/<id> POST
|
||||
// responses (the JWS-authenticated paths).
|
||||
func (s *ACMEService) IssueNonce(ctx context.Context) (string, error) {
|
||||
nonce, err := acme.GenerateNonce()
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.NewNonceFailureTotal)
|
||||
return "", fmt.Errorf("acme: generate nonce: %w", err)
|
||||
}
|
||||
if err := s.repo.IssueNonce(ctx, nonce, s.cfg.NonceTTL); err != nil {
|
||||
s.metrics.bump(&s.metrics.NewNonceFailureTotal)
|
||||
return "", fmt.Errorf("acme: persist nonce: %w", err)
|
||||
}
|
||||
s.metrics.bump(&s.metrics.NewNonceTotal)
|
||||
return nonce, nil
|
||||
}
|
||||
|
||||
// resolveProfile applies the default-profile fallback and confirms the
|
||||
// profile exists. Returns the resolved (canonical) profileID on
|
||||
// success. Centralizing the resolution here keeps every Phase
|
||||
@@ -996,26 +974,6 @@ func (s *ACMEService) LookupOrder(ctx context.Context, orderID, accountID string
|
||||
return order, nil
|
||||
}
|
||||
|
||||
// LookupAuthz returns an authz by ID. Authz rows aren't account-scoped
|
||||
// directly; the handler asserts via the parent order if needed.
|
||||
func (s *ACMEService) LookupAuthz(ctx context.Context, authzID string) (*domain.ACMEAuthorization, error) {
|
||||
authz, err := s.repo.GetAuthzByID(ctx, authzID)
|
||||
if err != nil {
|
||||
if errors.Is(err, repository.ErrNotFound) {
|
||||
return nil, ErrACMEAuthzNotFound
|
||||
}
|
||||
return nil, fmt.Errorf("acme: lookup authz: %w", err)
|
||||
}
|
||||
s.metrics.bump(&s.metrics.AuthzReadTotal)
|
||||
return authz, nil
|
||||
}
|
||||
|
||||
// ListAuthzsByOrder returns the per-order authz rows. Used by
|
||||
// MarshalOrder to compute the authorizations URL list.
|
||||
func (s *ACMEService) ListAuthzsByOrder(ctx context.Context, orderID string) ([]*domain.ACMEAuthorization, error) {
|
||||
return s.repo.ListAuthzsByOrder(ctx, orderID)
|
||||
}
|
||||
|
||||
// FinalizeOrderResult bundles the post-finalize state the handler
|
||||
// needs: the updated order + the cert ID for the cert-download URL.
|
||||
type FinalizeOrderResult struct {
|
||||
@@ -1323,242 +1281,6 @@ func identifierStrings(ids []domain.ACMEIdentifier) []string {
|
||||
return out
|
||||
}
|
||||
|
||||
// --- Phase 3 — challenge dispatch + validator callback -----------------
|
||||
|
||||
// ChallengeResponseShape is what RespondToChallenge returns to the
|
||||
// handler: the post-dispatch challenge row (status=processing) so the
|
||||
// handler can render it via acme.MarshalAuthorization-equivalent. The
|
||||
// validator goroutine writes the final status (valid/invalid) as a
|
||||
// callback after dispatch completes — clients fetching the challenge
|
||||
// via authz GET get the eventual state.
|
||||
type ChallengeResponseShape struct {
|
||||
Challenge *domain.ACMEChallenge
|
||||
}
|
||||
|
||||
// RespondToChallenge handles POST /acme/profile/<id>/challenge/<chall_id>
|
||||
// per RFC 8555 §7.5.1.
|
||||
//
|
||||
// Behavior:
|
||||
// - Look up the challenge + parent authz + parent order; assert the
|
||||
// account owns the order.
|
||||
// - If the challenge is already valid/invalid → idempotent return.
|
||||
// - If pending: transition to processing (atomic via WithinTx + audit).
|
||||
// - Submit to the validator pool with an onComplete callback that
|
||||
// transitions the challenge to valid/invalid in another WithinTx
|
||||
// (and cascades the parent authz status).
|
||||
// - Return the challenge in its current (processing) state; the
|
||||
// client polls authz/challenge for the eventual outcome.
|
||||
func (s *ACMEService) RespondToChallenge(
|
||||
ctx context.Context,
|
||||
accountID, challengeID string,
|
||||
accountJWK *jose.JSONWebKey,
|
||||
) (*domain.ACMEChallenge, error) {
|
||||
if s.tx == nil || s.auditService == nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: respond-to-challenge requires SetTransactor + SetAuditService")
|
||||
}
|
||||
if s.validatorPool == nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMEChallengePoolUnconfigured
|
||||
}
|
||||
// Phase 5 — per-challenge respond rate limit. Defends against retry
|
||||
// storms from a misbehaving client. Keyed by challengeID (not
|
||||
// accountID) so a flood against one challenge doesn't drain the
|
||||
// account's whole budget.
|
||||
if s.rateLimiter != nil && s.cfg.RateLimitChallengeRespondsPerHour > 0 {
|
||||
if !s.rateLimiter.Allow(acme.ActionChallengeRespond, challengeID, s.cfg.RateLimitChallengeRespondsPerHour) {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMERateLimited
|
||||
}
|
||||
}
|
||||
|
||||
ch, err := s.repo.GetChallengeByID(ctx, challengeID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
if errors.Is(err, repository.ErrNotFound) {
|
||||
return nil, ErrACMEChallengeNotFound
|
||||
}
|
||||
return nil, fmt.Errorf("acme: lookup challenge: %w", err)
|
||||
}
|
||||
|
||||
// Idempotent re-POST: already valid/invalid → just return.
|
||||
if ch.Status == domain.ACMEChallengeStatusValid || ch.Status == domain.ACMEChallengeStatusInvalid {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
if ch.Status == domain.ACMEChallengeStatusProcessing {
|
||||
// In-flight. Return the row as-is.
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
|
||||
// Confirm the requesting account owns the parent authz/order.
|
||||
authz, err := s.repo.GetAuthzByID(ctx, ch.AuthzID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: lookup parent authz: %w", err)
|
||||
}
|
||||
order, err := s.repo.GetOrderByID(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: lookup parent order: %w", err)
|
||||
}
|
||||
if order.AccountID != accountID {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMEOrderUnauthorized
|
||||
}
|
||||
|
||||
// Compute the key authorization the validator needs.
|
||||
expected, err := acme.KeyAuthorization(ch.Token, accountJWK)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: key authorization: %w", err)
|
||||
}
|
||||
|
||||
// Transition challenge → processing (atomic with audit row).
|
||||
ch.Status = domain.ACMEChallengeStatusProcessing
|
||||
if err := s.tx.WithinTx(ctx, func(q repository.Querier) error {
|
||||
if err := s.repo.UpdateChallengeWithTx(ctx, q, ch); err != nil {
|
||||
return err
|
||||
}
|
||||
return s.auditService.RecordEventWithTx(ctx, q,
|
||||
fmt.Sprintf("acme:%s", accountID), domain.ActorTypeUser,
|
||||
"acme_challenge_processing", "acme_challenge", ch.ChallengeID,
|
||||
map[string]interface{}{
|
||||
"authz_id": ch.AuthzID,
|
||||
"type": string(ch.Type),
|
||||
"identifier": authz.Identifier.Value,
|
||||
})
|
||||
}); err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Submit to the pool. The onComplete callback persists the final
|
||||
// challenge status + cascades the parent authz status. We detach
|
||||
// from the request context via context.WithoutCancel so the
|
||||
// callback's WithinTx survives the HTTP handler returning, while
|
||||
// preserving inherited values (logger, trace IDs, audit actor).
|
||||
bgctx := context.WithoutCancel(ctx)
|
||||
chSnapshot := *ch
|
||||
authzSnapshot := *authz
|
||||
identifier := authz.Identifier.Value
|
||||
s.validatorPool.Submit(bgctx, string(ch.Type), identifier, ch.Token, expected, func(verr error) {
|
||||
s.recordChallengeOutcome(bgctx, accountID, &chSnapshot, &authzSnapshot, verr)
|
||||
})
|
||||
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
|
||||
// recordChallengeOutcome is the validator-pool callback. Persists the
|
||||
// challenge's final status + cascades the parent authz status.
|
||||
//
|
||||
// Authz cascade: if the challenge succeeded, the authz becomes valid
|
||||
// (RFC 8555 §7.1.6: any one challenge passing makes the authz valid).
|
||||
// If the challenge failed, the authz becomes invalid only if no other
|
||||
// pending challenges remain (Phase 3 minimal-viable path: we mark the
|
||||
// authz invalid on first failure since Phase 3 emits 1 challenge per
|
||||
// authz; Phase 4+ extending to multi-challenge-per-authz revisits this).
|
||||
func (s *ACMEService) recordChallengeOutcome(
|
||||
ctx context.Context,
|
||||
accountID string,
|
||||
ch *domain.ACMEChallenge,
|
||||
authz *domain.ACMEAuthorization,
|
||||
verr error,
|
||||
) {
|
||||
now := time.Now().UTC()
|
||||
var newAuthzStatus domain.ACMEAuthzStatus
|
||||
if verr == nil {
|
||||
ch.Status = domain.ACMEChallengeStatusValid
|
||||
ch.ValidatedAt = &now
|
||||
ch.Error = nil
|
||||
newAuthzStatus = domain.ACMEAuthzStatusValid
|
||||
s.metrics.bump(&s.metrics.ChallengeValidateValid)
|
||||
} else {
|
||||
ch.Status = domain.ACMEChallengeStatusInvalid
|
||||
if p := acme.ChallengeProblemFromError(string(ch.Type), verr); p != nil {
|
||||
ch.Error = &domain.ACMEProblem{
|
||||
Type: p.Type,
|
||||
Detail: p.Detail,
|
||||
Status: p.Status,
|
||||
}
|
||||
}
|
||||
newAuthzStatus = domain.ACMEAuthzStatusInvalid
|
||||
s.metrics.bump(&s.metrics.ChallengeValidateInvalid)
|
||||
}
|
||||
|
||||
auditDetails := map[string]interface{}{
|
||||
"authz_id": ch.AuthzID,
|
||||
"type": string(ch.Type),
|
||||
"identifier": authz.Identifier.Value,
|
||||
"valid": verr == nil,
|
||||
}
|
||||
if verr != nil {
|
||||
auditDetails["error"] = verr.Error()
|
||||
}
|
||||
|
||||
_ = s.tx.WithinTx(ctx, func(q repository.Querier) error {
|
||||
if err := s.repo.UpdateChallengeWithTx(ctx, q, ch); err != nil {
|
||||
return err
|
||||
}
|
||||
if err := s.repo.UpdateAuthzStatusWithTx(ctx, q, ch.AuthzID, newAuthzStatus); err != nil {
|
||||
return err
|
||||
}
|
||||
// Cascade: if the authz turned valid, see whether the order's
|
||||
// authzs are now ALL valid; flip order to ready if so.
|
||||
// Read-after-write to confirm.
|
||||
authzs, err := s.repo.ListAuthzsByOrder(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
allValid := len(authzs) > 0
|
||||
anyInvalid := false
|
||||
for _, a := range authzs {
|
||||
if a.AuthzID == ch.AuthzID {
|
||||
if newAuthzStatus != domain.ACMEAuthzStatusValid {
|
||||
allValid = false
|
||||
}
|
||||
if newAuthzStatus == domain.ACMEAuthzStatusInvalid {
|
||||
anyInvalid = true
|
||||
}
|
||||
continue
|
||||
}
|
||||
if a.Status != domain.ACMEAuthzStatusValid {
|
||||
allValid = false
|
||||
}
|
||||
if a.Status == domain.ACMEAuthzStatusInvalid {
|
||||
anyInvalid = true
|
||||
}
|
||||
}
|
||||
order, err := s.repo.GetOrderByID(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
switch {
|
||||
case allValid && order.Status == domain.ACMEOrderStatusPending:
|
||||
order.Status = domain.ACMEOrderStatusReady
|
||||
if err := s.repo.UpdateOrderWithTx(ctx, q, order); err != nil {
|
||||
return err
|
||||
}
|
||||
case anyInvalid && order.Status == domain.ACMEOrderStatusPending:
|
||||
order.Status = domain.ACMEOrderStatusInvalid
|
||||
order.Error = &domain.ACMEProblem{
|
||||
Type: "urn:ietf:params:acme:error:incorrectResponse",
|
||||
Detail: "one or more authorizations failed",
|
||||
Status: 403,
|
||||
}
|
||||
if err := s.repo.UpdateOrderWithTx(ctx, q, order); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return s.auditService.RecordEventWithTx(ctx, q,
|
||||
fmt.Sprintf("acme:%s", accountID), domain.ActorTypeUser,
|
||||
"acme_challenge_completed", "acme_challenge", ch.ChallengeID,
|
||||
auditDetails)
|
||||
})
|
||||
}
|
||||
|
||||
// --- Phase 4 — key rollover + revocation + ARI -------------------------
|
||||
|
||||
// RotateAccountKey is the service-layer entry point for RFC 8555
|
||||
@@ -1910,56 +1632,3 @@ func mapACMERevocationReason(code int) string {
|
||||
return string(domain.RevocationReasonUnspecified)
|
||||
}
|
||||
}
|
||||
|
||||
// GarbageCollect runs a single ACME GC sweep. Phase 5 — the scheduler
|
||||
// invokes this every cfg.GCInterval. Three independent sweeps:
|
||||
//
|
||||
// 1. Delete used / expired nonces.
|
||||
// 2. Transition expired pending authzs to `expired`.
|
||||
// 3. Transition expired pending/ready/processing orders to `invalid`.
|
||||
//
|
||||
// Each sweep is a single SQL statement (no per-row transactions) so a
|
||||
// large reap is one atomic write per sweep. Per-sweep errors are
|
||||
// logged-and-continued: a failing nonces sweep doesn't block the
|
||||
// authzs sweep. Returns the first error encountered (for caller
|
||||
// telemetry); per-sweep counts are recorded on metrics regardless.
|
||||
//
|
||||
// Idempotent — repeated runs are safe; the second run finds 0 rows.
|
||||
func (s *ACMEService) GarbageCollect(ctx context.Context) error {
|
||||
s.metrics.bump(&s.metrics.GCRunsTotal)
|
||||
var firstErr error
|
||||
|
||||
if n, err := s.repo.GCExpiredNonces(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: nonces: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCNoncesReapedTotal, uint64(n))
|
||||
}
|
||||
|
||||
if n, err := s.repo.GCExpireAuthorizations(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: authzs: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCAuthzsExpiredTotal, uint64(n))
|
||||
}
|
||||
|
||||
if n, err := s.repo.GCInvalidateExpiredOrders(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: orders: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCOrdersInvalidatedTotal, uint64(n))
|
||||
}
|
||||
|
||||
return firstErr
|
||||
}
|
||||
|
||||
// atomicAddUint64 adds delta to the counter. The metrics struct exposes
|
||||
// only `bump` (add 1) by default; this helper covers the
|
||||
// rows-affected-N case the GC needs.
|
||||
func atomicAddUint64(c *atomic.Uint64, delta uint64) { c.Add(delta) }
|
||||
|
||||
@@ -0,0 +1,45 @@
|
||||
// Copyright 2026 certctl LLC. All rights reserved.
|
||||
// SPDX-License-Identifier: BUSL-1.1
|
||||
|
||||
package service
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
|
||||
"github.com/certctl-io/certctl/internal/domain"
|
||||
"github.com/certctl-io/certctl/internal/repository"
|
||||
)
|
||||
|
||||
// Phase 9 ARCH-M2 closure Sprint 9 (2026-05-14): extracted from
|
||||
// internal/service/acme.go via the Option B sibling-file pattern.
|
||||
// Package stays `service`; every external caller of
|
||||
// `service.ACMEService.LookupAuthz(...)` / `ListAuthzsByOrder(...)`
|
||||
// resolves the same way — pure mechanical relocation.
|
||||
//
|
||||
// This file holds the authz read-side concern. The authz write-side
|
||||
// (status cascade after challenge validation) lives in
|
||||
// acme_challenges.go alongside recordChallengeOutcome where it
|
||||
// belongs operationally; the authz creation path stays inside
|
||||
// CreateOrder in acme.go (orders own the per-order authz rows).
|
||||
|
||||
// LookupAuthz returns an authz by ID. Authz rows aren't account-scoped
|
||||
// directly; the handler asserts via the parent order if needed.
|
||||
func (s *ACMEService) LookupAuthz(ctx context.Context, authzID string) (*domain.ACMEAuthorization, error) {
|
||||
authz, err := s.repo.GetAuthzByID(ctx, authzID)
|
||||
if err != nil {
|
||||
if errors.Is(err, repository.ErrNotFound) {
|
||||
return nil, ErrACMEAuthzNotFound
|
||||
}
|
||||
return nil, fmt.Errorf("acme: lookup authz: %w", err)
|
||||
}
|
||||
s.metrics.bump(&s.metrics.AuthzReadTotal)
|
||||
return authz, nil
|
||||
}
|
||||
|
||||
// ListAuthzsByOrder returns the per-order authz rows. Used by
|
||||
// MarshalOrder to compute the authorizations URL list.
|
||||
func (s *ACMEService) ListAuthzsByOrder(ctx context.Context, orderID string) ([]*domain.ACMEAuthorization, error) {
|
||||
return s.repo.ListAuthzsByOrder(ctx, orderID)
|
||||
}
|
||||
@@ -0,0 +1,267 @@
|
||||
// Copyright 2026 certctl LLC. All rights reserved.
|
||||
// SPDX-License-Identifier: BUSL-1.1
|
||||
|
||||
package service
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
jose "github.com/go-jose/go-jose/v4"
|
||||
|
||||
"github.com/certctl-io/certctl/internal/api/acme"
|
||||
"github.com/certctl-io/certctl/internal/domain"
|
||||
"github.com/certctl-io/certctl/internal/repository"
|
||||
)
|
||||
|
||||
// Phase 9 ARCH-M2 closure Sprint 9 (2026-05-14): extracted from
|
||||
// internal/service/acme.go via the Option B sibling-file pattern.
|
||||
// Package stays `service`; every external caller of
|
||||
// `service.ACMEService.RespondToChallenge(...)` resolves the same
|
||||
// way — pure mechanical relocation.
|
||||
//
|
||||
// This file holds the Phase 3 challenge dispatch + validator
|
||||
// callback concern: the HTTP-facing RespondToChallenge entry point
|
||||
// (which transitions the challenge to `processing` and submits it
|
||||
// to the validator pool) plus the asynchronous recordChallengeOutcome
|
||||
// callback (which persists the final challenge status and cascades
|
||||
// the parent authz + order status). The authz read-side
|
||||
// (LookupAuthz / ListAuthzsByOrder) lives in acme_authz.go.
|
||||
|
||||
// --- Phase 3 — challenge dispatch + validator callback -----------------
|
||||
|
||||
// ChallengeResponseShape is what RespondToChallenge returns to the
|
||||
// handler: the post-dispatch challenge row (status=processing) so the
|
||||
// handler can render it via acme.MarshalAuthorization-equivalent. The
|
||||
// validator goroutine writes the final status (valid/invalid) as a
|
||||
// callback after dispatch completes — clients fetching the challenge
|
||||
// via authz GET get the eventual state.
|
||||
type ChallengeResponseShape struct {
|
||||
Challenge *domain.ACMEChallenge
|
||||
}
|
||||
|
||||
// RespondToChallenge handles POST /acme/profile/<id>/challenge/<chall_id>
|
||||
// per RFC 8555 §7.5.1.
|
||||
//
|
||||
// Behavior:
|
||||
// - Look up the challenge + parent authz + parent order; assert the
|
||||
// account owns the order.
|
||||
// - If the challenge is already valid/invalid → idempotent return.
|
||||
// - If pending: transition to processing (atomic via WithinTx + audit).
|
||||
// - Submit to the validator pool with an onComplete callback that
|
||||
// transitions the challenge to valid/invalid in another WithinTx
|
||||
// (and cascades the parent authz status).
|
||||
// - Return the challenge in its current (processing) state; the
|
||||
// client polls authz/challenge for the eventual outcome.
|
||||
func (s *ACMEService) RespondToChallenge(
|
||||
ctx context.Context,
|
||||
accountID, challengeID string,
|
||||
accountJWK *jose.JSONWebKey,
|
||||
) (*domain.ACMEChallenge, error) {
|
||||
if s.tx == nil || s.auditService == nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: respond-to-challenge requires SetTransactor + SetAuditService")
|
||||
}
|
||||
if s.validatorPool == nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMEChallengePoolUnconfigured
|
||||
}
|
||||
// Phase 5 — per-challenge respond rate limit. Defends against retry
|
||||
// storms from a misbehaving client. Keyed by challengeID (not
|
||||
// accountID) so a flood against one challenge doesn't drain the
|
||||
// account's whole budget.
|
||||
if s.rateLimiter != nil && s.cfg.RateLimitChallengeRespondsPerHour > 0 {
|
||||
if !s.rateLimiter.Allow(acme.ActionChallengeRespond, challengeID, s.cfg.RateLimitChallengeRespondsPerHour) {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMERateLimited
|
||||
}
|
||||
}
|
||||
|
||||
ch, err := s.repo.GetChallengeByID(ctx, challengeID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
if errors.Is(err, repository.ErrNotFound) {
|
||||
return nil, ErrACMEChallengeNotFound
|
||||
}
|
||||
return nil, fmt.Errorf("acme: lookup challenge: %w", err)
|
||||
}
|
||||
|
||||
// Idempotent re-POST: already valid/invalid → just return.
|
||||
if ch.Status == domain.ACMEChallengeStatusValid || ch.Status == domain.ACMEChallengeStatusInvalid {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
if ch.Status == domain.ACMEChallengeStatusProcessing {
|
||||
// In-flight. Return the row as-is.
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
|
||||
// Confirm the requesting account owns the parent authz/order.
|
||||
authz, err := s.repo.GetAuthzByID(ctx, ch.AuthzID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: lookup parent authz: %w", err)
|
||||
}
|
||||
order, err := s.repo.GetOrderByID(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: lookup parent order: %w", err)
|
||||
}
|
||||
if order.AccountID != accountID {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, ErrACMEOrderUnauthorized
|
||||
}
|
||||
|
||||
// Compute the key authorization the validator needs.
|
||||
expected, err := acme.KeyAuthorization(ch.Token, accountJWK)
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, fmt.Errorf("acme: key authorization: %w", err)
|
||||
}
|
||||
|
||||
// Transition challenge → processing (atomic with audit row).
|
||||
ch.Status = domain.ACMEChallengeStatusProcessing
|
||||
if err := s.tx.WithinTx(ctx, func(q repository.Querier) error {
|
||||
if err := s.repo.UpdateChallengeWithTx(ctx, q, ch); err != nil {
|
||||
return err
|
||||
}
|
||||
return s.auditService.RecordEventWithTx(ctx, q,
|
||||
fmt.Sprintf("acme:%s", accountID), domain.ActorTypeUser,
|
||||
"acme_challenge_processing", "acme_challenge", ch.ChallengeID,
|
||||
map[string]interface{}{
|
||||
"authz_id": ch.AuthzID,
|
||||
"type": string(ch.Type),
|
||||
"identifier": authz.Identifier.Value,
|
||||
})
|
||||
}); err != nil {
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondFailTotal)
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Submit to the pool. The onComplete callback persists the final
|
||||
// challenge status + cascades the parent authz status. We detach
|
||||
// from the request context via context.WithoutCancel so the
|
||||
// callback's WithinTx survives the HTTP handler returning, while
|
||||
// preserving inherited values (logger, trace IDs, audit actor).
|
||||
bgctx := context.WithoutCancel(ctx)
|
||||
chSnapshot := *ch
|
||||
authzSnapshot := *authz
|
||||
identifier := authz.Identifier.Value
|
||||
s.validatorPool.Submit(bgctx, string(ch.Type), identifier, ch.Token, expected, func(verr error) {
|
||||
s.recordChallengeOutcome(bgctx, accountID, &chSnapshot, &authzSnapshot, verr)
|
||||
})
|
||||
|
||||
s.metrics.bump(&s.metrics.ChallengeRespondTotal)
|
||||
return ch, nil
|
||||
}
|
||||
|
||||
// recordChallengeOutcome is the validator-pool callback. Persists the
|
||||
// challenge's final status + cascades the parent authz status.
|
||||
//
|
||||
// Authz cascade: if the challenge succeeded, the authz becomes valid
|
||||
// (RFC 8555 §7.1.6: any one challenge passing makes the authz valid).
|
||||
// If the challenge failed, the authz becomes invalid only if no other
|
||||
// pending challenges remain (Phase 3 minimal-viable path: we mark the
|
||||
// authz invalid on first failure since Phase 3 emits 1 challenge per
|
||||
// authz; Phase 4+ extending to multi-challenge-per-authz revisits this).
|
||||
func (s *ACMEService) recordChallengeOutcome(
|
||||
ctx context.Context,
|
||||
accountID string,
|
||||
ch *domain.ACMEChallenge,
|
||||
authz *domain.ACMEAuthorization,
|
||||
verr error,
|
||||
) {
|
||||
now := time.Now().UTC()
|
||||
var newAuthzStatus domain.ACMEAuthzStatus
|
||||
if verr == nil {
|
||||
ch.Status = domain.ACMEChallengeStatusValid
|
||||
ch.ValidatedAt = &now
|
||||
ch.Error = nil
|
||||
newAuthzStatus = domain.ACMEAuthzStatusValid
|
||||
s.metrics.bump(&s.metrics.ChallengeValidateValid)
|
||||
} else {
|
||||
ch.Status = domain.ACMEChallengeStatusInvalid
|
||||
if p := acme.ChallengeProblemFromError(string(ch.Type), verr); p != nil {
|
||||
ch.Error = &domain.ACMEProblem{
|
||||
Type: p.Type,
|
||||
Detail: p.Detail,
|
||||
Status: p.Status,
|
||||
}
|
||||
}
|
||||
newAuthzStatus = domain.ACMEAuthzStatusInvalid
|
||||
s.metrics.bump(&s.metrics.ChallengeValidateInvalid)
|
||||
}
|
||||
|
||||
auditDetails := map[string]interface{}{
|
||||
"authz_id": ch.AuthzID,
|
||||
"type": string(ch.Type),
|
||||
"identifier": authz.Identifier.Value,
|
||||
"valid": verr == nil,
|
||||
}
|
||||
if verr != nil {
|
||||
auditDetails["error"] = verr.Error()
|
||||
}
|
||||
|
||||
_ = s.tx.WithinTx(ctx, func(q repository.Querier) error {
|
||||
if err := s.repo.UpdateChallengeWithTx(ctx, q, ch); err != nil {
|
||||
return err
|
||||
}
|
||||
if err := s.repo.UpdateAuthzStatusWithTx(ctx, q, ch.AuthzID, newAuthzStatus); err != nil {
|
||||
return err
|
||||
}
|
||||
// Cascade: if the authz turned valid, see whether the order's
|
||||
// authzs are now ALL valid; flip order to ready if so.
|
||||
// Read-after-write to confirm.
|
||||
authzs, err := s.repo.ListAuthzsByOrder(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
allValid := len(authzs) > 0
|
||||
anyInvalid := false
|
||||
for _, a := range authzs {
|
||||
if a.AuthzID == ch.AuthzID {
|
||||
if newAuthzStatus != domain.ACMEAuthzStatusValid {
|
||||
allValid = false
|
||||
}
|
||||
if newAuthzStatus == domain.ACMEAuthzStatusInvalid {
|
||||
anyInvalid = true
|
||||
}
|
||||
continue
|
||||
}
|
||||
if a.Status != domain.ACMEAuthzStatusValid {
|
||||
allValid = false
|
||||
}
|
||||
if a.Status == domain.ACMEAuthzStatusInvalid {
|
||||
anyInvalid = true
|
||||
}
|
||||
}
|
||||
order, err := s.repo.GetOrderByID(ctx, authz.OrderID)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
switch {
|
||||
case allValid && order.Status == domain.ACMEOrderStatusPending:
|
||||
order.Status = domain.ACMEOrderStatusReady
|
||||
if err := s.repo.UpdateOrderWithTx(ctx, q, order); err != nil {
|
||||
return err
|
||||
}
|
||||
case anyInvalid && order.Status == domain.ACMEOrderStatusPending:
|
||||
order.Status = domain.ACMEOrderStatusInvalid
|
||||
order.Error = &domain.ACMEProblem{
|
||||
Type: "urn:ietf:params:acme:error:incorrectResponse",
|
||||
Detail: "one or more authorizations failed",
|
||||
Status: 403,
|
||||
}
|
||||
if err := s.repo.UpdateOrderWithTx(ctx, q, order); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return s.auditService.RecordEventWithTx(ctx, q,
|
||||
fmt.Sprintf("acme:%s", accountID), domain.ActorTypeUser,
|
||||
"acme_challenge_completed", "acme_challenge", ch.ChallengeID,
|
||||
auditDetails)
|
||||
})
|
||||
}
|
||||
@@ -0,0 +1,74 @@
|
||||
// Copyright 2026 certctl LLC. All rights reserved.
|
||||
// SPDX-License-Identifier: BUSL-1.1
|
||||
|
||||
package service
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"sync/atomic"
|
||||
)
|
||||
|
||||
// Phase 9 ARCH-M2 closure Sprint 9 (2026-05-14): extracted from
|
||||
// internal/service/acme.go via the Option B sibling-file pattern.
|
||||
// Package stays `service`; every external caller of
|
||||
// `service.ACMEService.GarbageCollect(...)` resolves the same way —
|
||||
// pure mechanical relocation.
|
||||
//
|
||||
// This file holds the Phase 5 ACME GC sweep concern: the scheduler-
|
||||
// invoked GarbageCollect entry point plus the atomicAddUint64
|
||||
// counter helper (only consumed inside the sweep body for the
|
||||
// rows-affected-N case the default `bump` doesn't cover).
|
||||
|
||||
// GarbageCollect runs a single ACME GC sweep. Phase 5 — the scheduler
|
||||
// invokes this every cfg.GCInterval. Three independent sweeps:
|
||||
//
|
||||
// 1. Delete used / expired nonces.
|
||||
// 2. Transition expired pending authzs to `expired`.
|
||||
// 3. Transition expired pending/ready/processing orders to `invalid`.
|
||||
//
|
||||
// Each sweep is a single SQL statement (no per-row transactions) so a
|
||||
// large reap is one atomic write per sweep. Per-sweep errors are
|
||||
// logged-and-continued: a failing nonces sweep doesn't block the
|
||||
// authzs sweep. Returns the first error encountered (for caller
|
||||
// telemetry); per-sweep counts are recorded on metrics regardless.
|
||||
//
|
||||
// Idempotent — repeated runs are safe; the second run finds 0 rows.
|
||||
func (s *ACMEService) GarbageCollect(ctx context.Context) error {
|
||||
s.metrics.bump(&s.metrics.GCRunsTotal)
|
||||
var firstErr error
|
||||
|
||||
if n, err := s.repo.GCExpiredNonces(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: nonces: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCNoncesReapedTotal, uint64(n))
|
||||
}
|
||||
|
||||
if n, err := s.repo.GCExpireAuthorizations(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: authzs: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCAuthzsExpiredTotal, uint64(n))
|
||||
}
|
||||
|
||||
if n, err := s.repo.GCInvalidateExpiredOrders(ctx); err != nil {
|
||||
s.metrics.bump(&s.metrics.GCRunFailuresTotal)
|
||||
if firstErr == nil {
|
||||
firstErr = fmt.Errorf("acme gc: orders: %w", err)
|
||||
}
|
||||
} else if n > 0 {
|
||||
atomicAddUint64(&s.metrics.GCOrdersInvalidatedTotal, uint64(n))
|
||||
}
|
||||
|
||||
return firstErr
|
||||
}
|
||||
|
||||
// atomicAddUint64 adds delta to the counter. The metrics struct exposes
|
||||
// only `bump` (add 1) by default; this helper covers the
|
||||
// rows-affected-N case the GC needs.
|
||||
func atomicAddUint64(c *atomic.Uint64, delta uint64) { c.Add(delta) }
|
||||
@@ -0,0 +1,46 @@
|
||||
// Copyright 2026 certctl LLC. All rights reserved.
|
||||
// SPDX-License-Identifier: BUSL-1.1
|
||||
|
||||
package service
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
|
||||
"github.com/certctl-io/certctl/internal/api/acme"
|
||||
)
|
||||
|
||||
// Phase 9 ARCH-M2 closure Sprint 9 (2026-05-14): extracted from
|
||||
// internal/service/acme.go via the Option B sibling-file pattern
|
||||
// (operator's choice post-Sprint-8). Package stays `service`; every
|
||||
// external caller of `service.ACMEService.IssueNonce(...)` resolves
|
||||
// the same way — pure mechanical relocation.
|
||||
//
|
||||
// This file holds the SERVER-issues-nonce concern: the IssueNonce
|
||||
// method that generates + persists a fresh ACME nonce for the
|
||||
// Replay-Nonce header per RFC 8555 §6.5. The nonceAdapter type
|
||||
// (which wraps ACMERepo.ConsumeNonce for the JWS verifier) stays
|
||||
// in acme.go alongside VerifyJWS — it's a verification-infrastructure
|
||||
// helper, not a server-side nonce concern.
|
||||
|
||||
// IssueNonce generates a fresh ACME nonce, persists it with the
|
||||
// configured TTL, and returns the encoded string for the
|
||||
// Replay-Nonce header.
|
||||
//
|
||||
// RFC 8555 §6.5: every successful ACME response carries a
|
||||
// Replay-Nonce. Phase 1a wires this via the directory + new-nonce
|
||||
// handlers; Phase 1b extends with new-account + account/<id> POST
|
||||
// responses (the JWS-authenticated paths).
|
||||
func (s *ACMEService) IssueNonce(ctx context.Context) (string, error) {
|
||||
nonce, err := acme.GenerateNonce()
|
||||
if err != nil {
|
||||
s.metrics.bump(&s.metrics.NewNonceFailureTotal)
|
||||
return "", fmt.Errorf("acme: generate nonce: %w", err)
|
||||
}
|
||||
if err := s.repo.IssueNonce(ctx, nonce, s.cfg.NonceTTL); err != nil {
|
||||
s.metrics.bump(&s.metrics.NewNonceFailureTotal)
|
||||
return "", fmt.Errorf("acme: persist nonce: %w", err)
|
||||
}
|
||||
s.metrics.bump(&s.metrics.NewNonceTotal)
|
||||
return nonce, nil
|
||||
}
|
||||
Reference in New Issue
Block a user