auth-bundle-1 Phase 6-7-8: bootstrap path + scope-down CLI + auditor-role split

# Phase 6 — day-0 admin bootstrap * internal/auth/bootstrap/ (new package): Strategy interface + EnvTokenStrategy with constant-time compare, one-shot consumption via sync.Mutex, optional admin-existence probe. Bundle 2's OIDC- first-admin will plug in alongside as an alternate Strategy. * BootstrapService.ValidateAndMint: validates the operator's CERTCTL_BOOTSTRAP_TOKEN, mints a 32-byte (64-hex-char) random API key value, persists the SHA-256 hash to api_keys, grants r-admin via actor_roles, AddHashed's the runtime keystore so the just- minted key authenticates the next request without restart, and records bootstrap.consume to the audit trail with category=auth. * internal/auth/keystore.go (new): KeyStore interface + StaticKeyStore (immutable env-var-only path) + MutableKeyStore (env-var keys + DB-loaded api_keys + runtime AddHashed). The auth middleware now consumes a KeyStore so the bootstrap path can extend the lookup table at runtime. * migrations/000031_api_keys.up/down.sql: api_keys table with (id, name UNIQUE, key_hash UNIQUE, tenant_id, admin, created_by, created_at, expires_at, last_used_at). Idempotent. * /v1/auth/bootstrap GET (probe) + POST (mint) — auth-exempt. Both routes documented in api/openapi.yaml + AuthExemptRouterRoutes allowlist updated. The token never leaves internal/auth/bootstrap; the minted plaintext key flows only into the HTTP response body. * Startup warning emitted when CERTCTL_BOOTSTRAP_TOKEN is set AND admin actors already exist (config drift signal). * Tests: 4 strategy invariants (empty token born disabled, wrong token=ErrInvalidToken without consumption, one-shot consumption, admin-exists closes path), 5 service tests (happy path + actor- name validation + propagation of strategy errors + nil-deps guard + 32-byte entropy budget), 8 HTTP-handler tests (status 201/410/401/400 mapping + token-leak hygiene scan of slog + audit details + Location header). Token-leak test redirects slog.Default to a buffer for the test scope. # Phase 7 — API-key migration + scope-down CLI * GET /v1/auth/keys handler + service method ListKeys backed by ActorRoleRepository.ListDistinctActors. Returns one row per (actor_id, actor_type) pair with the slice of role IDs they hold. Permission: auth.role.list. * internal/cli/auth_scope_down.go: AuthListKeys, AuthScopeDown (interactive), AuthScopeDownNonInteractive (JSON config), AuthScopeDownSuggest (--suggest with optional --apply). The synthetic actor-demo-anon is filtered out of every interactive / bulk path; non-interactive flow logs and skips it explicitly. * SuggestRoleFromAuditEvents (pure function): walks 30 days of audit events per actor and returns the narrowest matching role (admin / mcp / viewer / agent / operator) plus a one-line reason. Classification: any admin-shaped action wins; otherwise all-MCP → mcp; all-read-only → viewer; all-agent-shaped → agent; otherwise operator. Test table pins all six classifications. * CLI subcommand tree extended: 'auth keys list' + 'auth keys scope-down [--non-interactive <cfg>] [--suggest [--apply]]'. * CHANGELOG.md leads v2.1.0 with the SECURITY: AUDIT YOUR API KEYS call-out + four flow examples. # Phase 8 — auditor role + event_category column * migrations/000032_audit_category.up/down.sql: ALTER TABLE audit_events ADD COLUMN event_category TEXT NOT NULL DEFAULT 'cert_lifecycle' + CHECK constraint (cert_lifecycle/auth/config) + (event_category) and (event_category, timestamp DESC) indexes for the auditor-filter query path. WORM trigger from migration 000018 continues to enforce append-only at the DB layer (DDL is not blocked). * domain.AuditEvent gains EventCategory string (omitempty); domain.EventCategoryCertLifecycle / Auth / Config constants. * AuditService.RecordEventWithCategory sibling of RecordEvent; legacy callers stay on RecordEvent (defaults to cert_lifecycle). Auth callers (RoleService, ActorRoleService, BootstrapService) switched to RecordEventWithCategory(..., 'auth', ...). * GET /v1/audit?category=<cat>: handler accepts the optional query param, validates against the enum (400 on invalid value), dispatches through ListAuditEventsByCategory. OpenAPI updated with the new query param + AuditEvent.event_category schema. * Postgres AuditRepository.Create now writes event_category; AuditRepository.List filters on it; AuditFilter.EventCategory gates the WHERE clause. * Tests: 5 audit-category-filter HTTP tests (dispatch routing, back-compat fallback, 400 for invalid values, all 3 enum values accepted, page+category combine, JSON output surfaces the field). 3 auditor-role invariants (auditor holds exactly audit.read+audit.export, no mutating perms, disjoint from viewer except audit.read). # Cross-phase wiring * HandlerRegistry.Bootstrap field added; cmd/server/main.go wires the bootstrap service ahead of RegisterHandlers (extracted assembleNamedAPIKeys helper into auth_backfill.go, moved the keystore + bootstrap construction up alongside the auth repos). * AuthCheckResolver / AuthActorRoleService extended with ListKeys to satisfy the Phase 7 surface; existing fakes updated. * fakeAudit + mockAuditService stubs in tests gain RecordEventWithCategory + ListAuditEventsByCategory; existing tests untouched. # Verifications * gofmt -l: clean across every modified file. * go vet ./...: clean. * staticcheck across internal/auth + handler + router + cli + service + repository + cmd + domain: clean. * go test -short -count=1: green across every Bundle-1-touched package — internal/auth (incl. bootstrap), internal/api/handler, internal/api/router, internal/cli, internal/service/auth, internal/service, internal/domain/auth, internal/repository/postgres, cmd/server, cmd/cli, plus internal/scheduler, internal/api/middleware, cmd/agent, internal/mcp.
2026-06-07 14:51:30 +00:00 · 2026-05-09 20:15:43 +00:00
parent 60a589ab96
commit 3ef45e2ad4
38 changed files with 3159 additions and 140 deletions
@@ -0,0 +1,194 @@
+// Package bootstrap ships the day-0 admin-creation primitive for Bundle 1
+// Phase 6. The control plane comes up with no admin-roled actors; the
+// operator hands the env-var token to a single curl call; the server
+// mints the first admin API key, returns the key value once, then locks
+// the bootstrap door behind it.
+//
+// The Strategy interface is the forward-compat seam: Bundle 2 plugs in an
+// OIDC-first-admin strategy (the operator logs in via OIDC, the server
+// recognizes their group claim, the first such login auto-grants r-admin)
+// alongside the env-var-token strategy this file ships. Both implementations
+// satisfy the same interface; the boot path picks one based on which
+// CERTCTL_BOOTSTRAP_* env var is set.
+package bootstrap
+
+import (
+	"context"
+	"crypto/subtle"
+	"errors"
+	"sync"
+)
+
+// Sentinel errors the HTTP handler maps to status codes.
+var (
+	// ErrDisabled is returned when the bootstrap path is not callable
+	// either because (a) no token was set, or (b) admin actors already
+	// exist, or (c) the token was already consumed by an earlier call.
+	// Maps to HTTP 410 Gone.
+	ErrDisabled = errors.New("bootstrap: endpoint disabled")
+
+	// ErrInvalidToken is returned when the supplied token does not
+	// match the env-var token (constant-time compared). Maps to HTTP
+	// 401 Unauthorized. Deliberately does NOT distinguish between
+	// "wrong token" and "no token configured" so callers cannot use
+	// timing or status to probe the server's bootstrap state.
+	ErrInvalidToken = errors.New("bootstrap: invalid token")
+
+	// ErrInvalidActorName is returned when the requested admin-key
+	// name is empty or contains characters that would break audit
+	// attribution. Maps to HTTP 400.
+	ErrInvalidActorName = errors.New("bootstrap: invalid actor name")
+)
+
+// Strategy is the bundle 1 -> bundle 2 forward-compat seam. Each
+// strategy gates the day-0 admin path with a different credential type:
+// Bundle 1 ships EnvTokenStrategy (CERTCTL_BOOTSTRAP_TOKEN); Bundle 2
+// adds OIDCFirstAdminStrategy (CERTCTL_BOOTSTRAP_OIDC_GROUP). The
+// service holds whichever strategy was wired at boot.
+type Strategy interface {
+	// Available reports whether the strategy is currently callable.
+	// Returns false once the strategy is consumed (one-shot semantics)
+	// OR once the strategy detects an existing admin (via the
+	// AdminExistenceProbe). The HTTP handler maps !Available to 410
+	// Gone before doing any token validation, so probing for "is there
+	// a bootstrap path open" is safe.
+	Available(ctx context.Context) (bool, error)
+
+	// Validate consumes the credential and returns nil when the caller
+	// is permitted to mint the first admin. The strategy MUST atomic-
+	// flip its consumed state on first successful Validate so a
+	// concurrent racing call gets ErrDisabled. Returning a non-nil
+	// error MUST NOT mark the strategy consumed; the operator can
+	// retry with the correct credential.
+	Validate(ctx context.Context, token string) error
+}
+
+// AdminExistenceProbe is the callback the EnvTokenStrategy uses to ask
+// the actor-role repository whether any actor holds r-admin. Lives at
+// this package boundary so the strategy doesn't import internal/repository
+// (would create a cycle: bootstrap -> repository -> postgres -> bootstrap
+// when the postgres adapter is wired).
+type AdminExistenceProbe func(ctx context.Context) (bool, error)
+
+// EnvTokenStrategy is the env-var-token Bundle 1 implementation. The
+// operator sets CERTCTL_BOOTSTRAP_TOKEN, the server boots with this
+// strategy, the first valid Validate call atomically flips the
+// `consumed` flag and the next call returns ErrDisabled.
+//
+// The token comparison is crypto/subtle.ConstantTimeCompare so timing
+// attacks can't leak the token byte-by-byte. The token itself never
+// leaves this package: the strategy holds it in memory, the handler
+// receives only error sentinels, the audit row records the event but
+// not the token value.
+type EnvTokenStrategy struct {
+	token       string              // set once at construction; never mutated
+	probe       AdminExistenceProbe // optional; nil = skip the existence probe
+	mu          sync.Mutex          // guards consumed
+	consumed    bool                // flipped to true after first successful Validate
+	tokenLength int                 // cached for early-reject fast path
+}
+
+// NewEnvTokenStrategy constructs the env-var-token strategy. token must
+// be the raw value of CERTCTL_BOOTSTRAP_TOKEN. probe is optional; when
+// non-nil it gates Available + Validate on "no admin exists yet" so the
+// caller can't bootstrap a second admin after the fleet has stabilized.
+//
+// When token is empty the returned strategy is born consumed —
+// Available returns false, Validate returns ErrDisabled. This matches
+// the boot-path contract that an unset env var disables the endpoint.
+func NewEnvTokenStrategy(token string, probe AdminExistenceProbe) *EnvTokenStrategy {
+	s := &EnvTokenStrategy{
+		token:       token,
+		probe:       probe,
+		tokenLength: len(token),
+	}
+	if token == "" {
+		s.consumed = true
+	}
+	return s
+}
+
+// Available implements Strategy.
+func (s *EnvTokenStrategy) Available(ctx context.Context) (bool, error) {
+	s.mu.Lock()
+	consumed := s.consumed
+	s.mu.Unlock()
+	if consumed {
+		return false, nil
+	}
+	if s.probe != nil {
+		exists, err := s.probe(ctx)
+		if err != nil {
+			return false, err
+		}
+		if exists {
+			return false, nil
+		}
+	}
+	return true, nil
+}
+
+// Validate implements Strategy.
+func (s *EnvTokenStrategy) Validate(ctx context.Context, token string) error {
+	// Fast-path: if the strategy is disabled, return Disabled before
+	// doing any constant-time compare. The state flip below acquires
+	// the same mutex so this read is safe.
+	s.mu.Lock()
+	if s.consumed {
+		s.mu.Unlock()
+		return ErrDisabled
+	}
+	// Refuse zero-length tokens up front. ConstantTimeCompare returns
+	// 1 when both inputs are empty, which would otherwise produce a
+	// permanent backdoor on misconfigured deployments where token=""
+	// at construction; NewEnvTokenStrategy already covers that, but
+	// belt-and-braces here in case a future caller passes the strategy
+	// raw.
+	if s.tokenLength == 0 || len(token) == 0 {
+		s.mu.Unlock()
+		return ErrInvalidToken
+	}
+	// Constant-time compare. Length-pad implicit: ConstantTimeCompare
+	// returns 0 when lengths differ (and runs in constant time
+	// relative to the shorter length).
+	if subtle.ConstantTimeCompare([]byte(s.token), []byte(token)) != 1 {
+		s.mu.Unlock()
+		return ErrInvalidToken
+	}
+	// External probe: respect the "admin already exists" gate even
+	// after a valid token was supplied. This closes the race where a
+	// fleet first-admin lands during the gap between Available and
+	// Validate.
+	if s.probe != nil {
+		// Drop the lock for the probe — repo calls may be slow and
+		// holding the mutex through I/O would serialize every
+		// concurrent bootstrap attempt. Re-acquire after.
+		s.mu.Unlock()
+		exists, err := s.probe(ctx)
+		if err != nil {
+			return err
+		}
+		if exists {
+			return ErrDisabled
+		}
+		s.mu.Lock()
+		// Re-check consumed because a concurrent caller might have
+		// flipped it while we were probing.
+		if s.consumed {
+			s.mu.Unlock()
+			return ErrDisabled
+		}
+	}
+	s.consumed = true
+	s.mu.Unlock()
+	return nil
+}
+
+// IsConsumed reports whether the strategy has already been used. Test
+// helper; production callers should use Available which also runs the
+// admin-existence probe.
+func (s *EnvTokenStrategy) IsConsumed() bool {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	return s.consumed
+}
@@ -0,0 +1,125 @@
+package bootstrap
+
+import (
+	"context"
+	"errors"
+	"testing"
+)
+
+// TestEnvTokenStrategy_EmptyTokenIsBornDisabled pins the load-bearing
+// invariant that an unset CERTCTL_BOOTSTRAP_TOKEN closes the bootstrap
+// path at construction time. The handler depends on this — without it,
+// a misconfigured deploy that forgot to set the env var would expose
+// the endpoint with a token of "" that an attacker could trivially
+// match by also sending "".
+func TestEnvTokenStrategy_EmptyTokenIsBornDisabled(t *testing.T) {
+	s := NewEnvTokenStrategy("", nil)
+	avail, err := s.Available(context.Background())
+	if err != nil {
+		t.Fatalf("Available err = %v, want nil", err)
+	}
+	if avail {
+		t.Errorf("Available = true for empty token, want false")
+	}
+	if got := s.Validate(context.Background(), ""); !errors.Is(got, ErrDisabled) {
+		t.Errorf("Validate('') for empty-token strategy = %v, want ErrDisabled", got)
+	}
+	if got := s.Validate(context.Background(), "anything"); !errors.Is(got, ErrDisabled) {
+		t.Errorf("Validate('anything') for empty-token strategy = %v, want ErrDisabled", got)
+	}
+}
+
+// TestEnvTokenStrategy_WrongTokenReturnsInvalidToken pins that the
+// strategy maps a token mismatch to ErrInvalidToken (HTTP 401), not
+// ErrDisabled (410). Misclassifying these would let a probing attacker
+// distinguish "no token set" from "wrong token" via response status.
+func TestEnvTokenStrategy_WrongTokenReturnsInvalidToken(t *testing.T) {
+	s := NewEnvTokenStrategy("correct-token", nil)
+	if got := s.Validate(context.Background(), "wrong-token"); !errors.Is(got, ErrInvalidToken) {
+		t.Errorf("Validate(wrong) = %v, want ErrInvalidToken", got)
+	}
+	if got := s.Validate(context.Background(), ""); !errors.Is(got, ErrInvalidToken) {
+		t.Errorf("Validate('') = %v, want ErrInvalidToken", got)
+	}
+	if s.IsConsumed() {
+		t.Errorf("strategy consumed after failed Validate; must remain available for retry")
+	}
+}
+
+// TestEnvTokenStrategy_OneShotConsumption pins the invariant that the
+// first valid Validate call locks the strategy. The bootstrap path is
+// strictly one-shot; the second call MUST return ErrDisabled (HTTP
+// 410), not ErrInvalidToken (which would suggest "wrong token, try
+// again").
+func TestEnvTokenStrategy_OneShotConsumption(t *testing.T) {
+	s := NewEnvTokenStrategy("correct-token", nil)
+	if err := s.Validate(context.Background(), "correct-token"); err != nil {
+		t.Fatalf("first Validate = %v, want nil", err)
+	}
+	if !s.IsConsumed() {
+		t.Errorf("IsConsumed = false after successful Validate, want true")
+	}
+	if got := s.Validate(context.Background(), "correct-token"); !errors.Is(got, ErrDisabled) {
+		t.Errorf("second Validate = %v, want ErrDisabled", got)
+	}
+	avail, err := s.Available(context.Background())
+	if err != nil {
+		t.Fatalf("Available err = %v", err)
+	}
+	if avail {
+		t.Errorf("Available = true after consumption, want false")
+	}
+}
+
+// TestEnvTokenStrategy_AdminExistsClosesPath pins the invariant that
+// the admin-existence probe gates Available + Validate. The strategy
+// must NOT mint a second admin even if the operator forgot to unset
+// CERTCTL_BOOTSTRAP_TOKEN after onboarding.
+func TestEnvTokenStrategy_AdminExistsClosesPath(t *testing.T) {
+	probe := func(_ context.Context) (bool, error) { return true, nil }
+	s := NewEnvTokenStrategy("correct-token", probe)
+	avail, err := s.Available(context.Background())
+	if err != nil {
+		t.Fatalf("Available err = %v", err)
+	}
+	if avail {
+		t.Errorf("Available = true with admin exists probe, want false")
+	}
+	if got := s.Validate(context.Background(), "correct-token"); !errors.Is(got, ErrDisabled) {
+		t.Errorf("Validate = %v with admin exists, want ErrDisabled", got)
+	}
+	if s.IsConsumed() {
+		t.Errorf("strategy must NOT be consumed when admin-existence probe rejects; allows retry after operator removes the duplicate admin")
+	}
+}
+
+// TestEnvTokenStrategy_AdminProbeError surfaces the error to the
+// caller without consuming the strategy. The HTTP handler maps this
+// to 500; the operator can retry once the underlying issue is fixed.
+func TestEnvTokenStrategy_AdminProbeError(t *testing.T) {
+	probeErr := errors.New("boom")
+	probe := func(_ context.Context) (bool, error) { return false, probeErr }
+	s := NewEnvTokenStrategy("correct-token", probe)
+	if _, err := s.Available(context.Background()); !errors.Is(err, probeErr) {
+		t.Errorf("Available err = %v, want probeErr", err)
+	}
+	if got := s.Validate(context.Background(), "correct-token"); !errors.Is(got, probeErr) {
+		t.Errorf("Validate err = %v, want probeErr", got)
+	}
+	if s.IsConsumed() {
+		t.Errorf("strategy must NOT be consumed on probe error")
+	}
+}
+
+// TestEnvTokenStrategy_ZeroLengthRejectedEvenWithMatchingToken belt-
+// and-braces against the ConstantTimeCompare("","")=1 footgun. A
+// strategy explicitly constructed with token="" is born disabled
+// (ErrDisabled); but if a future caller bypasses the constructor, the
+// Validate path also rejects zero-length tokens up front.
+func TestEnvTokenStrategy_ZeroLengthRejectedEvenWithMatchingToken(t *testing.T) {
+	// Directly construct a strategy with token=""
+	s := &EnvTokenStrategy{token: "", tokenLength: 0, consumed: false}
+	if got := s.Validate(context.Background(), ""); !errors.Is(got, ErrInvalidToken) {
+		t.Errorf("Validate('','') = %v, want ErrInvalidToken (zero-length guard)", got)
+	}
+}
@@ -0,0 +1,204 @@
+package bootstrap
+
+import (
+	"context"
+	"crypto/rand"
+	"encoding/hex"
+	"fmt"
+	"regexp"
+	"time"
+
+	"github.com/certctl-io/certctl/internal/domain"
+	authdomain "github.com/certctl-io/certctl/internal/domain/auth"
+)
+
+// actorNameRe matches the operator-supplied admin-key name. Constraints:
+// 3-64 chars, lowercase alphanumeric + hyphen + underscore. Strict
+// charset prevents audit-attribution shenanigans (control characters,
+// log-injection sequences, mixed-case look-alikes for an existing
+// admin actor's name).
+var actorNameRe = regexp.MustCompile(`^[a-z0-9][a-z0-9_-]{2,63}$`)
+
+// APIKeyMinter is the slice of APIKeyRepository the bootstrap service
+// needs. Pulled out as a small interface so the service can be unit-
+// tested with an in-memory fake.
+type APIKeyMinter interface {
+	Create(ctx context.Context, key *authdomain.APIKey) error
+	GetByName(ctx context.Context, name string) (*authdomain.APIKey, error)
+}
+
+// RoleGranter is the slice of ActorRoleRepository the bootstrap
+// service needs.
+type RoleGranter interface {
+	Grant(ctx context.Context, ar *authdomain.ActorRole) error
+}
+
+// AuditRecorder is the slice of AuditService the bootstrap service
+// needs. Phase 8 ships RecordEventWithCategory which classifies the
+// row's event_category column directly; the bootstrap path always
+// emits with category=auth.
+type AuditRecorder interface {
+	RecordEventWithCategory(ctx context.Context, actor string, actorType domain.ActorType, action, eventCategory, resourceType, resourceID string, details map[string]interface{}) error
+}
+
+// KeyStoreAdder is the runtime hook the bootstrap service uses to
+// register the just-minted key with the auth middleware so the next
+// request authenticates without a process restart. The HTTP-layer
+// auth middleware exposes this via internal/auth.MutableKeyStore.
+type KeyStoreAdder interface {
+	AddHashed(name, hashHex string, admin bool)
+}
+
+// Service ties the bootstrap Strategy to the persistence layer. Kept
+// separate from the HTTP handler so unit tests can drive it without
+// httptest, and so the same service can back a future
+// `certctl auth bootstrap` CLI command.
+type Service struct {
+	strategy   Strategy
+	keys       APIKeyMinter
+	roles      RoleGranter
+	audit      AuditRecorder
+	keyStore   KeyStoreAdder
+	hashAPIKey func(string) string // injected so the auth package's HashAPIKey doesn't import this package
+}
+
+// NewService constructs a bootstrap Service.
+//
+// hashAPIKey takes the plaintext key and returns the SHA-256 hex used
+// by the auth middleware's keystore lookup. Pass internal/auth.HashAPIKey
+// at the production wire site; tests can pass a deterministic hash for
+// matching against MutableKeyStore lookups.
+//
+// keyStore is optional. Production wires the same MutableKeyStore the
+// auth middleware reads from so the minted key authenticates the next
+// request; when nil the bootstrap still persists the key to the DB
+// but the operator must restart to pick it up via the boot loader.
+func NewService(strategy Strategy, keys APIKeyMinter, roles RoleGranter, audit AuditRecorder, keyStore KeyStoreAdder, hashAPIKey func(string) string) *Service {
+	return &Service{
+		strategy:   strategy,
+		keys:       keys,
+		roles:      roles,
+		audit:      audit,
+		keyStore:   keyStore,
+		hashAPIKey: hashAPIKey,
+	}
+}
+
+// MintResult is the success payload returned to the HTTP handler. Key
+// is the plaintext value the operator must capture before the response
+// is dropped — the server holds it for ~milliseconds and never logs it.
+type MintResult struct {
+	APIKey   *authdomain.APIKey
+	KeyValue string
+}
+
+// Available reports whether the bootstrap endpoint is currently
+// callable. Returns the strategy's verdict plus a sentinel
+// (ErrDisabled) when not. The HTTP handler maps the sentinel to 410
+// Gone before reading any token from the request body so a probing
+// attacker can't distinguish "no token configured" from "wrong
+// token".
+func (s *Service) Available(ctx context.Context) (bool, error) {
+	if s == nil || s.strategy == nil {
+		return false, ErrDisabled
+	}
+	return s.strategy.Available(ctx)
+}
+
+// ValidateAndMint consumes the strategy's credential and persists the
+// first admin API key. The response carries the plaintext key value
+// once; the operator MUST capture it before the response goes out the
+// wire. Subsequent calls return ErrDisabled (one-shot semantics).
+//
+// Side effects:
+//  1. Strategy.Validate atomically flips its consumed state.
+//  2. A new row is written to api_keys (id, name, sha256(key), admin=true).
+//  3. A new row is written to actor_roles (actor=name, role=r-admin).
+//  4. The MutableKeyStore (if wired) gains a runtime entry so the next
+//     request authenticates without a restart.
+//  5. An audit event records the bootstrap consumption with
+//     event_category=auth, action=bootstrap.consume.
+//
+// The plaintext key is NEVER logged. It exists in three places:
+//   - the random buffer this function generates,
+//   - the MintResult.KeyValue field (the handler writes it to the
+//     response then discards),
+//   - the HTTP response body itself.
+//
+// If the persistence calls fail AFTER the strategy is consumed, the
+// service does NOT roll back the strategy state — by design. A failed
+// ValidateAndMint call leaves bootstrap closed; the operator must
+// recover via DB seeding (insert into actor_roles directly) rather
+// than retry. The alternative (retry) opens a window for a successful
+// validate-then-fail sequence to mint two admin keys on retry, which
+// silently widens the trust radius.
+func (s *Service) ValidateAndMint(ctx context.Context, token, actorName string) (*MintResult, error) {
+	if s == nil || s.strategy == nil || s.keys == nil || s.roles == nil {
+		return nil, ErrDisabled
+	}
+	if !actorNameRe.MatchString(actorName) {
+		return nil, ErrInvalidActorName
+	}
+	if err := s.strategy.Validate(ctx, token); err != nil {
+		return nil, err
+	}
+	// Strategy is now consumed; if anything below fails the operator
+	// has to recover via DB. See the docstring on MintFirstAdmin.
+	keyValue, err := generateAPIKey()
+	if err != nil {
+		return nil, fmt.Errorf("bootstrap: random key generation: %w", err)
+	}
+	keyHash := s.hashAPIKey(keyValue)
+	now := time.Now().UTC()
+	apiKey := &authdomain.APIKey{
+		Name:      actorName,
+		KeyHash:   keyHash,
+		TenantID:  authdomain.DefaultTenantID,
+		Admin:     true,
+		CreatedBy: "bootstrap",
+		CreatedAt: now,
+	}
+	if err := s.keys.Create(ctx, apiKey); err != nil {
+		return nil, fmt.Errorf("bootstrap: persist key: %w", err)
+	}
+	if err := s.roles.Grant(ctx, &authdomain.ActorRole{
+		ActorID:   actorName,
+		ActorType: authdomain.ActorTypeValue(domain.ActorTypeAPIKey),
+		RoleID:    authdomain.RoleIDAdmin,
+		TenantID:  authdomain.DefaultTenantID,
+		GrantedBy: "bootstrap",
+	}); err != nil {
+		return nil, fmt.Errorf("bootstrap: grant admin role: %w", err)
+	}
+	if s.keyStore != nil {
+		s.keyStore.AddHashed(actorName, keyHash, true)
+	}
+	if s.audit != nil {
+		// Phase 8 promotes event_category to a first-class column.
+		// Bootstrap is unambiguously an auth event. Errors from the
+		// audit write are intentionally ignored: the bootstrap mint
+		// succeeded and the consequent audit-row miss is preferable
+		// to surfacing a 500 to the operator after the admin-key
+		// already landed in the DB. The audit-row gap is detectable
+		// in monitoring (every successful mint should have a paired
+		// bootstrap.consume row).
+		_ = s.audit.RecordEventWithCategory(ctx, "bootstrap-token", domain.ActorTypeSystem,
+			"bootstrap.consume", domain.EventCategoryAuth, "api_key", apiKey.ID,
+			map[string]interface{}{
+				"actor_name": actorName,
+				"role_id":    authdomain.RoleIDAdmin,
+			})
+	}
+	return &MintResult{APIKey: apiKey, KeyValue: keyValue}, nil
+}
+
+// generateAPIKey returns 32 random bytes hex-encoded (64-char output).
+// Same entropy budget as `openssl rand -hex 32` which the agent
+// bootstrap docs recommend.
+func generateAPIKey() (string, error) {
+	buf := make([]byte, 32)
+	if _, err := rand.Read(buf); err != nil {
+		return "", err
+	}
+	return hex.EncodeToString(buf), nil
+}
@@ -0,0 +1,215 @@
+package bootstrap
+
+import (
+	"context"
+	"crypto/sha256"
+	"encoding/hex"
+	"errors"
+	"strings"
+	"testing"
+
+	"github.com/certctl-io/certctl/internal/domain"
+	authdomain "github.com/certctl-io/certctl/internal/domain/auth"
+)
+
+type fakeMinter struct {
+	created   []*authdomain.APIKey
+	createErr error
+}
+
+func (f *fakeMinter) Create(_ context.Context, k *authdomain.APIKey) error {
+	if f.createErr != nil {
+		return f.createErr
+	}
+	f.created = append(f.created, k)
+	return nil
+}
+func (f *fakeMinter) GetByName(_ context.Context, _ string) (*authdomain.APIKey, error) {
+	return nil, errors.New("not implemented for these tests")
+}
+
+type fakeGranter struct {
+	grants []*authdomain.ActorRole
+	err    error
+}
+
+func (f *fakeGranter) Grant(_ context.Context, ar *authdomain.ActorRole) error {
+	f.grants = append(f.grants, ar)
+	return f.err
+}
+
+type fakeAudit struct {
+	calls    []map[string]interface{}
+	category string
+}
+
+func (f *fakeAudit) RecordEventWithCategory(_ context.Context, _ string, _ domain.ActorType, _ string, eventCategory, _ string, _ string, details map[string]interface{}) error {
+	f.calls = append(f.calls, details)
+	f.category = eventCategory
+	return nil
+}
+
+type fakeKeyStore struct {
+	added []addedEntry
+}
+
+type addedEntry struct {
+	name  string
+	hash  string
+	admin bool
+}
+
+func (f *fakeKeyStore) AddHashed(name, hash string, admin bool) {
+	f.added = append(f.added, addedEntry{name: name, hash: hash, admin: admin})
+}
+
+func sha(s string) string {
+	h := sha256.Sum256([]byte(s))
+	return hex.EncodeToString(h[:])
+}
+
+// TestService_ValidateAndMint_HappyPath pins the load-bearing flow:
+// valid token → strategy consumed → api_keys row created → admin role
+// granted → keystore updated → audit row recorded → result carries the
+// plaintext key + the persisted APIKey row.
+func TestService_ValidateAndMint_HappyPath(t *testing.T) {
+	strategy := NewEnvTokenStrategy("the-token", nil)
+	minter := &fakeMinter{}
+	granter := &fakeGranter{}
+	audit := &fakeAudit{}
+	store := &fakeKeyStore{}
+	svc := NewService(strategy, minter, granter, audit, store, sha)
+
+	result, err := svc.ValidateAndMint(context.Background(), "the-token", "first-admin")
+	if err != nil {
+		t.Fatalf("ValidateAndMint err = %v", err)
+	}
+	if result == nil || result.KeyValue == "" {
+		t.Fatalf("result.KeyValue empty")
+	}
+	if len(result.KeyValue) < 32 {
+		t.Errorf("KeyValue length = %d, want >= 32 (entropy budget)", len(result.KeyValue))
+	}
+	if !strategy.IsConsumed() {
+		t.Errorf("strategy not consumed after successful mint")
+	}
+	if len(minter.created) != 1 {
+		t.Fatalf("minter.Create call count = %d, want 1", len(minter.created))
+	}
+	apiKey := minter.created[0]
+	if apiKey.Name != "first-admin" || !apiKey.Admin || apiKey.CreatedBy != "bootstrap" {
+		t.Errorf("api_key wrong fields: %+v", apiKey)
+	}
+	if apiKey.KeyHash != sha(result.KeyValue) {
+		t.Errorf("KeyHash != sha(KeyValue); persistence shape is wrong")
+	}
+	if len(granter.grants) != 1 {
+		t.Fatalf("granter.Grant call count = %d, want 1", len(granter.grants))
+	}
+	if granter.grants[0].RoleID != authdomain.RoleIDAdmin {
+		t.Errorf("granted role = %q, want %q", granter.grants[0].RoleID, authdomain.RoleIDAdmin)
+	}
+	if granter.grants[0].ActorID != "first-admin" {
+		t.Errorf("granted actor = %q, want first-admin", granter.grants[0].ActorID)
+	}
+	if granter.grants[0].GrantedBy != "bootstrap" {
+		t.Errorf("GrantedBy = %q, want bootstrap", granter.grants[0].GrantedBy)
+	}
+	if len(store.added) != 1 || store.added[0].name != "first-admin" || !store.added[0].admin {
+		t.Errorf("keystore.AddHashed not called with first-admin/admin=true: %+v", store.added)
+	}
+	if store.added[0].hash != apiKey.KeyHash {
+		t.Errorf("keystore hash != api_key hash; runtime auth would fail")
+	}
+	if len(audit.calls) != 1 {
+		t.Fatalf("audit RecordEventWithCategory calls = %d, want 1", len(audit.calls))
+	}
+	if audit.calls[0]["actor_name"] != "first-admin" {
+		t.Errorf("audit details lost actor_name: %+v", audit.calls[0])
+	}
+	if audit.category != "auth" {
+		t.Errorf("audit category = %q, want auth", audit.category)
+	}
+}
+
+// TestService_ValidateAndMint_RejectsInvalidActorName pins the
+// ErrInvalidActorName mapping (HTTP 400). Strict charset prevents
+// log-injection / lookalike actor names.
+func TestService_ValidateAndMint_RejectsInvalidActorName(t *testing.T) {
+	svc := NewService(NewEnvTokenStrategy("t", nil), &fakeMinter{}, &fakeGranter{}, nil, nil, sha)
+	cases := []string{
+		"",                      // empty
+		"AB",                    // too short
+		"Has-Caps",              // uppercase rejected
+		"contains spaces",       // space rejected
+		strings.Repeat("a", 65), // 65 chars > 64 max
+		"newline\nsuffix",       // log injection
+		"💀-evil",                // non-ASCII
+	}
+	for _, name := range cases {
+		_, err := svc.ValidateAndMint(context.Background(), "t", name)
+		if !errors.Is(err, ErrInvalidActorName) {
+			t.Errorf("name=%q err = %v, want ErrInvalidActorName", name, err)
+		}
+	}
+}
+
+// TestService_ValidateAndMint_PropagatesStrategyError pins that a
+// failed Validate (wrong token / disabled / probe error) propagates
+// without persisting anything.
+func TestService_ValidateAndMint_PropagatesStrategyError(t *testing.T) {
+	strategy := NewEnvTokenStrategy("the-token", nil)
+	minter := &fakeMinter{}
+	granter := &fakeGranter{}
+	store := &fakeKeyStore{}
+	svc := NewService(strategy, minter, granter, nil, store, sha)
+
+	_, err := svc.ValidateAndMint(context.Background(), "wrong-token", "first-admin")
+	if !errors.Is(err, ErrInvalidToken) {
+		t.Fatalf("err = %v, want ErrInvalidToken", err)
+	}
+	if len(minter.created) != 0 || len(granter.grants) != 0 || len(store.added) != 0 {
+		t.Errorf("persistence side effects fired despite Validate failure: minter=%d grants=%d keystore=%d", len(minter.created), len(granter.grants), len(store.added))
+	}
+}
+
+// TestService_ValidateAndMint_NilDepsReturnDisabled exercises the
+// no-strategy / no-repo guard. Returns ErrDisabled (handler maps to
+// 410). Belt-and-braces for partially-wired test or future call sites.
+func TestService_ValidateAndMint_NilDepsReturnDisabled(t *testing.T) {
+	cases := []struct {
+		name string
+		svc  *Service
+	}{
+		{"nil service", nil},
+		{"nil strategy", NewService(nil, &fakeMinter{}, &fakeGranter{}, nil, nil, sha)},
+		{"nil minter", NewService(NewEnvTokenStrategy("t", nil), nil, &fakeGranter{}, nil, nil, sha)},
+		{"nil granter", NewService(NewEnvTokenStrategy("t", nil), &fakeMinter{}, nil, nil, nil, sha)},
+	}
+	for _, tc := range cases {
+		_, err := tc.svc.ValidateAndMint(context.Background(), "t", "first-admin")
+		if !errors.Is(err, ErrDisabled) {
+			t.Errorf("%s: err = %v, want ErrDisabled", tc.name, err)
+		}
+	}
+}
+
+// TestService_GenerateAPIKey_HighEntropy pins the generated key shape:
+// 64 hex chars (32 random bytes). Belt-and-braces against future
+// refactors that might shrink the entropy budget.
+func TestService_GenerateAPIKey_HighEntropy(t *testing.T) {
+	seen := map[string]bool{}
+	for i := 0; i < 100; i++ {
+		k, err := generateAPIKey()
+		if err != nil {
+			t.Fatalf("iter %d: %v", i, err)
+		}
+		if len(k) != 64 {
+			t.Errorf("len = %d, want 64", len(k))
+		}
+		if seen[k] {
+			t.Errorf("key collision in 100 iters — entropy budget regressed")
+		}
+		seen[k] = true
+	}
+}