auth-bundle-1 Phase 6-7-8: bootstrap path + scope-down CLI + auditor-role split

# Phase 6 — day-0 admin bootstrap

* internal/auth/bootstrap/ (new package): Strategy interface +
  EnvTokenStrategy with constant-time compare, one-shot consumption
  via sync.Mutex, optional admin-existence probe. Bundle 2's OIDC-
  first-admin will plug in alongside as an alternate Strategy.
* BootstrapService.ValidateAndMint: validates the operator's
  CERTCTL_BOOTSTRAP_TOKEN, mints a 32-byte (64-hex-char) random API
  key value, persists the SHA-256 hash to api_keys, grants r-admin
  via actor_roles, AddHashed's the runtime keystore so the just-
  minted key authenticates the next request without restart, and
  records bootstrap.consume to the audit trail with category=auth.
* internal/auth/keystore.go (new): KeyStore interface +
  StaticKeyStore (immutable env-var-only path) + MutableKeyStore
  (env-var keys + DB-loaded api_keys + runtime AddHashed). The auth
  middleware now consumes a KeyStore so the bootstrap path can
  extend the lookup table at runtime.
* migrations/000031_api_keys.up/down.sql: api_keys table with
  (id, name UNIQUE, key_hash UNIQUE, tenant_id, admin, created_by,
  created_at, expires_at, last_used_at). Idempotent.
* /v1/auth/bootstrap GET (probe) + POST (mint) — auth-exempt. Both
  routes documented in api/openapi.yaml + AuthExemptRouterRoutes
  allowlist updated. The token never leaves internal/auth/bootstrap;
  the minted plaintext key flows only into the HTTP response body.
* Startup warning emitted when CERTCTL_BOOTSTRAP_TOKEN is set AND
  admin actors already exist (config drift signal).
* Tests: 4 strategy invariants (empty token born disabled, wrong
  token=ErrInvalidToken without consumption, one-shot consumption,
  admin-exists closes path), 5 service tests (happy path + actor-
  name validation + propagation of strategy errors + nil-deps
  guard + 32-byte entropy budget), 8 HTTP-handler tests (status
  201/410/401/400 mapping + token-leak hygiene scan of slog +
  audit details + Location header). Token-leak test redirects
  slog.Default to a buffer for the test scope.

# Phase 7 — API-key migration + scope-down CLI

* GET /v1/auth/keys handler + service method ListKeys backed by
  ActorRoleRepository.ListDistinctActors. Returns one row per
  (actor_id, actor_type) pair with the slice of role IDs they hold.
  Permission: auth.role.list.
* internal/cli/auth_scope_down.go: AuthListKeys, AuthScopeDown
  (interactive), AuthScopeDownNonInteractive (JSON config),
  AuthScopeDownSuggest (--suggest with optional --apply). The
  synthetic actor-demo-anon is filtered out of every interactive /
  bulk path; non-interactive flow logs and skips it explicitly.
* SuggestRoleFromAuditEvents (pure function): walks 30 days of
  audit events per actor and returns the narrowest matching role
  (admin / mcp / viewer / agent / operator) plus a one-line reason.
  Classification: any admin-shaped action wins; otherwise all-MCP
  → mcp; all-read-only → viewer; all-agent-shaped → agent;
  otherwise operator. Test table pins all six classifications.
* CLI subcommand tree extended: 'auth keys list' + 'auth keys
  scope-down [--non-interactive <cfg>] [--suggest [--apply]]'.
* CHANGELOG.md leads v2.1.0 with the SECURITY: AUDIT YOUR API KEYS
  call-out + four flow examples.

# Phase 8 — auditor role + event_category column

* migrations/000032_audit_category.up/down.sql: ALTER TABLE
  audit_events ADD COLUMN event_category TEXT NOT NULL DEFAULT
  'cert_lifecycle' + CHECK constraint (cert_lifecycle/auth/config)
  + (event_category) and (event_category, timestamp DESC) indexes
  for the auditor-filter query path. WORM trigger from migration
  000018 continues to enforce append-only at the DB layer (DDL is
  not blocked).
* domain.AuditEvent gains EventCategory string (omitempty);
  domain.EventCategoryCertLifecycle / Auth / Config constants.
* AuditService.RecordEventWithCategory sibling of RecordEvent;
  legacy callers stay on RecordEvent (defaults to cert_lifecycle).
  Auth callers (RoleService, ActorRoleService, BootstrapService)
  switched to RecordEventWithCategory(..., 'auth', ...).
* GET /v1/audit?category=<cat>: handler accepts the optional query
  param, validates against the enum (400 on invalid value),
  dispatches through ListAuditEventsByCategory. OpenAPI updated
  with the new query param + AuditEvent.event_category schema.
* Postgres AuditRepository.Create now writes event_category;
  AuditRepository.List filters on it; AuditFilter.EventCategory
  gates the WHERE clause.
* Tests: 5 audit-category-filter HTTP tests (dispatch routing,
  back-compat fallback, 400 for invalid values, all 3 enum values
  accepted, page+category combine, JSON output surfaces the
  field). 3 auditor-role invariants (auditor holds exactly
  audit.read+audit.export, no mutating perms, disjoint from
  viewer except audit.read).

# Cross-phase wiring

* HandlerRegistry.Bootstrap field added; cmd/server/main.go wires
  the bootstrap service ahead of RegisterHandlers (extracted
  assembleNamedAPIKeys helper into auth_backfill.go, moved the
  keystore + bootstrap construction up alongside the auth repos).
* AuthCheckResolver / AuthActorRoleService extended with ListKeys
  to satisfy the Phase 7 surface; existing fakes updated.
* fakeAudit + mockAuditService stubs in tests gain
  RecordEventWithCategory + ListAuditEventsByCategory; existing
  tests untouched.

# Verifications

* gofmt -l: clean across every modified file.
* go vet ./...: clean.
* staticcheck across internal/auth + handler + router + cli +
  service + repository + cmd + domain: clean.
* go test -short -count=1: green across every Bundle-1-touched
  package — internal/auth (incl. bootstrap), internal/api/handler,
  internal/api/router, internal/cli, internal/service/auth,
  internal/service, internal/domain/auth, internal/repository/postgres,
  cmd/server, cmd/cli, plus internal/scheduler, internal/api/middleware,
  cmd/agent, internal/mcp.
This commit is contained in:
shankar0123
2026-05-09 20:15:43 +00:00
parent 60a589ab96
commit 3ef45e2ad4
38 changed files with 3159 additions and 140 deletions
+37 -1
View File
@@ -427,10 +427,12 @@ func handleAuthPermissions(client *cli.Client, args []string) error {
func handleAuthKeys(client *cli.Client, args []string) error {
if len(args) == 0 {
fmt.Fprintf(os.Stderr, "usage: auth keys <assign|revoke> [...]\n")
fmt.Fprintf(os.Stderr, "usage: auth keys <list|assign|revoke|scope-down> [...]\n")
return nil
}
switch args[0] {
case "list":
return client.AuthListKeys()
case "assign":
// auth keys assign <key-id> --role <role-id>
if len(args) < 4 || args[2] != "--role" {
@@ -445,8 +447,42 @@ func handleAuthKeys(client *cli.Client, args []string) error {
return nil
}
return client.AuthRevokeRoleFromKey(args[1], args[3])
case "scope-down":
// Bundle 1 Phase 7 — interactive (default), --non-interactive
// <config.json>, or --suggest [--apply].
return handleAuthKeysScopeDown(client, args[1:])
default:
fmt.Fprintf(os.Stderr, "unknown keys subcommand: %s\n", args[0])
return nil
}
}
// handleAuthKeysScopeDown dispatches the three scope-down modes:
//
// auth keys scope-down → interactive
// auth keys scope-down --non-interactive <config> → JSON-driven
// auth keys scope-down --suggest [--apply] → audit-driven suggestions
func handleAuthKeysScopeDown(client *cli.Client, args []string) error {
if len(args) == 0 {
return client.AuthScopeDown()
}
switch args[0] {
case "--non-interactive":
if len(args) < 2 {
fmt.Fprintf(os.Stderr, "usage: auth keys scope-down --non-interactive <config.json>\n")
return nil
}
return client.AuthScopeDownNonInteractive(args[1])
case "--suggest":
apply := false
for _, a := range args[1:] {
if a == "--apply" {
apply = true
}
}
return client.AuthScopeDownSuggest(apply)
default:
fmt.Fprintf(os.Stderr, "unknown scope-down flag: %s\n", args[0])
return nil
}
}
+47
View File
@@ -2,13 +2,60 @@ package main
import (
"context"
"fmt"
"log/slog"
"strings"
"github.com/certctl-io/certctl/internal/auth"
"github.com/certctl-io/certctl/internal/config"
"github.com/certctl-io/certctl/internal/domain"
authdomain "github.com/certctl-io/certctl/internal/domain/auth"
)
// assembleNamedAPIKeys translates the operator's CERTCTL_API_KEYS_NAMED
// env-var (preferred) or CERTCTL_AUTH_SECRET (legacy) into the
// auth.NamedAPIKey slice the rest of the boot path consumes.
//
// Authentication unification (M-002): every authenticated request now
// carries a named actor in the request context so audit events record
// the real key identity instead of the hardcoded "api-key-user"
// string. Named keys come from CERTCTL_API_KEYS_NAMED (preferred). For
// backward compatibility CERTCTL_AUTH_SECRET is synthesized into
// legacy-key-N entries with Admin=false.
func assembleNamedAPIKeys(cfg *config.Config, logger *slog.Logger) []auth.NamedAPIKey {
if config.AuthType(cfg.Auth.Type) == config.AuthTypeNone {
return nil
}
var out []auth.NamedAPIKey
for _, nk := range cfg.Auth.NamedKeys {
out = append(out, auth.NamedAPIKey{
Name: nk.Name,
Key: nk.Key,
Admin: nk.Admin,
})
}
if len(out) == 0 && cfg.Auth.Secret != "" {
idx := 0
for _, p := range strings.Split(cfg.Auth.Secret, ",") {
p = strings.TrimSpace(p)
if p == "" {
continue
}
out = append(out, auth.NamedAPIKey{
Name: fmt.Sprintf("legacy-key-%d", idx),
Key: p,
Admin: false,
})
idx++
}
if len(out) > 0 && logger != nil {
logger.Warn("CERTCTL_AUTH_SECRET is deprecated — set CERTCTL_API_KEYS_NAMED for named actor attribution and admin gating",
"synthesized_keys", len(out))
}
}
return out
}
// actorRoleGranter is the narrow interface backfillNamedKeyActorRoles
// needs from the postgres ActorRoleRepository. Pulled out so the unit
// test can inject a fake without spinning up the full repo / DB.
+69 -63
View File
@@ -22,6 +22,7 @@ import (
"github.com/certctl-io/certctl/internal/api/middleware"
"github.com/certctl-io/certctl/internal/api/router"
"github.com/certctl-io/certctl/internal/auth"
"github.com/certctl-io/certctl/internal/auth/bootstrap"
"github.com/certctl-io/certctl/internal/config"
discoveryawssm "github.com/certctl-io/certctl/internal/connector/discovery/awssm"
discoveryazurekv "github.com/certctl-io/certctl/internal/connector/discovery/azurekv"
@@ -264,11 +265,68 @@ func main() {
authRoleRepo := postgres.NewRoleRepository(db)
authPermRepo := postgres.NewPermissionRepository(db)
authActorRoleRepo := postgres.NewActorRoleRepository(db)
authAPIKeyRepo := postgres.NewAPIKeyRepository(db)
authAuthorizer := authsvc.NewAuthorizer(authActorRoleRepo)
// authCheckerAdapter bridges authsvc.Authorizer (typed-string args)
// to the auth.PermissionChecker interface (plain-string args) so
// internal/auth doesn't have to import internal/service/auth.
authCheckerAdapter := authPermissionCheckerAdapter{a: authAuthorizer}
// Bundle 1 Phase 6 — parse env-var named API keys + assemble the
// runtime keystore + wire the bootstrap service. The keystore +
// bootstrap handler must exist before the HandlerRegistry is
// constructed below; the auth middleware that reads from the same
// keystore is wired further down (next to the rest of the
// middleware stack) but holds a reference to the same keystore so
// runtime additions from bootstrap propagate without restart.
//
// boot-path operations use context.Background() because the long-
// lived request context isn't constructed until later in main();
// this matches the convention used by other one-shot setup calls
// in this section (issuerService.SeedFromEnvVars, etc.).
bootCtx := context.Background()
namedKeys := assembleNamedAPIKeys(cfg, logger)
backfillNamedKeyActorRoles(bootCtx, authActorRoleRepo, namedKeys, logger)
authKeyStore := auth.NewMutableKeyStore(namedKeys)
if persistedKeys, err := authAPIKeyRepo.List(bootCtx, authdomainAlias.DefaultTenantID); err == nil {
for _, pk := range persistedKeys {
authKeyStore.AddHashed(pk.Name, pk.KeyHash, pk.Admin)
}
if len(persistedKeys) > 0 {
logger.Info("loaded persisted api_keys into runtime keystore",
"count", len(persistedKeys))
}
} else {
logger.Warn("api_keys boot loader failed; bootstrap-minted keys will not authenticate until next restart that succeeds",
"err", err)
}
bootstrapStrategy := bootstrap.NewEnvTokenStrategy(
cfg.Auth.BootstrapToken,
func(ctx context.Context) (bool, error) {
return authActorRoleRepo.AdminExists(ctx, authdomainAlias.DefaultTenantID)
},
)
bootstrapService := bootstrap.NewService(
bootstrapStrategy,
authAPIKeyRepo,
authActorRoleRepo,
auditService,
authKeyStore,
auth.HashAPIKey,
)
if cfg.Auth.BootstrapToken != "" {
// Honour the prompt's "warn at startup if token set + admin
// exists" requirement. The strategy re-probes on every Validate
// so this boot-time warning is purely informational.
if exists, probeErr := authActorRoleRepo.AdminExists(bootCtx, authdomainAlias.DefaultTenantID); probeErr == nil && exists {
logger.Warn("CERTCTL_BOOTSTRAP_TOKEN set but admin actors already exist; bootstrap endpoint will return 410 Gone — unset the env var to silence this warning")
} else if probeErr != nil {
logger.Warn("CERTCTL_BOOTSTRAP_TOKEN admin-existence probe failed at startup; behaviour will be determined by the live probe at request time", "err", probeErr)
} else {
logger.Info("bootstrap endpoint enabled — POST /api/v1/auth/bootstrap to mint the first admin key (one-shot)")
}
}
bootstrapHandler := handler.NewBootstrapHandler(bootstrapService)
policyService := service.NewPolicyService(policyRepo, auditService)
policyService.SetCertRepo(certificateRepo) // D-008: CertificateLifetime arm needs CertificateVersion.NotBefore/NotAfter
// G-1: RenewalPolicyService — distinct from PolicyService (compliance rules).
@@ -1001,6 +1059,10 @@ func main() {
authsvc.NewActorRoleService(authActorRoleRepo, authRoleRepo, authAuthorizer, auditService),
authCheckerAdapter,
),
// Bundle 1 Phase 6 — bootstrap day-0 admin endpoint. The
// service is wired above; handler is auth-exempt at the
// router (gated by the bootstrap.Strategy itself).
Bootstrap: bootstrapHandler,
// Checker is the load-bearing auth.PermissionChecker that
// auth.RequirePermission middleware uses to gate the legacy admin
// handlers (Bundle 1 Phase 3.5: bulk_revocation, admin_crl_cache,
@@ -1523,75 +1585,19 @@ func main() {
// Build middleware stack.
//
// Authentication unification (M-002): every authenticated request now
// carries a named actor in the request context so audit events record
// the real key identity instead of the hardcoded "api-key-user" string.
// Named keys come from CERTCTL_API_KEYS_NAMED (preferred). For backward
// compatibility CERTCTL_AUTH_SECRET is synthesized into legacy-key-N
// entries with Admin=false.
var namedKeys []auth.NamedAPIKey
if config.AuthType(cfg.Auth.Type) != config.AuthTypeNone {
// Translate typed config.NamedAPIKey -> auth.NamedAPIKey. The
// two structs are field-compatible but live in different packages to
// preserve the config→middleware dependency direction.
for _, nk := range cfg.Auth.NamedKeys {
namedKeys = append(namedKeys, auth.NamedAPIKey{
Name: nk.Name,
Key: nk.Key,
Admin: nk.Admin,
})
}
// Back-compat: if no named keys but legacy Secret is configured,
// synthesize named entries so the audit trail still attributes the
// action (instead of falling back to "api-key-user" / "anonymous").
if len(namedKeys) == 0 && cfg.Auth.Secret != "" {
parts := strings.Split(cfg.Auth.Secret, ",")
idx := 0
for _, p := range parts {
p = strings.TrimSpace(p)
if p == "" {
continue
}
namedKeys = append(namedKeys, auth.NamedAPIKey{
Name: fmt.Sprintf("legacy-key-%d", idx),
Key: p,
Admin: false,
})
idx++
}
if len(namedKeys) > 0 {
logger.Warn("CERTCTL_AUTH_SECRET is deprecated — set CERTCTL_API_KEYS_NAMED for named actor attribution and admin gating",
"synthesized_keys", len(namedKeys))
}
}
}
// Bundle 1 Phase 3 closure (C2): backfill actor_roles rows for every
// CERTCTL_API_KEYS_NAMED entry (and the legacy CERTCTL_AUTH_SECRET
// synthesized fallbacks) so RBAC checks have a row to match against.
// Without this, named keys would land on a Phase-3 actor context
// that authorizes every request through the legacy in-handler
// auth.IsAdmin path but fails every Phase-3.5 rbacGate (no
// actor_roles row → empty EffectivePermissions → 403). The helper
// lives in cmd/server/auth_backfill.go so the role-mapping invariant
// is pinned by a focused unit test without dragging in the full
// server bootstrap path.
backfillNamedKeyActorRoles(ctx, authActorRoleRepo, namedKeys, logger)
// Bundle 1 Phase 3 closure (C1): when CERTCTL_AUTH_TYPE=none the
// legacy NewAuthWithNamedKeys returns a no-op pass-through, which
// would leave ActorIDKey / ActorTypeKey / TenantIDKey unpopulated
// in context. Phase 3.5's rbacGate + Phase 4's RBAC handlers all
// require an actor in context (or they 401), so demo mode would be
// completely broken. NewDemoModeAuth injects the synthetic
// `actor-demo-anon` actor seeded by migration 000029, which holds
// the admin role at global scope; the demo + 5 examples in
// examples/*/docker-compose.yml continue to work end-to-end.
// Bundle 1 Phase 6: namedKeys + authKeyStore + bootstrap service
// are now constructed earlier (right after the auth repos) so the
// HandlerRegistry can wire the bootstrap handler. The auth
// middleware below reads from the same authKeyStore reference, so
// runtime additions from bootstrap propagate without restart.
var authMiddleware func(http.Handler) http.Handler
switch config.AuthType(cfg.Auth.Type) {
case config.AuthTypeNone:
authMiddleware = auth.NewDemoModeAuth()
default:
authMiddleware = auth.NewAuthWithNamedKeys(namedKeys)
authMiddleware = auth.NewAuthWithKeyStore(authKeyStore)
}
_ = bootstrapHandler // referenced by HandlerRegistry above
corsMiddleware := middleware.NewCORS(middleware.CORSConfig{
AllowedOrigins: cfg.CORS.AllowedOrigins,
})