Files
certctl/internal/service/auth/role_service.go
T
shankar0123 3ef45e2ad4 auth-bundle-1 Phase 6-7-8: bootstrap path + scope-down CLI + auditor-role split
# Phase 6 — day-0 admin bootstrap

* internal/auth/bootstrap/ (new package): Strategy interface +
  EnvTokenStrategy with constant-time compare, one-shot consumption
  via sync.Mutex, optional admin-existence probe. Bundle 2's OIDC-
  first-admin will plug in alongside as an alternate Strategy.
* BootstrapService.ValidateAndMint: validates the operator's
  CERTCTL_BOOTSTRAP_TOKEN, mints a 32-byte (64-hex-char) random API
  key value, persists the SHA-256 hash to api_keys, grants r-admin
  via actor_roles, AddHashed's the runtime keystore so the just-
  minted key authenticates the next request without restart, and
  records bootstrap.consume to the audit trail with category=auth.
* internal/auth/keystore.go (new): KeyStore interface +
  StaticKeyStore (immutable env-var-only path) + MutableKeyStore
  (env-var keys + DB-loaded api_keys + runtime AddHashed). The auth
  middleware now consumes a KeyStore so the bootstrap path can
  extend the lookup table at runtime.
* migrations/000031_api_keys.up/down.sql: api_keys table with
  (id, name UNIQUE, key_hash UNIQUE, tenant_id, admin, created_by,
  created_at, expires_at, last_used_at). Idempotent.
* /v1/auth/bootstrap GET (probe) + POST (mint) — auth-exempt. Both
  routes documented in api/openapi.yaml + AuthExemptRouterRoutes
  allowlist updated. The token never leaves internal/auth/bootstrap;
  the minted plaintext key flows only into the HTTP response body.
* Startup warning emitted when CERTCTL_BOOTSTRAP_TOKEN is set AND
  admin actors already exist (config drift signal).
* Tests: 4 strategy invariants (empty token born disabled, wrong
  token=ErrInvalidToken without consumption, one-shot consumption,
  admin-exists closes path), 5 service tests (happy path + actor-
  name validation + propagation of strategy errors + nil-deps
  guard + 32-byte entropy budget), 8 HTTP-handler tests (status
  201/410/401/400 mapping + token-leak hygiene scan of slog +
  audit details + Location header). Token-leak test redirects
  slog.Default to a buffer for the test scope.

# Phase 7 — API-key migration + scope-down CLI

* GET /v1/auth/keys handler + service method ListKeys backed by
  ActorRoleRepository.ListDistinctActors. Returns one row per
  (actor_id, actor_type) pair with the slice of role IDs they hold.
  Permission: auth.role.list.
* internal/cli/auth_scope_down.go: AuthListKeys, AuthScopeDown
  (interactive), AuthScopeDownNonInteractive (JSON config),
  AuthScopeDownSuggest (--suggest with optional --apply). The
  synthetic actor-demo-anon is filtered out of every interactive /
  bulk path; non-interactive flow logs and skips it explicitly.
* SuggestRoleFromAuditEvents (pure function): walks 30 days of
  audit events per actor and returns the narrowest matching role
  (admin / mcp / viewer / agent / operator) plus a one-line reason.
  Classification: any admin-shaped action wins; otherwise all-MCP
  → mcp; all-read-only → viewer; all-agent-shaped → agent;
  otherwise operator. Test table pins all six classifications.
* CLI subcommand tree extended: 'auth keys list' + 'auth keys
  scope-down [--non-interactive <cfg>] [--suggest [--apply]]'.
* CHANGELOG.md leads v2.1.0 with the SECURITY: AUDIT YOUR API KEYS
  call-out + four flow examples.

# Phase 8 — auditor role + event_category column

* migrations/000032_audit_category.up/down.sql: ALTER TABLE
  audit_events ADD COLUMN event_category TEXT NOT NULL DEFAULT
  'cert_lifecycle' + CHECK constraint (cert_lifecycle/auth/config)
  + (event_category) and (event_category, timestamp DESC) indexes
  for the auditor-filter query path. WORM trigger from migration
  000018 continues to enforce append-only at the DB layer (DDL is
  not blocked).
* domain.AuditEvent gains EventCategory string (omitempty);
  domain.EventCategoryCertLifecycle / Auth / Config constants.
* AuditService.RecordEventWithCategory sibling of RecordEvent;
  legacy callers stay on RecordEvent (defaults to cert_lifecycle).
  Auth callers (RoleService, ActorRoleService, BootstrapService)
  switched to RecordEventWithCategory(..., 'auth', ...).
* GET /v1/audit?category=<cat>: handler accepts the optional query
  param, validates against the enum (400 on invalid value),
  dispatches through ListAuditEventsByCategory. OpenAPI updated
  with the new query param + AuditEvent.event_category schema.
* Postgres AuditRepository.Create now writes event_category;
  AuditRepository.List filters on it; AuditFilter.EventCategory
  gates the WHERE clause.
* Tests: 5 audit-category-filter HTTP tests (dispatch routing,
  back-compat fallback, 400 for invalid values, all 3 enum values
  accepted, page+category combine, JSON output surfaces the
  field). 3 auditor-role invariants (auditor holds exactly
  audit.read+audit.export, no mutating perms, disjoint from
  viewer except audit.read).

# Cross-phase wiring

* HandlerRegistry.Bootstrap field added; cmd/server/main.go wires
  the bootstrap service ahead of RegisterHandlers (extracted
  assembleNamedAPIKeys helper into auth_backfill.go, moved the
  keystore + bootstrap construction up alongside the auth repos).
* AuthCheckResolver / AuthActorRoleService extended with ListKeys
  to satisfy the Phase 7 surface; existing fakes updated.
* fakeAudit + mockAuditService stubs in tests gain
  RecordEventWithCategory + ListAuditEventsByCategory; existing
  tests untouched.

# Verifications

* gofmt -l: clean across every modified file.
* go vet ./...: clean.
* staticcheck across internal/auth + handler + router + cli +
  service + repository + cmd + domain: clean.
* go test -short -count=1: green across every Bundle-1-touched
  package — internal/auth (incl. bootstrap), internal/api/handler,
  internal/api/router, internal/cli, internal/service/auth,
  internal/service, internal/domain/auth, internal/repository/postgres,
  cmd/server, cmd/cli, plus internal/scheduler, internal/api/middleware,
  cmd/agent, internal/mcp.
2026-05-09 20:15:43 +00:00

209 lines
6.9 KiB
Go

package auth
import (
"context"
"fmt"
"github.com/certctl-io/certctl/internal/domain"
authdomain "github.com/certctl-io/certctl/internal/domain/auth"
"github.com/certctl-io/certctl/internal/repository"
)
// RoleService manages roles + role-permission grants.
type RoleService struct {
repo repository.RoleRepository
permRepo repository.PermissionRepository
authorizer *Authorizer
audit AuditService
}
// NewRoleService constructs a RoleService.
func NewRoleService(repo repository.RoleRepository, permRepo repository.PermissionRepository, authorizer *Authorizer, audit AuditService) *RoleService {
return &RoleService{
repo: repo,
permRepo: permRepo,
authorizer: authorizer,
audit: audit,
}
}
// List returns every role in the caller's tenant. Requires
// `auth.role.list`.
func (s *RoleService) List(ctx context.Context, caller *Caller) ([]*authdomain.Role, error) {
if err := s.requirePermission(ctx, caller, "auth.role.list"); err != nil {
return nil, err
}
tenantID := caller.TenantID
if tenantID == "" {
tenantID = authdomain.DefaultTenantID
}
return s.repo.List(ctx, tenantID)
}
// Get returns the role with the given ID. Requires `auth.role.list`.
func (s *RoleService) Get(ctx context.Context, caller *Caller, id string) (*authdomain.Role, error) {
if err := s.requirePermission(ctx, caller, "auth.role.list"); err != nil {
return nil, err
}
return s.repo.Get(ctx, id)
}
// Create stores a new role. Requires `auth.role.create`.
func (s *RoleService) Create(ctx context.Context, caller *Caller, role *authdomain.Role) error {
if err := s.requirePermission(ctx, caller, "auth.role.create"); err != nil {
return err
}
if role.TenantID == "" {
role.TenantID = authdomain.DefaultTenantID
}
if err := s.repo.Create(ctx, role); err != nil {
return err
}
s.recordAudit(ctx, caller, "role.create", "role", role.ID, map[string]interface{}{"name": role.Name, "tenant_id": role.TenantID})
return nil
}
// Update modifies an existing role. Requires `auth.role.edit`.
func (s *RoleService) Update(ctx context.Context, caller *Caller, role *authdomain.Role) error {
if err := s.requirePermission(ctx, caller, "auth.role.edit"); err != nil {
return err
}
if err := s.repo.Update(ctx, role); err != nil {
return err
}
s.recordAudit(ctx, caller, "role.update", "role", role.ID, map[string]interface{}{"name": role.Name})
return nil
}
// Delete removes a role. Requires `auth.role.delete`. Returns
// repository.ErrAuthRoleInUse when active actor_roles still reference
// the role (FK ON DELETE RESTRICT).
func (s *RoleService) Delete(ctx context.Context, caller *Caller, id string) error {
if err := s.requirePermission(ctx, caller, "auth.role.delete"); err != nil {
return err
}
if err := s.repo.Delete(ctx, id); err != nil {
return err
}
s.recordAudit(ctx, caller, "role.delete", "role", id, nil)
return nil
}
// ListPermissions returns the (permission, scope) grants on the role.
// Requires `auth.role.list`.
func (s *RoleService) ListPermissions(ctx context.Context, caller *Caller, roleID string) ([]*authdomain.RolePermission, error) {
if err := s.requirePermission(ctx, caller, "auth.role.list"); err != nil {
return nil, err
}
return s.repo.ListPermissions(ctx, roleID)
}
// AddPermission grants a permission to a role at the given scope.
// Requires `auth.role.edit`. Returns ErrInvalidPermission if the
// permission name is not in the canonical catalogue.
func (s *RoleService) AddPermission(ctx context.Context, caller *Caller, roleID, permissionName string, scopeType authdomain.ScopeType, scopeID *string) error {
if err := s.requirePermission(ctx, caller, "auth.role.edit"); err != nil {
return err
}
if !s.permRepo.IsCanonical(permissionName) {
return fmt.Errorf("%w: %q", ErrInvalidPermission, permissionName)
}
perm, err := s.permRepo.GetByName(ctx, permissionName)
if err != nil {
return err
}
grant := &authdomain.RolePermission{
RoleID: roleID,
PermissionID: perm.ID,
ScopeType: scopeType,
ScopeID: scopeID,
}
if err := s.repo.AddPermission(ctx, grant); err != nil {
return err
}
details := map[string]interface{}{
"role_id": roleID,
"permission": permissionName,
"scope_type": string(scopeType),
}
if scopeID != nil {
details["scope_id"] = *scopeID
}
s.recordAudit(ctx, caller, "role.permission.add", "role", roleID, details)
return nil
}
// RemovePermission revokes a previously-granted permission from a role.
// Requires `auth.role.edit`.
func (s *RoleService) RemovePermission(ctx context.Context, caller *Caller, roleID, permissionName string, scopeType authdomain.ScopeType, scopeID *string) error {
if err := s.requirePermission(ctx, caller, "auth.role.edit"); err != nil {
return err
}
perm, err := s.permRepo.GetByName(ctx, permissionName)
if err != nil {
return err
}
grant := &authdomain.RolePermission{
RoleID: roleID,
PermissionID: perm.ID,
ScopeType: scopeType,
ScopeID: scopeID,
}
if err := s.repo.RemovePermission(ctx, grant); err != nil {
return err
}
details := map[string]interface{}{
"role_id": roleID,
"permission": permissionName,
"scope_type": string(scopeType),
}
if scopeID != nil {
details["scope_id"] = *scopeID
}
s.recordAudit(ctx, caller, "role.permission.remove", "role", roleID, details)
return nil
}
// requirePermission is the gate every public method runs first. System
// callers bypass; everyone else must hold the named permission globally.
// Returns ErrUnauthenticated when caller is nil, ErrForbidden when the
// caller exists but lacks the permission.
func (s *RoleService) requirePermission(ctx context.Context, caller *Caller, perm string) error {
if caller == nil {
return ErrUnauthenticated
}
if caller.IsSystem {
return nil
}
tenantID := caller.TenantID
if tenantID == "" {
tenantID = authdomain.DefaultTenantID
}
ok, err := s.authorizer.CheckPermission(ctx, caller.ActorID, authdomain.ActorTypeValue(caller.ActorType), tenantID, perm, authdomain.ScopeTypeGlobal, nil)
if err != nil {
return err
}
if !ok {
return fmt.Errorf("%w: %q", ErrForbidden, perm)
}
return nil
}
// recordAudit emits an audit row tied to the caller. Best-effort: audit
// failures are logged via panic-recover but do not fail the operation.
//
// Bundle 1 Phase 8: every role-mutation is an authentication /
// authorization event. The auditor role queries
// /v1/audit?category=auth to surface this slice.
func (s *RoleService) recordAudit(ctx context.Context, caller *Caller, action, resourceType, resourceID string, details map[string]interface{}) {
if s.audit == nil || caller == nil {
return
}
_ = s.audit.RecordEventWithCategory(ctx, caller.ActorID, caller.ActorType, action, domain.EventCategoryAuth, resourceType, resourceID, details)
}
// Ensure the compile-time pin: domain.ActorType is convertible to
// authdomain.ActorTypeValue via string equality. If the underlying
// types ever diverge this won't compile.
var _ authdomain.ActorTypeValue = authdomain.ActorTypeValue(domain.ActorTypeAPIKey)