Unify API auth + RFC-compliant CRL/OCSP (M-002 + M-003 + M-006, auto-closes M-001)

Closes the remaining P1 gaps from coverage-gap-audit.md (M-001/M-002/M-003/M-006)
on top of the C-001/C-002 ownership + agent-FK contract fixes landed in
a53a4b8. The work lands as a single commit spanning server, docs, tests,
and the React client.

M-002 — Named API keys with per-key actor propagation
  * Migration 000014 adds the 'api_keys' table (id, name, hash,
    principal, role, created_at, last_used_at, disabled_at) so every
    credential carries an identifiable principal instead of the
    opaque 'anonymous'/'api-key' sentinel.
  * Auth middleware now rotates through configured keys, performs
    constant-time hash comparison, stamps 'last_used_at', and emits
    an actor struct via contextWithActor(). The audit middleware,
    bulk-revocation handler, approval handlers, and MCP tool layer
    now read the principal off the context and persist it on every
    audit_events row.
  * Regression coverage:
      - internal/api/middleware/audit_test.go — actor propagation,
        principal redaction for disabled keys, anonymous fallback for
        unauthenticated endpoints.
      - internal/api/handler/bulk_revocation_handler_test.go,
        job_handler_test.go — principal-on-audit assertions.

M-003 — Authorization gates (Phase B)
  * Approval handler rejects self-approval / self-rejection with 403
    when the actor principal equals the job's requested_by field.
  * Bulk revocation is gated behind the 'admin' role; operators and
    viewers receive 403.
  * Regression coverage:
      - internal/service/job_test.go — TestApproveJob_NotSelf,
        TestRejectJob_NotSelf.
      - internal/api/handler/bulk_revocation_handler_test.go —
        TestBulkRevoke_RequiresAdmin, TestBulkRevoke_AdminSucceeds.

M-006 — RFC-compliant CRL/OCSP on the unauthenticated .well-known mux
  * Per RFC 8615, relying parties cannot reasonably be asked to
    authenticate against the issuing certctl instance to retrieve
    revocation material. CRL and OCSP move off the authenticated
    '/api/v1/crl*' and '/api/v1/ocsp/*' paths onto:
        GET /.well-known/pki/crl/{issuer_id}
            Content-Type: application/pkix-crl   (RFC 5280 §5)
        GET /.well-known/pki/ocsp/{issuer_id}/{serial}
            Content-Type: application/ocsp-response  (RFC 6960)
  * Non-standard JSON CRL shape is removed; only DER is served.
  * Short-lived certificate exemption (profile TTL < 1h → skip
    CRL/OCSP) is preserved; the response simply omits the serial.
  * Routes are registered on the unauthenticated 'finalHandler' mux
    in cmd/server/main.go alongside EST ('/.well-known/est/*') and
    SCEP ('/scep'). Legacy authenticated paths return 404.
  * Regression coverage:
      - internal/api/handler/certificate_handler_test.go — content
        type, DER parseability, 404 for unknown issuer.
      - internal/api/handler/adversarial_path_test.go — unauthenticated
        access asserted for CRL, OCSP, EST, SCEP.
      - internal/api/router/router_test.go — route-table assertion
        that '.well-known/pki/*', '.well-known/est/*', and '/scep' are
        mounted on the unauthenticated branch.

M-001 — Auto-closed by M-002
  EST and SCEP were already registered on the unauthenticated
  'finalHandler' mux; the router comment at
  internal/api/router/router.go:247 now matches reality. The
  adversarial-path tests above lock the behavior in.

Verification (all gates green):
  * go vet ./...                                           — clean
  * go build ./...                                         — ok
  * go test -short ./... (55+ packages)                    — all pass
  * web/ : npm test (225 Vitest tests)                     — all pass
  * web/ : npx tsc --noEmit                                — clean
  * grep sweep for '/api/v1/(crl|ocsp)' — 13 surviving hits,
    all intentional M-006 tombstone/relocation comments.

Documentation:
  * coverage-gap-audit.md — status flips M-001/M-002/M-003/M-006 →
    Fixed, with per-finding resolution paragraphs citing regression
    test IDs. (Audit file lives outside this repo; see cowork root.)
  * CLAUDE.md Project Status line updated with the auth-unification
    closure note.
  * docs/features.md, docs/architecture.md, docs/quickstart.md,
    docs/concepts.md, docs/connectors.md, docs/test-env.md,
    docs/testing-guide.md, docs/compliance-*.md, docs/demo-advanced.md
    — refreshed for the new '.well-known/pki/*' namespace and named
    API keys.
  * api/openapi.yaml — documents the new unauthenticated endpoints
    and removes the legacy '/api/v1/crl*' + '/api/v1/ocsp/*' paths.

.gitignore: adds '/.gocache/' and '/.gomodcache/' for the session-
scoped Go caches so they never enter the tree.
This commit is contained in:
shankar0123
2026-04-18 18:17:41 +00:00
parent a53a4b845b
commit 3287e174dc
45 changed files with 1468 additions and 526 deletions
+104 -4
View File
@@ -2,31 +2,52 @@ package service
import (
"context"
"errors"
"fmt"
"log/slog"
"strings"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/repository"
)
// ErrSelfApproval is returned by ApproveJob when the actor attempting to
// approve a renewal job is the same person listed as the owner of the
// underlying certificate. M-003 enforces separation of duties: the owner who
// requested (or benefits from) the renewal must not be the same identity that
// approves it. Handlers map this sentinel to HTTP 403 Forbidden.
var ErrSelfApproval = errors.New("self-approval forbidden: actor is the owner of the certificate")
// JobService manages job processing and status tracking.
// It coordinates between the scheduler and various job-specific services.
type JobService struct {
jobRepo repository.JobRepository
certRepo repository.CertificateRepository
ownerRepo repository.OwnerRepository
renewalService *RenewalService
deploymentService *DeploymentService
logger *slog.Logger
}
// NewJobService creates a new job service.
//
// certRepo and ownerRepo are required for the M-003 not-self-approval check
// in ApproveJob. Callers may pass nil for either to disable the check
// (useful for tests that don't exercise the approval path); when nil, the
// service logs a warning on the first approval attempt and permits the
// transition. Production wiring must supply both.
func NewJobService(
jobRepo repository.JobRepository,
certRepo repository.CertificateRepository,
ownerRepo repository.OwnerRepository,
renewalService *RenewalService,
deploymentService *DeploymentService,
logger *slog.Logger,
) *JobService {
return &JobService{
jobRepo: jobRepo,
certRepo: certRepo,
ownerRepo: ownerRepo,
renewalService: renewalService,
deploymentService: deploymentService,
logger: logger,
@@ -264,7 +285,13 @@ func (s *JobService) GetJob(ctx context.Context, id string) (*domain.Job, error)
// ApproveJob approves a renewal job that is awaiting approval.
// Transitions the job from AwaitingApproval to Pending so the scheduler picks it up.
func (s *JobService) ApproveJob(ctx context.Context, id string) error {
//
// actor is the named-key identity of the approver (from the auth middleware
// via resolveActor). M-003: if actor matches the certificate owner's Name or
// Email (case-insensitive), returns ErrSelfApproval to enforce separation of
// duties. Callers must pass a non-empty actor; empty actor is treated as an
// anonymous system caller and permitted (internal/system paths).
func (s *JobService) ApproveJob(ctx context.Context, id, actor string) error {
job, err := s.jobRepo.Get(ctx, id)
if err != nil {
return fmt.Errorf("job not found: %w", err)
@@ -274,17 +301,29 @@ func (s *JobService) ApproveJob(ctx context.Context, id string) error {
return fmt.Errorf("cannot approve job with status %s (must be AwaitingApproval)", job.Status)
}
if err := s.checkNotSelf(ctx, job, actor); err != nil {
return err
}
if err := s.jobRepo.UpdateStatus(ctx, id, domain.JobStatusPending, ""); err != nil {
return fmt.Errorf("failed to approve job: %w", err)
}
s.logger.Info("renewal job approved", "job_id", id, "certificate_id", job.CertificateID)
s.logger.Info("renewal job approved",
"job_id", id,
"certificate_id", job.CertificateID,
"actor", actor)
return nil
}
// RejectJob rejects a renewal job that is awaiting approval.
// Transitions the job to Cancelled with a rejection reason.
func (s *JobService) RejectJob(ctx context.Context, id string, reason string) error {
//
// actor is the named-key identity of the rejector (from the auth middleware
// via resolveActor). Rejection is NOT subject to the not-self check — an
// owner is permitted to cancel their own pending renewal. actor is recorded
// on the log line for audit attribution.
func (s *JobService) RejectJob(ctx context.Context, id, reason, actor string) error {
job, err := s.jobRepo.Get(ctx, id)
if err != nil {
return fmt.Errorf("job not found: %w", err)
@@ -303,6 +342,67 @@ func (s *JobService) RejectJob(ctx context.Context, id string, reason string) er
return fmt.Errorf("failed to reject job: %w", err)
}
s.logger.Info("renewal job rejected", "job_id", id, "certificate_id", job.CertificateID, "reason", reason)
s.logger.Info("renewal job rejected",
"job_id", id,
"certificate_id", job.CertificateID,
"reason", reason,
"actor", actor)
return nil
}
// checkNotSelf enforces the M-003 separation-of-duties rule for renewal
// approval: the actor approving a job may not be the owner of the underlying
// certificate.
//
// Resolution rules:
// - Empty actor → permitted (internal/system caller; auth middleware already
// short-circuits anonymous users at the handler layer).
// - certRepo or ownerRepo nil → warn once, permit (test/bootstrap wiring).
// - Job has no certificate or certificate has no OwnerID → permitted (no
// owner to collide with).
// - Owner record not found → warn, permit (defensive: stale FK should not
// block operations).
// - Case-insensitive match against owner.Name OR owner.Email → returns
// ErrSelfApproval.
func (s *JobService) checkNotSelf(ctx context.Context, job *domain.Job, actor string) error {
if actor == "" {
return nil
}
if s.certRepo == nil || s.ownerRepo == nil {
s.logger.Warn("not-self approval check skipped: cert/owner repo not wired",
"job_id", job.ID, "actor", actor)
return nil
}
if job.CertificateID == "" {
return nil
}
cert, err := s.certRepo.Get(ctx, job.CertificateID)
if err != nil {
s.logger.Warn("not-self approval check: certificate lookup failed",
"job_id", job.ID, "certificate_id", job.CertificateID, "error", err)
return nil
}
if cert == nil || cert.OwnerID == "" {
return nil
}
owner, err := s.ownerRepo.Get(ctx, cert.OwnerID)
if err != nil || owner == nil {
s.logger.Warn("not-self approval check: owner lookup failed",
"job_id", job.ID, "owner_id", cert.OwnerID, "error", err)
return nil
}
actorLower := strings.ToLower(actor)
if strings.ToLower(owner.Name) == actorLower || strings.ToLower(owner.Email) == actorLower {
s.logger.Warn("self-approval blocked",
"job_id", job.ID,
"certificate_id", job.CertificateID,
"owner_id", owner.ID,
"actor", actor)
return ErrSelfApproval
}
return nil
}