Commit Graph

12 Commits

Author SHA1 Message Date
shankar0123 0e29c416b1 refactor(handler,repo): replace strings.Contains error dispatch with typed sentinels (S-2)
Closes one 2026-04-24 audit finding (P2):

  - cat-s6-efc7f6f6bd50: 30 strings.Contains(err.Error(), ...) sites
    in internal/api/handler/ — brittle to repository-layer message
    changes, untyped against the actual failure mode.

Approach (Option B from prompt design notes):
  - New typed sentinels in internal/repository/errors.go:
      ErrNotFound, ErrForeignKeyConstraint
      IsForeignKeyError(err) helper (the only place substring
      matching at the lib/pq boundary is allowed; isolates the
      DB-driver string knowledge to one function).
  - New typed sentinel in internal/domain/errors.go:
      ErrValidation (reserved for future per-entity validation
      wrappers; not yet used by all handlers).
  - 49 sites in internal/repository/postgres/*.go updated to wrap
    sql.ErrNoRows-derived errors via fmt.Errorf("...: %w",
    repository.ErrNotFound).
  - 18 not-found handler sites + 2 FK-constraint handler sites
    refactored to errors.Is(err, repository.ErrNotFound) /
    repository.IsForeignKeyError(err).
  - 23 inline `fmt.Errorf("X not found")` test fixtures across
    handler tests rewrapped to wrap repository.ErrNotFound.
  - test_utils.go::ErrMockNotFound rewrapped to wrap
    repository.ErrNotFound; renewal_policy.go closure docblock
    updated to reflect the new convention.
  - integration test mockJobRepository.Get wraps repository.ErrNotFound.

CI regression guardrail:
- .github/workflows/ci.yml::"Forbidden strings.Contains(err.Error())
  regression guard (S-2)" greps for the three patterns ("not found",
  "violates foreign key", "RESTRICT") under internal/api/handler/
  and fails the build on regression.

Verification:
- go build ./... — clean
- go vet ./... — clean
- go test ./... -short -count=1 — all packages pass (handler +
  repository + service + integration)
- golangci-lint v2.11.4 run ./... — 0 issues
- S-2 guardrail dry-run on post-fix tree → empty (good)
- All sibling guardrails (S-1, G-3, D-1+D-2, B-1, L-1, H-1, C-1, F-1, P-1) pass

Audit findings closed:
- cat-s6-efc7f6f6bd50 (P2)

Deferred follow-ups:
- 6 domain-specific substring patterns still inline in handlers
  ("cannot approve", "cannot reject", "cannot be parsed",
  "no certificates found", "challenge password", "invalid"/
  "required" validation chains in profiles + agent_groups). Each
  needs its own typed sentinel, scoped per service. Documented
  by the S-2 CI guardrail's allowlist for closure-comments only.
- Per-entity not-found sentinels (Option A — ErrCertificateNotFound,
  ErrAgentNotFound, etc.) deferred. Generic ErrNotFound covers the
  current dispatch needs; per-entity precision would let handlers
  return entity-aware error bodies without a domain.Type field,
  but not blocking.
2026-04-25 17:54:14 +00:00
shankar0123 3287e174dc Unify API auth + RFC-compliant CRL/OCSP (M-002 + M-003 + M-006, auto-closes M-001)
Closes the remaining P1 gaps from coverage-gap-audit.md (M-001/M-002/M-003/M-006)
on top of the C-001/C-002 ownership + agent-FK contract fixes landed in
a53a4b8. The work lands as a single commit spanning server, docs, tests,
and the React client.

M-002 — Named API keys with per-key actor propagation
  * Migration 000014 adds the 'api_keys' table (id, name, hash,
    principal, role, created_at, last_used_at, disabled_at) so every
    credential carries an identifiable principal instead of the
    opaque 'anonymous'/'api-key' sentinel.
  * Auth middleware now rotates through configured keys, performs
    constant-time hash comparison, stamps 'last_used_at', and emits
    an actor struct via contextWithActor(). The audit middleware,
    bulk-revocation handler, approval handlers, and MCP tool layer
    now read the principal off the context and persist it on every
    audit_events row.
  * Regression coverage:
      - internal/api/middleware/audit_test.go — actor propagation,
        principal redaction for disabled keys, anonymous fallback for
        unauthenticated endpoints.
      - internal/api/handler/bulk_revocation_handler_test.go,
        job_handler_test.go — principal-on-audit assertions.

M-003 — Authorization gates (Phase B)
  * Approval handler rejects self-approval / self-rejection with 403
    when the actor principal equals the job's requested_by field.
  * Bulk revocation is gated behind the 'admin' role; operators and
    viewers receive 403.
  * Regression coverage:
      - internal/service/job_test.go — TestApproveJob_NotSelf,
        TestRejectJob_NotSelf.
      - internal/api/handler/bulk_revocation_handler_test.go —
        TestBulkRevoke_RequiresAdmin, TestBulkRevoke_AdminSucceeds.

M-006 — RFC-compliant CRL/OCSP on the unauthenticated .well-known mux
  * Per RFC 8615, relying parties cannot reasonably be asked to
    authenticate against the issuing certctl instance to retrieve
    revocation material. CRL and OCSP move off the authenticated
    '/api/v1/crl*' and '/api/v1/ocsp/*' paths onto:
        GET /.well-known/pki/crl/{issuer_id}
            Content-Type: application/pkix-crl   (RFC 5280 §5)
        GET /.well-known/pki/ocsp/{issuer_id}/{serial}
            Content-Type: application/ocsp-response  (RFC 6960)
  * Non-standard JSON CRL shape is removed; only DER is served.
  * Short-lived certificate exemption (profile TTL < 1h → skip
    CRL/OCSP) is preserved; the response simply omits the serial.
  * Routes are registered on the unauthenticated 'finalHandler' mux
    in cmd/server/main.go alongside EST ('/.well-known/est/*') and
    SCEP ('/scep'). Legacy authenticated paths return 404.
  * Regression coverage:
      - internal/api/handler/certificate_handler_test.go — content
        type, DER parseability, 404 for unknown issuer.
      - internal/api/handler/adversarial_path_test.go — unauthenticated
        access asserted for CRL, OCSP, EST, SCEP.
      - internal/api/router/router_test.go — route-table assertion
        that '.well-known/pki/*', '.well-known/est/*', and '/scep' are
        mounted on the unauthenticated branch.

M-001 — Auto-closed by M-002
  EST and SCEP were already registered on the unauthenticated
  'finalHandler' mux; the router comment at
  internal/api/router/router.go:247 now matches reality. The
  adversarial-path tests above lock the behavior in.

Verification (all gates green):
  * go vet ./...                                           — clean
  * go build ./...                                         — ok
  * go test -short ./... (55+ packages)                    — all pass
  * web/ : npm test (225 Vitest tests)                     — all pass
  * web/ : npx tsc --noEmit                                — clean
  * grep sweep for '/api/v1/(crl|ocsp)' — 13 surviving hits,
    all intentional M-006 tombstone/relocation comments.

Documentation:
  * coverage-gap-audit.md — status flips M-001/M-002/M-003/M-006 →
    Fixed, with per-finding resolution paragraphs citing regression
    test IDs. (Audit file lives outside this repo; see cowork root.)
  * CLAUDE.md Project Status line updated with the auth-unification
    closure note.
  * docs/features.md, docs/architecture.md, docs/quickstart.md,
    docs/concepts.md, docs/connectors.md, docs/test-env.md,
    docs/testing-guide.md, docs/compliance-*.md, docs/demo-advanced.md
    — refreshed for the new '.well-known/pki/*' namespace and named
    API keys.
  * api/openapi.yaml — documents the new unauthenticated endpoints
    and removes the legacy '/api/v1/crl*' + '/api/v1/ocsp/*' paths.

.gitignore: adds '/.gocache/' and '/.gomodcache/' for the session-
scoped Go caches so they never enter the tree.
2026-04-18 18:17:41 +00:00
shankar0123 cdc9d03d5b fix(m-2): thread context through CertificateService cluster
Collapses CertificateService, RevocationSvc, and CAOperationsSvc to
ctx-accepting method signatures. Removes context.Background() synthesis
at 24 internal call sites across certificate.go, revocation_svc.go, and
ca_operations.go.

- Primary repo calls inherit request cancellation via the passed ctx.
- Audit and notification dispatches use context.WithoutCancel(ctx) so
  they survive client disconnect.
- Collapses TriggerRenewal/TriggerRenewalWithActor,
  TriggerDeployment/TriggerDeploymentWithActor, and
  RevokeCertificate/RevokeCertificateWithActor sibling pairs into single
  canonical ctx-accepting methods (decisions D-1, D-2).

Handlers pass r.Context(). Mocks and tests updated to match new
signatures. No HTTP surface change, no OpenAPI change.

PR 1 of 6 in the M-2 remediation chain. Master green at this commit.

Refs: certctl-audit-report.md M-2 (L143, L224)
2026-04-18 00:29:37 +00:00
shankar0123 836534f2a7 feat: add issuer catalog page with type discovery + fix cert creation defaults (M33)
Issuer Catalog (M33):
- Shared issuer type config (issuerTypes.ts) with 6 supported + 2 coming-soon types
- Composable wizard components (TypeSelector, ConfigForm, ConfigDetailModal)
- Catalog card layout with Connected/Available/Coming Soon badges
- VaultPKI and DigiCert added to create wizard with full config fields
- ACME EAB fields (eab_kid, eab_hmac with sensitive flag)
- Issuer type filter dropdown on configured issuers table
- Config detail modal replacing 60-char truncation
- IssuerDetailPage uses shared typeLabels/redactConfig, Edit button, enabled/disabled status
- StatusBadge extended with Enabled/Disabled styles
- 2 new frontend tests (VaultPKI + DigiCert create payload verification)

Bug fixes:
- CertificateService.CreateCertificate now defaults Status to Pending and Tags to
  empty map when not set (DB column DEFAULTs only apply when columns are omitted
  from INSERT, but our repo always includes all columns)
- CreateCertificate handler now logs actual error via slog.Error before returning
  generic 500, enabling root cause debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:58:23 -04:00
shankar0123 a8fc177118 fix: resolve NULL csr_pem scan errors and QA smoke test failures
Root cause: certificate_versions.csr_pem is nullable in the schema but
Go code scanned it into a plain string. Used sql.NullString in
ListVersions and GetLatestVersion to handle NULL values correctly.

Also includes: partial update fetch-merge-update pattern to prevent FK
violations, nil directory guard in discovery service, diagnostic slog
logging in handlers, export handler 422 for unparseable PEM, OpenAPI
spec corrections, MCP tool description improvements, and test fixes.

Rewrites the Release Sign-Off section in testing-guide.md to individual
test-level granularity (320 rows) with smoke test results audited and
checked off (121 pass, 5 skip, 194 manual remaining).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 00:51:18 -04:00
shankar0123 e078a686bf feat: M20 Enhanced Query API — sort, time-range filters, cursor pagination, sparse fields, deployments endpoint
V2 (free) query enhancements for certificates:
- `sort` param with direction (`?sort=-notAfter` for descending)
- Time-range filters: `expires_before`, `expires_after`, `created_after`, `updated_after`
- Cursor-based pagination (`?cursor=token&page_size=100`) alongside page-based
- Sparse field selection (`?fields=id,commonName,status`)
- Additional filters: `agent_id`, `profile_id`
- New endpoint: `GET /api/v1/certificates/{id}/deployments`

25 new tests (12 handler + 13 e2e) covering all M20 features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 18:56:02 -04:00
shankar0123 43a03c168c fix: Go 1.25 upgrade, codebase audit fixes, MCP server tests
Upgrade from Go 1.22 to 1.25 (minimum for MCP SDK, actively supported).
CI updated to match.

Codebase audit fixes:
- Local CA parseIP() now uses net.ParseIP — IP SANs no longer silently dropped
- Nil pointer guards in agent.go GetWorkWithTargets for target/cert enrichment
- MCP CreateCertificateInput marks owner_id/team_id as required
- NGINX connector uses CombinedOutput() — captures diagnostic output on failure
- Jobs handler validates JSON decode on rejection body — returns 400 on malformed
- CRL/OCSP handlers propagate requestID for error tracing

MCP server tests (26 tests):
- client_test.go: HTTP client coverage (GET/POST/PUT/DELETE, auth, 204, errors, binary)
- tools_test.go: tool registration, pagination, end-to-end flows with mock API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 17:36:25 -04:00
shankar0123 28bef63569 fix: handle 'not found' errors as 404 in CRL/OCSP handlers
The error routing only checked for "issuer not found" but not
"certificate not found", causing cert-not-found errors to fall
through to a generic 500. Broadened the check to match any
"not found" error string for both CRL and OCSP handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 19:06:37 -04:00
shankar0123 762c523d59 feat: M15b — OCSP responder, DER CRL, short-lived exemption, revocation GUI
Backend:
- Embedded OCSP responder: GET /api/v1/ocsp/{issuer_id}/{serial} returns
  signed OCSP responses (good/revoked/unknown) using CA key
- DER-encoded X.509 CRL: GET /api/v1/crl/{issuer_id} returns proper DER CRL
  signed by issuing CA with 24h validity window
- Short-lived cert exemption: certs with profile TTL < 1 hour skip CRL/OCSP
  (expiry is sufficient revocation for ephemeral workloads)
- Extended issuer connector interface with GenerateCRL and SignOCSPResponse
- Local CA implements full CRL/OCSP signing; ACME and step-ca return
  appropriate "use native endpoint" errors
- IssuerConnectorAdapter bridges new methods between layers

Frontend:
- Revoke button on certificate detail page with RFC 5280 reason modal
- Revocation banner with reason display and timestamp
- Revocation status indicators in lifecycle section
- "Revoked" filter option in certificates list
- API client: revokeCertificate() function and Certificate type extensions

Tests: ~31 new tests across connector, service, handler, and adapter layers
Docs: milestones renumbered (M13-M14, M16-M18), M15b marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 14:39:10 -04:00
shankar0123 5d98e373e3 feat: M15a — certificate revocation API, CRL endpoint, and revocation notifications
Implements core revocation infrastructure: POST /api/v1/certificates/{id}/revoke
with all 8 RFC 5280 reason codes, JSON-formatted CRL at GET /api/v1/crl, webhook
and email revocation notifications, best-effort issuer notification, and immutable
revocation audit trail. Includes 48 new tests across service, handler, integration,
and domain layers (600+ total). Fixes 3 pre-existing test bugs (team_test error
matching, agent_group delete status code, team handler per_page validation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:59:18 -04:00
shankar0123 9e6756d02f Implement M5: hardening, input validation, and Vite+React+TS dashboard
Backend hardening:
- Fix 6 nginx.go non-constant format string build errors
- Add validation.go with hostname, PEM, and enum validators
- Apply input validation to all POST/PUT handlers (certificates,
  agents, CSR, policies, teams, owners, targets, issuers)
- Fix unchecked JSON decode in TriggerDeployment handler

Frontend (Vite + React + TypeScript):
- Migrate from single-file SPA to proper build pipeline
- 7 pages: Dashboard, Certificates (list+detail), Agents, Jobs,
  Notifications, Policies, Audit Trail
- TanStack Query for server state with auto-refetch intervals
- Certificate detail with version history and renewal trigger
- Job cancellation, status/type filtering, expiry countdowns
- Reusable components: DataTable, StatusBadge, ErrorState, PageHeader
- Dark theme with Tailwind CSS, sidebar nav via React Router

Server integration:
- Go server serves web/dist/ (Vite output) with SPA fallback
- Falls back to web/index.html for legacy mode
- .gitignore updated for web/node_modules/ and web/dist/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 01:19:19 -04:00
shankar0123 d395776a95 Initial scaffold: certificate control plane v0.1.0 2026-03-14 08:22:17 -04:00