Commit Graph

12 Commits

Author SHA1 Message Date
shankar0123 b059ec930f fix: end-to-end certificate lifecycle bugs + integration test environment
Fixes 12 production bugs preventing the full issuance→deployment flow
from working with ACME (Pebble/Let's Encrypt) and step-ca issuers:

ACME connector (acme.go):
- Save orderURI before WaitOrder overwrites it (Go crypto/acme bug)
- Add CreateOrderCert fallback via WaitOrder+FetchCert
- Remove defer-reset in ValidateConfig that caused nil pointer panic
- Add Insecure TLS option for self-signed ACME servers (Pebble)

step-ca connector (stepca.go, jwe.go):
- Real JWE provisioner key loading + decryption (was using ephemeral keys)
- Fix JWT audience (/1.0/sign), sha claim (key fingerprint), kid header
- Custom root CA trust via RootCertPath config
- Remove hardcoded 90-day validity default (let step-ca decide)

NGINX target connector (nginx.go):
- Use sh -c for validate/reload commands (shell interpretation)
- Use filepath.Dir instead of fragile string slicing
- Add private key file writing (agent-mode keys were never deployed)
- Make chain_path write conditional

Server/service layer:
- TriggerRenewalWithActor now creates actual Job records (was no-op)
- createDeploymentJobs falls back to DB query when cert.TargetIDs empty
- ProcessPendingJobs skips agent-routed deployment jobs
- Agent cert pickup path parsing: len(parts)<4 → len(parts)<3
- Health/ready/auth-info endpoints bypass auth middleware
- Write timeout 15s→120s for ACME issuance
- Cert fingerprint computed on CSR submission

Integration test environment (deploy/test/):
- 10-phase test script covering Local CA, ACME, step-ca, revocation,
  discovery, renewal, and API spot checks
- Docker Compose with 7 containers (server, agent, postgres, nginx,
  pebble, challtestsrv, step-ca) on isolated network
- TLS verification checks SAN (not just Subject CN) for modern CA compat

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 17:02:20 -04:00
shankar0123 836534f2a7 feat: add issuer catalog page with type discovery + fix cert creation defaults (M33)
Issuer Catalog (M33):
- Shared issuer type config (issuerTypes.ts) with 6 supported + 2 coming-soon types
- Composable wizard components (TypeSelector, ConfigForm, ConfigDetailModal)
- Catalog card layout with Connected/Available/Coming Soon badges
- VaultPKI and DigiCert added to create wizard with full config fields
- ACME EAB fields (eab_kid, eab_hmac with sensitive flag)
- Issuer type filter dropdown on configured issuers table
- Config detail modal replacing 60-char truncation
- IssuerDetailPage uses shared typeLabels/redactConfig, Edit button, enabled/disabled status
- StatusBadge extended with Enabled/Disabled styles
- 2 new frontend tests (VaultPKI + DigiCert create payload verification)

Bug fixes:
- CertificateService.CreateCertificate now defaults Status to Pending and Tags to
  empty map when not set (DB column DEFAULTs only apply when columns are omitted
  from INSERT, but our repo always includes all columns)
- CreateCertificate handler now logs actual error via slog.Error before returning
  generic 500, enabling root cause debugging

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 18:58:23 -04:00
shankar0123 a8fc177118 fix: resolve NULL csr_pem scan errors and QA smoke test failures
Root cause: certificate_versions.csr_pem is nullable in the schema but
Go code scanned it into a plain string. Used sql.NullString in
ListVersions and GetLatestVersion to handle NULL values correctly.

Also includes: partial update fetch-merge-update pattern to prevent FK
violations, nil directory guard in discovery service, diagnostic slog
logging in handlers, export handler 422 for unparseable PEM, OpenAPI
spec corrections, MCP tool description improvements, and test fixes.

Rewrites the Release Sign-Off section in testing-guide.md to individual
test-level granularity (320 rows) with smoke test results audited and
checked off (121 pass, 5 skip, 194 manual remaining).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 00:51:18 -04:00
shankar0123 de9264baf7 docs: synchronize project documentation with codebase
Implements 3 deferred security tickets (TICKET-003, TICKET-007, TICKET-010)
and performs comprehensive documentation audit to eliminate drift between
code and docs.

Code changes:
- TICKET-003: Repository integration tests with testcontainers-go (50+ subtests)
- TICKET-007: CertificateService decomposition into RevocationSvc + CAOperationsSvc
- TICKET-010: Request body size limits via http.MaxBytesReader middleware
- Fix missing slog import in certificate.go after service decomposition

Documentation updates:
- README: Fix endpoint count (97→93), expand env var reference (15→39 vars)
- CLAUDE.md: Fix OpenAPI operation count (85→93), update file locations
- architecture.md: Add body size limits section, middleware chain ordering
- CONTRIBUTING.md: New contributor guide with architecture conventions,
  test patterns, middleware ordering, CI thresholds
- SECURITY_REMEDIATION.md: Removed from repo (moved to cowork, gitignored)
- Test files: Add doc comments to all new test files

Documentation that should exist but doesn't yet:
- Architecture diagrams (C4 model or similar)
- Threat model document
- Testing philosophy guide
- Disaster recovery runbook
- Upgrade guide (migration between versions)
- API versioning strategy document

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 22:28:54 -04:00
shankar0123 e078a686bf feat: M20 Enhanced Query API — sort, time-range filters, cursor pagination, sparse fields, deployments endpoint
V2 (free) query enhancements for certificates:
- `sort` param with direction (`?sort=-notAfter` for descending)
- Time-range filters: `expires_before`, `expires_after`, `created_after`, `updated_after`
- Cursor-based pagination (`?cursor=token&page_size=100`) alongside page-based
- Sparse field selection (`?fields=id,commonName,status`)
- Additional filters: `agent_id`, `profile_id`
- New endpoint: `GET /api/v1/certificates/{id}/deployments`

25 new tests (12 handler + 13 e2e) covering all M20 features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 18:56:02 -04:00
shankar0123 762c523d59 feat: M15b — OCSP responder, DER CRL, short-lived exemption, revocation GUI
Backend:
- Embedded OCSP responder: GET /api/v1/ocsp/{issuer_id}/{serial} returns
  signed OCSP responses (good/revoked/unknown) using CA key
- DER-encoded X.509 CRL: GET /api/v1/crl/{issuer_id} returns proper DER CRL
  signed by issuing CA with 24h validity window
- Short-lived cert exemption: certs with profile TTL < 1 hour skip CRL/OCSP
  (expiry is sufficient revocation for ephemeral workloads)
- Extended issuer connector interface with GenerateCRL and SignOCSPResponse
- Local CA implements full CRL/OCSP signing; ACME and step-ca return
  appropriate "use native endpoint" errors
- IssuerConnectorAdapter bridges new methods between layers

Frontend:
- Revoke button on certificate detail page with RFC 5280 reason modal
- Revocation banner with reason display and timestamp
- Revocation status indicators in lifecycle section
- "Revoked" filter option in certificates list
- API client: revokeCertificate() function and Certificate type extensions

Tests: ~31 new tests across connector, service, handler, and adapter layers
Docs: milestones renumbered (M13-M14, M16-M18), M15b marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 14:39:10 -04:00
shankar0123 5d98e373e3 feat: M15a — certificate revocation API, CRL endpoint, and revocation notifications
Implements core revocation infrastructure: POST /api/v1/certificates/{id}/revoke
with all 8 RFC 5280 reason codes, JSON-formatted CRL at GET /api/v1/crl, webhook
and email revocation notifications, best-effort issuer notification, and immutable
revocation audit trail. Includes 48 new tests across service, handler, integration,
and domain layers (600+ total). Fixes 3 pre-existing test bugs (team_test error
matching, agent_group delete status code, team handler per_page validation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:59:18 -04:00
shankar0123 e03a75ed9a fix: replace fmt.Printf with structured slog logging across all services
All 10 service files now use slog.Error for failure logging instead of
fmt.Printf. Audit event recording errors are checked and logged rather
than silently discarded. Adds consistent structured context (resource IDs,
operation names) to all error log statements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 01:20:03 -04:00
shankar0123 66f04f7afe style: run gofmt -s across all Go files
Fixes Go Report Card gofmt score from 52% to 100%.
Pure formatting changes — no logic modifications.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 19:32:29 -04:00
shankar0123 3f6b0aa995 Fix Create methods: respect user-provided IDs and set timestamps
All service-layer Create methods (team, owner, target, issuer,
certificate) were unconditionally overwriting user-provided IDs with
auto-generated ones and leaving CreatedAt/UpdatedAt as zero values.

This caused three user-visible bugs:
- POST /api/v1/teams with {"id": "t-demo"} returned a generated ID
  like "team-1773601137949154216" instead of "t-demo"
- POST /api/v1/owners referencing the user-provided team_id failed
  with Internal Server Error (FK constraint on non-existent generated ID)
- created_at/updated_at came back as "0001-01-01T00:00:00Z"

Fix: all 9 affected Create methods (both context-aware and handler
interface variants) now check if ID is empty before generating, and
set timestamps to time.Now() if zero-valued. Follows the existing
correct pattern in policy.go CreateRule/CreatePolicy.

Also removes two stale temp files (audit.go.* and issuer.go.*) that
were accidentally committed to the repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 14:52:48 -04:00
shankar0123 9b4122b159 Fix runtime bugs, implement service layer, and overhaul documentation
Runtime fixes:
- Fix env var mismatch (CERTCTL_DB_URL → CERTCTL_DATABASE_URL)
- Fix table name mismatches (certificates → managed_certificates, notifications → notification_events)
- Add renewal_policy_id to certificate queries
- Remove non-existent created_at from notification queries
- Add env var fallback for agent CLI flags
- Graceful degradation for missing notifiers/issuers in demo mode
- Copy web/ directory in Dockerfile for dashboard serving

Service layer:
- Implement handler-service interface pattern across all services
- Wire up certificate, agent, job, policy, team, owner, audit, notification services

Documentation:
- Add concepts.md: beginner-friendly guide to TLS, CAs, private keys
- Rewrite quickstart.md with accurate API examples matching actual handlers
- Add demo-advanced.md: interactive demo with cert issuance and automated script
- Update architecture.md with correct table names and connector interfaces
- Update connectors.md to match actual Go interface signatures
- Update demo-guide.md with cross-references to new docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 21:38:11 -04:00
shankar0123 d395776a95 Initial scaffold: certificate control plane v0.1.0 2026-03-14 08:22:17 -04:00