Compare commits

..

105 Commits

Author SHA1 Message Date
shankar0123 707d8de4fb UX-001: sidebar re-entry + inline team/owner creation in wizard
Closes UX-001 (OnboardingWizard CertificateStep dead-end): users no
longer have to navigate away from the wizard and lose their in-flight
state when the required Owner/Team dropdowns are empty.

Layout.tsx
  - Adds persistent 'Setup guide' button in the left sidebar.
  - Clears localStorage 'certctl:onboarding-dismissed' then navigates
    to /?onboarding=1 as a re-entry signal that overrides dismissal.
  - localStorage.removeItem wrapped in try/catch to tolerate storage
    access errors (private browsing, quota, etc.).

DashboardPage.tsx
  - Reads ?onboarding=1 via useSearchParams as a forceOnboarding flag.
  - forceOnboarding bypasses the latched first-run gate so the wizard
    reopens even after dismissal or with certs/issuers already present.
  - onDismiss now also strips ?onboarding=1 via setSearchParams(next,
    { replace: true }) so a page refresh does not relaunch the wizard.

OnboardingWizard.tsx
  - Adds CreateTeamModalInline and CreateOwnerModalInline inside
    CertificateStep. Both wire through React Query: createTeam /
    createOwner mutation on success invalidates ['teams'] / ['owners']
    and calls onCreated(id) so the parent select auto-selects the new
    row as soon as the refetch lands.
  - '+ New team' and '+ New owner' buttons placed next to the select
    labels; empty-state copy replaced with inline 'create one now'
    buttons (no more Link back to /owners /teams).
  - CreateOwner coerces empty teamId to undefined before mutation so
    the server contract matches OwnersPage.

Tests (12 new, all green; total suite 252 passed / 0 failed):
  - Layout.test.tsx (4): Setup guide button renders, clicking it clears
    the dismissal key and navigates to /?onboarding=1, tolerates
    localStorage.removeItem throwing.
  - DashboardPage.test.tsx (4): first-run auto-open, ?onboarding=1
    re-entry after dismissal, onDismiss writes localStorage + strips
    the query param, dismissed-with-no-param stays closed.
  - OnboardingWizard.test.tsx (4): Skip-Skip reaches CertificateStep
    with '+ New team' / '+ New owner' buttons visible; '+ New team'
    happy path with React Query invalidation + parent-select
    auto-select via option-parent traversal (label is a sibling, not
    htmlFor-linked); '+ New owner' happy path pins team_id: undefined
    coercion; Cancel abort never mutates.

Test infrastructure notes:
  - Closure-driven vi.fn().mockImplementation pattern drives the
    post-invalidation refetch: the mutation mock mutates a closure
    variable that the getTeams/getOwners mock reads, so the parent
    select's new <option> exists by the time the refetch lands.
  - Anchored regex (/^Create Team$/, /^Create Owner$/) disambiguates
    the modal submit from the '+ New team' / '+ New owner' triggers.

Verification gates (all green):
  - vitest run: 252 passed / 0 failed (8 files, 13.98s)
  - tsc --noEmit: 0 errors
  - vite build: clean production bundle (851.77 kB js / 226.81 kB gzip)

No new runtime dependencies. Frontend-only change.
2026-04-19 14:49:04 +00:00
shankar0123 0725713e19 Close I-004 (agent hard-delete cascades targets) coverage-gap finding
Operator decision answered as full soft-delete with optional forced
cascade — hard-delete is not reachable from any public surface. Prior
to this commit, DELETE /agents/{id} ran a plain `DELETE FROM agents`
whose schema-level `ON DELETE CASCADE` on deployment_targets.agent_id
silently wiped every target, orphaning certs and aborting in-flight
jobs. The finding closure reshapes the agent-removal contract around
soft retirement with explicit preflight counts, an opt-in cascade
gated by a mandatory reason, and unconditional protection for the
four reserved sentinel agents used by discovery sources.

Schema — migration 000015:
  migrations/000015_agent_retire.up.sql flips
  deployment_targets_agent_id_fkey from ON DELETE CASCADE to ON DELETE
  RESTRICT, so a stray `DELETE FROM agents` now errors at the DB
  boundary instead of quietly destroying targets. Both `agents` and
  `deployment_targets` grow a retired_at TIMESTAMPTZ + retired_reason
  TEXT pair (TEXT not VARCHAR so operator comments are never
  truncated), indexed via partial indexes WHERE retired_at IS NOT
  NULL. The migration is self-healing (ADD COLUMN IF NOT EXISTS, DROP
  CONSTRAINT IF EXISTS then ADD CONSTRAINT, CREATE INDEX IF NOT
  EXISTS) so repeated runs against partially-migrated databases
  converge. migrations/000015_agent_retire.down.sql restores CASCADE
  and drops the new columns for clean rollback. A dedicated
  repository-layer testcontainers test
  (internal/repository/postgres/migration_000015_test.go) asserts the
  before/after FK action, column presence, index presence, and
  round-trip idempotency under up→down→up.

Domain — sentinel guard + dependency counts:
  internal/domain/connector.go gains IsRetired() on Agent, the
  exported SentinelAgentIDs slice listing server-scanner,
  cloud-aws-sm, cloud-azure-kv, cloud-gcp-sm verbatim (matching the
  four reserved IDs documented in CLAUDE.md and created at startup in
  cmd/server/main.go), IsSentinelAgent(id string) predicate,
  AgentDependencyCounts{ActiveTargets, ActiveCertificates,
  PendingJobs} with a HasDependencies() method, and ActorTypeAgent /
  ActorTypeSystem enum values used by audit emission downstream.
  Coverage locked down by internal/domain/connector_test.go.

Service — 8-step ordered contract:
  internal/service/agent_retire.go:RetireAgent(ctx, id, actor,
  opts{Force, Reason}) enforces a fixed execution order:
  (1) sentinel guard — IsSentinelAgent(id) returns ErrAgentIsSentinel
      unconditionally; force=true does NOT bypass it.
  (2) fetch — ErrAgentNotFound on miss.
  (3) idempotency — if IsRetired() already, return
      AgentRetirementResult{AlreadyRetired: true} with no new audit
      event and no state change (safe to replay from flaky clients).
  (4) preflight counts — collectAgentDependencyCounts runs
      ActiveTargets, ActiveCertificates, PendingJobs sequentially
      (not in parallel; keeps the per-query timeout predictable and
      matches the repo's existing call-chain shape).
  (5) force-reason guard — opts.Force=true with empty Reason returns
      ErrForceReasonRequired (wired into the 400 status surface).
  (6) dependency guard — HasDependencies() with opts.Force=false
      returns BlockedByDependenciesError{Counts} (wired into the 409
      body with per-bucket counts).
  (7) mutation — single pinned retiredAt := time.Now(); agent
      retirement first, then cascade target retirement if opts.Force,
      all under the repo's single transaction so the two retired_at
      stamps match to the second.
  (8) best-effort audit — agent_retired always; agent_retirement_
      cascaded additionally on the force path. Actor is whatever the
      handler resolves from the request; actor type is mapped by
      resolveActorType (system/agent-prefix→Agent/else→User). Audit
      emission failures are logged via slog.Error but do not abort
      the retirement (matches the house convention used by every
      other scheduler-emitted event).

  BlockedByDependenciesError implements Error() as
  "active_targets=%d, active_certificates=%d, pending_jobs=%d" and
  Unwrap() → ErrBlockedByDependencies. The single struct satisfies
  errors.Is via Unwrap (used by scheduler-level tests) and errors.As
  via the concrete type (used by the handler to fish out Counts for
  the 409 body). ListRetiredAgents(page, perPage) adds a separate
  paginated accessor with page<1→1 and perPage<1→50 normalization so
  retired rows are queryable without polluting the default agent
  listing.

  Sentinel guard coverage is asymmetric by design: all four reserved
  IDs are protected, and force=true cannot override. Regression tests
  in internal/service/agent_retire_test.go assert each of the eight
  steps in order, plus sentinel bypass attempts and idempotency
  replay.

Handler + router — status-code surface:
  internal/api/handler/agents.go:RetireAgent exposes seven status
  codes on DELETE /agents/{id}:
    200 on a fresh retirement (body echoes AgentRetirementResult).
    204 on idempotent replay (AlreadyRetired=true; no new audit).
    400 on ErrForceReasonRequired.
    403 on ErrAgentIsSentinel.
    404 on ErrAgentNotFound.
    409 on BlockedByDependenciesError, with a custom body shape
        {error, counts{active_targets, active_certificates,
        pending_jobs}} that bypasses the default ErrorWithRequestID
        envelope so callers get the per-bucket numbers directly.
    500 on any other error.
  Heartbeat HandleHeartbeat returns 410 Gone when the agent is
  retired (ErrAgentRetired), signalling the agent to shut down.
  Query params `force=true` and `reason=<text>` drive the cascade
  path; both are forwarded as url.Values through the new MCP
  transport.

  internal/api/router/router.go registers GET /api/v1/agents/retired
  literal-path BEFORE /api/v1/agents/{id} — Go 1.22 ServeMux's
  literal-beats-pattern-var precedence routes "retired" to the
  paginated retired-agents listing instead of fetching a hypothetical
  agent named "retired".

Agent binary — clean shutdown on 410:
  cmd/agent/main.go gains the ErrAgentRetired sentinel, a
  retiredOnce sync.Once, and a retiredSignal chan struct{}. A
  markRetired(source, statusCode, body) helper closes the channel
  exactly once; the Run() select loop observes the close and returns
  ErrAgentRetired; main() matches via errors.Is(err, ErrAgentRetired)
  and exits cleanly instead of spinning in the heartbeat retry loop.
  The 410 Gone surface is therefore terminal for the agent process.

MCP transport:
  internal/mcp/client.go adds Client.DeleteWithQuery(path, query),
  a new additive transport method. Client.Delete is path-only; without
  this method the retire tool would silently drop `force` and `reason`,
  turning every cascade retire into a default soft-retire. The new
  method shares do()'s 204 normalization and 4xx/5xx error
  propagation so tool authors get one contract.
  internal/mcp/tools.go + internal/mcp/types.go expose the
  retire_agent tool with Force+Reason inputs wired through
  DeleteWithQuery.

CLI:
  cmd/cli/main.go + internal/cli/client.go add two CLI surfaces:
  `agents list --retired` (client-side strip of --retired then
  delegation to ListRetiredAgents, sharing --page/--per-page parsing
  with the default listing) and `agents retire <id> [--force --reason
  "…"]` (mirrors ErrForceReasonRequired — force without reason is
  rejected client-side before the request is sent). JSON + table
  output modes both honor the new columns.

Frontend:
  web/src/pages/AgentsPage.tsx surfaces retired/retire affordances.
  web/src/api/client.ts + web/src/api/types.ts expose the retire
  endpoint and the retired-listing. 4 new Vitest regression cases.

OpenAPI:
  api/openapi.yaml documents DELETE /agents/{id} with all seven
  status codes, 410 on heartbeat, and the 409 per-bucket body shape.

Regression coverage (six new test files, all green):
  internal/service/agent_retire_test.go           — 8-step contract + sentinel guards
  internal/api/handler/agent_retire_handler_test.go — 7-status-code surface + 410 heartbeat
  internal/mcp/retire_agent_test.go               — DeleteWithQuery wire-through
  internal/cli/agent_retire_test.go               — --retired listing + --force/--reason pairing
  internal/repository/postgres/migration_000015_test.go — FK flip + columns + indexes + up↔down
  internal/domain/connector_test.go               — IsRetired, IsSentinelAgent, SentinelAgentIDs, HasDependencies

Files:
  api/openapi.yaml                                — DELETE + 410 + 409 body shape
  cmd/agent/main.go                               — ErrAgentRetired, markRetired, retiredSignal
  cmd/cli/main.go                                 — handleAgents list/get/retire dispatch
  docs/architecture.md, docs/concepts.md,
    docs/testing-guide.md                         — retirement contract narrative
  internal/api/handler/agents.go                  — RetireAgent, status surface, 410 on heartbeat
  internal/api/handler/agent_handler_test.go      — extended coverage
  internal/api/handler/agent_retire_handler_test.go — new
  internal/api/router/router.go                   — /agents/retired before /agents/{id}
  internal/cli/agent_retire_test.go               — new
  internal/cli/client.go                          — ListRetiredAgents + RetireAgent
  internal/domain/connector.go                    — IsRetired, SentinelAgentIDs,
                                                    IsSentinelAgent, AgentDependencyCounts,
                                                    ActorTypeAgent/System
  internal/domain/connector_test.go               — new
  internal/integration/lifecycle_test.go          — retirement fixture
  internal/mcp/client.go                          — DeleteWithQuery additive transport
  internal/mcp/retire_agent_test.go               — new
  internal/mcp/tools.go, internal/mcp/types.go    — retire_agent tool + Force/Reason inputs
  internal/repository/interfaces.go               — AgentRepository retirement methods
  internal/repository/postgres/agent.go           — retire + cascade target retire + counts
  internal/repository/postgres/migration_000015_test.go — new
  internal/service/agent.go                       — wire into AgentService surface
  internal/service/agent_retire.go                — new 8-step contract
  internal/service/agent_retire_test.go           — new
  internal/service/deployment.go                  — skip retired agents
  internal/service/target.go                      — skip retired agents
  internal/service/testutil_test.go               — shared mocks extended
  migrations/000015_agent_retire.up.sql           — new
  migrations/000015_agent_retire.down.sql         — new
  web/src/api/client.ts, types.ts + tests         — retire endpoint wiring
  web/src/pages/AgentsPage.tsx                    — retire UI
2026-04-19 05:24:00 +00:00
shankar0123 1ee77c89f8 I-003: job timeout reaper closes AwaitingCSR/AwaitingApproval gap
Add 11th always-on scheduler loop that transitions jobs stuck in
AwaitingCSR (default 24h TTL) or AwaitingApproval (default 168h TTL)
to Failed. I-001's retry loop then auto-promotes eligible Failed jobs
back to Pending. No new status enum, no schema migration.

- JobRepository.ListTimedOutAwaitingJobs with per-status cutoff WHERE
- JobService.ReapTimedOutJobs mirrors RetryFailedJobs structure
- Scheduler jobTimeoutLoop with atomic.Bool idempotency guard, 2m
  per-tick context, WaitGroup shutdown drain
- Config: CERTCTL_JOB_TIMEOUT_INTERVAL (10m), CERTCTL_JOB_AWAITING_CSR_TIMEOUT
  (24h), CERTCTL_JOB_AWAITING_APPROVAL_TIMEOUT (168h)
- Audit event per transition: actor=system, actorType=System,
  action=job_timeout, details={old_status, new_status, timeout_reason,
  age_hours}
- 14 new tests: 3 config, 7 service, 4 scheduler
2026-04-19 01:37:18 +00:00
shankar0123 4bc8b3e723 fix(config): add RetryInterval to TestValidate_ValidConfig + TestValidate_AuthTypeNone fixtures (I-001 follow-up)
Problem:
  TestValidate_ValidConfig and TestValidate_AuthTypeNone construct a
  SchedulerConfig without RetryInterval, so Validate() fails the
  'retry interval must be at least 1 second' check at config.go:1086
  with 'retry interval must be at least 1 second'. Both tests expect
  success, so they fail whenever run.

Root cause (re-derived from source, not inherited from memory):
  git log -S 'retry interval must be at least' --source --all shows
  the validation was introduced in 0200c7f (I-001, RetryFailedJobs
  scheduler wiring). git log -- internal/config/config_test.go shows
  the test file was last touched in 7382e5f, which predates 0200c7f.
  I-001 added a new Validate() rule without updating the two positive
  test fixtures — a gap in I-001's verification pass.

  This is NOT C-001 fallout. The config_test.go file was untouched by
  the C-001 closure commits 91642e2 and 4696116. The failure surfaced
  during the full test suite run after C-001 landed because no one
  had run 'go test ./internal/config/...' since I-001.

Scope:
  - internal/config/config_test.go (2 fixtures: TestValidate_ValidConfig,
    TestValidate_AuthTypeNone).

Implementation:
  Added 'RetryInterval: 5 * time.Minute' to both SchedulerConfig
  literals. 5 minutes matches the I-001 default at config.go:818:

    RetryInterval: getEnvDuration("CERTCTL_SCHEDULER_RETRY_INTERVAL", 5*time.Minute)

  The other two TestValidate_* tests (InvalidAuthType, APIKeyAuth_
  MissingSecret) are unaffected because they expect Validate() to
  error at the auth-type check (line 1052) or auth-secret check
  (line 1057), both of which fire before the RetryInterval check at
  line 1086.

Verification:
  - go test -count=1 -run 'TestValidate_' ./internal/config/...: PASS
  - go test -short -count=1 ./...: all packages PASS
  - go vet ./...: exit 0

Residual:
  None. This is a pure test-fixture fix — production code is unchanged.

Commit:
  0200c7f (I-001) should have included this edit. Attributed here for
  traceability.
2026-04-19 00:33:22 +00:00
shankar0123 469611650c fix(cli): add missing os + path/filepath imports to client_test.go
Follow-up to 91642e2. TestClient_ImportCertificates_SixFieldPayload
uses filepath.Join(t.TempDir(), ...) and os.WriteFile to stage a
test PEM, but the import block only listed encoding/json,
encoding/pem, net/http, etc. — neither os nor path/filepath was
imported. go vet rejected the package with 'undefined: filepath'
(and would have caught 'undefined: os' next).

Add both imports. No behavioral change — the referenced symbols
are the standard library's usual names for their respective
packages, so the test compiles and runs exactly as intended.
CI should now pass go build + go vet on the cli package.
2026-04-19 00:27:11 +00:00
shankar0123 91642e2860 C-001 scope expansion: tighten parallel POST /api/v1/certificates call sites to six-field contract
Problem:
a53a4b8 closed C-001 at the handler boundary by tightening the
ValidateRequired contract on POST /api/v1/certificates to require six
fields: name, common_name, renewal_policy_id, issuer_id, owner_id,
team_id. (Correction re-derived from source: the handler
ValidateRequired calls on owner_id/team_id/renewal_policy_id were
actually installed in 3287e17 under M-002/M-003/M-006 auth unification
— a53a4b8's commit message overstates scope.) Post-audit on
2026-04-18 found three parallel call sites still shipping
three-to-four-field payloads that the newly strict handler would
reject with HTTP 400:
  - GUI: OnboardingWizard CertificateStep (common_name + sans +
    issuer_id + environment only)
  - CLI: certctl-cli import (common_name + issuer_id + status only;
    no required-flag gating)
  - Tests: deploy/test/qa_test.go Part03 positive paths

Scope:
Bring every POST /api/v1/certificates caller to six-field parity. No
handler changes — the contract is authoritative; the callers must
conform.

Implementation:

  GUI — OnboardingWizard CertificateStep expansion:
    web/src/pages/OnboardingWizard.tsx adds name/owner_id/team_id/
    renewal_policy_id state. React Query hooks for getOwners/
    getTeams/getPolicies use per_page: '500' to populate dropdowns
    without pagination-driven truncation. Payload ships all six
    required fields plus sans/certificate_profile_id/environment.
    nextDisabled gate enforces all six before the Continue button
    activates.

  CLI — ImportCertificates rewrite:
    internal/cli/client.go rewrites ImportCertificates with
    flag.NewFlagSet("import", flag.ContinueOnError). Required flags:
    --owner-id, --team-id, --renewal-policy-id, --issuer-id. Optional:
    --name-template (default {cn}, templated via strings.ReplaceAll
    against cert.Subject.CommonName), --environment (default
    imported). Missing required flags fail pre-HTTP with a clear
    error. Request map ships all six required fields plus sans/
    environment/status/optional serial_number.
    cmd/cli/main.go — usage string updated to document the new
    required/optional flags.

  Tests — qa_test.go Part03 positive paths:
    deploy/test/qa_test.go Part03 Create_Minimal and Create_Full
    updated to include all six fields. Uses seed_demo.sql-supplied IDs
    (o-alice, t-platform, rp-standard) — docker-compose.demo.yml is
    the run context. C-001 explanatory comment added above
    Create_Minimal so future readers understand why the minimal
    payload is no longer minimal.

  MCP parity:
    Verified no-op. internal/mcp/types.go:28 CreateCertificateInput
    already declares all six fields; internal/mcp/tools.go:102
    forwards the typed struct unchanged.

Verification:

  Go CLI regression tests (internal/cli/client_test.go):
    * TestClient_ImportCertificates_MissingRequiredFlags — 5 subtests,
      one per missing required flag, confirms flag.ContinueOnError
      rejects with non-nil error before any HTTP call is attempted.
    * TestClient_ImportCertificates_MissingPositionalArgs — confirms
      the "usage: import <file>" error path when no PEM file is
      supplied after the flags.
    * TestClient_ImportCertificates_SixFieldPayload — uses httptest
      to decode the POST body and assert all six required fields
      plus sans/environment are present on the wire.

  Frontend regression test (web/src/api/client.test.ts):
    'createCertificate accepts and transmits all six required fields'
    pins the wire shape for both GUI call sites (OnboardingWizard
    CertificateStep + CertificatesPage CreateCertificateModal). If
    either UI surface accidentally drops a field, this assertion
    fails in CI rather than surfacing as a 400 at runtime.

  Grep-based call-site sweep:
    Enumerated every POST /api/v1/certificates create caller. Four
    total: OnboardingWizard, CertificatesPage, MCP tools, CLI import.
    All four now ship six-field payloads. Claim path
    (internal/service/discovery.go) updates existing rows and does
    not POST. EST/SCEP handlers invoke internal
    certService.CreateVersion, not the public API. Negative-path
    tests (qa_test.go:1085/1267/1274/1288/1298) remain valid: they
    assert 400/non-500 on oversized/malformed/missing-CN/UTF-8/empty
    bodies, and these properties still hold under the stricter
    handler.

  Static gates:
    go build ./..., go vet ./..., go test ./internal/cli/..., and
    cd web && npm run test deferred to operator pre-push — the Go
    toolchain is not available in the session sandbox. Grep-based
    verification confirms the syntactic shape of every changed file.

Residual:
None. Every POST /api/v1/certificates call site now conforms to the
six-field contract; the wire shape is pinned by both Go and
TypeScript regression tests.

Commit:
TBD-SHA (audit doc + CLAUDE.md carry TBD-SHA placeholders to be
amended after commit)
2026-04-19 00:25:10 +00:00
shankar0123 0200c7f4a4 Close I-001 (RetryFailedJobs never invoked) coverage-gap finding
Operator decision answered as Option A: JobService.RetryFailedJobs is
now wired into the scheduler as an always-on 10th loop. Prior to this
commit the method was implemented, unit-tested, and exported but had
zero runtime callers — any job that transitioned to status=Failed stayed
Failed forever regardless of how many attempts it had remaining.

Scheduler — 10th loop:
  internal/scheduler/scheduler.go grows a jobRetryLoop alongside the
  existing nine loops (renewal, jobs, health, notifications, short-lived,
  network scan, digest, health check, cloud discovery). The loop follows
  the established run-immediately-then-tick pattern (same shape as
  jobProcessorLoop), gated by a sync/atomic.Bool idempotency guard and
  joined into the scheduler's sync.WaitGroup so WaitForCompletion drains
  it on graceful shutdown. Each tick runs under a 2-minute context
  timeout mirroring jobProcessorLoop's opCtx budget. The runJobRetry
  helper invokes jobService.RetryFailedJobs(ctx, 3) — the advisory
  maxRetries cap is belt-and-suspenders; per-job eligibility is still
  enforced inside the service via Attempts < MaxAttempts.

  The JobServicer scheduler-interface gains RetryFailedJobs so the
  scheduler's dependency surface stays explicit and mockable.

Service — audit trail per retry:
  internal/service/job.go:RetryFailedJobs now emits an audit event for
  every Failed→Pending transition. Following the house convention used
  by all scheduler-emitted events, actor='system' and actorType=
  domain.ActorTypeSystem; action='job_retry'; details capture
  old_status, new_status, attempts, max_attempts. JobService carries an
  optional *AuditService (SetAuditService) that nil-guards to preserve
  test-wiring ergonomics — existing tests that construct JobService
  without an audit service continue to pass unchanged.

Config — env var with sane default:
  internal/config/config.go:SchedulerConfig grows RetryInterval, wired
  to CERTCTL_SCHEDULER_RETRY_INTERVAL with a 5-minute default. Validate
  rejects intervals below 1 second (matches other scheduler interval
  validators).

Server wiring:
  cmd/server/main.go calls jobService.SetAuditService(auditService)
  after JobService construction and sched.SetJobRetryInterval(
  cfg.Scheduler.RetryInterval) alongside the other SetXxxInterval calls.

Regression coverage:
  internal/service/job_test.go (3 new)
    - TestJobService_RetryFailedJobs_EligibleJobTransitionsAndAudits
    - TestJobService_RetryFailedJobs_SkipsJobsAtMaxAttempts
    - TestJobService_RetryFailedJobs_NoAuditServiceOK
  internal/scheduler/scheduler_test.go (3 new)
    - TestScheduler_JobRetryLoop_CallsService
    - TestScheduler_JobRetryLoop_IdempotencyGuard
    - TestScheduler_JobRetryLoop_WaitForCompletion

  The service tests assert status transitions, attempt-cap short-
  circuiting, and audit event shape (actor='system', action='job_retry',
  details keys). The scheduler tests assert the loop invokes the service,
  the atomic.Bool guard skips overlapping ticks with the expected
  'still running, skipping tick' log, and WaitForCompletion drains the
  in-flight tick on Stop.

Residual follow-up (not in scope for this commit):
  internal/service/renewal.go:RetryFailedJobs is a parallel dead-code
  duplicate of the same logic on RenewalService — untested and has no
  runtime caller. The audit finding called this out as 'implemented
  twice'. Removing it is a separate cleanup and does not block the
  Option-A wiring this commit delivers.

Files:
  cmd/server/main.go                     — SetAuditService + SetJobRetryInterval
  internal/config/config.go              — RetryInterval field + env + validate
  internal/scheduler/scheduler.go        — 10th loop, interface, field, setter
  internal/scheduler/scheduler_test.go   — 3 new scheduler-loop tests
  internal/service/job.go                — RetryFailedJobs audit emission + SetAuditService
  internal/service/job_test.go           — 3 new service-layer tests
2026-04-18 23:24:54 +00:00
shankar0123 fe7e766510 Close M-004 (OCSP issuer binding) and M-005 (discovery actor propagation) coverage-gap findings
M-004 — OCSP issuer binding (composite key):
  The OCSP lookup path now binds (issuer_id, serial) as a composite key
  rather than resolving by serial alone. CertificateRepository and
  RevocationRepository gain GetByIssuerAndSerial methods; ca_operations.go
  scopes both lookups by the issuer_id path param. When no managed cert
  binds to that (issuer, serial) tuple, GetOCSPResponse constructs an
  RFC 6960 §2.2 'unknown' response (CertStatus=2) instead of the prior
  default 'good'. Short-lived cert exemption (profile TTL < 1h) is
  preserved. Real repo errors (non-sql.ErrNoRows) fail closed with a log.

  Regression coverage: internal/service/ca_operations_test.go
    - TestCAOperationsSvc_GetOCSPResponse_Unknown_CrossIssuer
    - TestCAOperationsSvc_GetOCSPResponse_Unknown_UnknownSerial

M-005 — Discovery Claim/Dismiss actor propagation:
  DiscoveryService.ClaimDiscovered and DismissDiscovered now accept an
  explicit 'actor string' parameter (propagation pattern mirrors
  bulk_revocation.go / revocation_svc.go). The handler layer passes
  resolveActor(r.Context()) — the named-key identity established by the
  M-002 auth unification — and the service falls back to 'api' (the same
  safe sentinel resolveActor uses when no auth context is present) only
  when the caller passes an empty string. Never falls back to 'operator'.

  Regression coverage: internal/service/discovery_test.go
    - TestDiscoveryService_ClaimDiscovered_AuditActor
    - TestDiscoveryService_DismissDiscovered_AuditActor
    - TestDiscoveryService_ClaimDiscovered_EmptyActorFallsBackToAPI
    - TestDiscoveryService_DismissDiscovered_EmptyActorFallsBackToAPI

Each new test asserts event.Actor matches the caller-supplied string (or
'api' on empty input) and explicitly asserts event.Actor != 'operator'
to lock in the historical fix intent.

Files:
  internal/api/handler/discovery.go          — pass resolveActor(ctx)
  internal/api/handler/discovery_handler_test.go — updated call sites
  internal/integration/lifecycle_test.go     — updated mock wiring
  internal/repository/interfaces.go          — GetByIssuerAndSerial on
                                               CertificateRepository +
                                               RevocationRepository
  internal/repository/postgres/certificate.go — composite key lookup
  internal/service/ca_operations.go          — (issuer_id, serial) scoping
  internal/service/ca_operations_test.go     — 2 new M-004 tests
  internal/service/discovery.go              — actor parameter + 'api' fallback
  internal/service/discovery_test.go         — 4 new M-005 tests
  internal/service/shortlived_test.go        — mock signature update
  internal/service/testutil_test.go          — mock GetByIssuerAndSerial
2026-04-18 22:20:25 +00:00
shankar0123 ff7357f889 fix(lint): godoc comment on NewAuthWithNamedKeys must lead with function name (ST1020)
CI failure on master (commit 3287e17) — staticcheck ST1020:

  internal/api/middleware/middleware.go:125:1: ST1020: comment on exported
  function NewAuthWithNamedKeys should be of the form
  "NewAuthWithNamedKeys ..." (staticcheck)

When NewAuth was renamed to NewAuthWithNamedKeys during the M-002 auth
unification, the leading godoc sentence was left pointing at the old name.
Rewrite the comment so its first sentence starts with the new function
name, and expand the body to describe the named-key + admin-flag contract
introduced in 3287e17.

Also gitignore /.gopath/ — session-scoped tool install cache, same
category as /.gocache/ and /.gomodcache/.

Verification:
  go vet ./internal/api/middleware/...          — clean
  go build ./internal/api/middleware/...        — clean
  go test ./internal/api/middleware/...         — PASS (0.245s)
  staticcheck -checks=all,<project exclusions>  — clean across
    middleware, handler, service, domain, cmd/server, scheduler

Closes: CI failure on 3287e17.
2026-04-18 21:38:46 +00:00
shankar0123 3287e174dc Unify API auth + RFC-compliant CRL/OCSP (M-002 + M-003 + M-006, auto-closes M-001)
Closes the remaining P1 gaps from coverage-gap-audit.md (M-001/M-002/M-003/M-006)
on top of the C-001/C-002 ownership + agent-FK contract fixes landed in
a53a4b8. The work lands as a single commit spanning server, docs, tests,
and the React client.

M-002 — Named API keys with per-key actor propagation
  * Migration 000014 adds the 'api_keys' table (id, name, hash,
    principal, role, created_at, last_used_at, disabled_at) so every
    credential carries an identifiable principal instead of the
    opaque 'anonymous'/'api-key' sentinel.
  * Auth middleware now rotates through configured keys, performs
    constant-time hash comparison, stamps 'last_used_at', and emits
    an actor struct via contextWithActor(). The audit middleware,
    bulk-revocation handler, approval handlers, and MCP tool layer
    now read the principal off the context and persist it on every
    audit_events row.
  * Regression coverage:
      - internal/api/middleware/audit_test.go — actor propagation,
        principal redaction for disabled keys, anonymous fallback for
        unauthenticated endpoints.
      - internal/api/handler/bulk_revocation_handler_test.go,
        job_handler_test.go — principal-on-audit assertions.

M-003 — Authorization gates (Phase B)
  * Approval handler rejects self-approval / self-rejection with 403
    when the actor principal equals the job's requested_by field.
  * Bulk revocation is gated behind the 'admin' role; operators and
    viewers receive 403.
  * Regression coverage:
      - internal/service/job_test.go — TestApproveJob_NotSelf,
        TestRejectJob_NotSelf.
      - internal/api/handler/bulk_revocation_handler_test.go —
        TestBulkRevoke_RequiresAdmin, TestBulkRevoke_AdminSucceeds.

M-006 — RFC-compliant CRL/OCSP on the unauthenticated .well-known mux
  * Per RFC 8615, relying parties cannot reasonably be asked to
    authenticate against the issuing certctl instance to retrieve
    revocation material. CRL and OCSP move off the authenticated
    '/api/v1/crl*' and '/api/v1/ocsp/*' paths onto:
        GET /.well-known/pki/crl/{issuer_id}
            Content-Type: application/pkix-crl   (RFC 5280 §5)
        GET /.well-known/pki/ocsp/{issuer_id}/{serial}
            Content-Type: application/ocsp-response  (RFC 6960)
  * Non-standard JSON CRL shape is removed; only DER is served.
  * Short-lived certificate exemption (profile TTL < 1h → skip
    CRL/OCSP) is preserved; the response simply omits the serial.
  * Routes are registered on the unauthenticated 'finalHandler' mux
    in cmd/server/main.go alongside EST ('/.well-known/est/*') and
    SCEP ('/scep'). Legacy authenticated paths return 404.
  * Regression coverage:
      - internal/api/handler/certificate_handler_test.go — content
        type, DER parseability, 404 for unknown issuer.
      - internal/api/handler/adversarial_path_test.go — unauthenticated
        access asserted for CRL, OCSP, EST, SCEP.
      - internal/api/router/router_test.go — route-table assertion
        that '.well-known/pki/*', '.well-known/est/*', and '/scep' are
        mounted on the unauthenticated branch.

M-001 — Auto-closed by M-002
  EST and SCEP were already registered on the unauthenticated
  'finalHandler' mux; the router comment at
  internal/api/router/router.go:247 now matches reality. The
  adversarial-path tests above lock the behavior in.

Verification (all gates green):
  * go vet ./...                                           — clean
  * go build ./...                                         — ok
  * go test -short ./... (55+ packages)                    — all pass
  * web/ : npm test (225 Vitest tests)                     — all pass
  * web/ : npx tsc --noEmit                                — clean
  * grep sweep for '/api/v1/(crl|ocsp)' — 13 surviving hits,
    all intentional M-006 tombstone/relocation comments.

Documentation:
  * coverage-gap-audit.md — status flips M-001/M-002/M-003/M-006 →
    Fixed, with per-finding resolution paragraphs citing regression
    test IDs. (Audit file lives outside this repo; see cowork root.)
  * CLAUDE.md Project Status line updated with the auth-unification
    closure note.
  * docs/features.md, docs/architecture.md, docs/quickstart.md,
    docs/concepts.md, docs/connectors.md, docs/test-env.md,
    docs/testing-guide.md, docs/compliance-*.md, docs/demo-advanced.md
    — refreshed for the new '.well-known/pki/*' namespace and named
    API keys.
  * api/openapi.yaml — documents the new unauthenticated endpoints
    and removes the legacy '/api/v1/crl*' + '/api/v1/ocsp/*' paths.

.gitignore: adds '/.gocache/' and '/.gomodcache/' for the session-
scoped Go caches so they never enter the tree.
2026-04-18 18:17:41 +00:00
shankar0123 a53a4b845b fix(gui,api): close C-001 + C-002 — ownership + agent FK contract
C-001 — CreateCertificate was server-accepted with null owner_id,
team_id, renewal_policy_id because the GUI neither collected the fields
nor enforced them, even though the backend's ManagedCertificate schema
and handler contract treat them as required. Fix the contract at all
four layers:

  - web/src/pages/CertificatesPage.tsx: replace owner_id/team_id free-
    text inputs with <select> elements fed by getOwners/getTeams/
    getPolicies queries; mark all three required; gate the Create
    button on owner_id + team_id + renewal_policy_id being set.
  - internal/api/handler/certificates.go: ValidateRequired for
    owner_id, team_id, renewal_policy_id on CreateCertificate so the
    handler returns HTTP 400 with the offending field name before the
    service layer is reached.
  - internal/mcp/types.go: drop ',omitempty' from
    CreateCertificateInput.RenewalPolicyID so the MCP schema reflects
    the required contract; Update inputs keep partial-update semantics.
  - api/openapi.yaml: 'required: [name, common_name, renewal_policy_id,
    issuer_id, owner_id, team_id]' was already present on the Create
    schema; clarified DeploymentTarget.agent_id description to note the
    FK contract.

C-002 — CreateTargetWizard accepted an empty or bogus agent_id and the
service inserted directly, producing a Postgres 23503 FK-violation that
bubbled out as a generic HTTP 500. The FK itself (migration 000001 line
104: agent_id TEXT NOT NULL REFERENCES agents(id)) is correct; we keep
the schema strict and add validation at three layers:

  - internal/service/target.go: introduce
    ErrAgentNotFound sentinel and pre-validate agent_id in
    TargetService.CreateTarget — empty string returns
    'agent_id is required'; a nonexistent id returns the full
    'referenced agent does not exist: <id>' error. Both wrap
    ErrAgentNotFound via fmt.Errorf %w so callers can use errors.Is.
  - internal/api/handler/targets.go: ValidateRequired on agent_id; map
    errors.Is(err, service.ErrAgentNotFound) to HTTP 400 instead of
    letting it fall through to the generic 500 branch.
  - internal/mcp/types.go: drop ',omitempty' from
    CreateTargetInput.AgentID to match the required contract.
  - web/src/pages/TargetsPage.tsx: replace the free-text Agent ID input
    with a <select> populated from getAgents(); include agent in the
    canProceedToReview gate so Next is disabled until an agent is
    chosen.

Regression coverage (21 new subtests total):

  - TestCreateCertificate_MissingRequiredField_Returns400 — 6 subtests,
    one per required field, each proves the handler guard fires before
    the mock service is called.
  - TestCreateTarget_MissingAgentID_Returns400 — handler guard.
  - TestCreateTarget_NonexistentAgent_Returns400 — pins the
    ErrAgentNotFound -> 400 translation.
  - TestTargetService_CreateTarget_MissingAgentID — errors.Is sentinel.
  - TestTargetService_CreateTarget_NonexistentAgentID — errors.Is.
  - The existing TestTargetService_CreateTarget_Success, along with
    TestCreateTarget_{MissingName,MissingType,NameTooLong}_* handler
    tests, were updated to seed a real agent or include agent_id in
    the request body so the happy paths still run cleanly.

Gates (Phase 4):
  - go build/vet/test/race: green
  - go test -cover: internal/service 68.7% (gate 55%),
    internal/api/handler 78.9% (gate 60%)
  - golangci-lint on service+handler+mcp: 0 issues
  - govulncheck: no reachable vulns
  - tsc --noEmit: clean
  - vitest: 223/223 passing

See cowork/certctl-coverage-gap-audit.md entries C-001 and C-002.
2026-04-18 16:01:40 +00:00
shankar0123 9143da5fa8 Merge branch 'fix/d-008-policy-engine-drift' 2026-04-18 14:56:06 +00:00
shankar0123 b3cc7cbdb2 fix(policies): close the D-006 loop — TitleCase seed canonicals + severity-aware, config-consuming rule engine (D-008)
D-008 was a three-part drift in the policy engine that made the
D-005/D-006 remediation cosmetic below the DB layer:

  (a) migrations/seed.sql INSERTed rules with pre-D-005 lowercase
      types ('ownership', 'environment', 'lifetime', 'renewal_window')
      that the handler validator rejects on Create/Update but that
      raw SQL INSERTs bypassed entirely. At runtime evaluateRule's
      switch fell through to the default "unknown policy rule type"
      error branch on every demo rule × every cert × every cycle,
      flooding logs while emitting zero violations.

  (b) migrations/seed_demo.sql persisted lowercase severity values
      ('critical', 'error', 'warning') on policy_violations rows.
      INSERT succeeded because that column had no CHECK, but any
      frontend comparing against the canonical PolicySeverity enum
      mis-categorized every seeded violation.

  (c) evaluateRule hardcoded Severity: PolicySeverityWarning on
      every emitted violation and ignored rule.Config entirely —
      so the D-006 per-rule severity column (000013) and every
      per-arm Config JSON ({allowed_issuer_ids, allowed_domains,
      required_keys, allowed, lead_time_days, max_days}) was dead
      data below the evaluation layer.

This commit lands (a)+(b)+(c) atomically. Shipping any subset
leaves the feature half-working.

## Changes

Domain (internal/domain/policy.go):
  * Add PolicyTypeCertificateLifetime as the 6th TitleCase canonical.
    Pre-D-008 the seeded "max-certificate-lifetime" rule had no engine
    arm — routing it through RenewalLeadTime would conflate "how
    close to expiry before we renew" with "how long can the cert
    possibly be", two distinct semantics. The new type accepts
    config {"max_days": int} and flags certs whose
    NotAfter - NotBefore exceeds the cap.

Handler validator (internal/api/handler/validation.go):
  * ValidatePolicyType allowlist grown to 6 canonicals
    (AllowedIssuers, AllowedDomains, RequiredMetadata,
    AllowedEnvironments, RenewalLeadTime, CertificateLifetime).

OpenAPI (api/openapi.yaml):
  * PolicyType enum grown to match domain.

Frontend (web/src/api/types.ts, types.test.ts):
  * POLICY_TYPES tuple gains CertificateLifetime; pin test asserts
    all 6 canonicals and rejects casing drift.

Migration 000014 (policy_violations severity CHECK):
  * Named CHECK constraint (policy_violations_severity_check)
    mirroring 000013's allowlist, defense-in-depth at the DB layer
    against future drift from bypassed writes (migrations, psql
    sessions, future callers). Symmetric down migration drops by
    name.

Seed data:
  * migrations/seed.sql rewritten to emit TitleCase canonicals with
    per-arm config JSON that actually exercises the config-consuming
    paths (not the missing-field backstops):
      - pr-require-owner         → RequiredMetadata     {"required_keys":["owner"]}                        Warning
      - pr-allowed-environments  → AllowedEnvironments  {"allowed":["production","staging","development"]} Error
      - pr-max-certificate-lifetime → CertificateLifetime {"max_days":90}                                   Critical
      - pr-min-renewal-window    → RenewalLeadTime      {"lead_time_days":14}                              Warning
    Severities are now differentiated per rule (D-006 intent).
  * migrations/seed_demo.sql violation rows flipped to TitleCase
    severity ('Critical', 'Error', 'Warning') so migration 000014
    applies cleanly on upgrade paths.

Engine rewrite (internal/service/policy.go):
  * evaluateRule rewritten. All six arms now:
      1. Parse rule.Config into the per-arm typed struct.
      2. Bad JSON → log at ValidateCertificate boundary and skip
         this rule (no co-located poisoning of other rules in the
         same batch).
      3. Empty/null Config → emit the pre-D-008 missing-field
         violation (backwards compat invariant — operators who
         haven't reconfigured still see the same output).
      4. Violations emitted carry rule.Severity (no more hardcoded
         Warning); D-006 column is now load-bearing.
  * CertificateLifetime arm reads NotBefore/NotAfter from the
    certificate's latest version via CertRepo. Injected via
    PolicyService.SetCertRepo() setter — avoids churning ~36
    NewPolicyService call sites while keeping the lifetime arm
    optional (degrades to a log+skip if the setter is not wired).

Server wiring (cmd/server/main.go):
  * policyService.SetCertRepo(certRepo) wired after construction.

Tests (internal/service/policy_test.go):
  * 25 new subtests across 5 groups:
      - TestEvaluateRule_SeverityPassThrough (6): every rule type
        emits violations carrying rule.Severity, not hardcoded.
      - TestEvaluateRule_ConfigConsumed (12): every per-arm Config
        path exercised positive + negative.
      - TestEvaluateRule_EmptyConfig_BackCompat (3): empty/null
        Config still emits pre-D-008 missing-field violations.
      - TestEvaluateRule_BadConfig_SkipsRule: malformed JSON logs
        and skips cleanly without poisoning neighbors.
      - TestEvaluateRule_CertificateLifetime_RepoScenarios (3):
        ok when repo wired, log+skip when not, handles missing
        NotBefore/NotAfter edges.

Provenance: D-008 surfaced during D-005/D-006 remediation review
in eef1db0. That commit added persistence and CI pins for the
severity field but did not re-verify the evaluation layer
consumed it; this finding and fix close the audit-process gap.
2026-04-18 14:55:56 +00:00
shankar0123 eef1db0f0a fix(policies): stop 400ing the "+ New Policy" button + add per-rule severity (D-005, D-006)
Coverage Gap Audit findings D-005 (P0) + D-006 (P1) fixed together in a
single commit because they share the same root cause — policy CRUD sending
values the backend silently rejects — and splitting them would leave a
half-working UI between commits.

## D-005 (P0): PoliciesPage dropdown 400s every Create Policy

Root cause
----------
`web/src/pages/PoliciesPage.tsx` populated the Type `<select>` from a
hardcoded `['key_algorithm', 'ownership', 'allowed_issuers', ...]` array.
The backend's `internal/api/handler/validators.go::ValidatePolicyType`
enforces the TitleCase allowlist `AllowedIssuers`, `AllowedDomains`,
`RequiredMetadata`, `AllowedEnvironments`, `RenewalLeadTime` — defined in
`internal/domain/policy.go`. Every Create Policy request was rejected with
`400 invalid policy type`. The error surfaced only as a transient toast;
the modal closed anyway. Silent user-visible failure.

Fix
---
- `web/src/api/types.ts`: added `POLICY_TYPES` and `POLICY_SEVERITIES`
  tuples with `as const` and narrowed `PolicyRule.type`, `.severity`, and
  `PolicyViolation.severity` to the literal-union types. Dropdown is now
  sourced from the tuple; casing drift becomes a compile error.
- `web/src/pages/PoliciesPage.tsx`: rekeyed `severityStyles` /
  `severityDots` to the TitleCase values, added `humanize()` for display
  (AllowedIssuers → "Allowed Issuers"), removed the `badge-neutral`
  fallback that was papering over the mismatch.
- `web/src/api/types.test.ts` (new): pins both tuples exactly. If anyone
  edits one side of the frontend/backend contract without the other, CI
  fails with a clear assertion. Pure-TS vitest, no RTL dependency.

## D-006 (P1): `severity` field silently dropped on create/update

Root cause
----------
`PolicyRule` had no `Severity` field in `internal/domain/policy.go`. The
frontend has always sent `severity` on create/update, but Go's
`json.Decoder` (default settings, no `DisallowUnknownFields`) silently
dropped it. The value never reached PostgreSQL. Every rule rendered with
the same severity because there was no severity — just a display
computation downstream.

Fix: option (b), full-stack schema add (not delete-the-field)
-------------------------------------------------------------
- Migration `000013_policy_rule_severity` (up + down): adds
  `severity VARCHAR(50) NOT NULL DEFAULT 'Warning'` to `policy_rules` with
  CHECK constraint `severity IN ('Warning', 'Error', 'Critical')`. No
  index — three-value column on a low-thousands-rows table, planner will
  seq-scan regardless. PG 11+ metadata-only ADD COLUMN, safe on live data.
- `internal/domain/policy.go`: added `Severity PolicySeverity` field.
- `internal/repository/postgres/policy.go`: plumbed `severity` through
  ListRules SELECT + Scan, GetRule SELECT + Scan, CreateRule INSERT,
  UpdateRule UPDATE (4 queries).
- `internal/service/policy.go::UpdatePolicy`: if the client omits
  severity on a PUT (zero-value empty string), fetch the existing rule
  and preserve its severity. Without this, partial updates would trip the
  NOT NULL CHECK and 500. Preserves pre-existing behavior for Name/Type
  (out of scope).
- `internal/api/handler/policies.go::CreatePolicy`: default empty severity
  to `'Warning'`, then validate via `ValidatePolicySeverity`. 400 with
  clear message instead of 500 on CHECK violation. `UpdatePolicy`:
  validates severity only when provided.
- `internal/mcp/types.go` + `internal/mcp/tools.go`: added optional
  `severity` on the MCP `create_policy` / `update_policy` tool inputs so
  LLM callers stay in sync with the wire contract.
- `api/openapi.yaml`: added `severity` to the `PolicyRule` schema with
  the enum and default.

Acceptance criterion (user-defined)
-----------------------------------
"Create a rule with severity=Critical, reload the page, and still see
Critical — no silent drops." Verified end-to-end: frontend sends
`severity: "Critical"`, handler validates, service persists, DB stores,
GET returns, React renders the correct badge.

Seed data
---------
`migrations/seed.sql`: four demo rules now have differentiated severities
— `pr-require-owner` → Warning, `pr-allowed-environments` → Error,
`pr-max-certificate-lifetime` → Critical, `pr-min-renewal-window` →
Warning. The user called out that seeding all four at the same severity
makes the feature look decorative; differentiation demonstrates the
column carries real signal.

## Integration test fix (side effect of D-006)

`internal/integration/e2e_test.go::TestCrossResourceWorkflow/CreatePolicy`
was sending `"severity": "High"` — a value from the pre-audit severity
vocabulary that the new `ValidatePolicySeverity` correctly rejects with
400. Changed to `"Error"` (closest semantic match in the new TitleCase
allowlist). Only severity reference in the integration/ directory;
verified via grep.

## Out of scope, logged for follow-up (d/D-008)

Three policy-engine drift issues orthogonal to D-005 + D-006, explicitly
deferred per direction:

1. `migrations/seed.sql` policy_rules INSERTs use lowercase TYPE values
   (`'ownership'`, `'environment'`, `'lifetime'`, `'renewal_window'`).
   These are load-bearing on `internal/service/policy.go::evaluateRule`'s
   `switch rule.Type` (which also uses the lowercase strings). Migrating
   requires coordinated changes across seed + evaluation engine.
2. `migrations/seed_demo.sql:482-483` contains lowercase `'critical'`
   severity — will now fail the new CHECK constraint. Separate fix.
3. `evaluateRule` hardcodes `Severity: domain.PolicySeverityWarning` on
   emitted violations and ignores the configured `rule.Config`. The new
   severity column is read correctly on the CRUD path but not yet
   consulted during evaluation.

## Verification

Backend:
- `go build ./...` — clean
- `go vet ./...` — clean
- `go test -short ./...` — all packages green, including
  `internal/service` (policy service), `internal/api/handler` (policy +
  MCP handler tests), `internal/integration` (e2e_test.go after fix),
  `internal/domain`, `internal/repository/postgres`.

Frontend:
- `tsc --noEmit` — clean
- `vitest run` — 223/223 passing (4 new assertions in types.test.ts)
- `vite build` — clean (only the pre-existing chunk-size warning)
2026-04-18 13:02:04 +00:00
shankar0123 72f5246ce3 Merge branch 'fix/m11-cosign-v3-sign-blob-bundle': M-11 cosign v3 sign-blob migration 2026-04-18 09:29:25 +00:00
shankar0123 cb308bb4c7 ci(release): migrate cosign sign-blob to --bundle (cosign v3.0)
Cosign v3.0 (shipped by default with sigstore/cosign-installer@cad07c2e,
release v3.0.5) removed --output-signature and --output-certificate from
the sign-blob subcommand. The replacement is a single --bundle flag that
emits a unified Sigstore bundle (.sigstore.json) containing the
signature, certificate chain, and Rekor inclusion proof in one file.

This change migrates both sign-blob invocations in .github/workflows/
release.yml (per-binary matrix signing and aggregate checksums.txt
signing), updates the artefact upload paths, the artefact aggregation
case filter, the GitHub Release asset list, and the release-notes body
verify-blob example. The README cosign verification snippet and sidecar
description are also updated to the --bundle / .sigstore.json shape.

No cosign version pinning. No legacy fallback. OCI image signing
(cosign sign on image digest) is unchanged — only sign-blob flags
changed in v3.0. See M-11 in certctl-audit-report.md.

Verification gates:
- YAML parse: OK
- go vet ./...: exit 0
- go build ./...: exit 0
- grep 'cosign sign-blob' release.yml: 2 (expected: 2)
- grep '.sigstore.json' release.yml: 9 (expected: >=5)
- grep '.sig/.pem' release.yml non-comment: 0 (expected: 0)
- README legacy cosign refs: 0 (expected: 0)
- docs/ legacy cosign refs: 0 (expected: 0)

Coverage: unchanged (CI workflow edit + README — zero Go code touched).
2026-04-18 09:29:20 +00:00
shankar0123 ad93e99158 Merge branch 'fix/m10-openapi-spec-drift': M-10 OpenAPI spec drift reconciliation 2026-04-18 03:21:45 +00:00
shankar0123 9d0c3dfa15 docs(openapi): reconcile api/openapi.yaml with router routes (M-10)
Add 9 missing operations to api/openapi.yaml that exist in router.go but
were absent from the spec. Spec-only change with no runtime Go code
changes; all 106 pre-existing operationIds preserved byte-identical.

New operationIds:
  - testTargetConnection (POST /api/v1/targets/{id}/test)
  - verifyDeployment    (POST /api/v1/jobs/{id}/verify)
  - getJobVerification  (GET  /api/v1/jobs/{id}/verification)
  - estCACerts          (GET  /.well-known/est/cacerts)
  - estSimpleEnroll     (POST /.well-known/est/simpleenroll)
  - estSimpleReEnroll   (POST /.well-known/est/simplereenroll)
  - estCSRAttrs         (GET  /.well-known/est/csrattrs)
  - scepGet             (GET  /scep)
  - scepPost            (POST /scep)

Spec operations: 106 → 115 (matches 115 router routes exactly).

Verification:
  - openapi-spec-validator: OK
  - go build ./...: clean
  - go vet ./...:   clean
  - go test -race -count=1 -short ./...: 54 packages ok, 0 FAIL
  - golangci-lint run ./...: 0 issues
  - govulncheck ./...: 0 vulnerabilities in our code
  - tsc --noEmit: 0 errors
  - vitest run: 3 files, 218 tests passed

sha256 before: 7c14f77107a86f8de82fe91b7f5e16cca11206d1e1fab7b7bd77ff396620fdf3
sha256 after:  87bd92d0407d63643bec612d27261bf489563beb90d0791ea71cde26346f83d3
2026-04-18 03:21:40 +00:00
shankar0123 2c9602db71 Merge branch 'fix/m9-sentinel-discovery-log-levels': M-9 sentinel discovery log-level fix 2026-04-18 02:53:50 +00:00
shankar0123 ef670fa6da fix(m-9): aggregate per-endpoint scan errors in NetworkScanService
Before this fix, RunScan declared `scanErrors []string` but never
appended to it. As a result:

  - the summary Info log ("network target scan completed") always
    reported `"errors": 0`, regardless of how many endpoints failed
  - the DiscoveryReport's `Errors` field — stored on the scan record
    and surfaced in the GUI scan history — was always nil

Operators who needed to understand scan failures had to enable Debug
logging and grep through the noise of expected sweep-scan connection
refusals. The per-endpoint log level (Debug) is deliberate and correct
— scanning a /24 typically produces 200+ connection-refused results,
and logging each at Warn would create massive log spam at default
verbosity. The bug was the silent loss of the aggregate count.

This commit:

  - extracts the partitioning logic into `collectScanResults`, a pure
    method that splits per-endpoint results into discovered certificate
    entries and a list of endpoint error strings
  - populates the errors list with "<address>: <error>" so the scan
    record correlates failures back to specific endpoints
  - preserves the existing Debug-level per-endpoint log (sweep noise
    discipline) — no change to default-verbosity log output

The summary Info log's "errors" field and the DiscoveryReport's Errors
field now reflect the true failure count. Debug detail remains
available for operators diagnosing specific endpoints.

Audit scope note: the M-9 finding narrative implied broad Debug-level
hiding of real errors across AWS SM, Azure KV, GCP SM, and network
scan sentinel agents. On investigation, the three cloud-discovery
connectors (awssm, azurekv, gcpsm) already use appropriate Warn/Error
discipline for per-item and root-level failures. Only the network
scanner had a silent observability gap, and it was a missed append
rather than a misapplied log level. See audit resolution log for
full details.

CWE: CWE-778 (Insufficient Logging) — aggregate failure count lost.

Tests: 4 new unit tests on collectScanResults covering the
aggregation path (success + failure mix), all-success, all-failed,
and empty-input degenerate cases. All tests pass with -race.

Verification:
  - go build ./cmd/server/... ./cmd/agent/... ./cmd/mcp-server/... ./cmd/cli/...  exit 0
  - go vet ./...                                                                    exit 0
  - go test -race -count=1 -timeout 300s [full CI race path]                        exit 0
  - golangci-lint run ./... --timeout 5m (v2.11.4)                                  0 issues
  - govulncheck ./... (@latest)                                                     0 in-code vulnerabilities
  - go test -count=1 -cover ./internal/service/...                                  68.0% (> 55% threshold)

Invariants preserved:
  - collectScanResults signature: method on *NetworkScanService,
    input []domain.NetworkScanResult, return ([]DiscoveredCertEntry, []string)
  - Debug log key names unchanged ("address", "error")
  - DiscoveryReport schema unchanged (Errors field already existed)
  - Sentinel agent ID "server-scanner" unchanged
  - No migration, no API, no wire-format change

Refs: M-9 Medium finding; audit resolution log appended in follow-up
commit on workspace-level audit report.
2026-04-18 02:34:14 +00:00
shankar0123 5a6ec39cfd Merge branch 'fix/m2-pr-f-scheduler-contextcheck-audit-closeout' 2026-04-18 01:43:56 +00:00
shankar0123 e3196e7b50 M-2 PR-F: Middleware/ACME ctx-propagation + contextcheck linter + audit closeout
Final PR in the six-commit M-2 sequence (PR-A: CertificateService cluster
cdc9d03, PR-B: IssuerService+TargetService eb14236, PR-C: Policy/Profile/
Owner/Team 2497be4, PR-D: Job/Notification/Audit ccd89c3, PR-E: AgentService
283ec27, PR-F: this commit). PR-A through PR-E collapsed the service-layer
shim methods and deleted every in-production context.Background() /
context.TODO() call from internal/service/; this PR completes the sweep
across the non-service tiers (HTTP middleware + ACME connector) and wires
the contextcheck linter so regressions fail CI.

Three narrow edits land the D-3 pattern (context.WithoutCancel for
subsidiary async writes and deferred shutdown contexts):

  - internal/api/middleware/audit.go  -- async audit goroutine now runs
    on auditCtx := context.WithoutCancel(r.Context()) instead of
    context.Background(). Preserves request-scoped values (trace ID, auth)
    while detaching from the request's cancellation so the audit write
    does not get killed when the response completes. Goroutine is still
    tracked via a.wg (M-1 shutdown drain) so Flush(ctx) behaviour is
    unchanged. CWE-770 Missing Release (goroutine leak potential) +
    CWE-400 Resource Exhaustion (missed cancellation propagation).

  - internal/api/middleware/middleware.go -- Recovery panic path now
    logs via slog.ErrorContext(ctx, ...) instead of log.Printf. Request-
    scoped trace/auth metadata now carries through the panic log, matching
    every other request log. D-3 non-bypass: the context is r.Context()
    captured before the defer, so even a panic mid-handler propagates
    the ctx's trace ID into the ERROR log line.

  - internal/connector/issuer/acme/acme.go (HTTP-01 challenge server
    shutdown) -- defer shutdown context derived from
    context.WithTimeout(context.WithoutCancel(ctx), 5s) instead of
    context.Background(). Preserves parent ctx values, detaches from
    parent cancellation so Shutdown always gets its full 5-second
    budget even when the parent was cancelled. Matches the same pattern
    applied in ACME's solveAuthorizationsDNS01 and solveAuthorizationsDNSPersist01.

Linter wiring: .golangci.yml adds `contextcheck` to the enabled set.
golangci-lint v2.11.4 now fails CI on any function that takes a
context.Context parameter but calls into context.Background() or
context.TODO() instead of propagating -- regression guard for all five
prior PRs.

Verification (CI parity, GOCACHE=/tmp/gocache GOMODCACHE=/tmp/gomodcache
GOLANGCI_LINT_CACHE=/tmp/lintcache):

  - go build ./... -> 0
  - go vet ./... -> 0
  - golangci-lint run (contextcheck enabled) -> 0 issues
  - go test -race -short ./internal/api/middleware/... -> PASS
  - go test -race -short ./internal/scheduler/... -> PASS
  - go test -race -short ./internal/connector/issuer/acme/... -> PASS
  - go test -race -short ./internal/service/... -> PASS
  - rg "context\.(Background|TODO)\(\)" internal/service/ internal/scheduler/
    internal/connector/ internal/api/middleware/ -> 0 non-test hits
    (one pedagogical godoc reference in audit.go documenting why
    context.Background() would be wrong remains intentional)

Wire-format invariants preserved: 0 API routes, 0 SQL migrations, 0
frontend bytes, 0 OpenAPI bytes, 0 connector interface signature changes,
0 new env vars, 0 new external dependencies (pure context stdlib). The
AuditRecorder interface signature, the body-hash algorithm (SHA-256 16
hex chars), the excluded-path short-circuit, the actor-extraction path,
the responseWriter status-capture wrapper, the AuditServiceAdapter, and
all 116 API routes under /api/v1/, /.well-known/est/, /scep, /health,
/auth are byte-identical.

M-2 aggregate across PR-A through PR-F: 57 files, +635 / -613 (PR-A 12f
+227/-237, PR-B 9f +150/-146, PR-C 17f +156/-148, PR-D 11f +67/-63,
PR-E 4f +9/-15, PR-F 4f +26/-4). With M-2 closed, 8 of 10 Medium
findings resolved; M-9, M-10, L-1..L-4, I-1..I-8 remain post-v2.1.0
hardening batch.

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:43:47 +00:00
shankar0123 bea69efd12 Merge branch 'fix/m2-pr-e-agent-service'
PR-E of 6: AgentService ctx-first collapse.

Collapses the HeartbeatWithContext wrapper into a single Heartbeat
method. Handler-facing method name is preserved (D-4); the handler
service interface and mock already expected ctx-first, so this PR
touches only the service layer and its tests (4 files, 9+/15-).

Verification on the feature branch: build, vet, test (-short),
test -race, full-module test -short, and golangci-lint all clean.

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:25:30 +00:00
shankar0123 283ec27ca4 fix(m2-pr-e): collapse AgentService.HeartbeatWithContext into Heartbeat
PR-E of 6 in the M-2 end-to-end remediation sequence. Collapses the
HeartbeatWithContext wrapper into a single ctx-first Heartbeat method,
matching D-1 (ctx-only signatures, no dual forms). The handler-facing
method name is preserved (D-4) — internal/api/handler/agents.go already
declares `Heartbeat(ctx, ...)` on its local service interface, and the
handler mock at internal/api/handler/agent_handler_test.go already
takes `_ context.Context` as its first param, so no handler churn.

Changes
-------
internal/service/agent.go
  - Delete the zero-body Heartbeat wrapper that forwarded to
    HeartbeatWithContext with context.Background().
  - Rename HeartbeatWithContext → Heartbeat (ctx-bearing body
    folded directly into the canonical method).

internal/service/agent_test.go
  - TestHeartbeat (L95) and TestHeartbeat_NotFound (L128):
    agentService.HeartbeatWithContext(ctx, ...) → .Heartbeat(ctx, ...).

internal/service/concurrent_test.go
  - L162: agentSvc.HeartbeatWithContext(ctx, agentID, metadata)
    → .Heartbeat(ctx, agentID, metadata).

internal/service/context_test.go
  - L179 + L232: agentSvc.HeartbeatWithContext(ctx, ...) → .Heartbeat(...)
  - L185 + L238 t.Logf strings: "HeartbeatWithContext with ..." →
    "Heartbeat with ..." to match the collapsed method name.

Verification (Go 1.25.9 linux/arm64, CI-parity caches)
------------------------------------------------------
  go build ./...                 clean
  go vet ./...                   clean
  go test -short ./internal/service/... ./internal/api/handler/... \
    ./internal/integration/...   all ok
  go test -race -short same set  all ok
  go test -short ./...           all packages ok
  golangci-lint run ./...        0 issues

Locked decisions from the M-2 plan:
  D-1 ctx-only signatures (no dual forms)
  D-4 preserve handler method names facing the router
  D-5 domain types stay ctx-free

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:25:20 +00:00
shankar0123 a67a6b6c30 Merge branch 'fix/m2-pr-d-job-notification-audit'
PR-D: Thread ctx through Job + Notification + Audit service cluster.
Collapse CancelJobWithContext into CancelJob; eliminate 10
context.Background() hits.

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:20:58 +00:00
shankar0123 ccd89c348f fix(m2-pr-d): thread ctx through Job/Notification/Audit services
Collapse CancelJobWithContext into CancelJob; eliminate 10 context.Background()
hits across the Job+Notification+Audit service cluster by threading ctx
through their handler-facing service interfaces.

Services (ctx-first):
- service/job.go: ListJobs, GetJob, CancelJob, ApproveJob, RejectJob now
  accept ctx; the CancelJobWithContext wrapper is removed (handler callers
  continue to invoke CancelJob, now ctx-aware).
- service/notification.go: ListNotifications, GetNotification, MarkAsRead
  accept ctx.
- service/audit.go: ListAuditEvents, GetAuditEvent accept ctx.

Handlers (interface + callsites):
- handler/jobs.go, handler/notifications.go, handler/audit.go: local
  service interfaces updated, r.Context() threaded at every callsite.

Tests:
- Mock services updated to match the new interfaces (ctx accepted and
  ignored via '_ context.Context' first parameter; Fn closure fields
  unchanged).
- job_test.go / notification_test.go callsites thread context.Background()
  to match production shape.

Verification:
  go build ./...                 ok
  go vet ./...                   ok
  go test -short ./...           ok
  go test -race -short ./...     ok
  golangci-lint run ./...        0 issues

Locked decisions from the M-2 plan:
  D-1 ctx-only signatures (no dual forms)
  D-4 preserve handler method names facing the router
  D-5 domain types stay ctx-free

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:20:46 +00:00
shankar0123 478a141498 Merge branch 'fix/m2-pr-c-crud-cluster' 2026-04-18 01:10:10 +00:00
shankar0123 2497be496d M-2 PR-C: Collapse Policy/Profile/Owner/Team services to ctx-first signatures
- Add ctx first param to 21 service-layer handler-interface methods
  across policy.go (6), profile.go (5), owner.go (5), team.go (5)
- Replace 24 context.Background() call sites with received ctx; use
  context.WithoutCancel(ctx) for subsidiary audit-recording ops to
  preserve fire-and-forget audit semantics without inheriting caller
  cancellation
- Add ctx first param to 21 handler-interface method signatures across
  policies.go (6), profiles.go (5), owners.go (5), teams.go (5)
- Thread r.Context() through 21 HTTP handler sites (ListPolicies,
  GetPolicy, CreatePolicy, UpdatePolicy, DeletePolicy, ListViolations,
  ListProfiles, GetProfile, CreateProfile, UpdateProfile, DeleteProfile,
  ListOwners, GetOwner, CreateOwner, UpdateOwner, DeleteOwner,
  ListTeams, GetTeam, CreateTeam, UpdateTeam, DeleteTeam)
- Update MockPolicyService/MockProfileService/MockOwnerService/
  MockTeamService mock method impls with _ context.Context first param
  (Fn fields unchanged — closures do not need ctx); update mock impls
  in integration/lifecycle_test.go for all four services
- Update 12 service-layer test callsites (policy_test.go ×2,
  owner_test.go ×5, team_test.go ×5, profile_test.go ×13) to pass
  context.Background() at the call site

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 01:10:06 +00:00
shankar0123 25dd6c07f3 Merge branch 'fix/m2-pr-b-issuer-target' 2026-04-18 00:47:02 +00:00
shankar0123 eb14236166 M-2 PR-B: Collapse IssuerService + TargetService to ctx-first signatures
- Delete bare TestConnection wrapper in IssuerService; rename
  TestConnectionWithContext → TestConnection
- Delete TestTargetConnection delegate shim in TargetService (canonical
  TestConnection already ctx-first)
- Add ctx first param to 10 handler-interface methods
  (ListIssuers/GetIssuer/CreateIssuer/UpdateIssuer/DeleteIssuer and
  ListTargets/GetTarget/CreateTarget/UpdateTarget/DeleteTarget)
- Replace 16 context.Background() call sites with received ctx
- Thread r.Context() through 12 HTTP handler sites in issuers.go and
  targets.go (outer TargetHandler.TestTargetConnection HTTP method name
  preserved for router compatibility)
- Update MockIssuerService, MockTargetService, and mockTargetService
  (integration) for ctx-first forwarding; update test callsite literals

Audit complete. Commit: 1f6cf0eafa. Sections: 12. Findings: 2/7/10/4/6.
2026-04-18 00:46:58 +00:00
shankar0123 bbb628243f Merge branch 'fix/m2-pr-a-certificate-cluster' 2026-04-18 00:29:40 +00:00
shankar0123 cdc9d03d5b fix(m-2): thread context through CertificateService cluster
Collapses CertificateService, RevocationSvc, and CAOperationsSvc to
ctx-accepting method signatures. Removes context.Background() synthesis
at 24 internal call sites across certificate.go, revocation_svc.go, and
ca_operations.go.

- Primary repo calls inherit request cancellation via the passed ctx.
- Audit and notification dispatches use context.WithoutCancel(ctx) so
  they survive client disconnect.
- Collapses TriggerRenewal/TriggerRenewalWithActor,
  TriggerDeployment/TriggerDeploymentWithActor, and
  RevokeCertificate/RevokeCertificateWithActor sibling pairs into single
  canonical ctx-accepting methods (decisions D-1, D-2).

Handlers pass r.Context(). Mocks and tests updated to match new
signatures. No HTTP surface change, no OpenAPI change.

PR 1 of 6 in the M-2 remediation chain. Master green at this commit.

Refs: certctl-audit-report.md M-2 (L143, L224)
2026-04-18 00:29:37 +00:00
shankar0123 e951d319d0 Merge branch 'fix/m1-audit-shutdown-drain'
Resolves M-1 (Medium): Audit recorder shutdown drain.

The API audit middleware's detached recording goroutines now drain
during graceful shutdown via AuditMiddleware.Flush (sync.WaitGroup +
timeout-aware select), called between http.Server.Shutdown and
db.Close. Prevents silent audit-event loss on SIGTERM
(CWE-662 / CWE-400).
2026-04-17 17:29:54 +00:00
shankar0123 d14a45401b fix(audit): drain in-flight recording goroutines on shutdown (M-1)
Audit events spawned from the HTTP middleware ran in detached goroutines
using context.Background(). On SIGTERM the DB pool was closed before
those goroutines finished writing, silently dropping audit events
(CWE-662 Improper Synchronization / CWE-400 Uncontrolled Resource
Consumption).

NewAuditLog now returns an *AuditMiddleware struct that tracks every
spawned goroutine with sync.WaitGroup. Callers wire the middleware via
its Middleware method value (preserves the existing
func(http.Handler) http.Handler shape) and drain the WaitGroup with
Flush(ctx), which blocks until in-flight recordings complete or the
provided context is cancelled — mirroring scheduler.WaitForCompletion.

Flush is invoked in cmd/server/main.go between http.Server.Shutdown
(no new requests accepted) and db.Close (pool torn down), with a
timeout returning ErrAuditFlushTimeout wrapping ctx.Err().

Request-derived inputs (method, path, status) are snapshotted before
the goroutine spawn so the worker does not race with http.Server
reusing r after the handler returns.

Tests:
  TestAuditLog_FlushDrainsInFlightGoroutines
  TestAuditLog_FlushTimeoutReturnsErrAuditFlushTimeout

Verification:
  go build ./...                            : 0
  go vet ./...                              : 0
  go test -race -short ./...                : 0 (all packages)
  go test -cover ./internal/api/middleware  : 81.4%
  golangci-lint run                         : 0 issues
  govulncheck ./...                         : 0 vulns in called code
2026-04-17 17:29:48 +00:00
shankar0123 655e2879e6 feat(frontend): add Owner field to OnboardingWizard Certificate step
The first-run onboarding wizard's Certificate step now surfaces an
Owner dropdown (required) alongside Issuer and Profile, matching the
ownership model introduced in M11b. Prevents newly-created certs from
being unowned and bypassing notification routing.

- web/src/pages/OnboardingWizard.tsx: getOwners query, ownerId state,
  Owner <select>, required-field guard (nextDisabled), empty-state link
  to /owners page when no owners exist yet.

Frontend-only change; no backend wiring or schema impact. Separated
from the M-6 sentinel-agent idempotency commit per scope-guard.
2026-04-17 16:55:44 +00:00
shankar0123 e757ef1471 Merge branch 'fix/m6-sentinel-idempotent-create'
Resolves M-6 (Medium): swallowed sentinel agent INSERT errors.
CWE-662 / CWE-209-adjacent.

Shape A: CreateIfNotExists helper + 4 sentinel call sites.
2026-04-17 16:32:12 +00:00
shankar0123 27afa4463d fix(repository): idempotent sentinel agent creation via ON CONFLICT (M-6)
Sentinel agents (server-scanner, cloud-aws-sm, cloud-azure-kv,
cloud-gcp-sm) were created on startup with a plain INSERT whose
duplicate-key error was swallowed unconditionally. That silenced every
other DB failure too (connectivity drop, permissions change, unrelated
constraint violation) — a restart after the first boot quietly
de-fanged cloud discovery and the network scanner (CWE-662, CWE-209-
adjacent).

Shape A: add AgentRepository.CreateIfNotExists using ON CONFLICT (id)
DO NOTHING RETURNING id + sql.ErrNoRows discrimination. This keeps the
strict Create semantics (duplicate-key is an error) intact for real
agent registration and gives sentinels their own idempotent path.

- repo: CreateIfNotExists returns (created bool, err error); false,nil
  on pre-existing row; false,wrapped err on anything else.
- interface: CreateIfNotExists added to AgentRepository.
- main.go: 4 sentinel sites log Error/Info/Debug distinctly.
- mocks: service + integration mocks implement the new method.
- tests: 4 new testcontainers integration tests cover first-insert,
  idempotent second-call, concurrent 16-goroutine race (exactly one
  creator, no duplicate-key panic), and pre-cancelled context
  surfacing.

Coverage gates (go test -cover): service 67.6%/55, handler 78.6%/60,
domain 92.7%/40, middleware 80.0%/30, crypto 86.7%/85. Race/vet/
golangci-lint v2.11.4 (0 issues)/govulncheck v1.2.0 clean across all
touched packages.
2026-04-17 16:32:07 +00:00
shankar0123 80450c7180 fix(repository): populate TargetIDs in certificate scan helper (M-7)
scanCertificate never queried the certificate_target_mappings junction
table, so Certificate.TargetIDs was always nil on reads. This silently
broke deployment lookups, bulk revocation filters, cert detail pages,
and any code path that iterated TargetIDs to dispatch target work.

Fix:
- Convert scanCertificate to a receiver method (r *CertificateRepository)
  so it has access to the DB for the secondary junction query.
- Get(): scan the row, then call r.getTargetIDs(ctx, certID) to populate
  TargetIDs with a single targeted query.
- List() and GetExpiringCertificates(): inline the scan loop so we can
  collect all certIDs first, then call getTargetIDsForCertificates once
  with pq.Array(certIDs) to avoid N+1 round-trips. Build a map and
  attach TargetIDs to each certificate in the result set.
- Default TargetIDs to []string{} (not nil) when a cert has no mappings
  so JSON marshals as [] rather than null.

Tests:
- New integration test file certificate_targetids_test.go with 5
  subtests exercising Get / List / GetExpiringCertificates single
  and multi-target cases plus the empty-slice vs nil contract.
- Uses the shared testcontainers-go setupTestDB infrastructure and
  skips under 'go test -short' so CI (which excludes ./internal/repository/...
  from coverage paths anyway) stays green.

Addresses M-7 from certctl-audit-report.md.
2026-04-17 15:41:08 +00:00
shankar0123 c655e0f8c5 fix(crypto/local-ca): reject expired or not-yet-valid sub-CA certificates on disk load (M-5)
loadCAFromDisk now validates the upstream sub-CA certificate's NotBefore
and NotAfter fields before accepting it, returning a fail-closed error
at server startup instead of silently loading an out-of-window CA.

Before this fix, loadCAFromDisk checked BasicConstraints.IsCA and
KeyUsage=CertSign but not the validity window. An expired enterprise
sub-CA (e.g. an ADCS subordinate whose rollover slipped) would load
without warning and the scheduler would mint child certs that every
RFC 5280 path validator rejects — outages show up at relying parties,
not at certctl, and only after thresholds trip.

CWE-672 (Operation on a Resource after Expiration or Release); secondary
CWE-295 (Improper Certificate Validation). Error strings include the CA
subject CommonName and both RFC3339 timestamps so the log line is
actionable in a 3am incident.

Tests: TestSubCAMode gains three subtests exercising the new gate —
SubCA_ExpiredCert_IsRejected (CA expired 1h ago → error mentions
'expired' and the CN), SubCA_NotYetValid_IsRejected (CA valid +1h →
error mentions 'not yet valid' and the CN), and SubCA_BarelyValid_IsAccepted
(CA valid [now-1m, now+1h] → issuance succeeds, proving no
over-rejection). Adds generateTestSubCAWithValidity helper; the
original generateTestSubCA wrapper preserves the [now, now+5y] default
for existing tests.

Package coverage: 67.7% -> 68.3%.

Verification: go build, go vet, go test -race, go test -cover all
green locally; golangci-lint v2.11.4 clean; govulncheck clean. All CI
coverage floors met with margin (service 67.6/55, handler 78.6/60,
domain 92.7/40, middleware 80.0/30, crypto 86.7/85).

Parent: 5abeeb8 (M-8 per-ciphertext salt).
Closes: audit finding M-5 in certctl-audit-report.md.
2026-04-17 14:10:23 +00:00
shankar0123 5abeeb882b fix(crypto): per-ciphertext PBKDF2 salt + v2 versioned format with v1 fallback (M-8) 2026-04-17 05:36:29 +00:00
shankar0123 b1df6dab27 ci(release): add CLI/MCP binaries, checksums, SBOM, Cosign, SLSA provenance (M-3) 2026-04-17 04:04:55 +00:00
shankar0123 672e1d991d build: propagate HTTP_PROXY/HTTPS_PROXY/NO_PROXY through Docker build (M-4, Issue #9)
Addresses Medium finding M-4 in the audit report. The multi-stage
Dockerfiles previously had no ARG declarations for HTTP_PROXY,
HTTPS_PROXY, or NO_PROXY, so corporate-proxy environments silently
failed at 'npm ci' (frontend stage) and 'go mod download' (Go builder).
The npm retry idiom (`npm ci --include=dev || npm ci --include=dev`)
masked the failure because the upstream 'Exit handler never called!'
bug exits 0 despite the install crash.

Fix: thread HTTP_PROXY / HTTPS_PROXY / NO_PROXY ARGs through every
Docker build stage that performs network I/O, re-export them as ENV
with both upper- and lower-case aliases (apk/curl/npm read lowercase;
Go/Node read uppercase), and forward the host shell's environment via
`build.args:` in every compose file and `build-args:` in the release
workflow's docker/build-push-action steps. Defaults are empty strings
so un-proxied builds remain byte-identical to the pre-fix tree.

Scope: Dockerfile (frontend + Go builder stages), Dockerfile.agent
(Go builder stage), deploy/docker-compose.yml (server + agent),
deploy/docker-compose.dev.yml (server + agent), deploy/docker-compose.test.yml
(server + agent), .github/workflows/release.yml (both docker/build-push-action
v6 invocations). Zero Go, web, test, or runtime code changes. Zero
base-image changes. Existing npm `||` retry idiom and `ARG TARGETARCH`
preserved verbatim.

CWE-1173 (Improper Use of Validated Input) / CWE-16 (Configuration).

Verification:
- YAML parses clean across all four compose files and release.yml.
- yamllint -d relaxed: clean exit across all five YAML files.
- All six `build.args:` blocks expose HTTP_PROXY, HTTPS_PROXY, NO_PROXY
  with default-empty ${VAR:-} substitution.
- Both release.yml docker/build-push-action steps expose the same
  three keys sourced from ${{ secrets.HTTP_PROXY }}, etc.
- Dockerfiles contain 5 proxy ARG declarations total (Dockerfile has 2
  stages × 3 ARGs = 6 lines, Dockerfile.agent has 1 stage × 3 ARGs = 3
  lines); lowercase ENV aliases verified present in every stage.
- git diff --shortstat: 6 files changed, 117 insertions(+), 0 deletions.
  Pure additive.

Docker-live verification (`docker build`, `docker compose config`)
deferred to CI / post-commit smoke because the sandbox has no Docker
runtime. hadolint, go, golangci-lint, govulncheck likewise unavailable
in the sandbox; per-layer CI coverage gates (service 55%, handler 60%,
domain 40%, middleware 30%) are trivially unaffected as M-4 touches
zero Go source files.
2026-04-17 03:12:45 +00:00
shankar0123 89b910a8f1 security: atomic pending-job claim with FOR UPDATE SKIP LOCKED (H-6)
Fixes H-6 (CWE-362) — GetPendingJobs returned pending rows without row
locks, so two scheduler replicas in an HA deployment could both read the
same row, both decide it was theirs, and race on UpdateStatus, producing
duplicate Running jobs and duplicate certificate issuances.

Remediation: a claim-style repository API that selects + transitions
Pending -> Running in one transaction with SELECT ... FOR UPDATE SKIP
LOCKED. Concurrent claimants observe disjoint row sets; no worker ever
sees another worker's claimed row.

Repository changes (internal/repository/postgres/job.go):
  - New ClaimPendingJobs(ctx, jobType, limit): BEGIN; SELECT id,...
    FROM jobs WHERE status='Pending' (optional type filter, optional
    LIMIT) FOR UPDATE SKIP LOCKED; UPDATE jobs SET status='Running',
    updated_at=NOW() WHERE id = ANY($ids); COMMIT. Returns the claimed
    rows with status already flipped.
  - New ClaimPendingByAgentID(ctx, agentID): mirrors M31 UNION ALL
    semantics (direct agent_id match, target->agent JOIN fallback,
    certificate->target->agent chain for AwaitingCSR) but wraps each
    branch in FOR UPDATE SKIP LOCKED and flips Deployment/Renewal rows
    to Running. AwaitingCSR rows are returned in place (state
    transition deferred until SubmitCSR, consistent with M8 semantics).
  - Existing GetPendingJobs / ListPendingByAgentID retained for legacy
    compatibility; their godoc now directs production callers to the
    Claim* variants.

Production caller switches:
  - internal/service/job.go ProcessPendingJobs: ListByStatus(Pending)
    -> ClaimPendingJobs(ctx, "", 0). Eliminates the real scheduler
    race between two replicas tick-firing simultaneously.
  - internal/service/agent.go GetPendingWork: ListPendingByAgentID ->
    ClaimPendingByAgentID. Eliminates the race between two pollers
    for the same agent (e.g. brief network blip causing duplicate
    poll) and between a scheduler tick and an agent poll.

Safety argument for pre-flipping Pending -> Running inside the claim
transaction: ProcessRenewalJob and ProcessDeploymentJob both call
UpdateStatus(Running) unconditionally on entry, so an early flip is
idempotent. On panic, the scheduler's panic recovery leaves the job
in Running which the existing stale-running reaper handles.

Tests (internal/repository/postgres/repo_test.go, skipped in -short):
  - TestJobRepository_ClaimPendingJobs_FlipsToRunning: seed 5 Pending,
    claim once, assert all 5 returned + DB rows Running, residual
    claim returns 0.
  - TestJobRepository_ClaimPendingJobs_ConcurrentDisjoint: seed M=40
    Pending Renewals, spawn N=8 goroutines each calling
    ClaimPendingJobs(_, JobTypeRenewal, 1) in a loop. Invariants:
    (a) no job ID claimed by more than one worker, (b) sum of claims
    == 40, (c) all 40 rows in Running state in the DB. Bounded
    empty-streak guard (20 iterations) covers SKIP LOCKED transient
    zeros under contention.
  - TestJobRepository_ClaimPendingByAgentID_TransitionsDeployments:
    seeds 2 Pending Deployment + 1 AwaitingCSR for agent A plus 1
    Pending Renewal for agent B (scope check). Asserts deployments
    flip to Running, AwaitingCSR is returned but preserved, agent B's
    renewal never appears.

Mock updates: testutil_test.go, lifecycle_test.go, verification_test.go
gained ClaimPendingJobs/ClaimPendingByAgentID on their mock job repos
mirroring the real Pending -> Running semantics. Mocks intentionally
do NOT write to StatusUpdates (that map tracks UpdateStatus() call
history specifically; the real claim path uses a bulk UPDATE, not
UpdateStatus).

Verification (CI-scope):
  - go build ./cmd/...: ok
  - go vet ./...: ok
  - go test -race -short on service, api/handler, api/middleware,
    scheduler, connector/..., domain, validation, tlsprobe: ok
  - Coverage gates: service 67.6% (>=55), handler 78.6% (>=60),
    middleware 80.0% (>=30), domain 92.7% (>=40). All hold.
  - golangci-lint 2.11.4: 0 issues
  - govulncheck: no vulnerabilities in call graph
  - Frontend: tsc clean, 218 vitest tests pass, vite build ok
  - helm lint + helm template: ok
  - Invariant sweeps: FOR UPDATE SKIP LOCKED present in job.go;
    H-1 through H-5 fixtures unchanged.

Refs: H-6 in certctl-audit-report.md
2026-04-17 02:34:56 +00:00
shankar0123 6315ef102a security(globalsign): remove InsecureSkipVerify and pin CA pool (H-5)
The GlobalSign Atlas HVCA connector previously used InsecureSkipVerify:true
on its mTLS TLS config, disabling server certificate validation and
defeating the purpose of the client-side mTLS handshake. This was a
CWE-295 Improper Certificate Validation vulnerability silently degrading
trust on every production call to GlobalSign's signing API.

Remediation (per H-5 audit finding, Lens 4.4):

- Remove InsecureSkipVerify from all three http.Client construction sites
  (ValidateConfig, getHTTPClient, and legacy initialisation path).
- Introduce buildServerTLSConfig() helper that constructs tls.Config with
  MinVersion: tls.VersionTLS12 (addresses adjacent L-1 recommendation).
- New optional config field `server_ca_path` (env:
  CERTCTL_GLOBALSIGN_SERVER_CA_PATH). When unset the connector trusts the
  system root CA bundle (correct default for GlobalSign's publicly-trusted
  HVCA endpoints). When set the bundle is loaded via x509.NewCertPool() +
  AppendCertsFromPEM, and only those roots are trusted (supports private
  HVCA deployments and defence-in-depth root pinning).
- Error wrapping chain: "failed to read server CA bundle at %s" and
  "no valid PEM certificates found in server CA bundle at %s" surface
  config problems at ValidateConfig time instead of silently failing at
  request time.

Docs, config, service env-seed, and GUI issuer type definition updated to
expose the new field. Tests: 9 dead `InsecureSkipVerify: true` client
TLSClientConfig blocks (no-ops against httptest.NewServer plain-HTTP)
replaced with bare http.Client; new TestGlobalSign_ServerTLSConfig covers
pinned-CA trust, untrusted-server rejection, missing-file and invalid-PEM
error paths.

Verification:
- go build ./... clean
- go vet ./... clean
- go test -race ./internal/connector/issuer/globalsign/... ./internal/config/... ./internal/service/... ok
- go test ./... (excluding testcontainers-gated repo layer) ok
- golangci-lint run ./... 0 issues
- govulncheck ./... 0 reachable vulns
- Per-layer coverage: service 68.7% (≥55), handler 83.6% (≥60), domain 82.0% (≥40), middleware 63.8% (≥30)
- globalsign package coverage: 75.9%
- Invariant sweep: 0 InsecureSkipVerify references remain in globalsign
  package (only a test-file comment documenting the removal).
2026-04-17 01:40:58 +00:00
shankar0123 119986fa7e security: add SSRF defence-in-depth for webhook notifier (fixes H-4)
The webhook notifier would previously accept any operator-configured URL
and hand it to http.Client without validation. That exposed two
SSRF classes (CWE-918):

  * Reserved-address reachability — a misconfigured or adversarial
    webhook URL pointing at 127.0.0.1, ::1, 169.254.169.254 (cloud
    metadata), or 0.0.0.0 would succeed, exfiltrating request bodies
    to local services or leaking short-lived cloud credentials.
  * DNS rebinding — a hostname resolving to a public IP at validation
    time and to a reserved IP at dial time would bypass any
    URL-string-only check.

Fix installs two independent layers:

  * validation.ValidateSafeURL runs at config-ingest time and before
    every outbound POST. It rejects non-HTTP(S) schemes, empty hosts,
    and literal reserved-IP hosts with a clear operator-facing error.
    This is a fast early diagnostic.
  * validation.SafeHTTPDialContext is installed on the webhook
    http.Transport. It re-resolves the host at dial time, rejects any
    resolved address whose address lies in a reserved range (loopback,
    link-local, multicast, broadcast, unspecified, IPv6
    link-local/multicast), and pins the resolved IP into the final
    dial address so the TLS handshake targets the exact IP the guard
    approved. This is the authoritative, TOCTOU-safe defence against
    DNS rebinding.

The two layers are complementary — validateURL fails fast on obvious
misconfiguration; SafeHTTPDialContext fails closed when DNS changes
between validation and dial.

The existing unexported isReservedIP helper in
internal/service/network_scan.go is extracted into
internal/validation.IsReservedIP with byte-identical behaviour so the
webhook notifier and the network scanner share a single authoritative
reserved-address list. RFC 1918 ranges remain intentionally allowed
(certctl's self-hosted design). Broader unspecified / IPv6 link-local
coverage lives only in the stricter dial-time policy, where it belongs
for outbound HTTP egress.

Test seam: Connector gains an unexported validateURL func field and a
same-package newForTest constructor that installs a permissive
validator and the stdlib default transport. Production callers cannot
reach this constructor because it is unexported; only same-package
tests (package webhook) can use it. Same-package happy-path tests call
newForTest so they can point at httptest loopback servers without
being blocked by the production guard. The four SSRF-rejection tests
that verify the guard itself still call New so they exercise the real,
strict validator. This keeps the production SSRF defence
unconditionally on in real code while preserving legitimate unit-test
coverage.

Tests
-----
  * internal/validation/ssrf_test.go (new) — 16-subtest pin on
    IsReservedIP that is byte-identical with the original network-
    scanner behaviour; ValidateSafeURL accept/reject matrix covering
    HTTPS/HTTP, reserved-literal IPv4/IPv6, dangerous schemes
    (file/gopher/ftp/javascript/data/ldap/dict/jar), missing hosts,
    and malformed inputs; SafeHTTPDialContext rejects literal reserved
    addresses and hosts resolving to reserved addresses (DNS-rebinding
    coverage via localhost).
  * internal/connector/notifier/webhook/webhook_test.go — happy-path
    tests switched to newForTest; production-guard SSRF-rejection
    tests (TestValidateConfig_RejectsReservedURLs,
    TestValidateConfig_RejectsDangerousScheme,
    TestPostWebhook_RejectsReservedURL,
    TestPostWebhook_RejectsDangerousScheme) continue to call New so
    they exercise the unconditionally-installed production validator.

Wire-format invariants preserved
--------------------------------
  * Outbound HTTP request shape (method, headers, body, HMAC
    signature) unchanged.
  * network_scan.go behaviour unchanged — validation.IsReservedIP is
    byte-identical with the deleted helper.
  * RFC 1918 (10/8, 172.16/12, 192.168/16) remain allowed for both
    outbound webhook and CIDR expansion, matching the self-hosted
    design.

Verification
------------
  * go test -race ./internal/validation/... ./internal/connector/
    notifier/webhook/... ./internal/service/... — green.
  * Full-suite go test -race ./... — green (GOTMPDIR=/dev/shm to
    sidestep full /tmp on the sandbox host).
  * Coverage gates pass: service 68.8% >= 55%, handler 83.6% >= 60%,
    domain 82.0% >= 40%, middleware 63.8% >= 30%. Overall 67.8%.
    Webhook package 91.5% line coverage; validation package
    ValidateSafeURL/SafeHTTPDialContext 78-100% per function.
  * govulncheck ./... — no vulnerabilities found.
  * golangci-lint run on touched H-4 production code — clean. Pre-
    existing errcheck/gosimple warnings in scope-adjacent files
    (webhook_test.go:270 w.Write, network_scan.go:120/173/265/305)
    verified against 3853b74 to predate this commit; left alone per
    scope guard.

Operational notes
-----------------
  * No migration needed. The guard is pure Go code; existing webhook
    configs continue to work unless they point at reserved addresses,
    in which case they now fail closed with a clear error.
  * Existing operators who rely on webhook POST to 127.0.0.1 or
    ::1 (e.g., local receivers on the same host as certctl-server)
    must expose their receiver on an RFC 1918 address or public IP.
    This is deliberate — the threat model for webhook notifiers
    includes untrusted operator-supplied URLs.

Scope guard: H-4 only. H-5, H-6, M-*, L-*, and I-* findings remain
open and are tracked separately. No drive-by refactors.
2026-04-17 00:34:47 +00:00
shankar0123 3853b7460c security: reject CRLF/NUL in email headers to prevent SMTP injection (fixes H-3)
H-3 in certctl-audit-report.md: caller-supplied From/To/Subject were
interpolated directly into the SMTP DATA payload and handed to
client.Mail / client.Rcpt with no sanitization, allowing an attacker
who controls any of those values to inject extra headers (Bcc:,
Reply-To:), split the message body (CRLFCRLF), or tamper with the
SMTP envelope. CWE-113.

Fix:
- New package helper internal/validation.ValidateHeaderValue(field,
  value). Rejects CR ("\r"), LF ("\n"), and NUL ("\x00") with an error
  that names the offending field but does NOT echo the raw value,
  so log readers cannot be attacked with injected content. Silent
  stripping was considered and rejected: authentication-relevant
  headers must fail visibly.
- Two-layer defense in internal/connector/notifier/email/email.go:
    (1) primary guard at the top of sendEmail / sendHTMLEmail, which
        blocks tampering of the SMTP envelope (client.Mail, client.Rcpt)
        since net/smtp does not sanitize those arguments; and
    (2) defense-in-depth guard inside formatEmailMessage /
        formatHTMLEmailMessage, catching any future caller that
        bypasses sendEmail. Both format functions now return an error.
- Body content is intentionally NOT validated — CR/LF in body is legal
  RFC 5322 content and net/smtp handles dot-stuffing.

Tests:
- internal/validation/headers_test.go: 3 functions (AcceptsSafeInput,
  RejectsControlCharacters, DefaultFieldName) covering plain ASCII,
  UTF-8 multibyte, tabs, typical email addresses, CRLF injection,
  lone CR, lone LF, NUL, CRLFCRLF body split, trailing CR, leading LF.
  Each reject case asserts the field name IS in the error and the
  raw offending value IS NOT (anti-log-injection).
- internal/connector/notifier/email/email_test.go: added
  TestEmail_FormatEmailMessage_RejectsCRLFInjection and
  TestEmail_FormatHTMLEmailMessage_RejectsCRLFInjection. Existing
  format tests updated for the new (bytes, error) signature.

Wire-format invariants preserved:
- SMTP DATA headers still use CRLF separators and RFC 1123Z Date
  (unchanged).
- Content-Type headers unchanged (text/plain for plain, text/html +
  MIME-Version: 1.0 for HTML).
- No change to message encoding or transport.

Verification (Go 1.25.9 linux-arm64, parent e9947dc):
- go build ./...                                 clean
- go vet ./...                                   clean
- go test -race ./internal/validation/...        ok
- go test -race ./internal/connector/notifier/email/...   ok
- go test -race ./internal/connector/notifier/webhook/... ok
- Per-layer coverage gates all pass:
    validation  95.1% (+0.7 vs baseline 94.4%)
    email       39.7% (+1.4 vs baseline 38.3%)
    service     67.8% (unchanged)
    handler     78.6% (unchanged)
    middleware  80.0% (unchanged)
    domain      92.7% (unchanged)
- govulncheck ./...                              No vulnerabilities found
- golangci-lint run ./internal/validation/... ./internal/connector/notifier/email/...
                                                 0 issues

Operational note: SMTP sends that would previously deliver a
tampered message now fail fast at the notifier with a clear error.
Operators who were relying on header-injection-shaped inputs (there
should be none in practice — all callers are internal certctl code)
will see "failed to format message: <field> contains disallowed
control character" in logs.

Scope: H-3 only. H-4 (webhook SSRF) follows in a separate commit.
2026-04-17 00:08:20 +00:00
shankar0123 e9947dc0fe docs: redact V3 feature specifics from README (fixes H-7)
Problem
-------
H-7 (CWE-200 / information disclosure, strategic-policy class): the
public README's V3 section enumerated the paid-tier feature set --
"Role-based access control with profile-gating", "Event-driven
architecture with real-time operational views", "Advanced search",
"compliance scoring", "HSM/TPM integration" -- violating the
CLAUDE.md directive "Keep V3+ deliberately vague -- one-liner
descriptions only. Don't telegraph the paid feature set." The prior
wording also carried factual drift: `compliance scoring` was pulled
forward to V2.2 per the V2.2 Roadmap, so pairing it with V3 in the
README misrepresented the open-core line.

Fix
---
Replace the two-sentence enumeration at README.md:322-323 with a
single deliberately-vague sentence:

  Enterprise capabilities for larger deployments are available in
  the commercial tier.

No named features. No SKU enumeration. Matches the policy one-liner
shape used in neighboring V1 / V2 / V4+ sections. Net -1 line of
prose.

Files
-----
  README.md                          1 -, 1 +

Wire-format invariants preserved
--------------------------------
This is a docs-only change. All protocol surfaces are byte-identical:
  - RFC 7030 EST handler (internal/api/handler/est.go) -- untouched
  - RFC 8894 SCEP handler (internal/api/handler/scep.go) -- untouched
  - Shared internal/pkcs7/ package -- untouched
  - H-1 revocation composite key (migration 000012) -- untouched
  - H-2 SCEP challenge-password preflight + PKCSReq guard -- untouched
  - C-2 AES-256-GCM config encryption contract -- untouched
  - CRL DER bytes, OCSP response bytes -- untouched

Verification
------------
  git diff 387fb55 HEAD -- internal/ cmd/ migrations/ api/ deploy/
    -> 0 code changes (only README.md modified after H-1)

Operational note
----------------
No behavioral change. Product positioning only. The V3 feature set
itself remains documented in the gitignored roadmap.md / strategy.md,
which are the intended sources of truth for the paid tier.

Audit report: see /Users/shankar/Desktop/cowork/certctl-audit-report.md
2026-04-16 23:46:37 +00:00
shankar0123 b813660c74 security: require SCEP challenge password when SCEP enabled (fixes H-2)
Problem (CWE-306 Missing Authentication for Critical Function):
internal/service/scep.go PKCSReq skipped the shared-secret check when
s.challengePassword was empty. An unconfigured-but-enabled SCEP server
accepted any unauthenticated client reaching /scep and issued a
certificate against the configured issuer for any CSR with a valid
signature. No audit trail distinguished authenticated from
unauthenticated enrollments. This matches the two-layer fail-closed
pattern already used for C-2 (f549a7a): reject at startup AND reject
at the service boundary.

Fix (two layers, defense-in-depth):

Layer 1 — startup pre-flight in cmd/server/main.go:
  preflightSCEPChallengePassword returns a non-nil error when SCEP is
  enabled and CERTCTL_SCEP_CHALLENGE_PASSWORD is empty. main logs and
  os.Exit(1)s before the SCEP service is constructed. Disabled SCEP is
  unaffected. The helper is unit-testable in isolation.

Layer 2 — service-layer rejection in internal/service/scep.go:
  PKCSReq refuses enrollment when s.challengePassword == "" even though
  main already blocks this state — protects future call sites (tests,
  library reuse, a REST-over-HTTPS wrapper). When a secret is
  configured, the comparison now uses crypto/subtle.ConstantTimeCompare
  so response time does not leak the configured secret through a
  short-circuiting byte compare.

Files:
- cmd/server/main.go: preflightSCEPChallengePassword helper; call site
  inside the `if cfg.SCEP.Enabled` block before issuer lookup; fatal
  slog error references CWE-306 and names the env var so operators can
  diagnose the startup failure without reading code.
- cmd/server/main_test.go: TestPreflightSCEPChallengePassword with five
  table-driven subtests (disabled empty, disabled set, enabled empty
  rejected, enabled set, single-char boundary). The enabled-empty case
  asserts the error string contains both CERTCTL_SCEP_CHALLENGE_PASSWORD
  and CWE-306 so the log message remains actionable.
- internal/config/config.go: SCEPConfig.ChallengePassword godoc now
  states the field is REQUIRED when SCEP.Enabled and cross-references
  preflightSCEPChallengePassword.
- internal/service/scep.go: imports crypto/subtle; PKCSReq rewritten
  with the two-layer check; comment block cites H-2 / CWE-306 and the
  constant-time rationale.
- internal/service/scep_test.go: existing tests that relied on the
  vulnerable empty-password path now configure a secret on both sides.
  TestSCEPService_PKCSReq_ChallengePassword_NotRequired is replaced by
  TestSCEPService_PKCSReq_ChallengePassword_EmptyServerConfigRejected
  which iterates ["", "any-value", "guess"] against an unconfigured
  server and asserts "not configured" in the error. A new
  TestSCEPService_PKCSReq_ChallengePassword_ConstantTimeLengthIndependence
  exercises same-prefix-longer and wrong-case inputs to guard against a
  regression from ConstantTimeCompare to a short-circuiting byte compare.
- internal/service/m11c_crypto_enforcement_test.go: four tests
  (RejectsWeakKey, AcceptsStrongKey, MaxTTL_ForwardedToIssuer,
  NoProfileRepo_PassesThrough) constructed NewSCEPService with an empty
  challenge password and exercised PKCSReq through the now-rejected
  vulnerable path. All four now configure "secret123" on both sides with
  an inline H-2 comment; the crypto/MaxTTL/profile behavior they assert
  is unchanged.

Wire-format / behavioral invariants preserved:
- RFC 8894 SCEP handler is untouched (internal/api/handler/scep.go and
  internal/pkcs7/*): GetCACaps/GetCACert responses, PKIOperation request
  parsing, and the PKCS#7 certs-only response format are byte-identical.
- RFC 7030 EST handler is untouched
  (internal/api/handler/est.go + internal/pkcs7/*).
- Revocation idempotency composite key (H-1, migration 000012) untouched.
- AES-256-GCM config encryption (C-2) untouched.
- CRL DER bytes and OCSP response bytes unchanged.

Verification:
- go build ./...              silent success
- go vet ./...                silent success
- go test -race -count=1 ./internal/service/ ./cmd/server/
  ./internal/api/handler/ ./internal/integration/    all OK
- Coverage with comfortable headroom over CI gates:
    service     67.8% (gate 55%)
    handler     79.0% (gate 60%)
    domain      92.7% (gate 40%)
    middleware  80.0% (gate 30%)
    cmd/server  1.6%  (preflightSCEPChallengePassword: 100%)
  internal/service/scep.go PKCSReq statement coverage: 100%.
- rg sweeps: no `s.challengePassword != ""` remains;
  no `challengePassword != s.challengePassword` remains.

Operational note: operators with SCEP enabled but no challenge password
set will see a fatal startup error and a log line citing
CERTCTL_SCEP_CHALLENGE_PASSWORD and CWE-306 after upgrading. This is the
intended fail-closed behavior. Fix by either setting the env var to a
non-empty shared secret or setting CERTCTL_SCEP_ENABLED=false.

Audit report: certctl-audit-report.md (revision 5) logs this under
H-2 Resolution Log.
2026-04-16 22:22:51 +00:00
shankar0123 387fb555ac security: scope revocation unique index to (issuer_id, serial_number) (fixes H-1)
RFC 5280 §5.2.3 defines certificate serial number uniqueness per issuing CA,
not globally. The prior unique index on `certificate_revocations.serial_number`
enforced a stricter invariant than the spec: with 12 issuer connectors (Local
CA, ACME, Vault, step-ca, OpenSSL, DigiCert, Sectigo, Google CAS, AWS ACM PCA,
Entrust, GlobalSign, EJBCA), two distinct certificates legitimately issued by
different CAs can share a serial number. Recording a revocation for the second
collision silently dropped via `ON CONFLICT DO NOTHING`, leaving the second
cert persistently absent from OCSP/CRL responses.

Changes:

- Migration 000012 drops `idx_certificate_revocations_serial` and creates
  `idx_certificate_revocations_issuer_serial` UNIQUE ON (issuer_id,
  serial_number). Adds a non-unique `idx_certificate_revocations_serial_lookup`
  to preserve the serial-only fast path for OCSP/CRL probes that already know
  the issuer scope.
- `CertificateRevocationRepository.Create` targets the new composite key in
  `ON CONFLICT` — same-issuer idempotency preserved, cross-issuer collisions
  now recorded as distinct rows.
- `GetBySerial(serial)` renamed `GetByIssuerAndSerial(issuerID, serial)` on
  the interface and Postgres impl. All callers (OCSP responder, CRL
  generator, short-lived-cert exemption check) already have `issuerID` in
  scope because the protocol paths carry it (`/api/v1/ocsp/{issuer_id}/{serial}`,
  `/api/v1/crl/{issuer_id}`).
- Repository integration test added: `TestRevocationRepository_CrossIssuerSerialCollision`
  asserts that serial `CAFEBABE01` can be stored under two issuers
  simultaneously, that lookups return the correct row per (issuer, serial),
  and that same-issuer idempotency still works (re-inserting (issuer, serial)
  does not error and does not duplicate).
- Existing tests and service/integration mocks updated for the rename.

Wire-format invariants preserved: CRL DER bytes, OCSP response bytes, and
AES-256-GCM config encryption are unaffected — this change touches only
revocation-record uniqueness scope.

CWE-664.
2026-04-16 21:49:59 +00:00
shankar0123 f549a7aa79 security: fail closed when CERTCTL_CONFIG_ENCRYPTION_KEY is unset (fixes C-2)
EncryptIfKeySet/DecryptIfKeySet in internal/crypto/encryption.go previously
returned plaintext + wasEncrypted=false when the operator had not configured
CERTCTL_CONFIG_ENCRYPTION_KEY. That produced a data-at-rest confidentiality
bypass (CWE-311): sensitive fields on dynamically-configured issuer and
target rows (source='database') were persisted to PostgreSQL without any
encryption, and no caller could distinguish the encrypted from the plaintext
branch at runtime. The only visible signal was a single warning log line
emitted once at startup.

Fail closed instead:

- EncryptIfKeySet / DecryptIfKeySet now return crypto.ErrEncryptionKeyRequired
  (a new exported sentinel, errors.Is-unwrappable) when the key is empty or
  nil, rather than silently emitting plaintext. The (result, wasEncrypted,
  err) tuple signature is preserved for source compatibility; only the
  semantics of the no-key branch changed.

- cmd/server/main.go grows a startup pre-flight check: if no encryption key
  is configured the server lists issuers and targets, counts rows with
  source='database', and refuses to start (os.Exit(1)) if any exist. Operators
  must either configure CERTCTL_CONFIG_ENCRYPTION_KEY or remove the exposed
  rows before the control plane can boot. The warning-only path is retained
  for the clean-slate case (no database rows).

- internal/service/issuer.go's SeedFromEnvVars now guards the encryption call
  with len(s.encryptionKey) > 0 so env-seeded rows (source='env', which are
  reconstructable on every boot from process env) continue to persist as
  plaintext in the 'config' column when no key is configured. Registry load
  already falls through to cfg.Config when EncryptedConfig is nil. GUI/API
  write paths (source='database') remain fail-closed via propagation of
  ErrEncryptionKeyRequired.

- Integration tests that exercise CreateIssuer via the handler layer now
  supply a real 32-byte AES-256 test key so the encrypt path runs instead of
  returning ErrEncryptionKeyRequired. Same pattern in internal/service/
  testutil_test.go for consolidated service-layer tests.

- internal/crypto/encryption_test.go grows regression guards:
  TestEncryptIfKeySet_EmptyKeyFailsClosed (nil_key + empty_key subtests),
  TestDecryptIfKeySet_EmptyKeyFailsClosed (nil_key + empty_key subtests),
  TestEncryptDecryptIfKeySet_RoundTripProducesDifferentCiphertext,
  TestDecryptIfKeySet_RejectsTamperedCiphertext, and
  TestEncryptIfKeySet_PreservesErrEncryptionKeyRequiredSentinel (verifies
  the sentinel unwraps through fmt.Errorf(%w)-style wrapping).

Wire format is unchanged: AES-256-GCM Encrypt/Decrypt/DeriveKey, the
12-byte nonce prefix, the GCM auth tag, the PBKDF2 salt
('certctl-config-encryption-v1'), and the 100,000 iteration count are all
byte-identical. Ciphertexts produced before this change remain decryptable.

Verified:
- go build ./... : clean
- go vet ./...   : clean
- go test -race ./internal/crypto/... ./internal/service/... \
    ./internal/integration/... ./cmd/server/... : pass
- golangci-lint run ./... : 0 issues
- govulncheck ./... : 0 reachable vulnerabilities
- rg 'return plaintext, false, nil' internal/ : no matches
- Coverage: crypto 85.0% (unchanged), service 67.8% (was 67.9%, noise),
  cmd/server 0.0% (unchanged baseline). All above CI thresholds.

See certctl-audit-report.md for the full finding record and resolution log.
2026-04-16 21:10:40 +00:00
shankar0123 b219e5d68a security: use crypto/rand for agent API keys (fixes C-1)
Replaces math/rand-based agent API key generation in internal/service/agent.go
with crypto/rand.Read over a 32-byte buffer encoded with base64.RawURLEncoding,
yielding a 43-character URL-safe unpadded ASCII string (256 bits of entropy).

generateAPIKey now returns (string, error); Register and RegisterAgent propagate
entropy-source failures. hashAPIKey is unchanged — the SHA-256 hashed-at-rest
invariant is preserved.

Fixes C-1 (CWE-338: Use of Cryptographically Weak Pseudo-Random Number Generator)
from certctl-audit-report.md.

Changes:
- internal/service/agent.go: new imports (crypto/rand, encoding/base64);
  generateAPIKey rewritten to return (string, error); Register and RegisterAgent
  updated to propagate the error.
- internal/service/agent_test.go: TestGenerateAPIKey_Properties regression test
  (non-empty, length 43, valid base64url, 32 decoded bytes, no collisions over
  64 calls). No entropy-failure test — Go 1.24+ (issue #66821) makes crypto/rand
  errors fatal, so that branch is defensively unreachable.

Verification:
- go build ./cmd/server/... ./cmd/agent/... ./cmd/mcp-server/... ./cmd/cli/... → pass
- go vet ./... → pass
- go test -race (CI scope, 43 packages) → pass
- golangci-lint v2.11.4 run ./... → 0 issues
- govulncheck ./... → 0 vulnerabilities in certctl code
- Coverage: service 68.9% / handler 83.6% / domain 82.0% / middleware 63.8%
  (all above CI gates 55/60/40/30)
- grep math/rand in internal/ and cmd/ → zero production hits
- No caller assumes the old 32-char length or legacy charset
2026-04-16 19:43:19 +00:00
shankar0123 1f6cf0eafa fix: add npm ci retry and install verification for proxy environments (#9)
npm has a known bug where `npm ci` can crash with "Exit handler never
called!" behind corporate proxies yet exit with code 0. This adds a
single retry on failure and verifies tsc is actually installed before
proceeding to build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:21:47 -04:00
shankar0123 a49eae8155 fix: correct BSL 1.1 change date to March 14, 2033
why-certctl.md said March 1, CHART_SUMMARY.md said March 28. The
LICENSE file is authoritative: Change Date is March 14, 2033.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:12:49 -04:00
shankar0123 1c7d085f16 docs: move maintenance notice and quick start link above Documentation section
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:05:47 -04:00
shankar0123 cc6eec3608 fix: merge npm install + build into single Docker layer (#9)
The previous fix (--include=dev) was necessary but insufficient. The
real issue is that node_modules created by npm ci in one layer can be
lost when COPY web/ . creates the next layer — depending on the Docker
storage driver (fuse-overlayfs, vfs). Merging install and build into a
single RUN eliminates the layer boundary entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:52:50 -04:00
shankar0123 86fb140414 fix: ensure devDependencies install in Docker build (#9)
npm ci skips devDependencies when NODE_ENV=production leaks from the
host environment into the Docker build. This breaks the frontend stage
because typescript and vite are devDependencies. Adding --include=dev
makes the install hermetic regardless of host environment.

Closes #9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:00:06 -04:00
shankar0123 13cd4d98ba feat(V2.2): bulk revocation — filter-based fleet-wide certificate revocation
Add POST /api/v1/certificates/bulk-revoke with filter criteria (profile_id,
owner_id, agent_id, issuer_id, team_id, certificate_ids), partial-failure
tolerance, and audit trail. Includes MCP tool, CLI command (certs bulk-revoke),
server-side bulk modal in GUI replacing client-side sequential loop, OpenAPI
spec, compliance mapping updates, and 21 new tests (12 service, 7 handler,
1 CLI, 1 frontend).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 00:06:34 -04:00
shankar0123 84bc1245a1 fix: case-insensitive issuer type validation + missing M49 types (#7)
Backend rejected lowercase type strings (e.g., "acme") sent by older
cached frontends. Add normalizeIssuerType() with alias map for
case-insensitive lookup, wire into both Create paths. Add missing
Entrust/GlobalSign/EJBCA to validIssuerTypes. Add lowercase fallbacks
to issuer factory switch. 39 new test subtests covering normalization,
lowercase create flows, and M49 type acceptance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 23:20:32 -04:00
shankar0123 e1bcde4cf1 feat(M50): cloud secret manager discovery — AWS SM, Azure KV, GCP SM
Extend certificate discovery from filesystem + network to cloud secret
managers. Three pluggable DiscoverySource connectors feed into the
existing discovery pipeline via sentinel agent pattern, with a 9th
scheduler loop for periodic cloud scanning.

- AWS Secrets Manager: aws-sdk-go-v2, tag/prefix filtering, 10 tests
- Azure Key Vault: stdlib HTTP + OAuth2, base64 DER/PEM, 16 tests
- GCP Secret Manager: stdlib HTTP + JWT OAuth2, label filter, 14 tests
- CloudDiscoveryService orchestrator with 9 tests
- 9th scheduler loop (6h default, atomic.Bool idempotency)
- Discovery page: color-coded source type badges
- 14 new env vars across CloudDiscoveryConfig structs
- Docs: connectors.md, architecture.md, features.md, README updated

49 new tests. All CI checks pass (go vet, race, lint, coverage).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 23:01:00 -04:00
shankar0123 3f619bcaac feat(M49): Entrust, GlobalSign & EJBCA issuer connectors
Add three new issuer connectors completing commercial and open-source CA
coverage. Entrust uses mTLS client certificate auth with sync/async
issuance. GlobalSign Atlas uses mTLS + API key/secret dual auth with
serial-based tracking. EJBCA supports dual auth (mTLS or OAuth2) for
self-hosted Keyfactor CAs.

Each connector implements the full issuer.Connector interface (9 methods),
includes httptest-based unit tests (~14 each), and follows established
patterns (injectable HTTP clients, RFC 5280 revocation reason mapping,
CRL/OCSP delegated to CA).

Also includes: issuer factory cases, env var seeding, config structs,
domain types, seed data (3 rows, all disabled), OpenAPI enum updates,
frontend issuer catalog entries with config fields, and full docs
(connectors.md, architecture.md, features.md, README).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 22:24:12 -04:00
shankar0123 f3a85d6b08 fix: remove unused createTestCert function in tlsprobe tests
golangci-lint (unused linter) flagged createTestCert as dead code —
only createTestCertWithKey is called by the actual tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:54:38 -04:00
shankar0123 596d86a206 feat(M48): continuous TLS health monitoring — endpoint state machine, shared tlsprobe, 8 API endpoints, GUI
Adds continuous TLS endpoint health monitoring that closes the deploy→verify→monitor loop.
After M25 verifies a deployment succeeded once, M48 continuously confirms it stays healthy.

Key components:
- Shared `internal/tlsprobe/` package extracted from network scanner for reuse
- Health status state machine: healthy → degraded (2 failures) → down (5 failures),
  plus cert_mismatch when served fingerprint differs from expected
- 8th scheduler loop (60s tick, per-endpoint configurable intervals)
- PostgreSQL migration 000011: endpoint_health_checks + endpoint_health_history tables
- 8 REST API endpoints (CRUD, history, acknowledge, summary)
- Health Monitor GUI page with summary bar, status table, create modal, auto-refresh
- 38 new tests (5 tlsprobe + 11 domain + 10 service + 8 handler + 4 frontend)
- All coverage thresholds maintained (service 68%, handler 83%, domain 87%, middleware 63%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:45:45 -04:00
shankar0123 f2e60b93a3 feat(M11c): crypto policy enforcement — CSR validation, MaxTTL caps, key metadata
Enforce certificate profile crypto constraints across all 5 issuance paths
(renewal, agent CSR, EST, SCEP). ValidateCSRAgainstProfile() rejects CSRs
with key algorithm/size that don't match profile rules. MaxTTL enforcement
caps certificate validity per issuer connector (Local CA, Vault, step-ca
enforce directly; ACME/DigiCert/Sectigo pass through). Key algorithm and
size are now persisted in certificate_versions for audit compliance.

16 new tests (12 service-layer + 4 Local CA connector). Removes hardcoded
version number from GUI sidebar. Documentation updated across architecture,
features, connectors, and README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:05:14 -04:00
shankar0123 f16a9c767a docs: consolidate README — merge architecture, security, design decisions into Why certctl
Fold Architecture, Key Design Decisions, and Security sections into the
Why certctl section as bold-header paragraphs. Removes three standalone
sections, tightening the README structure: Documentation → Integrations →
Why certctl (with architecture, security, design decisions) → What It Does →
Quick Start → Examples → CLI → MCP → Development → Roadmap → License.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:06:43 -04:00
shankar0123 3a27c87b3f docs: move Supported Integrations under Documentation links in README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:03:11 -04:00
shankar0123 0ed8676066 docs: rewrite README to highlight all adoption-driving features
Move documentation table to top (below Gantt chart). Condense screenshots
to 4 key images with "see all" link. Add Enrollment Protocols and
Standards & Revocation tables. Surface previously buried features:
dynamic GUI config, onboarding wizard, approval workflows, agent groups,
TLS verification, certificate export, SCEP, revocation infrastructure.
Fix stale numbers (26 pages, 111 routes) verified against repo source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:00:09 -04:00
shankar0123 bcefb11e65 feat(M51): add SCEP server (RFC 8894) for MDM and network device enrollment
Implements Simple Certificate Enrollment Protocol with single-endpoint
operation-based dispatch (GetCACaps, GetCACert, PKIOperation), PKCS#7
SignedData CSR extraction with fallback for raw/base64 CSR, challenge
password authentication via CSR attributes, and shared internal/pkcs7
package extracted from EST handler to eliminate code duplication.

24 new tests (11 service + 13 handler) plus 5 shared pkcs7 package tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 16:47:18 -04:00
shankar0123 75cf8475f5 tighten BSL license scope, fix documentation underselling shipped features
Broadened BSL Additional Use Grant from "hosted or managed service" to cover
any commercial offering (embedded, bundled, integrated). Updated README to
promote all shipped connectors from Beta to Implemented, added EST/ARI/S/MIME
highlight, Helm quickstart, and corrected license description. Fixed
connectors.md stale claims (AWS ACM PCA listed as planned, K8s Secrets
listed as coming soon) and updated overview with exact connector counts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 15:54:03 -04:00
shankar0123 c015cab2f4 docs: rewrite features.md, audit README + architecture against repo
Rewrote docs/features.md from scratch as authoritative feature inventory
(1255 lines, every claim verified against source files).

Audited README.md and architecture.md against repo — fixed 19 stale
references: K8s Secrets status, issuer counts, dashboard page counts,
CI thresholds, missing connectors in Mermaid diagrams, OpenAPI operation
count, GetCACertPEM behavior, and V2/V4 roadmap accuracy.

Also includes related fixes discovered during audit:
- Scheduler skips expired/failed/revoked certs from auto-renewal
- Seed demo expiry dates moved outside 31-day scheduler query window
- Agent pages use correct last_heartbeat_at field name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 00:22:57 -04:00
shankar0123 3da6584ab8 fix: correct K8s Secrets status to 'Coming in 2.1', increase audit trail page size to 200
The Kubernetes Secrets target connector has config validation, tests, UI,
and Helm RBAC implemented but the realK8sClient is a stub — runtime
deployment will fail. Update README and connectors.md to reflect actual
status instead of misleading 'Beta' label.

Also increase the audit trail GUI default from 50 to 200 events per page
(backend already permits up to 500).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 12:11:01 -04:00
shankar0123 68f6fd474b fix: return 409 on duplicate issuer name, improve error handling and onboarding defaults
Closes #7. The issuer create/update handlers swallowed all service errors
as generic 500s. Now differentiates: 409 for UNIQUE constraint violations,
400 for unsupported issuer type, 404 for not-found on update, 500 for
unknown errors. Adds structured error logging via slog.

OnboardingWizard now pre-populates config field defaults when a type is
selected (matching IssuersPage behavior), preventing empty required fields
from causing silent failures.

install-agent.sh hardened for curl|bash usage: --agent-id flag, =value
syntax, /dev/tty stdin reopening, proper stderr routing in download_binary,
non-interactive install examples in help text, and updated wizard commands.

Adds adversarial security tests for EST, path traversal, and query
injection handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-12 19:18:32 -04:00
shankar0123 614e4e636b chore: bump Go to 1.25.9 to patch 4 stdlib CVEs
Go 1.25.9 (released Apr 7 2026) fixes:
- GO-2026-4947: unexpected work during chain building in crypto/x509
- GO-2026-4946: inefficient policy validation in crypto/x509
- GO-2026-4870: unauthenticated TLS 1.3 KeyUpdate DoS in crypto/tls
- GO-2026-4865: JsBraceDepth context tracking XSS in html/template

Update CI workflow and go.mod to pin 1.25.9. govulncheck now reports
0 vulnerabilities in called code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 23:33:25 -04:00
shankar0123 370f856725 fix: resolve 8 staticcheck lint errors in test files
SA1029: use typed context key instead of string in main_test.go
S1039: remove unnecessary fmt.Sprintf in validation_test.go
SA4023: fix unreachable nil check on concrete error type
SA4006: fix unused variable assignments in stepca_test.go (4 occurrences)
SA4000: fix duplicate expression in ssh_test.go (BEGIN vs END CERTIFICATE)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 23:27:57 -04:00
shankar0123 7382e5f03b test: comprehensive test gap closure across 24 packages
Close coverage gaps identified by dual-audit (qualitative + quantitative).
New test files for config (0%→98%), router (0%→100%), handler validation,
health, audit, response helpers, webhook notifier (0%→88%), email notifier,
middleware (recovery, rate limiter), domain profile, service nil-safety,
config helpers, issuer bootstrap, and server bootstrap wiring. Expanded
existing tests for ACME (34%→42%), step-ca (42%→52%), F5, SSH, agent
(43%→63%), scheduler (88%→99%), renewal service, and issuerfactory.

All tests pass: go test -short, go vet, go test -race clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 23:09:40 -04:00
shankar0123 5567d4b411 feat(M47): add Kubernetes Secrets target + AWS ACM PCA issuer connectors
Implement both M47 connectors with full cross-layer wiring:

Kubernetes Secrets target: DNS-1123 validation, kubernetes.io/tls Secret
create-or-update, chain concatenation, serial number validation, Helm
RBAC gating. 18 tests.

AWS ACM Private CA issuer: synchronous issuance (like Vault), ARN regex
validation, RFC 5280 revocation reason mapping, CA cert retrieval,
factory + env var seeding. 23 tests.

Cross-cutting: domain types, service validation, config, factory, agent
dispatch, frontend (TargetsPage, issuerTypes), OpenAPI, seed data, Helm
chart, connectors docs, README. Testing docs (testing-guide, qa-test-guide,
qa_test.go) with Parts thematically integrated near related connectors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 20:21:09 -04:00
shankar0123 e5516d7286 test: add unified QA test suite (qa_test.go) replacing legacy bash smoke script
1717-line Go test file covering all 52 Parts of testing-guide.md against the
Docker Compose demo stack. ~120 automated subtests (API, DB, source, perf),
11 skipped Parts with reasons, ~270 manual gaps documented. Audited against
actual router, seed data, domain structs, and migrations — 8 factual bugs
caught and fixed during review. Companion guide at docs/qa-test-guide.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 07:35:38 -04:00
shankar0123 fd94e0bd19 docs: comprehensive testing guide audit — expand thin Parts, add 11 new connector/feature test sections
Refactored testing-guide.md from V2.0 (42 Parts, 444 tests) to V2.1 (52 Parts, 507 tests):

- Expanded Part 11 (ARI) and Part 19 (Agent Work Routing) with What/Why intro
  paragraphs and per-test annotations explaining the production impact
- Replaced Part 40 (Documentation) passive table with 8 executable verification
  tests (README screenshots, issuer/target type matching, OpenAPI parity, etc.)
- Added Part 39 benchmark tests for Prometheus endpoint and audit trail queries
- Added 11 new Part sections (42-52) covering all previously untested features:
  Envoy, Postfix/Dovecot, SSH, WinCertStore, JavaKeystore, Digest Email,
  Dynamic Issuer/Target Config, Onboarding Wizard, ACME Profiles, Helm Chart
- Fixed stale TOC entries (regenerated from actual headings)
- Removed duplicate TOC block left from previous reorder
- Added sign-off chart entries for all new Parts
- Updated summary: 144 auto (passed) + 88 auto (pending) + 5 skipped + 270 manual = 507 total

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 00:43:05 -04:00
shankar0123 d0415d3b5e chore: move HSM/TPM to V3 paid tier, rename roadmap.md to strategy.md
- HSM/TPM agent key storage and CA key storage moved from V5+ to V3 Pro
  (enterprise compliance gate, not adoption driver)
- Renamed roadmap.md to strategy.md (gitignored, never committed)
- Updated compliance-nist.md HSM references from V5 to V3 Pro

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 23:09:55 -04:00
shankar0123 c6efa4ab39 docs: add Docker Compose environments guide and fix compose files
- New deploy/ENVIRONMENTS.md: comprehensive walkthrough of all 4 compose
  files with service-by-service explanations, beginner-friendly Docker
  concepts, and expert-level networking/config details
- Fix docker-compose.dev.yml: agent LOG_LEVEL → CERTCTL_LOG_LEVEL (was
  silently ignored without the CERTCTL_ prefix)
- Add CERTCTL_CONFIG_ENCRYPTION_KEY to base and test compose (enables
  M34/M35 dynamic issuer/target config encryption)
- Add CERTCTL_DISCOVERY_DIRS to base compose agent (enables filesystem
  certificate discovery in default deployment)
- Cross-link ENVIRONMENTS.md from README doc table and quickstart.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:57:17 -04:00
shankar0123 dedf7fa3a9 docs: add quick-start jump link near top of README
Adds a one-line "Ready to try it?" link right after the maintainer
callout, before the longer prose sections. Gives scanners an immediate
exit to install instructions without rearranging the README's
explain → show → install flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:38:34 -04:00
shankar0123 4b5927dfff docs: expand README documentation table and fix orphaned doc links
- README: Add 7 missing docs to documentation table (MCP server, OpenAPI
  guide, migration guides for certbot/acme.sh/cert-manager, test
  environment, testing guide). Fix connector reference description to
  remove stale counts. Link OpenAPI guide instead of raw YAML.
- architecture.md: Add cross-references to testing-guide.md and
  test-env.md from testing strategy section and What's Next links.
  These were the only two orphaned docs with zero inbound references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:37:47 -04:00
shankar0123 cc03f55006 docs: comprehensive documentation audit — fix stale counts, V2/V3 matrix, connector status
- features.md: Fix Feature Matrix to correctly show all V2 Free features
  (F5/IIS/WinCertStore/JavaKeystore as Implemented, not Stub; Vault/DigiCert/
  Sectigo/GoogleCAS as V2 Free, not V3 Paid). Add missing shipped features
  (EST, verification, export, S/MIME, ARI, digest, Helm, onboarding). Update
  issuer count to 9, target count to 13.
- architecture.md: Fix F5/IIS from "interface only, implementation planned"
  to implemented. Add all 13 target connectors to built-in targets list.
- why-certctl.md: Add Sectigo and Google CAS to issuer list (7→9). Fix
  target count (10→13). Remove hardcoded endpoint/operation counts.
- connectors.md: Fix F5 BIG-IP TOC entry from "Interface Only" to
  "Implemented". Remove dead "Planned Issuers" TOC link.
- README.md: Remove competitor product names (CertKit, KeyTalk). Remove
  hardcoded dashboard page count. Remove hardcoded endpoint counts. Fix V4
  roadmap to remove already-shipped issuers (Sectigo, Google CAS).
- Remove hardcoded MCP tool counts (78/80) across 8 files (mcp.md,
  architecture.md, features.md, testing-guide.md, concepts.md, quickstart.md,
  demo-advanced.md, why-certctl.md). Replace with "REST API exposed via MCP"
  to avoid future drift.
- quickstart.md: Docker Compose environments table (from previous session).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:33:12 -04:00
shankar0123 93e1dc598c fix: resolve frontend-to-backend mapping gaps across API types, config fields, and issuer IDs
Full audit of all ~100 backend API endpoints against frontend client functions
and TypeScript interfaces. Fixes field name mismatches, missing client functions,
phantom interface fields, type coercion for Go bool/int config fields, and
issuer type ID alignment with backend domain constants.

Backend:
- issuer.go/target.go: GUI-created entities default enabled=true (Go bool
  zero value was overriding DB DEFAULT)

Frontend types (types.ts):
- Certificate: fingerprint→fingerprint_sha256, phantom fields made optional
- CertificateVersion: fingerprint→fingerprint_sha256, chain_pem→pem_chain,
  removed phantom version/cert_pem fields
- Job: error_message→last_error (matches Go json tag)

Frontend client (client.ts):
- Added getNotification(id) and getAuditEvent(id) for existing backend routes

Frontend pages:
- CertificateDetailPage: derives serial/fingerprint/issuedAt from latest
  CertificateVersion instead of empty Certificate fields
- JobsPage/JobDetailPage: error_message→last_error
- TargetsPage: reload_cmd→reload_command, validate_cmd→validate_command,
  added missing config fields per backend structs (validate_command for
  NGINX/Apache, hostname/winrm_timeout for IIS, private_key/passphrase/
  cert_mode/key_mode for SSH, winrm_https/winrm_insecure for WinCertStore,
  create_keystore for JavaKeystore, mode for Dovecot), type coercion via
  buildConfigPayload() with BOOL_FIELDS/INT_FIELDS sets, IIS WinRM nesting
- TargetDetailPage: added passphrase to sensitiveKeys redaction
- issuerTypes.ts: type IDs aligned to backend constants (acme→ACME,
  local→GenericCA, stepca→StepCA, openssl→OpenSSL), backward compat aliases
  preserved, step-ca config fields updated to match backend struct

Utilities (utils.ts):
- formatDate/formatDateTime accept string|undefined|null

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:09:48 -04:00
shankar0123 25f33b830f fix: resolve golangci-lint issues in wincertstore connector
Remove unnecessary fmt.Sprintf wrapping a string literal (staticcheck S1039),
remove unused tempFileForPFX function, and clean up unused os import.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 19:16:34 -04:00
shankar0123 7d6ef44e21 feat(M46): Windows Certificate Store + Java Keystore target connectors, shared certutil package
Extract shared certutil helpers (CreatePFX, ParsePrivateKey, ComputeThumbprint,
GenerateRandomPassword, ParseCertificatePEM) from IIS connector for reuse.
Add WinCertStore connector (PowerShell Import-PfxCertificate, dual local/WinRM
mode, configurable store/location, expired cert cleanup) and JavaKeystore
connector (PEM→PKCS#12→keytool pipeline, JKS/PKCS12 support, shell injection
prevention, path traversal protection). 53 new tests, all passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 19:14:32 -04:00
shankar0123 dfa4dbbcbd fix: remove unused jwkThumbprint, move verifyJWSSignature to test file
golangci-lint flagged jwkThumbprint as unused. Removed it and the dead
var _ compile-time checks. Moved verifyJWSSignature (test-only helper)
from profile.go to profile_test.go where it belongs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 13:58:40 -04:00
shankar0123 f92c997a50 feat(M45): ACME certificate profile selection, ARI RFC 9773 renumber, 45-day renewal positioning
Three related ACME ecosystem changes shipped as a single milestone:

1. ACME Certificate Profile Selection: Custom JWS-signed newOrder POST with
   `profile` field (e.g., `tlsserver`, `shortlived` for 6-day certs) bypassing
   acme.Client.AuthorizeOrder() since golang.org/x/crypto lacks profile support.
   ES256 JWS signing with kid mode, nonce management, directory discovery.
   Empty profile delegates to standard library path (zero behavior change).
   Configurable via CERTCTL_ACME_PROFILE env var. GUI: profile dropdown on
   ACME issuer config.

2. ARI RFC 9702 → 9773 Renumber: All 25+ references updated across Go source,
   docs, README, and examples. Zero remaining occurrences of RFC 9702.

3. 45-Day / Short-Lived Certificate Positioning: 5 domain tests validating
   renewal thresholds against SC-081v3 validity reduction timeline (200→100→47
   days) and Let's Encrypt 45-day/6-day profiles. ARI (RFC 9773) is the
   expected renewal path for 6-day shortlived certs.

New tests: 13 profile + 5 domain threshold + 1 frontend = 19 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 13:52:13 -04:00
shankar0123 697c0be9f3 feat(M38): SSH target connector for agentless deployment via SSH/SFTP
Adds a new target connector enabling certificate deployment to any
Linux/Unix server without installing the certctl agent binary. Uses the
proxy agent pattern — a single agent in the same network zone deploys
certs to remote servers over SSH/SFTP.

Key additions:
- SSH/SFTP connector with key auth (file/inline) + password auth
- Injectable SSHClient interface for cross-platform testing (25 tests)
- Shell injection prevention via validation.ValidateShellCommand()
- Configurable cert/key/chain paths with octal permissions
- GUI: 11 SSH config fields in target create wizard

Also fixes pre-existing frontend bug where all target type strings
(nginx, apache, etc.) were sent as lowercase but the backend expects
proper-case (NGINX, Apache, etc.), breaking GUI-created targets.
Adds missing TargetTypeSSH to validTargetTypes service map.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 12:36:01 -04:00
shankar0123 8f146e08d6 feat(M36): onboarding wizard for first-run experience
4-step wizard (Connect CA → Deploy Agent → Add Certificate → Done) shown
on fresh installs when no user-configured issuers or certificates exist.
Auto-seeded env var issuers (source="env") are excluded from first-run
detection. Wizard state latches to prevent query refetches from dismissing
it mid-flow. Split docker-compose into clean default (wizard-compatible)
and demo override (seed_demo.sql). Added missing migrations 000009/000010
to test compose.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 19:27:01 -04:00
shankar0123 e6088c79a3 feat(M35): dynamic target configuration with encrypted config, test connection, and GUI updates
Mirror M34's dynamic issuer config pattern for deployment targets: AES-256-GCM
encrypted config storage, sensitive field redaction in API responses, agent
heartbeat-based test connection endpoint, and full frontend updates including
test status indicators, source badges, and removal of stale hostname/status
fields from the Target interface.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 01:09:53 -04:00
shankar0123 e19b8c95fe docs: remove hardcoded test counts from public-facing docs
Replace brittle test count numbers (1,554+, 1,088+, 211, etc.) with
descriptions of testing approach and CI-enforced coverage gates.
Counts go stale every milestone — coverage thresholds are machine-
verified and never drift.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 00:20:22 -04:00
shankar0123 995b72df05 feat(M34): dynamic issuer configuration with encrypted config storage
Replace static env-var-based issuer wiring with GUI-driven dynamic
configuration stored encrypted in PostgreSQL. Operators can now
configure, test, enable/disable, and manage issuers from the dashboard
without restarting the server.

Key changes:
- AES-256-GCM encryption for sensitive issuer config at rest (PBKDF2
  key derivation with 100k iterations)
- Dynamic IssuerRegistry with sync.RWMutex replacing static map
- Connector factory pattern (issuerfactory.NewFromConfig) replacing
  140 lines of static wiring in main.go
- Migration 000009: encrypted_config, last_tested_at, test_status,
  source columns on issuers table
- Env var seeding on first boot with ON CONFLICT DO NOTHING
- Registry Rebuild() for atomic map swap after CRUD operations
- Issuer type validation against domain constants on Create
- Audit trail for test connection results
- Conditional seeding for step-ca/OpenSSL (only when env vars set)
- GUI: source badge, connection test status on issuer detail page

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 00:20:13 -04:00
shankar0123 9954fd1100 fix: remove unused installKeyErrOn field for golangci-lint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 22:29:34 -04:00
shankar0123 2a14a1da01 feat(M40): F5 BIG-IP target connector via iControl REST
Replace 190-line stub with full iControl REST implementation (~580 lines).
Token auth with 401 auto-retry, file upload + crypto object install,
transaction-based atomic SSL profile updates, cleanup on failure.
Injectable F5Client interface for cross-platform testing. 32 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 22:26:58 -04:00
shankar0123 5a53b648b1 feat(M44): Google CAS issuer connector
Google Cloud Certificate Authority Service integration via REST API
with OAuth2 service account auth (JWT→access token). Synchronous
issuance model, CA pool selection, mutex-guarded token caching,
revocation with RFC 5280 reason mapping. No Google SDK dependency —
all stdlib. 19 tests with httptest mock OAuth2 + CAS API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 21:25:34 -04:00
shankar0123 cb72292b83 fix: use tagged switch for staticcheck QF1002 in sectigo tests
Convert 3 untagged switch statements to tagged `switch r.URL.Path {}`
form to satisfy staticcheck QF1002. No behavioral change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 21:08:21 -04:00
shankar0123 3a11e447cf feat(M43): Sectigo SCM issuer connector
Implement Sectigo Certificate Manager REST API connector with async
order model (enroll → poll → collect PEM), 3-header auth, DV/OV/EV
support, collect-not-ready (400/-183) graceful handling, and RFC 5280
revocation reason mapping. 20 tests with httptest mock API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 21:01:14 -04:00
shankar0123 bad02e6f23 docs: add deployment examples index and cross-link migration guides
Create docs/examples.md as the central entry point for all 5 turnkey
docker-compose scenarios with a decision matrix, per-example summaries,
and contextual migration guide links. Update quickstart.md to bridge
from demo to real deployment. Consolidate README docs table (10 rows
from 13). Fix Vault PKI "(planned)" in cert-manager guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 17:41:23 -04:00
shankar0123 4c3b7cbb16 docs: fix stale references, seed data case bugs, and convert ASCII diagrams to Mermaid
Audit all docs and examples against current codebase state. Fix seed_demo.sql
domain constant casing (IssuerType, TargetType, AgentStatus) that would cause
agent dispatch failures. Fix example docker-compose health endpoints (/health
not /api/v1/health) and env var names (CERTCTL_DATABASE_URL). Update connector
counts, test numbers, and planned→implemented status across docs. Convert 3
ASCII flow diagrams to Mermaid.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 16:11:42 -04:00
shankar0123 e8c64b47dd docs: rewrite why-certctl positioning page
Fix stale competitive claims (IIS shipped in M39, target count now 10),
add 47-day operational math as forcing function, add credibility signals
(1554 tests, 97 API operations, CI pipeline), restructure competitive
comparisons by category for scannability, add "What Else Ships Free"
feature surface section, add "Who Should Look Elsewhere" disqualification,
move ownership message to opening paragraph.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 15:50:41 -04:00
shankar0123 9feb6c796d feat(M42): Postfix/Dovecot mail server target connector
Dual-mode TLS connector for mail servers — single package with mode
field selecting Postfix or Dovecot defaults. File-based cert/key
deployment with correct permissions (cert 0644, key 0600), optional
chain append, shell injection prevention, and configurable
reload/validate commands. 18 tests covering config validation,
deployment, and security. GUI wizard fields and OpenAPI enum updated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 01:46:15 -04:00
shankar0123 fd05bacb76 feat(M41): Envoy target connector with SDS support
File-based deployment for Envoy service mesh — writes cert/key/chain
to watched directory with optional SDS JSON config for xDS bootstrap.
Path traversal prevention, configurable filenames, 15 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 01:23:35 -04:00
shankar0123 f51571297d docs: update README for M39 WinRM completion
Update test count (1,521+), IIS target description (local + WinRM),
architecture section (proxy agent mention), and integration list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 21:00:39 -04:00
shankar0123 9a41d0ca39 feat(M39): IIS WinRM proxy agent mode + front-to-back wiring
Complete the IIS target connector with dual-mode deployment:
- WinRM proxy agent mode via masterzen/winrm for remote Windows servers
- Base64 PFX transfer with try/finally cleanup on remote host
- GUI wizard updated with 13 IIS config fields including WinRM settings
- TargetDetailPage sensitive field redaction (password/secret/token/key)
- OpenAPI TargetType enum updated (added Traefik, Caddy)
- connectors.md fully documented with WinRM proxy config example
- 38 total IIS tests (10 new WinRM tests), all passing with race detection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 20:53:20 -04:00
shankar0123 8b52da6aef feat(M39): IIS target connector + README overhaul
Implement full IIS target connector with PEM-to-PFX conversion via
go-pkcs12, PowerShell-based deployment (Import-PfxCertificate, IIS
binding management), SHA-1 thumbprint computation, and SNI support.
Injectable PowerShellExecutor interface enables cross-platform testing.
Regex-validated config fields prevent PowerShell injection. 28 tests.

Restructure README from 563 to 313 lines: outcome-focused feature
descriptions, "Who Is This For" persona section, examples promoted
above the fold, configuration/API/security reference moved to docs.
All numbers verified against repo (25 GUI pages, 97 OpenAPI ops,
CI thresholds service 55%/handler 60%/domain 40%/middleware 30%).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 20:27:27 -04:00
312 changed files with 68468 additions and 7372 deletions
+14 -3
View File
@@ -19,7 +19,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.25'
go-version: '1.25.9'
- name: Go Build
run: |
@@ -45,11 +45,11 @@ jobs:
run: govulncheck ./...
- name: Race Detection
run: go test -race ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/scheduler/... ./internal/connector/... ./internal/domain/... ./internal/validation/... -count=1 -timeout 300s
run: go test -race ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/scheduler/... ./internal/connector/... ./internal/crypto/... ./internal/domain/... ./internal/validation/... ./internal/tlsprobe/... -count=1 -timeout 300s
- name: Go Test with Coverage
run: |
go test ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/integration/... ./internal/connector/issuer/... ./internal/connector/target/... ./internal/connector/notifier/... ./internal/mcp/... ./internal/cli/... ./internal/domain/... ./internal/validation/... -count=1 -cover -coverprofile=coverage.out
go test ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/integration/... ./internal/connector/issuer/... ./internal/connector/target/... ./internal/connector/notifier/... ./internal/connector/discovery/... ./internal/crypto/... ./internal/mcp/... ./internal/cli/... ./internal/domain/... ./internal/validation/... ./internal/tlsprobe/... -count=1 -cover -coverprofile=coverage.out
- name: Check Coverage Thresholds
run: |
@@ -73,6 +73,13 @@ jobs:
MIDDLEWARE_COV=$(go tool cover -func=coverage.out | grep 'internal/api/middleware' | awk '{print $NF}' | sed 's/%//' | awk '{sum+=$1; n++} END {if(n>0) printf "%.1f", sum/n; else print "0"}')
echo "Middleware layer coverage: ${MIDDLEWARE_COV}%"
# Check crypto package coverage (target: 85%+)
# M-8 rationale: encryption primitives are a security-critical gate.
# v2 format, key-derivation, fallback, and fail-closed sentinel paths
# all need exhaustive coverage to avoid silent regressions (CWE-916 / CWE-329).
CRYPTO_COV=$(go tool cover -func=coverage.out | grep 'internal/crypto' | awk '{print $NF}' | sed 's/%//' | awk '{sum+=$1; n++} END {if(n>0) printf "%.1f", sum/n; else print "0"}')
echo "Crypto package coverage: ${CRYPTO_COV}%"
# Fail if thresholds not met
if [ "$(echo "$SERVICE_COV < 55" | bc -l)" -eq 1 ]; then
echo "::error::Service layer coverage ${SERVICE_COV}% is below 55% threshold"
@@ -90,6 +97,10 @@ jobs:
echo "::error::Middleware layer coverage ${MIDDLEWARE_COV}% is below 30% threshold"
exit 1
fi
if [ "$(echo "$CRYPTO_COV < 85" | bc -l)" -eq 1 ]; then
echo "::error::Crypto package coverage ${CRYPTO_COV}% is below 85% threshold"
exit 1
fi
echo "Coverage thresholds passed!"
- name: Upload Coverage Report
+292 -43
View File
@@ -7,40 +7,30 @@ on:
env:
REGISTRY: ghcr.io
GO_VERSION: '1.22'
# Keep in lock-step with .github/workflows/ci.yml (M-3).
GO_VERSION: '1.25.9'
IMAGE_NAMESPACE: shankar0123
jobs:
# Cross-compile agent and server binaries for multiple platforms
# ----------------------------------------------------------------------
# build-binaries (M-3): matrix build every (binary × OS × arch) tuple.
# For each tuple we produce: the binary, a SPDX-JSON SBOM, a keyless
# Cosign signature + certificate bundle, and a single-line sha256sum
# file. All artefacts are uploaded to a workflow-scoped artifact; the
# aggregate-checksums job fans them back in for release upload.
# ----------------------------------------------------------------------
build-binaries:
name: Build Cross-Platform Binaries
name: Build ${{ matrix.binary }} (${{ matrix.os }}/${{ matrix.arch }})
runs-on: ubuntu-latest
permissions:
contents: write
contents: read
id-token: write # Cosign keyless OIDC identity token
strategy:
fail-fast: false
matrix:
include:
# Agent binaries (4 platforms)
- os: linux
arch: amd64
binary: agent
- os: linux
arch: arm64
binary: agent
- os: darwin
arch: amd64
binary: agent
- os: darwin
arch: arm64
binary: agent
# Server binaries (2 platforms)
- os: linux
arch: amd64
binary: server
- os: linux
arch: arm64
binary: server
binary: [agent, server, cli, mcp-server]
os: [linux, darwin]
arch: [amd64, arm64]
steps:
- uses: actions/checkout@v4
@@ -51,35 +41,174 @@ jobs:
- name: Extract version from tag
id: version
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> "$GITHUB_OUTPUT"
- name: Build ${{ matrix.binary }} binary (${{ matrix.os }}-${{ matrix.arch }})
- name: Build binary
id: build
env:
GOOS: ${{ matrix.os }}
GOARCH: ${{ matrix.arch }}
CGO_ENABLED: 0
CGO_ENABLED: '0'
VERSION: ${{ steps.version.outputs.VERSION }}
run: |
set -euo pipefail
OUTPUT_NAME="certctl-${{ matrix.binary }}-${{ matrix.os }}-${{ matrix.arch }}"
go build -ldflags="-w -s -X main.Version=${{ steps.version.outputs.VERSION }}" \
mkdir -p dist
go build \
-trimpath \
-ldflags="-w -s -X main.Version=${VERSION}" \
-o "dist/${OUTPUT_NAME}" \
"./cmd/${{ matrix.binary }}"
ls -lh "dist/${OUTPUT_NAME}"
echo "output_name=${OUTPUT_NAME}" >> "$GITHUB_OUTPUT"
- name: Upload binaries to release
- name: Generate SBOM (SPDX-JSON)
uses: anchore/sbom-action@e22c389904149dbc22b58101806040fa8d37a610 # v0.24.0
with:
file: dist/${{ steps.build.outputs.output_name }}
format: spdx-json
output-file: dist/${{ steps.build.outputs.output_name }}.sbom.spdx.json
upload-artifact: false
upload-release-assets: false
- name: Install Cosign
uses: sigstore/cosign-installer@cad07c2e89fa2edd6e2d7bab4c1aa38e53f76003 # v4.1.1
- name: Keyless-sign binary with Cosign
env:
OUTPUT_NAME: ${{ steps.build.outputs.output_name }}
run: |
set -euo pipefail
# Cosign v3.0 (shipped by cosign-installer@v4.1.1 default
# cosign-release=v3.0.5) removed --output-signature/--output-certificate
# on sign-blob. The replacement is --bundle, which emits a unified
# Sigstore bundle (signature + cert chain + Rekor inclusion proof) as
# a single .sigstore.json artefact. M-11.
cosign sign-blob \
--yes \
--bundle "dist/${OUTPUT_NAME}.sigstore.json" \
"dist/${OUTPUT_NAME}"
- name: Compute SHA-256 sidecar
env:
OUTPUT_NAME: ${{ steps.build.outputs.output_name }}
run: |
set -euo pipefail
cd dist
sha256sum "${OUTPUT_NAME}" > "${OUTPUT_NAME}.sha256"
cat "${OUTPUT_NAME}.sha256"
- name: Upload build artefacts
uses: actions/upload-artifact@v4
with:
name: binary-${{ steps.build.outputs.output_name }}
path: |
dist/${{ steps.build.outputs.output_name }}
dist/${{ steps.build.outputs.output_name }}.sigstore.json
dist/${{ steps.build.outputs.output_name }}.sbom.spdx.json
dist/${{ steps.build.outputs.output_name }}.sha256
if-no-files-found: error
retention-days: 7
# ----------------------------------------------------------------------
# aggregate-checksums (M-3): fan in every matrix artefact, produce a
# single checksums.txt (sha256sum format, compatible with `sha256sum
# -c`), sign it with Cosign, upload everything to the GitHub Release,
# and emit a base64-encoded hash manifest for the SLSA generator.
# ----------------------------------------------------------------------
aggregate-checksums:
name: Aggregate checksums & sign
runs-on: ubuntu-latest
needs: [build-binaries]
permissions:
contents: write
id-token: write # Cosign keyless OIDC identity token
outputs:
hashes: ${{ steps.hashes.outputs.hashes }}
steps:
- name: Download binary artefacts
uses: actions/download-artifact@v4
with:
pattern: binary-*
path: artifacts
merge-multiple: true
- name: Aggregate SHA-256 sums
id: hashes
run: |
set -euo pipefail
cd artifacts
: > checksums.txt
for f in certctl-*; do
case "$f" in
*.sigstore.json|*.sbom.spdx.json|*.sha256|checksums.txt)
continue ;;
esac
sha256sum "$f" >> checksums.txt
done
echo "=== checksums.txt ==="
cat checksums.txt
# base64 hashes (single line, no wrapping) for SLSA generator.
HASHES=$(base64 -w0 < checksums.txt)
echo "hashes=${HASHES}" >> "$GITHUB_OUTPUT"
- name: Install Cosign
uses: sigstore/cosign-installer@cad07c2e89fa2edd6e2d7bab4c1aa38e53f76003 # v4.1.1
- name: Keyless-sign checksums.txt
run: |
set -euo pipefail
cd artifacts
# Cosign v3.0 --bundle replaces the removed v2 flag pair
# --output-signature / --output-certificate. See M-11.
cosign sign-blob \
--yes \
--bundle checksums.txt.sigstore.json \
checksums.txt
- name: Upload artefacts to GitHub Release
uses: softprops/action-gh-release@v2
if: startsWith(github.ref, 'refs/tags/')
with:
files: |
dist/certctl-agent-*
dist/certctl-server-*
artifacts/certctl-*
artifacts/checksums.txt
artifacts/checksums.txt.sigstore.json
# Build and push Docker images
# ----------------------------------------------------------------------
# provenance-binaries (M-3): SLSA Level 3 provenance for every binary.
# The SLSA generic generator reusable workflow runs in a hermetic
# workflow run, producing multiple.intoto.jsonl from the base64 hash
# manifest and uploading it as a release asset.
# ----------------------------------------------------------------------
provenance-binaries:
name: SLSA provenance (binaries)
needs: [aggregate-checksums]
permissions:
actions: read
id-token: write
contents: write
uses: slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@v2.1.0
with:
base64-subjects: "${{ needs.aggregate-checksums.outputs.hashes }}"
upload-assets: true
provenance-name: multiple.intoto.jsonl
# ----------------------------------------------------------------------
# build-and-push-docker: push container images to GHCR with native
# SLSA L3 provenance (mode=max) and SBOM attestations emitted by
# docker/build-push-action@v6, plus a keyless Cosign signature on the
# image digest for identity-bound verification. The M-4 proxy-propagation
# build-args block is retained verbatim — M-3 only adds supply-chain
# steps; it never touches M-4 wiring.
# ----------------------------------------------------------------------
build-and-push-docker:
name: Build & Push Docker Images
runs-on: ubuntu-latest
permissions:
contents: write
packages: write
id-token: write # Cosign keyless OIDC identity token
steps:
- uses: actions/checkout@v4
@@ -93,40 +222,90 @@ jobs:
- name: Extract version from tag
id: version
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> "$GITHUB_OUTPUT"
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Install Cosign
uses: sigstore/cosign-installer@cad07c2e89fa2edd6e2d7bab4c1aa38e53f76003 # v4.1.1
- name: Build and push server image
id: server-push
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile
push: true
tags: |
${{ env.REGISTRY }}/shankar0123/certctl-server:${{ steps.version.outputs.VERSION }}
${{ env.REGISTRY }}/shankar0123/certctl-server:latest
${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-server:${{ steps.version.outputs.VERSION }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-server:latest
# Proxy propagation (M-4, Issue #9) — forwards runner-level proxy
# secrets into the Docker build so self-hosted runners behind
# corporate proxies can reach public registries. GitHub-hosted
# runners don't need proxies, so the secrets are optional and
# resolve to empty strings when unset — byte-identical to the
# pre-fix behaviour for the public-runner path.
build-args: |
HTTP_PROXY=${{ secrets.HTTP_PROXY }}
HTTPS_PROXY=${{ secrets.HTTPS_PROXY }}
NO_PROXY=${{ secrets.NO_PROXY }}
# Supply-chain hardening (M-3): emit native SLSA L3 provenance
# and SBOM attestations bound to the image manifest.
provenance: mode=max
sbom: true
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Keyless-sign server image with Cosign
env:
DIGEST: ${{ steps.server-push.outputs.digest }}
IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-server
run: |
set -euo pipefail
cosign sign --yes "${IMAGE}@${DIGEST}"
- name: Build and push agent image
id: agent-push
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile.agent
push: true
tags: |
${{ env.REGISTRY }}/shankar0123/certctl-agent:${{ steps.version.outputs.VERSION }}
${{ env.REGISTRY }}/shankar0123/certctl-agent:latest
${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-agent:${{ steps.version.outputs.VERSION }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-agent:latest
# Proxy propagation (M-4, Issue #9) — see server-image step for
# rationale. Empty secrets resolve to empty build args, leaving
# the un-proxied code path byte-identical to the pre-fix tree.
build-args: |
HTTP_PROXY=${{ secrets.HTTP_PROXY }}
HTTPS_PROXY=${{ secrets.HTTPS_PROXY }}
NO_PROXY=${{ secrets.NO_PROXY }}
# Supply-chain hardening (M-3): emit native SLSA L3 provenance
# and SBOM attestations bound to the image manifest.
provenance: mode=max
sbom: true
cache-from: type=gha
cache-to: type=gha,mode=max
# Create release notes with all artifacts
- name: Keyless-sign agent image with Cosign
env:
DIGEST: ${{ steps.agent-push.outputs.digest }}
IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAMESPACE }}/certctl-agent
run: |
set -euo pipefail
cosign sign --yes "${IMAGE}@${DIGEST}"
# ----------------------------------------------------------------------
# create-release: stamp the release body. The actual asset uploads are
# handled by aggregate-checksums (binaries, SBOMs, sigs, certs,
# checksums.txt + signature) and the SLSA generator (multiple.intoto.jsonl).
# ----------------------------------------------------------------------
create-release:
name: Create Release Notes
runs-on: ubuntu-latest
needs: [build-binaries, build-and-push-docker]
needs: [build-binaries, aggregate-checksums, provenance-binaries, build-and-push-docker]
permissions:
contents: write
@@ -135,7 +314,7 @@ jobs:
- name: Extract version from tag
id: version
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> "$GITHUB_OUTPUT"
- name: Create release with notes
uses: softprops/action-gh-release@v2
@@ -197,6 +376,76 @@ jobs:
- **Linux x86_64**: `certctl-server-linux-amd64`
- **Linux ARM64**: `certctl-server-linux-arm64`
- **macOS x86_64**: `certctl-server-darwin-amd64`
- **macOS ARM64 (Apple Silicon)**: `certctl-server-darwin-arm64`
## CLI & MCP Server Binaries
The `certctl-cli` (REST API wrapper) and `certctl-mcp-server` (Model Context
Protocol bridge) binaries ship for all four platforms as well:
- `certctl-cli-{linux,darwin}-{amd64,arm64}`
- `certctl-mcp-server-{linux,darwin}-{amd64,arm64}`
## Verifying this release
Every binary, `checksums.txt`, and container image is signed with Cosign
keyless OIDC. Each binary ships with a SPDX-JSON SBOM. Binaries are covered
by SLSA Level 3 provenance; container images carry native SLSA L3 provenance
and SBOM attestations (docker/build-push-action `provenance: mode=max`,
`sbom: true`) in addition to a Cosign signature on the digest.
**1. Verify SHA-256 checksums:**
```bash
sha256sum -c checksums.txt
```
**2. Verify the Cosign signature on checksums.txt (keyless OIDC):**
```bash
cosign verify-blob \
--bundle checksums.txt.sigstore.json \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/\.github/workflows/release\.yml@refs/tags/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
checksums.txt
```
Replace `checksums.txt` with any individual binary name to verify that
artefact directly (each binary ships with its own `.sigstore.json`
bundle, e.g. `cosign verify-blob --bundle certctl-agent-linux-amd64.sigstore.json …`).
**3. Verify SLSA Level 3 provenance (binaries):**
```bash
slsa-verifier verify-artifact \
--provenance-path multiple.intoto.jsonl \
--source-uri github.com/shankar0123/certctl \
--source-tag ${{ steps.version.outputs.VERSION }} \
certctl-agent-linux-amd64
```
**4. Verify container image signature and attestations:**
```bash
IMAGE=ghcr.io/shankar0123/certctl-server:${{ steps.version.outputs.VERSION }}
cosign verify \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/\.github/workflows/release\.yml@refs/tags/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
# SBOM attestation (SPDX-JSON) emitted by docker/build-push-action
cosign verify-attestation --type spdxjson \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
# SLSA provenance attestation (mode=max)
cosign verify-attestation --type slsaprovenance \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
```
## Helm Chart
+6 -1
View File
@@ -65,10 +65,15 @@ certctl-cli
/cli
# Private strategy docs
roadmap.md
strategy.md
SECURITY_REMEDIATION.md
# OS
.DS_Store
Thumbs.db
mcp-server
# Local Go build/module caches (session-scoped, never committed)
/.gocache/
/.gomodcache/
/.gopath/
+1
View File
@@ -6,6 +6,7 @@ run:
linters:
default: none
enable:
- contextcheck
- govet
- staticcheck
- unused
+30 -4
View File
@@ -3,17 +3,43 @@
# Stage 1: Build frontend
FROM node:20-alpine AS frontend
# Proxy propagation (M-4, Issue #9) — defaulted to empty so un-proxied builds
# behave identically to the pre-fix tree. When `HTTP_PROXY`/`HTTPS_PROXY`/
# `NO_PROXY` are forwarded via `docker build --build-arg` (or compose
# `build.args`), they are re-exported as ENV with both upper- and lower-case
# names because npm/apk/curl read the lowercase variants while Go, Node, and
# most HTTP libraries read the uppercase ones.
ARG HTTP_PROXY=
ARG HTTPS_PROXY=
ARG NO_PROXY=
ENV HTTP_PROXY=${HTTP_PROXY} \
HTTPS_PROXY=${HTTPS_PROXY} \
NO_PROXY=${NO_PROXY} \
http_proxy=${HTTP_PROXY} \
https_proxy=${HTTPS_PROXY} \
no_proxy=${NO_PROXY}
WORKDIR /app/web
COPY web/package.json web/package-lock.json ./
RUN npm ci
COPY web/ .
RUN npm run build
RUN npm ci --include=dev || npm ci --include=dev && \
node_modules/.bin/tsc --version && \
npm run build
# Stage 2: Build Go binary
FROM golang:1.25-alpine AS builder
# Proxy propagation (M-4, Issue #9) — see Stage 1 rationale.
ARG HTTP_PROXY=
ARG HTTPS_PROXY=
ARG NO_PROXY=
ENV HTTP_PROXY=${HTTP_PROXY} \
HTTPS_PROXY=${HTTPS_PROXY} \
NO_PROXY=${NO_PROXY} \
http_proxy=${HTTP_PROXY} \
https_proxy=${HTTPS_PROXY} \
no_proxy=${NO_PROXY}
RUN apk add --no-cache git ca-certificates tzdata
WORKDIR /app
+16
View File
@@ -2,6 +2,22 @@
# Stage 1: Build
FROM golang:1.25-alpine AS builder
# Proxy propagation (M-4, Issue #9) — defaulted to empty so un-proxied builds
# behave identically to the pre-fix tree. When `HTTP_PROXY`/`HTTPS_PROXY`/
# `NO_PROXY` are forwarded via `docker build --build-arg` (or compose
# `build.args`), they are re-exported as ENV with both upper- and lower-case
# names because apk and curl read the lowercase variants while Go reads the
# uppercase ones.
ARG HTTP_PROXY=
ARG HTTPS_PROXY=
ARG NO_PROXY=
ENV HTTP_PROXY=${HTTP_PROXY} \
HTTPS_PROXY=${HTTPS_PROXY} \
NO_PROXY=${NO_PROXY} \
http_proxy=${HTTP_PROXY} \
https_proxy=${HTTPS_PROXY} \
no_proxy=${NO_PROXY}
RUN apk add --no-cache git ca-certificates
WORKDIR /app
+14 -7
View File
@@ -6,13 +6,20 @@ Licensor: Shankar Reddy
Licensed Work: certctl
The Licensed Work is (c) 2026 Shankar Reddy.
Additional Use Grant: You may make use of the Licensed Work, provided that
you may not use the Licensed Work for a Certificate
Management Service. A "Certificate Management Service"
is a commercial offering that allows third parties
(other than your employees and contractors acting on
your behalf) to access and/or use the Licensed Work's
certificate lifecycle management functionality as part
of a hosted or managed service.
you may not use the Licensed Work for a Commercial
Certificate Service. A "Commercial Certificate Service"
is any product, service, or offering in which a third
party (other than your employees and contractors
acting on your behalf) accesses, uses, or benefits
from the Licensed Work's certificate management
functionality — including but not limited to lifecycle
management, discovery, monitoring, alerting, renewal
automation, deployment, and revocation — as part of
or in connection with an offering for which
compensation is received. This restriction applies
regardless of whether the Licensed Work is hosted,
managed, embedded, bundled, or integrated with
another product or service.
Change Date: March 14, 2033
+233 -390
View File
@@ -14,7 +14,7 @@
TLS certificate lifespans are shrinking fast. The CA/Browser Forum passed [Ballot SC-081v3](https://cabforum.org/2025/04/11/ballot-sc081v3-introduce-schedule-of-reducing-validity-and-data-reuse-periods/) unanimously in April 2025, setting a phased reduction: **200 days** by March 2026, **100 days** by March 2027, and **47 days** by March 2029. Organizations managing dozens or hundreds of certificates can no longer rely on spreadsheets, calendar reminders, or manual renewal workflows. The math doesn't work — at 47-day lifespans, a team managing 100 certificates is processing 7+ renewals per week, every week, forever.
certctl is a self-hosted platform that automates the entire certificate lifecycle — from issuance through renewal to deployment — with zero human intervention. It works with any certificate authority, deploys to any server, and keeps private keys on your infrastructure where they belong.
certctl is a self-hosted platform that automates the entire certificate lifecycle — from issuance through renewal to deployment — with zero human intervention. It works with any certificate authority, deploys to any server, and keeps private keys on your infrastructure where they belong. It's free, self-hosted, and covers the same lifecycle that enterprise platforms charge $100K+/year for.
```mermaid
gantt
@@ -36,91 +36,101 @@ gantt
47 days :crit, 2020-01-01, 47d
```
> **Actively maintained — shipping weekly.** Found something? [Open a GitHub issue](https://github.com/shankar0123/certctl/issues) — issues get triaged same-day. CI runs the full test suite with race detection, static analysis, and vulnerability scanning on every commit.
**Ready to try it?** Jump to the [Quick Start](#quick-start) — you'll have a running dashboard in under 5 minutes.
## Documentation
| Guide | Description |
|-------|-------------|
| [Why certctl?](docs/why-certctl.md) | Competitive positioning — how certctl compares to open-source and enterprise certificate management platforms |
| [Why certctl?](docs/why-certctl.md) | How certctl compares to ACME clients, agent-based SaaS, and enterprise platforms |
| [Concepts](docs/concepts.md) | TLS certificates explained from scratch — for beginners who know nothing about certs |
| [Quick Start](docs/quickstart.md) | Get running in 5 minutes — dashboard, API, CLI, discovery, stakeholder demo flow |
| [Quick Start](docs/quickstart.md) | 5-minute setup — dashboard, API, CLI, discovery, stakeholder demo flow |
| [Docker Compose Environments](deploy/ENVIRONMENTS.md) | Service-by-service walkthrough of all 4 compose files, env var reference |
| [Deployment Examples](docs/examples.md) | 5 turnkey scenarios (ACME+NGINX, wildcard DNS-01, private CA, step-ca, multi-issuer) with migration guides |
| [Advanced Demo](docs/demo-advanced.md) | Issue a certificate end-to-end with technical deep-dives |
| [Architecture](docs/architecture.md) | System design, data flow diagrams, security model |
| [Feature Inventory](docs/features.md) | Complete reference of all V2 capabilities, API endpoints, and configuration |
| [Connectors](docs/connectors.md) | Build custom issuer, target, and notifier connectors |
| [Feature Inventory](docs/features.md) | Complete reference of all capabilities, API endpoints, and configuration |
| [Connector Reference](docs/connectors.md) | Configuration for all issuer, target, and notifier connectors |
| [MCP Server](docs/mcp.md) | AI integration via Model Context Protocol — setup, available tools, examples |
| [OpenAPI 3.1 Spec](docs/openapi.md) | API reference guide with endpoint overview ([raw spec](api/openapi.yaml)) |
| [Compliance Mapping](docs/compliance.md) | SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 alignment guides |
| [Migrate from Certbot](docs/migrate-from-certbot.md) | Step-by-step migration from Certbot/Let's Encrypt cron jobs |
| [Migrate from acme.sh](docs/migrate-from-acmesh.md) | Migration guide for acme.sh users with DNS-01 scripts |
| [certctl for cert-manager Users](docs/certctl-for-cert-manager-users.md) | Using certctl alongside cert-manager for non-Kubernetes infrastructure |
> **Next release:** v2.1.0 will be tagged after the full V2 feature suite passes manual QA across all 34 sections of the [testing guide](docs/testing-guide.md). Automated CI (1,300+ Go tests + 211 frontend tests) gates every commit; the manual playbook covers integration, deployment, and UX verification that unit tests can't reach.
## Why certctl Exists
Certificate lifecycle tooling today falls into two camps: expensive enterprise platforms (Venafi, Keyfactor, Sectigo) that cost six figures and take months to deploy, or single-purpose tools (cert-manager, certbot) that handle one slice of the problem. If you run a mixed infrastructure — some NGINX, some Apache, a few HAProxy nodes, maybe an F5 — and you need to manage certificates from multiple CAs, there's nothing self-hosted that covers the full lifecycle without vendor lock-in.
certctl fills that gap. It's **CA-agnostic** — the issuer connector interface means you can plug in any certificate authority: a self-signed local CA for dev, Let's Encrypt via ACME for public certs, Smallstep step-ca for your private PKI, your enterprise ADCS via sub-CA mode, or any custom CA through a shell script adapter. You're never locked to a single CA vendor, and you can run multiple issuers simultaneously for different certificate types.
It's also **target-agnostic**. Agents deploy certificates to NGINX, Apache, HAProxy, Traefik, and Caddy — all using the same pluggable connector model for any server that accepts cert files. The control plane never initiates outbound connections — agents poll for work, which means certctl works behind firewalls, across network zones, and in air-gapped environments.
For a detailed comparison with CertKit, KeyTalk, and enterprise platforms (Venafi, Keyfactor), see [Why certctl?](docs/why-certctl.md)
## What It Does
certctl gives you a single pane of glass for every TLS certificate in your organization:
- **Web dashboard** — 24 operational pages: certificate inventory, deployment timeline with TLS verification, bulk operations (renew/revoke/reassign), discovery triage, network scan management, approval workflows, audit trail with CSV/JSON export, agent fleet overview with OS/arch grouping, short-lived credential monitoring, digest email preview
- **REST API** — 97 endpoints under `/api/v1/` + `/.well-known/est/` for complete automation, with sparse fields, sort, cursor pagination, and time-range filters
- **Agents** — generate private keys locally (ECDSA P-256), discover existing certs on disk (PEM/DER), submit CSRs only (private keys never leave your servers)
- **Network scanner** — discovers certificates on TLS endpoints across CIDR ranges without requiring agents, concurrent scanning with configurable timeouts
- **Certificate export** — PEM (JSON or file download) and PKCS#12 formats, with audit trail; private keys never included
- **S/MIME + EKU support** — issue certificates with emailProtection, codeSigning, timeStamping, clientAuth EKUs; email SAN routing for S/MIME
- **EST server** (RFC 7030) — device and WiFi certificate enrollment via industry-standard protocol
- **Post-deployment verification** — agent-side TLS probe confirms the target serves the correct certificate by SHA-256 fingerprint match
- **Approval workflows** — require human sign-off on renewals before deployment
- **Background scheduler** — 7 automated loops: renewal checks, job processing, agent health, notifications, short-lived cert expiry, network scanning, and scheduled certificate digest emails
- **ACME Renewal Information (ARI, RFC 9702)** — CA-directed renewal timing; certctl asks the CA when to renew instead of using fixed thresholds
- **Scheduled certificate digest emails** — HTML digest with certificate stats, expiration timeline, and job health; optional daily briefing via SMTP
- **Helm chart** — Production-ready Kubernetes deployment with server, PostgreSQL, and agent DaemonSet
For the full capability breakdown — revocation infrastructure, policy engine, observability, EST enrollment, and more — see the [Feature Inventory](docs/features.md).
| [Migrate from certbot](docs/migrate-from-certbot.md) | Step-by-step migration from certbot cron jobs to certctl |
| [Migrate from acme.sh](docs/migrate-from-acmesh.md) | Migration guide for acme.sh users, DNS hook compatibility |
| [certctl for cert-manager users](docs/certctl-for-cert-manager-users.md) | How certctl complements cert-manager for mixed infrastructure |
| [Test Environment](docs/test-env.md) | Docker Compose test environment with real CA backends |
| [Testing Guide](docs/testing-guide.md) | Comprehensive test procedures, smoke tests, and release sign-off checklist |
## Supported Integrations
### Certificate Issuers
| Issuer | Status | Type |
|--------|--------|------|
| Local CA (self-signed + sub-CA) | Implemented | `GenericCA` |
| ACME v2 (Let's Encrypt, Sectigo) | Implemented (HTTP-01 + DNS-01 + DNS-PERSIST-01) | `ACME` |
| ACME EAB (ZeroSSL, Google Trust) | Implemented (auto-fetch EAB from ZeroSSL) | `ACME` |
| step-ca | Implemented | `StepCA` |
| OpenSSL / Custom CA | Implemented | `OpenSSL` |
| Vault PKI | Beta | `VaultPKI` |
| DigiCert CertCentral | Beta | `DigiCert` |
**Vault PKI and DigiCert connectors are in beta.** If you hit any bugs or unexpected behavior, please [open a GitHub issue](https://github.com/shankar0123/certctl/issues) -- we're actively testing these and want to hear from real users.
| Issuer | Type | Notes |
|--------|------|-------|
| Local CA (self-signed + sub-CA) | `GenericCA` | Sub-CA mode chains to enterprise root (ADCS, etc.) |
| ACME v2 (Let's Encrypt, ZeroSSL, etc.) | `ACME` | HTTP-01, DNS-01, DNS-PERSIST-01 challenges. EAB auto-fetch from ZeroSSL. Profile selection (`tlsserver`, `shortlived`). |
| step-ca (Smallstep) | `StepCA` | JWK provisioner auth, issuance + renewal + revocation |
| OpenSSL / Custom CA | `OpenSSL` | Shell script adapter — any CA with a CLI |
| HashiCorp Vault PKI | `VaultPKI` | Token auth, synchronous issuance, CRL/OCSP delegated to Vault |
| DigiCert CertCentral | `DigiCert` | Async order model, OV/EV support, PEM bundle parsing |
| Sectigo SCM | `Sectigo` | 3-header auth, DV/OV/EV, collect-not-ready graceful handling |
| Google Cloud CAS | `GoogleCAS` | OAuth2 service account, synchronous issuance, CA pool selection |
| AWS ACM Private CA | `AWSACMPCA` | Synchronous issuance, configurable signing algorithm/template ARN |
| Entrust Certificate Services | `Entrust` | mTLS client certificate auth, synchronous/approval-pending issuance |
| GlobalSign Atlas HVCA | `GlobalSign` | mTLS + API key/secret dual auth, serial-based tracking |
| EJBCA (Keyfactor) | `EJBCA` | Dual auth (mTLS or OAuth2), self-hosted open-source CA |
**Note:** ADCS integration is handled via the Local CA's sub-CA mode — certctl operates as a subordinate CA with its signing certificate issued by ADCS. Any CA with a shell-accessible signing interface can be integrated today via the OpenSSL/Custom CA connector.
**Note:** ADCS integration is handled via the Local CA's sub-CA mode — certctl operates as a subordinate CA with its signing certificate issued by ADCS. Any CA with a shell-accessible signing interface can be integrated via the OpenSSL/Custom CA connector.
### Deployment Targets
| Target | Status | Type |
|--------|--------|------|
| NGINX | Implemented | `NGINX` |
| Apache httpd | Implemented | `Apache` |
| HAProxy | Implemented | `HAProxy` |
| Traefik | Implemented | `Traefik` |
| Caddy | Implemented | `Caddy` |
| F5 BIG-IP | Interface only | `F5` |
| Microsoft IIS | Interface only | `IIS` |
| Target | Type | Notes |
|--------|------|-------|
| NGINX | `NGINX` | File write, config validation, reload |
| Apache httpd | `Apache` | Separate cert/chain/key files, configtest, graceful reload |
| HAProxy | `HAProxy` | Combined PEM file, validate, reload |
| Traefik | `Traefik` | File provider deployment, auto-reload via filesystem watch |
| Caddy | `Caddy` | Dual-mode: admin API hot-reload or file-based |
| Envoy | `Envoy` | File-based with optional SDS JSON config |
| Postfix | `Postfix` | Mail server TLS, pairs with S/MIME support |
| Dovecot | `Dovecot` | Mail server TLS, pairs with S/MIME support |
| Microsoft IIS | `IIS` | Local PowerShell or remote WinRM, PEM→PFX, SNI support |
| F5 BIG-IP | `F5` | iControl REST via proxy agent, transaction-based atomic updates |
| SSH (Agentless) | `SSH` | SFTP cert/key deployment to any Linux/Unix server |
| Windows Certificate Store | `WinCertStore` | PowerShell Import-PfxCertificate, configurable store/location |
| Java Keystore | `JavaKeystore` | PEM→PKCS#12→keytool pipeline, JKS and PKCS12 formats |
| Kubernetes Secrets | `KubernetesSecrets` | `kubernetes.io/tls` Secrets, in-cluster or kubeconfig auth |
### Enrollment Protocols
| Protocol | Standard | Use Case |
|----------|----------|----------|
| EST (Enrollment over Secure Transport) | RFC 7030 | Device enrollment, WiFi/802.1X, IoT |
| SCEP (Simple Certificate Enrollment Protocol) | RFC 8894 | MDM platforms (Jamf, Intune), network devices |
| ACME v2 | RFC 8555 | Public CA automated issuance (Let's Encrypt, ZeroSSL) |
| ACME ARI (Renewal Information) | RFC 9773 | CA-directed renewal timing — the CA tells you when to renew |
### Standards & Revocation
| Capability | Standard | Notes |
|------------|----------|-------|
| DER-encoded X.509 CRL | RFC 5280 | Per-issuer, signed by issuing CA, 24h validity |
| Embedded OCSP responder | RFC 6960 | Good/revoked/unknown status per issuer |
| S/MIME certificates | RFC 8551 | Email protection EKU, adaptive KeyUsage flags |
| Certificate export | — | PEM (JSON/file) and PKCS#12 formats |
| ACME DNS-PERSIST-01 | IETF draft | Standing validation record, no per-renewal DNS updates |
### Notifiers
| Notifier | Status | Type |
|----------|--------|------|
| Email (SMTP) | Implemented | `Email` |
| Webhooks | Implemented | `Webhook` |
| Slack | Implemented | `Slack` |
| Microsoft Teams | Implemented | `Teams` |
| PagerDuty | Implemented | `PagerDuty` |
| OpsGenie | Implemented | `OpsGenie` |
| Notifier | Type |
|----------|------|
| Email (SMTP) | `Email` |
| Webhooks | `Webhook` |
| Slack | `Slack` |
| Microsoft Teams | `Teams` |
| PagerDuty | `PagerDuty` |
| OpsGenie | `OpsGenie` |
All connectors are pluggable — build your own by implementing the [connector interface](docs/connectors.md).
@@ -128,43 +138,57 @@ All connectors are pluggable — build your own by implementing the [connector i
<table>
<tr>
<td><a href="docs/screenshots/v2-dashboard.png"><img src="docs/screenshots/v2-dashboard.png" width="270" alt="Dashboard"></a><br><b>Dashboard</b><br><sub>Stats, expiration heatmap, renewal trends</sub></td>
<td><a href="docs/screenshots/v2-certificates.png"><img src="docs/screenshots/v2-certificates.png" width="270" alt="Certificates"></a><br><b>Certificates</b><br><sub>Inventory with status, owner, team filters</sub></td>
<td><a href="docs/screenshots/v2-agents.png"><img src="docs/screenshots/v2-agents.png" width="270" alt="Agents"></a><br><b>Agents</b><br><sub>Fleet health, OS/arch, IP, version</sub></td>
<td><a href="docs/screenshots/v2-dashboard.png"><img src="docs/screenshots/v2-dashboard.png" width="400" alt="Dashboard"></a><br><b>Dashboard</b><br><sub>Stats, expiration heatmap, renewal trends, issuance rate</sub></td>
<td><a href="docs/screenshots/v2-certificates.png"><img src="docs/screenshots/v2-certificates.png" width="400" alt="Certificates"></a><br><b>Certificates</b><br><sub>Inventory with bulk ops, status filters, owner/team columns</sub></td>
</tr>
<tr>
<td><a href="docs/screenshots/v2-fleet.png"><img src="docs/screenshots/v2-fleet.png" width="270" alt="Fleet Overview"></a><br><b>Fleet Overview</b><br><sub>OS distribution, status breakdown</sub></td>
<td><a href="docs/screenshots/v2-jobs.png"><img src="docs/screenshots/v2-jobs.png" width="270" alt="Jobs"></a><br><b>Jobs</b><br><sub>Issuance, renewal, deployment queue</sub></td>
<td><a href="docs/screenshots/v2-notifications.png"><img src="docs/screenshots/v2-notifications.png" width="270" alt="Notifications"></a><br><b>Notifications</b><br><sub>Expiration warnings, renewal results</sub></td>
</tr>
<tr>
<td><a href="docs/screenshots/v2-policies.png"><img src="docs/screenshots/v2-policies.png" width="270" alt="Policies"></a><br><b>Policies</b><br><sub>Ownership, lifetime, renewal rules</sub></td>
<td><a href="docs/screenshots/v2-profiles.png"><img src="docs/screenshots/v2-profiles.png" width="270" alt="Profiles"></a><br><b>Profiles</b><br><sub>Key types, max TTL, crypto constraints</sub></td>
<td><a href="docs/screenshots/v2-issuers.png"><img src="docs/screenshots/v2-issuers.png" width="270" alt="Issuers"></a><br><b>Issuers</b><br><sub>Local CA, ACME, step-ca, Vault PKI, DigiCert</sub></td>
</tr>
<tr>
<td><a href="docs/screenshots/v2-targets.png"><img src="docs/screenshots/v2-targets.png" width="270" alt="Targets"></a><br><b>Targets</b><br><sub>NGINX, Apache, HAProxy, Traefik, Caddy deployment</sub></td>
<td><a href="docs/screenshots/v2-owners.png"><img src="docs/screenshots/v2-owners.png" width="270" alt="Owners"></a><br><b>Owners</b><br><sub>Cert ownership with team assignment</sub></td>
<td><a href="docs/screenshots/v2-teams.png"><img src="docs/screenshots/v2-teams.png" width="270" alt="Teams"></a><br><b>Teams</b><br><sub>Org grouping for notification routing</sub></td>
</tr>
<tr>
<td><a href="docs/screenshots/v2-agent-groups.png"><img src="docs/screenshots/v2-agent-groups.png" width="270" alt="Agent Groups"></a><br><b>Agent Groups</b><br><sub>Dynamic grouping by OS, arch, CIDR</sub></td>
<td><a href="docs/screenshots/v2-audit-trail.png"><img src="docs/screenshots/v2-audit-trail.png" width="270" alt="Audit Trail"></a><br><b>Audit Trail</b><br><sub>Immutable log, CSV/JSON export</sub></td>
<td><a href="docs/screenshots/v2-short-lived.png"><img src="docs/screenshots/v2-short-lived.png" width="270" alt="Short-Lived"></a><br><b>Short-Lived Creds</b><br><sub>Ephemeral certs with live TTL countdown</sub></td>
<td><a href="docs/screenshots/v2-issuers.png"><img src="docs/screenshots/v2-issuers.png" width="400" alt="Issuers"></a><br><b>Issuers</b><br><sub>Catalog with 10 CA types, GUI config, test connection</sub></td>
<td><a href="docs/screenshots/v2-jobs.png"><img src="docs/screenshots/v2-jobs.png" width="400" alt="Jobs"></a><br><b>Jobs</b><br><sub>Issuance, renewal, deployment queue with approval workflow</sub></td>
</tr>
</table>
> **24 operational GUI pages** covering the full certificate lifecycle: dashboard, certificates (list + detail with EKU badges, deployment timeline, TLS verification status), agents, fleet overview, jobs (list + detail with approval workflow), notifications, policies, profiles, issuers (catalog + detail), targets (list + detail + wizard), owners, teams, agent groups, audit trail, short-lived credentials, discovery triage, network scan management, digest email preview, and observability metrics.
**[See all screenshots →](docs/screenshots/)**
## Why certctl
Certificate lifecycle tooling falls into two camps: enterprise platforms (Venafi, Keyfactor) that cost six figures and take months to deploy, or single-purpose tools (certbot, cert-manager) that handle one slice of the problem. certctl fills the gap — full lifecycle automation, self-hosted, free, CA-agnostic, and target-agnostic. If you're running certbot cron jobs, manually renewing certs, or stitching together scripts across mixed infrastructure, certctl replaces all of that.
Built for **platform engineering and DevOps teams** managing 10500+ certificates, **security and compliance teams** who need audit trails and policy enforcement for SOC 2, PCI-DSS 4.0, or NIST SP 800-57 ([compliance mapping included](docs/compliance.md)), and **small teams without enterprise budgets** who need Venafi-grade automation for a 50-server environment. For a detailed comparison, see [Why certctl?](docs/why-certctl.md)
**Architecture.** Go 1.25 control plane with handler→service→repository layering, PostgreSQL 16 backend (21 tables), and a pull-only deployment model — the server never initiates outbound connections. Agents poll for work. For network appliances and agentless servers, a proxy agent in the same network zone handles deployment via the target's API (WinRM, iControl REST, SSH/SFTP). Background scheduler runs 7 loops: renewal with ARI integration (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/architecture.md) for full system diagrams.
**Security-first.** Agents generate ECDSA P-256 keys locally — private keys never touch the control plane. API key auth enforced by default with SHA-256 hashing and constant-time comparison. CORS deny-by-default. Shell injection prevention on all connector scripts. SSRF protection (reserved IP filtering) on the network scanner. Atomic idempotency guards on scheduler loops. Issuer and target credentials encrypted at rest with AES-256-GCM. Every API call recorded to an immutable audit trail with actor attribution, body hash, and latency tracking. CI runs race detection, 11 linters, and vulnerability scanning on every commit.
**Key design decisions.** TEXT primary keys — human-readable prefixed IDs (`mc-api-prod`, `t-platform`, `o-alice`) so you can identify resources at a glance in logs and queries. Idempotent migrations (`IF NOT EXISTS`, `ON CONFLICT DO NOTHING`) safe for repeated execution. Dynamic configuration via GUI with AES-256-GCM encrypted credential storage and env var backward compatibility. Handlers define their own service interfaces for clean dependency inversion.
## What It Does
**Automated lifecycle.** Certificates renew and deploy themselves. The scheduler monitors expiration, issues through your CA, and deploys to targets — zero human intervention. ACME ARI (RFC 9773) lets the CA direct renewal timing. Ready for 47-day (SC-081v3) and 6-day (Let's Encrypt shortlived) certificate lifetimes.
**Operational dashboard.** 26-page GUI covers the entire lifecycle: certificate inventory with bulk ops, deployment timeline with rollback, discovery triage, network scan management, agent fleet health, short-lived credential countdown, approval workflows, and observability metrics. Configure issuers and targets from the dashboard — no env var editing, no server restarts.
**Private keys stay on your servers.** Agents generate ECDSA P-256 keys locally, submit only the CSR. The control plane never touches private keys. After deployment, agents probe the live TLS endpoint and compare SHA-256 fingerprints to confirm the right certificate is actually being served.
**Discovery.** Agents scan filesystems for existing PEM/DER certificates. The network scanner probes TLS endpoints across CIDR ranges without agents. Cloud discovery finds certificates in AWS Secrets Manager, Azure Key Vault, and GCP Secret Manager. Continuous TLS health monitoring tracks endpoint status (healthy/degraded/down/cert_mismatch) with configurable thresholds and historical probe data. All discovery modes feed into a unified triage workflow — claim, dismiss, or import what you find.
**Policy engine.** Certificate profiles constrain key types, max TTL, and EKUs — with crypto policy enforcement that validates every CSR against profile rules before it reaches the issuer. MaxTTL caps are enforced per issuer connector. Approval workflows pause jobs for human review. Ownership tracking routes notifications to the right team. Agent groups match devices by OS, architecture, IP CIDR, and version.
**Enrollment protocols.** EST server (RFC 7030) for device and WiFi enrollment. SCEP server (RFC 8894) for MDM platforms and network devices. S/MIME issuance with email protection EKU.
**Revocation.** Single and bulk revocation (by profile, owner, agent, or issuer). DER-encoded X.509 CRL per issuer, signed by the issuing CA. Embedded OCSP responder. RFC 5280 reason codes. Short-lived certs (TTL < 1 hour) are exempt — expiry is sufficient revocation.
**Audit and observability.** Immutable append-only audit trail records every lifecycle action, every API call, and every approval decision. Prometheus metrics endpoint. Scheduled certificate digest emails. Continuous endpoint health monitoring with state machine transitions and real-time alerts.
**Notifications.** Slack, Teams, PagerDuty, OpsGenie, SMTP, webhooks. Routed by certificate owner. Daily digest emails with stats and expiring certs.
**Multiple interfaces.** REST API (111 routes), CLI (12 commands), MCP server (80 tools for Claude, Cursor, Windsurf), Helm chart, web dashboard. Certificate export in PEM and PKCS#12.
**First-run onboarding.** Wizard guides you through connecting a CA, deploying an agent, and issuing your first certificate. Or start with the pre-populated demo — 32 certificates, 10 issuers, 180 days of history.
For the complete capability breakdown, see the [Feature Inventory](docs/features.md).
## Quick Start
### Docker Pull
```bash
docker pull shankar0123.docker.scarf.sh/certctl-server
docker pull shankar0123.docker.scarf.sh/certctl-agent
```
### Docker Compose (Recommended)
```bash
@@ -173,17 +197,19 @@ cd certctl
docker compose -f deploy/docker-compose.yml up -d --build
```
Wait ~30 seconds, then open **http://localhost:8443** in your browser.
Wait ~30 seconds, then open **http://localhost:8443** in your browser. The onboarding wizard walks you through connecting a CA, deploying an agent, and issuing your first certificate.
The dashboard comes pre-loaded with 32 demo certificates across 7 issuers, 8 agents, 180 days of job history, discovery scan data, and network scan targets — a realistic snapshot of a certificate inventory that looks like it's been running for months.
**Want a pre-populated demo instead?** Add the demo override to see 32 certificates across 10 issuers, 8 agents, and 180 days of realistic history:
```bash
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build
```
The `deploy/` directory has four compose files: `docker-compose.yml` (base platform), `docker-compose.demo.yml` (demo data overlay), `docker-compose.dev.yml` (PgAdmin + debug logging), and `docker-compose.test.yml` (standalone integration tests with real CA backends). See the [Docker Compose Environments Guide](deploy/ENVIRONMENTS.md) for a service-by-service walkthrough, or the [Quick Start](docs/quickstart.md#docker-compose-environments) for a summary.
Verify the API:
```bash
curl http://localhost:8443/health
# {"status":"healthy"}
curl -s http://localhost:8443/api/v1/certificates | jq '.total'
# 32
```
### Agent Install (One-Liner)
@@ -194,240 +220,104 @@ curl -sSL https://raw.githubusercontent.com/shankar0123/certctl/master/install-a
Detects your OS and architecture, downloads the binary, configures systemd (Linux) or launchd (macOS), and starts the agent. See [install-agent.sh](install-agent.sh) for details.
### Manual Build
### Helm Chart (Kubernetes)
```bash
# Prerequisites: Go 1.25+, PostgreSQL 16+, Docker (for testcontainers-go)
go mod download
make build
# Set up database
export CERTCTL_DATABASE_URL="postgres://certctl:certctl@localhost:5432/certctl?sslmode=disable"
export CERTCTL_AUTH_TYPE=none
make migrate-up
# Start server
./bin/server
# Start agent (separate terminal)
export CERTCTL_SERVER_URL=http://localhost:8443
export CERTCTL_API_KEY=change-me-in-production
export CERTCTL_AGENT_NAME=local-agent
export CERTCTL_AGENT_ID=agent-local-01
./bin/agent --agent-id=agent-local-01
helm install certctl deploy/helm/certctl/ \
--set server.apiKey=your-api-key \
--set postgres.password=your-db-password
```
## Architecture
Production-ready chart with Server Deployment, PostgreSQL StatefulSet, Agent DaemonSet, health probes, security contexts (non-root, read-only rootfs), and optional Ingress. See [values.yaml](deploy/helm/certctl/values.yaml) for all configuration options.
**Control plane** (Go 1.25 net/http) → **PostgreSQL 16** (21 tables, TEXT primary keys) → **Agents** (key generation, CSR submission, cert deployment). Background scheduler runs 7 loops: renewal checks (1h), job processing (30s), agent health (2m), notifications (1m), short-lived cert expiry (30s), network scanning (6h), certificate digest (24h). See [Architecture Guide](docs/architecture.md) for full system diagrams and data flow.
### Key Design Decisions
- **Private keys isolated from the control plane.** Agents generate ECDSA P-256 keys locally and submit CSRs (public key only). The server signs the CSR and returns the certificate — private keys never touch the control plane. Server-side keygen is available via `CERTCTL_KEYGEN_MODE=server` for demo/development only.
- **TEXT primary keys, not UUIDs.** IDs are human-readable prefixed strings (`mc-api-prod`, `t-platform`, `o-alice`) so you can identify resource types at a glance in logs and queries.
- **Handler → Service → Repository layering.** Handlers define their own service interfaces for clean dependency inversion. No global service singletons.
- **Idempotent migrations.** All schema uses `IF NOT EXISTS` and seed data uses `ON CONFLICT (id) DO NOTHING`, safe for repeated execution.
PostgreSQL 16 with 21 tables covering certificates, versions, policies, issuers, targets, agents, jobs, teams, owners, profiles, agent groups, revocations, discovery, network scans, and audit events. See the [Architecture Guide](docs/architecture.md) for the full schema.
## Configuration
All environment variables use the `CERTCTL_` prefix. Full reference below (39 variables across server, agent, and connector config).
### Server — Core
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SERVER_HOST` | `127.0.0.1` | Server bind address |
| `CERTCTL_SERVER_PORT` | `8080` | Server listen port (165535) |
| `CERTCTL_DATABASE_URL` | `postgres://localhost/certctl` | PostgreSQL connection string (required) |
| `CERTCTL_DATABASE_MAX_CONNS` | `25` | PostgreSQL connection pool size (min 1) |
| `CERTCTL_DATABASE_MIGRATIONS_PATH` | `./migrations` | Path to migration SQL files |
| `CERTCTL_MAX_BODY_SIZE` | `1048576` | Max HTTP request body in bytes (default 1MB) |
| `CERTCTL_LOG_LEVEL` | `info` | Log verbosity: `debug`, `info`, `warn`, `error` |
| `CERTCTL_LOG_FORMAT` | `json` | Log format: `json` (structured) or `text` (human-readable) |
### Server — Auth, CORS, Rate Limiting
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_AUTH_TYPE` | `api-key` | Auth mode: `api-key`, `jwt`, or `none` (demo only) |
| `CERTCTL_AUTH_SECRET` | — | Required for `api-key` and `jwt` auth types |
| `CERTCTL_CORS_ORIGINS` | *(empty = deny all)* | Comma-separated allowed origins, or `*` for dev |
| `CERTCTL_RATE_LIMIT_ENABLED` | `true` | Enable token bucket rate limiting |
| `CERTCTL_RATE_LIMIT_RPS` | `50` | Requests per second per client |
| `CERTCTL_RATE_LIMIT_BURST` | `100` | Max burst size |
| `CERTCTL_KEYGEN_MODE` | `agent` | Key generation: `agent` (production) or `server` (demo only) |
### Server — Scheduler
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SCHEDULER_RENEWAL_CHECK_INTERVAL` | `1h` | How often to check expiring certs (min 1m) |
| `CERTCTL_SCHEDULER_JOB_PROCESSOR_INTERVAL` | `30s` | How often to process pending jobs (min 1s) |
| `CERTCTL_SCHEDULER_AGENT_HEALTH_CHECK_INTERVAL` | `2m` | Agent heartbeat check frequency (min 1s) |
| `CERTCTL_SCHEDULER_NOTIFICATION_PROCESS_INTERVAL` | `1m` | Notification send frequency (min 1s) |
### Server — Sub-CA Mode
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_CA_CERT_PATH` | — | PEM-encoded CA certificate for sub-CA mode |
| `CERTCTL_CA_KEY_PATH` | — | PEM-encoded CA private key (RSA, ECDSA, PKCS#8) |
### Server — Feature Flags
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_EST_ENABLED` | `false` | Enable RFC 7030 EST enrollment endpoints |
| `CERTCTL_EST_ISSUER_ID` | `iss-local` | Which issuer processes EST enrollments |
| `CERTCTL_EST_PROFILE_ID` | — | Constrain EST to a specific certificate profile |
| `CERTCTL_NETWORK_SCAN_ENABLED` | `false` | Enable server-side TLS network scanning |
| `CERTCTL_NETWORK_SCAN_INTERVAL` | `6h` | How often scheduled scans run |
| `CERTCTL_VERIFY_DEPLOYMENT` | `true` | TLS verification after certificate deployment |
| `CERTCTL_VERIFY_TIMEOUT` | `10s` | TLS probe timeout |
| `CERTCTL_VERIFY_DELAY` | `2s` | Delay before verification probe |
### Server — Notification Connectors
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SLACK_WEBHOOK_URL` | — | Slack incoming webhook URL (enables Slack) |
| `CERTCTL_SLACK_CHANNEL` | — | Override default webhook channel |
| `CERTCTL_SLACK_USERNAME` | `certctl` | Bot display name |
| `CERTCTL_TEAMS_WEBHOOK_URL` | — | Microsoft Teams webhook URL (enables Teams) |
| `CERTCTL_PAGERDUTY_ROUTING_KEY` | — | PagerDuty Events API v2 key (enables PagerDuty) |
| `CERTCTL_PAGERDUTY_SEVERITY` | `warning` | Event severity: `info`, `warning`, `error`, `critical` |
| `CERTCTL_OPSGENIE_API_KEY` | — | OpsGenie Alert API key (enables OpsGenie) |
| `CERTCTL_OPSGENIE_PRIORITY` | `P3` | Alert priority: `P1``P5` |
### Agent
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SERVER_URL` | `http://localhost:8080` | Control plane URL |
| `CERTCTL_API_KEY` | — | Agent API key for authentication |
| `CERTCTL_AGENT_ID` | — | Registered agent ID (required) |
| `CERTCTL_KEY_DIR` | `/var/lib/certctl/keys` | Private key storage directory (0600 perms) |
| `CERTCTL_DISCOVERY_DIRS` | — | Directories to scan for existing certs (comma-separated) |
Docker Compose overrides for the demo stack are in `deploy/docker-compose.yml`.
## Development
### Docker Pull
```bash
# Install dev tools (golangci-lint, migrate CLI, air)
make install-tools
# Run tests
make test
# Run tests with race detection (same as CI)
go test -race ./internal/service/... ./internal/api/handler/... ./internal/api/middleware/... ./internal/scheduler/... ./internal/connector/... ./internal/domain/... ./internal/validation/...
# Run with coverage
make test-coverage
# Lint (runs golangci-lint with project config)
make lint
# Vulnerability scan
govulncheck ./...
# Format
make fmt
docker pull shankar0123.docker.scarf.sh/certctl-server
docker pull shankar0123.docker.scarf.sh/certctl-agent
```
### CI Pipeline
## Verifying this release
Every push and PR runs: `go vet`, `go test -race` (race detection), `golangci-lint` (11 linters including gosec and bodyclose), `govulncheck` (dependency CVE scanning), and per-layer coverage thresholds (service 60%, handler 60%, domain 40%, middleware 50%). Frontend CI runs TypeScript type checking, Vitest tests, and Vite production build. See `.github/workflows/ci.yml` for details.
Every `v*` tag publishes signed, attested release artefacts. Binaries
(`certctl-agent`, `certctl-server`, `certctl-cli`, `certctl-mcp-server` for
`linux|darwin × amd64|arm64`) ship alongside a `checksums.txt`, per-binary
SPDX-JSON SBOMs, Cosign signatures, and SLSA Level 3 provenance. Container
images on `ghcr.io/shankar0123/certctl-{server,agent}` are built with
`docker/build-push-action` `provenance: mode=max` + `sbom: true` and are
additionally signed with Cosign at the image digest.
### Docker Compose
All signatures use Cosign keyless OIDC; the signing identity is the
release workflow running on a signed tag.
**1. Verify SHA-256 checksums:**
```bash
make docker-up # Start stack (server + postgres + agent)
make docker-down # Stop stack
make docker-logs-server # Server logs
make docker-logs-agent # Agent logs
make docker-clean # Stop + remove volumes
sha256sum -c checksums.txt
```
## Security
**2. Verify the Cosign signature on `checksums.txt`:**
### Private Key Management
- **Agent keygen mode (default)**: Agents generate ECDSA P-256 keys locally and store them with 0600 permissions in `CERTCTL_KEY_DIR` (default `/var/lib/certctl/keys`). Only the CSR (public key) is sent to the control plane. Private keys never leave agent infrastructure.
- **Server keygen mode (demo only)**: Set `CERTCTL_KEYGEN_MODE=server` for development/demo with Local CA. The control plane generates RSA-2048 keys server-side. A log warning is emitted at startup.
### Authentication
- Agent-to-server: API key (registered at agent creation)
- API key and JWT auth types supported; `none` for demo/development
- Auth type and secret configured via `CERTCTL_AUTH_TYPE` and `CERTCTL_AUTH_SECRET`
### CORS
- **Deny-by-default**: Empty `CERTCTL_CORS_ORIGINS` blocks all cross-origin requests. Operators must explicitly list allowed origins (comma-separated) or set `*` for development.
### Input Validation
- Shell command injection prevention on all connector scripts (strict character whitelist, no metacharacters)
- RFC 1123 domain name validation, base64url ACME token validation
- SSRF protection in network scanner (loopback, link-local, multicast, broadcast ranges filtered)
### Concurrency Safety
- Scheduler loops protected by `sync/atomic.Bool` idempotency guards — duplicate ticks are skipped
- Graceful shutdown waits up to 30 seconds for in-flight work before database close
### Audit Trail
- Immutable append-only log in PostgreSQL (`audit_events` table)
- Every lifecycle action attributed to an actor with timestamp and resource reference
- No update or delete operations on audit records
- Every API call recorded to audit trail with method, path, actor, SHA-256 body hash, response status, and latency
## API Overview
97 endpoints under `/api/v1/` + `/.well-known/est/`, all returning JSON. List endpoints support pagination, sparse field selection (`?fields=`), sort (`?sort=-notAfter`), time-range filters, and cursor-based pagination. Full request/response schemas in the [OpenAPI 3.1 spec](api/openapi.yaml).
### Key Endpoints
```
# Certificate lifecycle
GET /api/v1/certificates List (filter, sort, cursor, sparse fields)
POST /api/v1/certificates/{id}/renew Trigger renewal → 202 Accepted
POST /api/v1/certificates/{id}/revoke Revoke with RFC 5280 reason code
GET /api/v1/certificates/{id}/export/pem Export PEM (JSON or file download)
POST /api/v1/certificates/{id}/export/pkcs12 Export PKCS#12 bundle (no private key)
GET /api/v1/crl/{issuer_id} DER-encoded X.509 CRL
GET /api/v1/ocsp/{issuer_id}/{serial} OCSP responder (good/revoked/unknown)
# Agent operations
POST /api/v1/agents/{id}/csr Submit CSR for issuance
GET /api/v1/agents/{id}/work Poll for pending deployment jobs
POST /api/v1/agents/{id}/discoveries Submit certificate discovery scan results
# Discovery & network scanning
GET /api/v1/discovered-certificates List discovered certs (?agent_id, ?status)
POST /api/v1/discovered-certificates/{id}/claim Link to managed cert
POST /api/v1/network-scan-targets/{id}/scan Trigger immediate TLS scan
# Jobs & approval
POST /api/v1/jobs/{id}/approve Approve interactive renewal
POST /api/v1/jobs/{id}/reject Reject interactive renewal
# Post-deployment verification
POST /api/v1/jobs/{id}/verify Submit TLS verification result
GET /api/v1/jobs/{id}/verification Get verification status
# Observability
GET /api/v1/metrics/prometheus Prometheus exposition format
GET /api/v1/stats/summary Dashboard summary
# Digest emails (scheduled briefing)
GET /api/v1/digest/preview HTML email preview
POST /api/v1/digest/send Send digest immediately
# EST enrollment (RFC 7030)
POST /.well-known/est/simpleenroll Device certificate enrollment
GET /.well-known/est/cacerts CA certificate chain (PKCS#7)
```bash
cosign verify-blob \
--bundle checksums.txt.sigstore.json \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/\.github/workflows/release\.yml@refs/tags/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
checksums.txt
```
Full CRUD is available for certificates, agents, issuers, targets, teams, owners, policies, profiles, agent groups, notifications, and audit events. See the [OpenAPI spec](api/openapi.yaml) or [Feature Inventory](docs/features.md) for the complete endpoint reference.
Every individual binary ships with its own `.sigstore.json` bundle
(unified Sigstore bundle containing signature, certificate chain, and
Rekor inclusion proof). Swap `checksums.txt` for any binary name and
point `--bundle` at the matching `<binary>.sigstore.json` to verify it
directly.
**3. Verify SLSA Level 3 provenance on a binary:**
```bash
slsa-verifier verify-artifact \
--provenance-path multiple.intoto.jsonl \
--source-uri github.com/shankar0123/certctl \
--source-tag v2.1.0 \
certctl-agent-linux-amd64
```
**4. Verify a container image signature and its SBOM / provenance attestations:**
```bash
IMAGE=ghcr.io/shankar0123/certctl-server:v2.1.0
cosign verify \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/\.github/workflows/release\.yml@refs/tags/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
# SBOM attestation (SPDX-JSON, emitted by docker/build-push-action)
cosign verify-attestation --type spdxjson \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
# SLSA provenance attestation (docker/build-push-action `provenance: mode=max`)
cosign verify-attestation --type slsaprovenance \
--certificate-identity-regexp '^https://github\.com/shankar0123/certctl/' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
"$IMAGE"
```
## Examples
Pick the scenario closest to your setup and have it running in 2 minutes.
| Example | Scenario |
|---------|----------|
| [`examples/acme-nginx/`](examples/acme-nginx/) | Let's Encrypt + NGINX, HTTP-01 challenges |
| [`examples/acme-wildcard-dns01/`](examples/acme-wildcard-dns01/) | Wildcard certs via DNS-01 (Cloudflare hook included) |
| [`examples/private-ca-traefik/`](examples/private-ca-traefik/) | Local CA (self-signed or sub-CA) + Traefik file provider |
| [`examples/step-ca-haproxy/`](examples/step-ca-haproxy/) | Smallstep step-ca + HAProxy combined PEM |
| [`examples/multi-issuer/`](examples/multi-issuer/) | ACME for public + Local CA for internal, one dashboard |
Each directory contains a `docker-compose.yml` and a `README.md` explaining the scenario, prerequisites, and customization.
## CLI
@@ -439,22 +329,14 @@ go install github.com/shankar0123/certctl/cmd/cli@latest
export CERTCTL_SERVER_URL=http://localhost:8443
export CERTCTL_API_KEY=your-api-key
# Certificate commands
# Usage
certctl-cli certs list # List all certificates
certctl-cli certs get mc-api-prod # Get certificate details
certctl-cli certs renew mc-api-prod # Trigger renewal
certctl-cli certs revoke mc-api-prod --reason keyCompromise
# Agent and job commands
certctl-cli agents list # List registered agents
certctl-cli jobs list # List jobs
certctl-cli jobs cancel job-123 # Cancel a pending job
# Operations
certctl-cli status # Server health + summary stats
certctl-cli import certs.pem # Bulk import from PEM file
# Output formats
certctl-cli certs list --format json # JSON output (default: table)
```
@@ -463,14 +345,10 @@ certctl-cli certs list --format json # JSON output (default: table)
certctl ships a standalone MCP (Model Context Protocol) server that exposes all 80 API endpoints as tools for AI assistants — Claude, Cursor, Windsurf, OpenClaw, VS Code Copilot, and any MCP-compatible client.
```bash
# Install
# Install and run
go install github.com/shankar0123/certctl/cmd/mcp-server@latest
# Configure
export CERTCTL_SERVER_URL=http://localhost:8443
export CERTCTL_API_KEY=your-api-key
# Run (stdio transport — add to your AI client config)
mcp-server
```
@@ -489,73 +367,38 @@ mcp-server
}
```
## Development
```bash
make build # Build server + agent binaries
make test # Run tests
make lint # golangci-lint (11 linters)
govulncheck ./... # Vulnerability scan
make docker-up # Start Docker Compose stack
```
CI runs on every push: `go vet`, `go test -race`, `golangci-lint`, `govulncheck`, and per-layer coverage thresholds (service 55%, handler 60%, domain 40%, middleware 30%). Frontend CI runs TypeScript type checking, Vitest tests, and Vite production build. 1,668 Go test functions with 625+ subtests, plus frontend test suite.
## Roadmap
### V1 (v1.0.0)
### V1 (v1.0.0) — Shipped
Core lifecycle management — Local CA + ACME v2 issuers, NGINX target connector, agent-side key generation, API auth + rate limiting, React dashboard, CI pipeline with coverage gates, Docker images on GHCR.
### V2: Operational Maturity
30+ milestones complete, 1,500+ tests. See the [Feature Inventory](docs/features.md) for details on every capability.
**What shipped (all ✅):**
- **Issuers** — Sub-CA mode (enterprise root chains), ACME DNS-01 + DNS-PERSIST-01 (wildcard certs, any DNS provider), step-ca (native /sign API), OpenSSL/Custom CA (script-based signing), ACME ARI (RFC 9702, CA-directed renewal timing)
- **Revocation** — RFC 5280 reason codes, DER-encoded X.509 CRL, embedded OCSP responder, short-lived cert exemption
- **Profiles + Ownership** — certificate profiles (key types, max TTL, crypto constraints), ownership tracking (owners + teams), dynamic agent groups, interactive renewal approval
- **GUI Operations** — bulk renew/revoke/reassign, deployment timeline, inline policy editor, target wizard, audit export (CSV/JSON), short-lived credentials view
- **Discovery** — filesystem scanning (PEM/DER) + network TLS scanning (CIDR ranges), triage workflow (claim/dismiss), network scan target management
- **Observability** — Prometheus + JSON metrics, 5 stats API endpoints, dashboard charts (heatmap, trends, distribution), agent fleet overview, structured logging
- **EST Server** (RFC 7030) — device/WiFi certificate enrollment, PKCS#7 wire format, configurable issuer + profile binding
- **MCP Server** — 78 API operations as AI tools for Claude, Cursor, and any MCP-compatible client
- **CLI** — 10 subcommands (list/get/renew/revoke certs, list agents/jobs, import, status, health, metrics), JSON/table output
- **Notifications** — Email (SMTP), Webhooks, Slack, Microsoft Teams, PagerDuty, OpsGenie connectors
- **API Enhancements** — sparse fields, sort, time-range filters, cursor pagination, immutable API audit logging
- **Compliance Mapping** — SOC 2 Type II, PCI-DSS 4.0, NIST SP 800-57 alignment guides
- **Post-Deployment TLS Verification** — agent-side TLS probe confirms the target is serving the correct certificate by SHA-256 fingerprint match, verification status visible in deployment timeline
- **Traefik + Caddy Targets** — Traefik (file provider, auto-reload) and Caddy (Admin API hot-reload or file-based), both in target wizard GUI
- **Certificate Export** — PEM (JSON or file download) and PKCS#12 formats, private keys never included (agent-side only), audit trail, GUI export buttons
- **S/MIME Support** — EKU-aware issuance (emailProtection, codeSigning, timeStamping), adaptive KeyUsage flags, email SAN routing, EKU badges in GUI
- **ACME ARI (RFC 9702)** — CA-directed renewal timing: the CA tells certctl the optimal renewal window, gracefully degrading to fixed thresholds when ARI is unavailable
- **Scheduled Certificate Digest** — HTML email digests with certificate stats, expiration timeline, job trends, and agent health; configurable daily/hourly/weekly briefings via SMTP
- **Helm Chart** — Production-ready Kubernetes with server Deployment, PostgreSQL StatefulSet with PVC, Agent DaemonSet, security contexts, resource limits, optional Ingress
**Also shipped:**
- Issuer catalog page (see all supported CAs, configure from dashboard)
- Vault PKI and DigiCert CertCentral issuer connectors (Beta)
- Turnkey deployment examples (ACME+NGINX, wildcard+DNS-01, private CA+Traefik, step-ca+HAProxy, multi-issuer)
- Migration guides (Certbot, acme.sh, cert-manager complement)
- One-line agent install script with cross-compiled binaries
**Coming in v2.1.0:**
- Dynamic issuer and target configuration via GUI (no env var restarts)
- First-run onboarding wizard
### V2: Operational Maturity — Shipped
30+ milestones shipping enterprise-grade features for free. Sub-CA mode, ACME DNS-01/DNS-PERSIST-01/EAB/ARI (RFC 9773)/profile selection, step-ca, Vault PKI, DigiCert CertCentral, Sectigo SCM, Google CAS, AWS ACM PCA, Entrust, GlobalSign, EJBCA, OpenSSL/Custom CA issuers. NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS (WinRM), F5 BIG-IP, SSH, Windows Certificate Store, Java Keystore, Kubernetes Secrets targets. EST server (RFC 7030) and SCEP server (RFC 8894) enrollment protocols. RFC 5280 revocation with DER CRL + embedded OCSP responder. Certificate profiles, ownership tracking, team assignment, agent groups, interactive approval workflows. Filesystem, network, and cloud secret manager (AWS SM, Azure KV, GCP SM) certificate discovery with triage GUI. Dynamic issuer/target configuration via GUI with AES-256-GCM encrypted storage. First-run onboarding wizard. Post-deployment TLS verification. Certificate export (PEM/PKCS#12). S/MIME support. Prometheus metrics. Scheduled certificate digest emails. Slack, Teams, PagerDuty, OpsGenie, SMTP notifications. MCP server (80 tools), CLI (12 commands), Helm chart. Compliance mapping (SOC 2, PCI-DSS 4.0, NIST SP 800-57). 5 turnkey deployment examples. Agent install script. Migration guides from certbot, acme.sh, and cert-manager. See the [Feature Inventory](docs/features.md) for details.
### V3: certctl Pro
Enterprise capabilities for larger deployments are available in the commercial tier.
Team access controls, identity provider integration, enterprise deployment targets, compliance and risk scoring, advanced fleet operations, event-driven architecture, advanced search, real-time operational views.
### V4+: Cloud, Scale & Passive Discovery
Passive network discovery (TLS listener), Kubernetes integration (cert-manager external issuer, Secrets target), cloud infrastructure targets (AWS ALB/ACM, Azure Key Vault), extended CA support (Google CAS, EJBCA, Sectigo), and platform-scale features (Terraform provider, multi-tenancy, HSM support).
## Examples
Turnkey Docker Compose configurations for common scenarios — pick the one closest to your setup and have it running in 2 minutes.
| Example | Scenario |
|---------|----------|
| [`examples/acme-nginx/`](examples/acme-nginx/) | Let's Encrypt + NGINX, HTTP-01 challenges |
| [`examples/acme-wildcard-dns01/`](examples/acme-wildcard-dns01/) | Wildcard certs via DNS-01 (Cloudflare hook included) |
| [`examples/private-ca-traefik/`](examples/private-ca-traefik/) | Local CA (self-signed or sub-CA) + Traefik file provider |
| [`examples/step-ca-haproxy/`](examples/step-ca-haproxy/) | Smallstep step-ca + HAProxy combined PEM |
| [`examples/multi-issuer/`](examples/multi-issuer/) | ACME for public + Local CA for internal, one dashboard |
Each directory contains a `docker-compose.yml` and a `README.md` explaining the scenario, prerequisites, and customization.
### V4+: Cloud & Scale
Kubernetes cert-manager external issuer, cloud infrastructure targets, extended CA support, and platform-scale features.
## License
Certctl is licensed under the [Business Source License 1.1](LICENSE). The source code is publicly available and free to use, modify, and self-host. The one restriction: you may not offer certctl as a managed/hosted certificate management service to third parties. The BSL 1.1 license converts automatically to Apache 2.0 on March 1, 2033, providing perpetual freedom.
Certctl is licensed under the [Business Source License 1.1](LICENSE). The source code is publicly available and free to use, modify, and self-host. The one restriction: you may not use certctl's certificate management functionality as part of a commercial offering to third parties, whether hosted, managed, embedded, bundled, or integrated. The BSL 1.1 license converts automatically to Apache 2.0 on March 14, 2033.
For licensing inquiries: certctl@proton.me
---
If certctl solves a problem you have, [star the repo](https://github.com/shankar0123/certctl) to help others find it. Questions, bugs, or feature requests — [open an issue](https://github.com/shankar0123/certctl/issues).
+1098 -48
View File
File diff suppressed because it is too large Load Diff
+619
View File
@@ -18,6 +18,7 @@ import (
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
"time"
)
@@ -828,3 +829,621 @@ func generateTestCertWithCN(commonName string) (*x509.Certificate, error) {
func strPtr(s string) *string {
return &s
}
// TestCreateTargetConnector_AllSupportedTypes tests connector creation for all 14 supported target types.
func TestCreateTargetConnector_AllSupportedTypes(t *testing.T) {
tmpDir := t.TempDir()
tests := []struct {
name string
typeName string
config interface{}
}{
{
name: "NGINX",
typeName: "NGINX",
config: map[string]string{
"cert_path": filepath.Join(tmpDir, "cert.pem"),
"key_path": filepath.Join(tmpDir, "key.pem"),
},
},
{
name: "Apache",
typeName: "Apache",
config: map[string]string{
"cert_path": filepath.Join(tmpDir, "cert.pem"),
"key_path": filepath.Join(tmpDir, "key.pem"),
},
},
{
name: "HAProxy",
typeName: "HAProxy",
config: map[string]string{
"cert_path": filepath.Join(tmpDir, "cert.pem"),
},
},
{
name: "F5",
typeName: "F5",
config: map[string]string{
"host": "192.0.2.1",
},
},
{
name: "IIS",
typeName: "IIS",
config: map[string]string{
"cert_store": "My",
},
},
{
name: "Traefik",
typeName: "Traefik",
config: map[string]string{
"cert_dir": tmpDir,
},
},
{
name: "Caddy",
typeName: "Caddy",
config: map[string]string{
"mode": "file",
},
},
{
name: "Envoy",
typeName: "Envoy",
config: map[string]string{
"cert_dir": tmpDir,
},
},
{
name: "Postfix",
typeName: "Postfix",
config: map[string]string{
"cert_path": filepath.Join(tmpDir, "cert.pem"),
"key_path": filepath.Join(tmpDir, "key.pem"),
},
},
{
name: "Dovecot",
typeName: "Dovecot",
config: map[string]string{
"cert_path": filepath.Join(tmpDir, "cert.pem"),
"key_path": filepath.Join(tmpDir, "key.pem"),
},
},
{
name: "SSH",
typeName: "SSH",
config: map[string]string{
"host": "192.0.2.1",
"user": "root",
"cert_path": "/etc/ssl/cert.pem",
"key_path": "/etc/ssl/key.pem",
},
},
{
name: "WinCertStore",
typeName: "WinCertStore",
config: map[string]string{
"cert_store": "My",
},
},
{
name: "JavaKeystore",
typeName: "JavaKeystore",
config: map[string]string{
"keystore_path": filepath.Join(tmpDir, "keystore.jks"),
},
},
{
name: "KubernetesSecrets",
typeName: "KubernetesSecrets",
config: map[string]string{
"namespace": "default",
"secret_name": "tls-secret",
},
},
}
cfg := &AgentConfig{
ServerURL: "http://localhost:8443",
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
configJSON, err := json.Marshal(tt.config)
if err != nil {
t.Fatalf("failed to marshal config: %v", err)
}
connector, err := agent.createTargetConnector(tt.typeName, configJSON)
// Some connectors (like WinCertStore, IIS) may error on non-Windows platforms
// or with insufficient validation. We accept either a valid connector or an error
// for now — the real unit tests in internal/connector/target/* cover validation
if connector == nil && err != nil {
// This is acceptable if the connector validates required fields
t.Logf("connector creation returned error (may be validation): %v", err)
return
}
if connector == nil {
t.Errorf("expected connector to be non-nil for type %s", tt.typeName)
}
})
}
}
// TestCreateTargetConnector_InvalidJSON tests connector creation with invalid JSON for each type.
func TestCreateTargetConnector_InvalidJSON(t *testing.T) {
tests := []string{
"NGINX",
"Apache",
"HAProxy",
"F5",
"IIS",
"Traefik",
"Caddy",
"Envoy",
"Postfix",
"Dovecot",
"SSH",
"WinCertStore",
"JavaKeystore",
"KubernetesSecrets",
}
cfg := &AgentConfig{
ServerURL: "http://localhost:8443",
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
invalidJSON := json.RawMessage("{invalid json}")
for _, typeName := range tests {
t.Run(typeName, func(t *testing.T) {
_, err := agent.createTargetConnector(typeName, invalidJSON)
if err == nil {
t.Errorf("expected error for invalid JSON with type %s", typeName)
}
})
}
}
// TestCreateTargetConnector_UnknownType tests connector creation with unknown target type.
func TestCreateTargetConnector_UnknownType(t *testing.T) {
cfg := &AgentConfig{
ServerURL: "http://localhost:8443",
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
_, err := agent.createTargetConnector("MagicBox", nil)
if err == nil {
t.Error("expected error for unsupported target type")
}
if !strings.Contains(err.Error(), "unsupported target type") {
t.Errorf("expected 'unsupported target type' error, got: %v", err)
}
}
// TestCreateTargetConnector_EmptyConfig tests connector creation with empty config JSON.
func TestCreateTargetConnector_EmptyConfig(t *testing.T) {
tests := []string{
"NGINX",
"Apache",
"HAProxy",
"Traefik",
"Caddy",
"Envoy",
}
cfg := &AgentConfig{
ServerURL: "http://localhost:8443",
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
for _, typeName := range tests {
t.Run(typeName, func(t *testing.T) {
// Empty config should be handled gracefully (defaults applied)
connector, err := agent.createTargetConnector(typeName, nil)
// Should not error on nil/empty config (defaults are applied)
if err != nil {
// Validation errors are acceptable, but parsing errors are not
if !strings.Contains(err.Error(), "invalid") && !strings.Contains(err.Error(), "missing") {
t.Logf("connector creation with empty config returned: %v", err)
}
return
}
if connector == nil {
t.Errorf("expected non-nil connector for type %s with empty config", typeName)
}
})
}
}
// TestRunDiscoveryScan_ValidCerts tests discovery scanning with valid certificates.
func TestRunDiscoveryScan_ValidCerts(t *testing.T) {
tmpDir := t.TempDir()
// Create a valid PEM certificate file
cert, _ := generateTestCertWithCN("example.com")
block := &pem.Block{Type: "CERTIFICATE", Bytes: cert.Raw}
certPEM := pem.EncodeToMemory(block)
certPath := filepath.Join(tmpDir, "cert.pem")
if err := os.WriteFile(certPath, certPEM, 0644); err != nil {
t.Fatalf("failed to write certificate: %v", err)
}
// Mock server to accept discovery report
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/agents/a-test/discoveries" {
t.Errorf("unexpected path: %s", r.URL.Path)
w.WriteHeader(http.StatusNotFound)
return
}
if r.Method != http.MethodPost {
t.Errorf("unexpected method: %s", r.Method)
w.WriteHeader(http.StatusMethodNotAllowed)
return
}
// Verify request body
var payload map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
t.Logf("failed to decode discovery report: %v", err)
w.WriteHeader(http.StatusBadRequest)
return
}
// Verify report contains certificates
certs, ok := payload["certificates"].([]interface{})
if !ok || len(certs) == 0 {
t.Logf("expected certificates in report")
}
w.WriteHeader(http.StatusAccepted)
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Run discovery scan
agent.runDiscoveryScan(context.Background())
// If we got here without panic/error, the test passes
}
// TestRunDiscoveryScan_NoCertificates tests discovery scanning with empty directory.
func TestRunDiscoveryScan_NoCertificates(t *testing.T) {
tmpDir := t.TempDir()
// Create an empty directory
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Should not receive a request if no certs found and no errors
t.Logf("discovery report received: %s", r.URL.Path)
w.WriteHeader(http.StatusAccepted)
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Run discovery scan - should complete without error even with empty directory
agent.runDiscoveryScan(context.Background())
}
// TestRunDiscoveryScan_MultipleCerts tests discovery scanning with multiple certificate files.
func TestRunDiscoveryScan_MultipleCerts(t *testing.T) {
tmpDir := t.TempDir()
// Create multiple certificate files
cert1, _ := generateTestCertWithCN("cert1.example.com")
cert2, _ := generateTestCertWithCN("cert2.example.com")
block1 := &pem.Block{Type: "CERTIFICATE", Bytes: cert1.Raw}
block2 := &pem.Block{Type: "CERTIFICATE", Bytes: cert2.Raw}
certPath1 := filepath.Join(tmpDir, "cert1.pem")
certPath2 := filepath.Join(tmpDir, "cert2.crt")
if err := os.WriteFile(certPath1, pem.EncodeToMemory(block1), 0644); err != nil {
t.Fatalf("failed to write cert1: %v", err)
}
if err := os.WriteFile(certPath2, pem.EncodeToMemory(block2), 0644); err != nil {
t.Fatalf("failed to write cert2: %v", err)
}
certCount := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/agents/a-test/discoveries" {
w.WriteHeader(http.StatusNotFound)
return
}
var payload map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
w.WriteHeader(http.StatusBadRequest)
return
}
// Count certificates in report
if certs, ok := payload["certificates"].([]interface{}); ok {
certCount = len(certs)
}
w.WriteHeader(http.StatusAccepted)
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Run discovery scan
agent.runDiscoveryScan(context.Background())
if certCount != 2 {
t.Logf("expected 2 certificates in discovery report, got %d", certCount)
}
}
// TestRunDiscoveryScan_DERCertificate tests discovery scanning with DER-encoded certificate.
func TestRunDiscoveryScan_DERCertificate(t *testing.T) {
tmpDir := t.TempDir()
// Create a DER-encoded certificate file
cert, _ := generateTestCertWithCN("der.example.com")
derPath := filepath.Join(tmpDir, "cert.der")
if err := os.WriteFile(derPath, cert.Raw, 0644); err != nil {
t.Fatalf("failed to write DER certificate: %v", err)
}
certCount := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/agents/a-test/discoveries" {
w.WriteHeader(http.StatusNotFound)
return
}
var payload map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
w.WriteHeader(http.StatusBadRequest)
return
}
if certs, ok := payload["certificates"].([]interface{}); ok {
certCount = len(certs)
}
w.WriteHeader(http.StatusAccepted)
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Run discovery scan
agent.runDiscoveryScan(context.Background())
if certCount != 1 {
t.Logf("expected 1 DER certificate in discovery report, got %d", certCount)
}
}
// TestRunDiscoveryScan_Subdirectories tests discovery scanning with subdirectories.
func TestRunDiscoveryScan_Subdirectories(t *testing.T) {
tmpDir := t.TempDir()
// Create subdirectory
subDir := filepath.Join(tmpDir, "subdir")
if err := os.MkdirAll(subDir, 0755); err != nil {
t.Fatalf("failed to create subdir: %v", err)
}
// Create certificate in subdirectory
cert, _ := generateTestCertWithCN("subdir.example.com")
block := &pem.Block{Type: "CERTIFICATE", Bytes: cert.Raw}
certPath := filepath.Join(subDir, "cert.pem")
if err := os.WriteFile(certPath, pem.EncodeToMemory(block), 0644); err != nil {
t.Fatalf("failed to write certificate: %v", err)
}
certCount := 0
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/api/v1/agents/a-test/discoveries" {
w.WriteHeader(http.StatusNotFound)
return
}
var payload map[string]interface{}
if err := json.NewDecoder(r.Body).Decode(&payload); err != nil {
w.WriteHeader(http.StatusBadRequest)
return
}
if certs, ok := payload["certificates"].([]interface{}); ok {
certCount = len(certs)
}
w.WriteHeader(http.StatusAccepted)
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Run discovery scan - should recursively find certs in subdirs
agent.runDiscoveryScan(context.Background())
if certCount != 1 {
t.Logf("expected 1 certificate in subdirectory, got %d", certCount)
}
}
// TestRunDiscoveryScan_ServerError tests discovery scanning when server returns error.
func TestRunDiscoveryScan_ServerError(t *testing.T) {
tmpDir := t.TempDir()
// Create a certificate file
cert, _ := generateTestCertWithCN("example.com")
block := &pem.Block{Type: "CERTIFICATE", Bytes: cert.Raw}
certPath := filepath.Join(tmpDir, "cert.pem")
if err := os.WriteFile(certPath, pem.EncodeToMemory(block), 0644); err != nil {
t.Fatalf("failed to write certificate: %v", err)
}
// Mock server returns error
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusInternalServerError)
w.Write([]byte("server error"))
}))
defer server.Close()
cfg := &AgentConfig{
ServerURL: server.URL,
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
DiscoveryDirs: []string{tmpDir},
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
// Should handle server error gracefully without panicking
agent.runDiscoveryScan(context.Background())
}
// TestDiscoveredCertEntry_ValidFields tests that discovered certificate entries have valid fields.
func TestDiscoveredCertEntry_ValidFields(t *testing.T) {
tmpDir := t.TempDir()
// Create certificate with specific details
cert, _ := generateTestCertWithCN("test.example.com")
block := &pem.Block{Type: "CERTIFICATE", Bytes: cert.Raw}
certPEM := pem.EncodeToMemory(block)
certPath := filepath.Join(tmpDir, "cert.pem")
if err := os.WriteFile(certPath, certPEM, 0644); err != nil {
t.Fatalf("failed to write certificate: %v", err)
}
cfg := &AgentConfig{
ServerURL: "http://localhost:8443",
APIKey: "test-key",
AgentID: "a-test",
Hostname: "test-host",
}
logger := slog.New(slog.NewTextHandler(io.Discard, nil))
agent := NewAgent(cfg, logger)
entries := agent.parsePEMFile(certPath)
if len(entries) != 1 {
t.Fatalf("expected 1 entry, got %d", len(entries))
}
entry := entries[0]
// Verify all required fields are populated
if entry.CommonName == "" {
t.Error("CommonName should not be empty")
}
if entry.FingerprintSHA256 == "" {
t.Error("FingerprintSHA256 should not be empty")
}
if len(entry.FingerprintSHA256) != 64 {
t.Errorf("FingerprintSHA256 should be 64 hex chars, got %d", len(entry.FingerprintSHA256))
}
if entry.SerialNumber == "" {
t.Error("SerialNumber should not be empty")
}
if entry.IssuerDN == "" {
t.Error("IssuerDN should not be empty")
}
if entry.SubjectDN == "" {
t.Error("SubjectDN should not be empty")
}
if entry.NotBefore == "" {
t.Error("NotBefore should not be empty")
}
if entry.NotAfter == "" {
t.Error("NotAfter should not be empty")
}
if entry.KeyAlgorithm == "" {
t.Error("KeyAlgorithm should not be empty")
}
if entry.KeySize == 0 {
t.Error("KeySize should not be zero")
}
if entry.SourcePath == "" {
t.Error("SourcePath should not be empty")
}
if entry.SourceFormat != "PEM" {
t.Errorf("SourceFormat should be 'PEM', got '%s'", entry.SourceFormat)
}
if entry.PEMData == "" {
t.Error("PEMData should not be empty")
}
}
+177 -2
View File
@@ -12,6 +12,7 @@ import (
"crypto/x509/pkix"
"encoding/json"
"encoding/pem"
"errors"
"flag"
"fmt"
"io"
@@ -23,13 +24,20 @@ import (
"path/filepath"
"runtime"
"strings"
"sync"
"syscall"
"time"
"github.com/shankar0123/certctl/internal/connector/target"
"github.com/shankar0123/certctl/internal/connector/target/apache"
"github.com/shankar0123/certctl/internal/connector/target/caddy"
"github.com/shankar0123/certctl/internal/connector/target/envoy"
pf "github.com/shankar0123/certctl/internal/connector/target/postfix"
sshconn "github.com/shankar0123/certctl/internal/connector/target/ssh"
"github.com/shankar0123/certctl/internal/connector/target/f5"
jks "github.com/shankar0123/certctl/internal/connector/target/javakeystore"
k8s "github.com/shankar0123/certctl/internal/connector/target/k8ssecret"
wcs "github.com/shankar0123/certctl/internal/connector/target/wincertstore"
"github.com/shankar0123/certctl/internal/connector/target/haproxy"
"github.com/shankar0123/certctl/internal/connector/target/iis"
"github.com/shankar0123/certctl/internal/connector/target/nginx"
@@ -47,6 +55,16 @@ type AgentConfig struct {
DiscoveryDirs []string // Directories to scan for certificates (comma-separated via env)
}
// ErrAgentRetired is the sentinel returned by [Agent.Run] when the control
// plane responds with HTTP 410 Gone to a heartbeat or work-poll request — the
// canonical signal that this agent's row has been soft-retired server-side
// (see I-004 in cowork/certctl-coverage-gap-audit.md). The binary must
// terminate cleanly: an init-system restart would only produce another 410
// and wedge the host in a restart loop. main() translates this sentinel into
// a zero exit code so systemd (Restart=on-failure) and launchd do not respawn
// the process. Do not wrap this error — main() matches it with errors.Is.
var ErrAgentRetired = fmt.Errorf("agent retired by control plane")
// Agent represents the local agent that runs on target servers.
// It periodically sends heartbeats, polls for work, executes deployment and CSR jobs,
// and scans configured directories for existing certificates.
@@ -62,6 +80,17 @@ type Agent struct {
pollInterval time.Duration
discoveryInterval time.Duration
consecutiveFailures int
// I-004: terminal retirement signal. retiredSignal is closed exactly once
// (guarded by retiredOnce) when either sendHeartbeat or pollForWork
// observes HTTP 410 Gone. The Run() select loop picks up the close and
// returns ErrAgentRetired, unwinding the goroutine cleanly so main() can
// log + exit(0). Using a channel + sync.Once (rather than an atomic bool
// + polling) lets us fall through the select statement immediately instead
// of waiting for the next ticker; the zero-allocation close is safe to
// race with ctx.Done() and other cases.
retiredOnce sync.Once
retiredSignal chan struct{}
}
// WorkResponse represents the response from the work polling endpoint.
@@ -92,9 +121,31 @@ func NewAgent(cfg *AgentConfig, logger *slog.Logger) *Agent {
heartbeatInterval: 60 * time.Second,
pollInterval: 30 * time.Second,
discoveryInterval: 6 * time.Hour, // scan for certs every 6 hours
retiredSignal: make(chan struct{}),
}
}
// markRetired records that the control plane has declared this agent retired
// (HTTP 410 Gone on heartbeat or work poll). Idempotent via sync.Once — if
// both the heartbeat and work-poll paths observe 410 in the same tick, only
// the first close() runs and we avoid a runtime panic. Emits an ERROR-level
// log line so init-system journaling captures it prominently, and includes
// the source (heartbeat/work_poll), response body, and status code so the
// operator can verify it's a genuine retirement signal rather than a
// misrouted request. After this returns, the select-loop case in Run()
// observes the closed channel on its next iteration and returns
// ErrAgentRetired.
func (a *Agent) markRetired(source string, statusCode int, body string) {
a.retiredOnce.Do(func() {
a.logger.Error("agent has been retired by control plane — shutting down",
"source", source,
"status", statusCode,
"body", body,
"agent_id", a.config.AgentID)
close(a.retiredSignal)
})
}
// Run starts the agent's main loop.
// It sends heartbeats, polls for work, and handles graceful shutdown via context cancellation.
func (a *Agent) Run(ctx context.Context) error {
@@ -148,6 +199,19 @@ func (a *Agent) Run(ctx context.Context) error {
a.logger.Info("agent shutting down", "reason", ctx.Err())
return ctx.Err()
// I-004: retiredSignal is closed exactly once (via markRetired's
// sync.Once) when either sendHeartbeat or pollForWork observes HTTP 410
// Gone from the control plane. Falling through this case immediately
// (rather than waiting for the next ticker) lets the agent shut down
// quickly once retirement is confirmed — every extra heartbeat against a
// retired row is wasted work and noise in the audit trail. Returning
// ErrAgentRetired propagates up to main(), which matches it with
// errors.Is and exits(0) so systemd/launchd do not respawn the process.
case <-a.retiredSignal:
a.logger.Info("agent retired signal received — exiting event loop",
"agent_id", a.config.AgentID)
return ErrAgentRetired
case <-heartbeatTicker.C:
a.sendHeartbeat(ctx)
@@ -203,6 +267,22 @@ func (a *Agent) sendHeartbeat(ctx context.Context) {
}
defer resp.Body.Close()
// I-004: HTTP 410 Gone is the terminal signal from the control plane that
// this agent's row has been soft-retired (see internal/api/handler/agent.go
// heartbeat path + AgentRetirementService). Treat it separately from the
// generic non-200 error branch: record the event to markRetired (which closes
// retiredSignal exactly once via sync.Once) and return without bumping
// consecutiveFailures — this is not a transient failure, it's a clean
// shutdown. The Run() select loop picks up the closed channel on its next
// iteration and returns ErrAgentRetired, which main() translates into an
// exit(0) so systemd/launchd don't respawn the process into another 410
// loop.
if resp.StatusCode == http.StatusGone {
body, _ := io.ReadAll(resp.Body)
a.markRetired("heartbeat", resp.StatusCode, string(body))
return
}
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
a.logger.Error("heartbeat rejected",
@@ -231,6 +311,19 @@ func (a *Agent) pollForWork(ctx context.Context) {
}
defer resp.Body.Close()
// I-004: same terminal-retirement handling as sendHeartbeat. Work-poll is the
// other hot path that can observe an agent's soft-retirement; if the
// heartbeat tick happens to fire after a work-poll tick within the same
// retirement window, this branch catches it first. markRetired's sync.Once
// guards idempotency so racing both paths in the same tick only closes the
// signal channel once. No consecutiveFailures increment — retirement is
// not a transient failure.
if resp.StatusCode == http.StatusGone {
body, _ := io.ReadAll(resp.Body)
a.markRetired("work_poll", resp.StatusCode, string(body))
return
}
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
a.logger.Error("work poll rejected",
@@ -583,7 +676,11 @@ func (a *Agent) createTargetConnector(targetType string, configJSON json.RawMess
return nil, fmt.Errorf("invalid F5 config: %w", err)
}
}
return f5.New(&cfg, a.logger), nil
conn, err := f5.New(&cfg, a.logger)
if err != nil {
return nil, fmt.Errorf("failed to create F5 connector: %w", err)
}
return conn, nil
case "IIS":
var cfg iis.Config
@@ -592,7 +689,7 @@ func (a *Agent) createTargetConnector(targetType string, configJSON json.RawMess
return nil, fmt.Errorf("invalid IIS config: %w", err)
}
}
return iis.New(&cfg, a.logger), nil
return iis.New(&cfg, a.logger)
case "Traefik":
var cfg traefik.Config
@@ -612,6 +709,71 @@ func (a *Agent) createTargetConnector(targetType string, configJSON json.RawMess
}
return caddy.New(&cfg, a.logger), nil
case "Envoy":
var cfg envoy.Config
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid Envoy config: %w", err)
}
}
return envoy.New(&cfg, a.logger), nil
case "Postfix":
var cfg pf.Config
cfg.Mode = "postfix"
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid Postfix config: %w", err)
}
}
return pf.New(&cfg, a.logger), nil
case "Dovecot":
var cfg pf.Config
cfg.Mode = "dovecot"
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid Dovecot config: %w", err)
}
}
return pf.New(&cfg, a.logger), nil
case "SSH":
var cfg sshconn.Config
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid SSH config: %w", err)
}
}
return sshconn.New(&cfg, a.logger)
case "WinCertStore":
var cfg wcs.Config
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid WinCertStore config: %w", err)
}
}
return wcs.New(&cfg, a.logger)
case "JavaKeystore":
var cfg jks.Config
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid JavaKeystore config: %w", err)
}
}
return jks.New(&cfg, a.logger), nil
case "KubernetesSecrets":
var cfg k8s.Config
if len(configJSON) > 0 {
if err := json.Unmarshal(configJSON, &cfg); err != nil {
return nil, fmt.Errorf("invalid KubernetesSecrets config: %w", err)
}
}
return k8s.New(&cfg, a.logger)
default:
return nil, fmt.Errorf("unsupported target type: %s", targetType)
}
@@ -1042,6 +1204,19 @@ func main() {
cancel()
<-errChan
case err := <-errChan:
// I-004: ErrAgentRetired is a terminal, *clean* shutdown — the control
// plane responded HTTP 410 Gone on heartbeat/work-poll, meaning this
// agent's row has been soft-retired and will never be reachable again.
// Exit 0 so systemd's Restart=on-failure and launchd's KeepAlive do NOT
// respawn the process into another 410 loop (which would wedge the host
// and spam the control plane). Operators can observe the retirement via
// audit_events or the AgentsPage retired tab; the terminal log line on
// the way out is enough for post-mortem forensics.
if errors.Is(err, ErrAgentRetired) {
logger.Info("agent retired by control plane — exiting without restart",
"agent_id", agentCfg.AgentID)
return
}
if err != context.Canceled {
logger.Error("agent error", "error", err)
os.Exit(1)
+40 -4
View File
@@ -27,14 +27,17 @@ Commands:
certs renew ID Trigger certificate renewal
certs revoke ID Revoke a certificate
agents list List agents
agents get ID Get agent details
agents list List agents (add --retired to list soft-retired agents)
agents get ID Get agent details
agents retire ID Soft-retire an agent (add --force --reason "…" to cascade)
jobs list List jobs
jobs get ID Get job details
jobs cancel ID Cancel a pending job
import FILE Bulk import certificates from PEM file(s)
Required: --owner-id, --team-id, --renewal-policy-id, --issuer-id
Optional: --name-template (default {cn}), --environment (default imported)
status Show server health + summary stats
version Show CLI version
@@ -130,15 +133,27 @@ func handleCerts(client *cli.Client, args []string) error {
reason = subArgs[2]
}
return client.RevokeCertificate(id, reason)
case "bulk-revoke":
return client.BulkRevokeCertificates(subArgs)
default:
fmt.Fprintf(os.Stderr, "unknown subcommand: certs %s\n", subcommand)
return nil
}
}
// handleAgents dispatches the `agents` subcommands.
//
// I-004 additions:
//
// agents list --retired — hit the opt-in /agents/retired endpoint
// instead of the default listing (which
// filters retired rows out).
// agents retire <id> — soft-retire an agent (DELETE /agents/{id}).
// --force cascades; --reason is required with
// --force (mirrors ErrForceReasonRequired).
func handleAgents(client *cli.Client, args []string) error {
if len(args) == 0 {
fmt.Fprintf(os.Stderr, "usage: agents <list|get> [options]\n")
fmt.Fprintf(os.Stderr, "usage: agents <list|get|retire> [options]\n")
return nil
}
@@ -147,13 +162,34 @@ func handleAgents(client *cli.Client, args []string) error {
switch subcommand {
case "list":
return client.ListAgents(subArgs)
// --retired flag splits to a separate endpoint. We intercept it
// client-side and strip it before delegating, so both code paths
// share the --page/--per-page flag parsing inside the client.
retired := false
rest := make([]string, 0, len(subArgs))
for _, a := range subArgs {
if a == "--retired" {
retired = true
continue
}
rest = append(rest, a)
}
if retired {
return client.ListRetiredAgents(rest)
}
return client.ListAgents(rest)
case "get":
if len(subArgs) == 0 {
fmt.Fprintf(os.Stderr, "usage: agents get <id>\n")
return nil
}
return client.GetAgent(subArgs[0])
case "retire":
if len(subArgs) == 0 {
fmt.Fprintf(os.Stderr, "usage: agents retire <id> [--force] [--reason <reason>]\n")
return nil
}
return client.RetireAgent(subArgs)
default:
fmt.Fprintf(os.Stderr, "unknown subcommand: agents %s\n", subcommand)
return nil
+355 -135
View File
@@ -9,6 +9,7 @@ import (
"os"
"os/signal"
"strconv"
"strings"
"syscall"
"time"
@@ -17,12 +18,9 @@ import (
"github.com/shankar0123/certctl/internal/api/router"
"github.com/shankar0123/certctl/internal/config"
"github.com/shankar0123/certctl/internal/domain"
acmeissuer "github.com/shankar0123/certctl/internal/connector/issuer/acme"
"github.com/shankar0123/certctl/internal/connector/issuer/local"
digicertissuer "github.com/shankar0123/certctl/internal/connector/issuer/digicert"
opensslissuer "github.com/shankar0123/certctl/internal/connector/issuer/openssl"
stepcaissuer "github.com/shankar0123/certctl/internal/connector/issuer/stepca"
vaultissuer "github.com/shankar0123/certctl/internal/connector/issuer/vault"
discoveryawssm "github.com/shankar0123/certctl/internal/connector/discovery/awssm"
discoveryazurekv "github.com/shankar0123/certctl/internal/connector/discovery/azurekv"
discoverygcpsm "github.com/shankar0123/certctl/internal/connector/discovery/gcpsm"
notifyemail "github.com/shankar0123/certctl/internal/connector/notifier/email"
notifyopsgenie "github.com/shankar0123/certctl/internal/connector/notifier/opsgenie"
notifypagerduty "github.com/shankar0123/certctl/internal/connector/notifier/pagerduty"
@@ -83,107 +81,64 @@ func main() {
ownerRepo := postgres.NewOwnerRepository(db)
logger.Info("initialized all repositories")
// Initialize Local CA issuer connector.
// In sub-CA mode (CERTCTL_CA_CERT_PATH + CERTCTL_CA_KEY_PATH set), loads a pre-signed
// CA cert+key from disk. All issued certs chain to the upstream root (e.g., ADCS).
// Otherwise, generates an ephemeral self-signed CA for development/demo.
localCAConfig := &local.Config{}
if cfg.CA.CertPath != "" && cfg.CA.KeyPath != "" {
localCAConfig.CACertPath = cfg.CA.CertPath
localCAConfig.CAKeyPath = cfg.CA.KeyPath
logger.Info("Local CA configured in sub-CA mode",
"cert_path", cfg.CA.CertPath,
"key_path", cfg.CA.KeyPath)
// Initialize dynamic issuer registry.
// Issuers are loaded from the database (with AES-256-GCM encrypted config).
// On first boot with an empty database, env var issuers are seeded automatically.
//
// M-8 (CWE-916 / CWE-329): the encryption passphrase is passed as a raw
// string into IssuerService / TargetService / IssuerRegistry. Each call to
// crypto.EncryptIfKeySet generates a fresh 16-byte PBKDF2 salt and emits a
// v2 blob (magic 0x02 || salt || nonce || sealed). Decryption auto-detects
// v1 legacy blobs (no magic) and falls back to the fixed v1 salt for
// backward compatibility; v1 blobs transparently upgrade to v2 on next
// write. DO NOT pre-derive the key here with crypto.DeriveKey — that was
// the v1 fixed-salt behaviour that M-8 removes.
encryptionKey := cfg.Encryption.ConfigEncryptionKey
if encryptionKey != "" {
logger.Info("config encryption enabled (AES-256-GCM, per-ciphertext PBKDF2 salt)")
} else {
logger.Info("Local CA configured in self-signed mode (ephemeral)")
}
localCA := local.New(localCAConfig, logger)
logger.Info("initialized Local CA issuer connector")
// Initialize ACME issuer connector (for Let's Encrypt, ZeroSSL, Sectigo, Google Trust Services, etc.)
// Supports HTTP-01 (default), DNS-01 (for wildcards), and DNS-PERSIST-01 (standing record) challenge types.
// EAB (External Account Binding) required by ZeroSSL, Google Trust Services, SSL.com.
acmeConnector := acmeissuer.New(&acmeissuer.Config{
DirectoryURL: os.Getenv("CERTCTL_ACME_DIRECTORY_URL"),
Email: os.Getenv("CERTCTL_ACME_EMAIL"),
EABKid: os.Getenv("CERTCTL_ACME_EAB_KID"),
EABHmac: os.Getenv("CERTCTL_ACME_EAB_HMAC"),
ChallengeType: os.Getenv("CERTCTL_ACME_CHALLENGE_TYPE"),
DNSPresentScript: os.Getenv("CERTCTL_ACME_DNS_PRESENT_SCRIPT"),
DNSCleanUpScript: os.Getenv("CERTCTL_ACME_DNS_CLEANUP_SCRIPT"),
DNSPersistIssuerDomain: os.Getenv("CERTCTL_ACME_DNS_PERSIST_ISSUER_DOMAIN"),
Insecure: cfg.ACME.Insecure,
}, logger)
logger.Info("initialized ACME issuer connector")
// Initialize step-ca issuer connector (for Smallstep private CA).
// Uses the native /sign API with JWK provisioner authentication.
stepcaConnector := stepcaissuer.New(&stepcaissuer.Config{
CAURL: os.Getenv("CERTCTL_STEPCA_URL"),
RootCertPath: os.Getenv("CERTCTL_STEPCA_ROOT_CERT"),
ProvisionerName: os.Getenv("CERTCTL_STEPCA_PROVISIONER"),
ProvisionerKeyPath: os.Getenv("CERTCTL_STEPCA_KEY_PATH"),
ProvisionerPassword: os.Getenv("CERTCTL_STEPCA_PASSWORD"),
}, logger)
logger.Info("initialized step-ca issuer connector")
// Initialize OpenSSL/Custom CA issuer connector (for script-based CA integrations).
// Delegates certificate signing to user-provided scripts.
opensslConnector := opensslissuer.New(&opensslissuer.Config{
SignScript: os.Getenv("CERTCTL_OPENSSL_SIGN_SCRIPT"),
RevokeScript: os.Getenv("CERTCTL_OPENSSL_REVOKE_SCRIPT"),
CRLScript: os.Getenv("CERTCTL_OPENSSL_CRL_SCRIPT"),
TimeoutSeconds: getEnvIntDefault(os.Getenv("CERTCTL_OPENSSL_TIMEOUT_SECONDS"), 30),
}, logger)
logger.Info("initialized OpenSSL/Custom CA issuer connector")
// Initialize Vault PKI issuer connector (for HashiCorp Vault internal PKI).
// Uses the Vault HTTP API with token authentication.
vaultConnector := vaultissuer.New(&vaultissuer.Config{
Addr: os.Getenv("CERTCTL_VAULT_ADDR"),
Token: os.Getenv("CERTCTL_VAULT_TOKEN"),
Mount: getEnvDefault("CERTCTL_VAULT_MOUNT", "pki"),
Role: os.Getenv("CERTCTL_VAULT_ROLE"),
TTL: getEnvDefault("CERTCTL_VAULT_TTL", "8760h"),
}, logger)
logger.Info("initialized Vault PKI issuer connector")
// Initialize DigiCert CertCentral issuer connector (for enterprise public CA).
// Uses the DigiCert REST API with async order model.
digicertConnector := digicertissuer.New(&digicertissuer.Config{
APIKey: os.Getenv("CERTCTL_DIGICERT_API_KEY"),
OrgID: os.Getenv("CERTCTL_DIGICERT_ORG_ID"),
ProductType: getEnvDefault("CERTCTL_DIGICERT_PRODUCT_TYPE", "ssl_basic"),
BaseURL: getEnvDefault("CERTCTL_DIGICERT_BASE_URL", "https://www.digicert.com/services/v2"),
}, logger)
logger.Info("initialized DigiCert CertCentral issuer connector")
// Build issuer registry: maps issuer IDs (from database) to connector implementations.
// "iss-local" matches the seed data issuer ID for the Local CA.
// "iss-acme-staging" and "iss-acme-prod" are conventional IDs for ACME issuers.
// "iss-stepca" is the step-ca private CA connector.
// "iss-openssl" is the custom CA/OpenSSL connector.
issuerRegistry := map[string]service.IssuerConnector{
"iss-local": service.NewIssuerConnectorAdapter(localCA),
"iss-acme-staging": service.NewIssuerConnectorAdapter(acmeConnector),
"iss-acme-prod": service.NewIssuerConnectorAdapter(acmeConnector),
"iss-stepca": service.NewIssuerConnectorAdapter(stepcaConnector),
"iss-openssl": service.NewIssuerConnectorAdapter(opensslConnector),
// C-2 fix: fail closed at startup when database-sourced issuer or target
// rows exist without a configured encryption key. Previously the server
// would emit a one-line warning and silently persist new GUI-created
// configs as plaintext (CWE-311). Refuse to start instead: the operator
// must either configure CERTCTL_CONFIG_ENCRYPTION_KEY or remove the
// vulnerable rows before the control plane can boot.
ctx := context.Background()
dbIssuers, ierr := issuerRepo.List(ctx)
if ierr != nil {
logger.Error("startup check: failed to list issuers", "error", ierr)
os.Exit(1)
}
dbTargets, terr := targetRepo.List(ctx)
if terr != nil {
logger.Error("startup check: failed to list targets", "error", terr)
os.Exit(1)
}
var dbIssuerCount, dbTargetCount int
for _, iss := range dbIssuers {
if iss != nil && iss.Source == "database" {
dbIssuerCount++
}
}
for _, tgt := range dbTargets {
if tgt != nil && tgt.Source == "database" {
dbTargetCount++
}
}
if dbIssuerCount > 0 || dbTargetCount > 0 {
logger.Error(
"startup refused: CERTCTL_CONFIG_ENCRYPTION_KEY is not set but database-sourced configs exist "+
"(would expose sensitive fields as plaintext, CWE-311). "+
"Set the encryption key or remove the affected rows before restarting.",
"database_sourced_issuers", dbIssuerCount,
"database_sourced_targets", dbTargetCount,
)
os.Exit(1)
}
logger.Warn("CERTCTL_CONFIG_ENCRYPTION_KEY not set — env-seeded issuers will be stored in plaintext; GUI-created issuers and targets will be rejected until a key is configured")
}
// Conditionally register Vault PKI (only if CERTCTL_VAULT_ADDR is set)
if os.Getenv("CERTCTL_VAULT_ADDR") != "" {
issuerRegistry["iss-vault"] = service.NewIssuerConnectorAdapter(vaultConnector)
logger.Info("Vault PKI issuer registered", "id", "iss-vault")
}
// Conditionally register DigiCert (only if CERTCTL_DIGICERT_API_KEY is set)
if os.Getenv("CERTCTL_DIGICERT_API_KEY") != "" {
issuerRegistry["iss-digicert"] = service.NewIssuerConnectorAdapter(digicertConnector)
logger.Info("DigiCert CertCentral issuer registered", "id", "iss-digicert")
}
logger.Info("issuer registry configured", "issuers", len(issuerRegistry))
issuerRegistry := service.NewIssuerRegistry(logger)
// Initialize revocation repository
revocationRepo := postgres.NewRevocationRepository(db)
@@ -191,6 +146,7 @@ func main() {
// Initialize services (following the dependency graph)
auditService := service.NewAuditService(auditRepo)
policyService := service.NewPolicyService(policyRepo, auditService)
policyService.SetCertRepo(certificateRepo) // D-008: CertificateLifetime arm needs CertificateVersion.NotBefore/NotAfter
certificateService := service.NewCertificateService(certificateRepo, policyService, auditService)
notifierRegistry := make(map[string]service.Notifier)
@@ -268,11 +224,21 @@ func main() {
renewalService := service.NewRenewalService(certificateRepo, jobRepo, renewalPolicyRepo, profileRepo, auditService, notificationService, issuerRegistry, cfg.Keygen.Mode)
renewalService.SetTargetRepo(targetRepo)
deploymentService := service.NewDeploymentService(jobRepo, targetRepo, agentRepo, certificateRepo, auditService, notificationService)
jobService := service.NewJobService(jobRepo, renewalService, deploymentService, logger)
jobService := service.NewJobService(jobRepo, certificateRepo, ownerRepo, renewalService, deploymentService, logger)
// I-001: emit "job_retry" audit events when the scheduler resets Failed→Pending.
// SetAuditService is optional — JobService falls back to nil-guarded no-op if unwired.
jobService.SetAuditService(auditService)
agentService := service.NewAgentService(agentRepo, certificateRepo, jobRepo, targetRepo, auditService, issuerRegistry, renewalService)
agentService.SetProfileRepo(profileRepo)
issuerService := service.NewIssuerService(issuerRepo, auditService)
targetService := service.NewTargetService(targetRepo, auditService)
issuerService := service.NewIssuerService(issuerRepo, auditService, issuerRegistry, encryptionKey, logger)
// Seed issuers from env vars on first boot (empty database only), then build registry
issuerService.SeedFromEnvVars(context.Background(), cfg)
if err := issuerService.BuildRegistry(context.Background()); err != nil {
logger.Error("failed to build issuer registry from database", "error", err)
}
logger.Info("issuer registry loaded", "issuers", issuerRegistry.Len())
targetService := service.NewTargetService(targetRepo, auditService, agentRepo, encryptionKey, logger)
profileService := service.NewProfileService(profileRepo, auditService)
teamService := service.NewTeamService(teamRepo, auditService)
ownerService := service.NewOwnerService(ownerRepo, auditService)
@@ -292,14 +258,99 @@ func main() {
Name: "Network Scanner (Server-Side)",
Status: domain.AgentStatusOnline,
}
if err := agentRepo.Create(context.Background(), sentinelAgent); err != nil {
// Ignore duplicate key errors (agent already exists)
logger.Debug("sentinel agent creation", "status", "exists or created", "id", service.SentinelAgentID)
// M-6: use CreateIfNotExists so duplicate rows on restart/upgrade are
// idempotent without swallowing unrelated DB failures (CWE-662).
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAgent)
if err != nil {
logger.Error("sentinel agent creation failed", "id", service.SentinelAgentID, "error", err)
} else if created {
logger.Info("sentinel agent created", "id", service.SentinelAgentID)
} else {
logger.Debug("sentinel agent already exists", "id", service.SentinelAgentID)
}
}
// Initialize cloud discovery sources (M50)
var cloudDiscoveryService *service.CloudDiscoveryService
if cfg.CloudDiscovery.Enabled {
cloudDiscoveryService = service.NewCloudDiscoveryService(discoveryService, logger)
// AWS Secrets Manager
if cfg.CloudDiscovery.AWSSM.Enabled {
awsSource := discoveryawssm.New(&cfg.CloudDiscovery.AWSSM, logger)
cloudDiscoveryService.RegisterSource(awsSource)
// Create sentinel agent for AWS SM
sentinelAWS := &domain.Agent{
ID: service.SentinelAWSSecretsMgr,
Name: "AWS Secrets Manager Discovery",
Status: domain.AgentStatusOnline,
}
// M-6: idempotent create (CWE-662).
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAWS)
if err != nil {
logger.Error("sentinel agent creation failed", "id", service.SentinelAWSSecretsMgr, "error", err)
} else if created {
logger.Info("sentinel agent created", "id", service.SentinelAWSSecretsMgr)
} else {
logger.Debug("sentinel agent already exists", "id", service.SentinelAWSSecretsMgr)
}
}
// Azure Key Vault
if cfg.CloudDiscovery.AzureKV.Enabled {
azureSource := discoveryazurekv.New(discoveryazurekv.Config{
VaultURL: cfg.CloudDiscovery.AzureKV.VaultURL,
TenantID: cfg.CloudDiscovery.AzureKV.TenantID,
ClientID: cfg.CloudDiscovery.AzureKV.ClientID,
ClientSecret: cfg.CloudDiscovery.AzureKV.ClientSecret,
}, logger)
cloudDiscoveryService.RegisterSource(azureSource)
sentinelAzure := &domain.Agent{
ID: service.SentinelAzureKeyVault,
Name: "Azure Key Vault Discovery",
Status: domain.AgentStatusOnline,
}
// M-6: idempotent create (CWE-662).
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAzure)
if err != nil {
logger.Error("sentinel agent creation failed", "id", service.SentinelAzureKeyVault, "error", err)
} else if created {
logger.Info("sentinel agent created", "id", service.SentinelAzureKeyVault)
} else {
logger.Debug("sentinel agent already exists", "id", service.SentinelAzureKeyVault)
}
}
// GCP Secret Manager
if cfg.CloudDiscovery.GCPSM.Enabled {
gcpSource := discoverygcpsm.New(&cfg.CloudDiscovery.GCPSM, logger)
cloudDiscoveryService.RegisterSource(gcpSource)
sentinelGCP := &domain.Agent{
ID: service.SentinelGCPSecretMgr,
Name: "GCP Secret Manager Discovery",
Status: domain.AgentStatusOnline,
}
// M-6: idempotent create (CWE-662).
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelGCP)
if err != nil {
logger.Error("sentinel agent creation failed", "id", service.SentinelGCPSecretMgr, "error", err)
} else if created {
logger.Info("sentinel agent created", "id", service.SentinelGCPSecretMgr)
} else {
logger.Debug("sentinel agent already exists", "id", service.SentinelGCPSecretMgr)
}
}
logger.Info("cloud discovery enabled",
"sources", cloudDiscoveryService.SourceCount(),
"interval", cfg.CloudDiscovery.Interval.String())
}
logger.Info("initialized all services")
// Initialize bulk revocation service
bulkRevocationService := service.NewBulkRevocationService(revocationSvc, certificateRepo, auditService, logger)
// Initialize stats and metrics services
statsService := service.NewStatsService(certificateRepo, jobRepo, agentRepo)
logger.Info("initialized stats service")
@@ -327,6 +378,8 @@ func main() {
exportService := service.NewExportService(certificateRepo, auditService)
exportHandler := handler.NewExportHandler(exportService)
bulkRevocationHandler := handler.NewBulkRevocationHandler(bulkRevocationService)
// Initialize digest service (requires email notifier)
var digestService *service.DigestService
var digestHandler *handler.DigestHandler
@@ -346,6 +399,29 @@ func main() {
}
}
// Initialize health check service (M48)
var healthCheckService *service.HealthCheckService
var healthCheckHandler *handler.HealthCheckHandler
if cfg.HealthCheck.Enabled {
healthCheckRepo := postgres.NewHealthCheckRepository(db)
healthCheckService = service.NewHealthCheckService(
healthCheckRepo,
auditService,
logger,
cfg.HealthCheck.MaxConcurrent,
time.Duration(cfg.HealthCheck.DefaultTimeout)*time.Millisecond,
cfg.HealthCheck.HistoryRetention,
cfg.HealthCheck.AutoCreate,
)
healthCheckHandler = handler.NewHealthCheckHandler(healthCheckService)
logger.Info("health check service enabled",
"interval", cfg.HealthCheck.CheckInterval.String(),
"max_concurrent", cfg.HealthCheck.MaxConcurrent)
} else {
// Create a no-op health check handler for route registration
healthCheckHandler = handler.NewHealthCheckHandler(nil)
}
logger.Info("initialized all handlers")
// Create context with cancellation
@@ -365,6 +441,10 @@ func main() {
// Configure scheduler intervals from config
sched.SetRenewalCheckInterval(cfg.Scheduler.RenewalCheckInterval)
sched.SetJobProcessorInterval(cfg.Scheduler.JobProcessorInterval)
// I-001: drive the failed-job retry loop. Runs on start + every RetryInterval
// (default 5m, CERTCTL_SCHEDULER_RETRY_INTERVAL). Kept adjacent to the job
// processor setter because they share the JobServicer dependency.
sched.SetJobRetryInterval(cfg.Scheduler.RetryInterval)
sched.SetAgentHealthCheckInterval(cfg.Scheduler.AgentHealthCheckInterval)
sched.SetNotificationProcessInterval(cfg.Scheduler.NotificationProcessInterval)
if cfg.NetworkScan.Enabled {
@@ -376,6 +456,29 @@ func main() {
sched.SetDigestInterval(cfg.Digest.Interval)
logger.Info("digest scheduler enabled", "interval", cfg.Digest.Interval.String())
}
if healthCheckService != nil {
sched.SetHealthCheckService(healthCheckService)
sched.SetHealthCheckInterval(cfg.HealthCheck.CheckInterval)
logger.Info("health check scheduler enabled", "interval", cfg.HealthCheck.CheckInterval.String())
}
if cloudDiscoveryService != nil && cloudDiscoveryService.SourceCount() > 0 {
sched.SetCloudDiscoveryService(cloudDiscoveryService)
sched.SetCloudDiscoveryInterval(cfg.CloudDiscovery.Interval)
logger.Info("cloud discovery scheduler enabled",
"interval", cfg.CloudDiscovery.Interval.String(),
"sources", cloudDiscoveryService.SourceCount())
}
// Wire job timeout reaper (I-003)
sched.SetJobReaperService(jobService)
sched.SetJobTimeoutInterval(cfg.Scheduler.JobTimeoutInterval)
sched.SetAwaitingCSRTimeout(cfg.Scheduler.AwaitingCSRTimeout)
sched.SetAwaitingApprovalTimeout(cfg.Scheduler.AwaitingApprovalTimeout)
logger.Info("job timeout reaper enabled",
"interval", cfg.Scheduler.JobTimeoutInterval.String(),
"csr_timeout", cfg.Scheduler.AwaitingCSRTimeout.String(),
"approval_timeout", cfg.Scheduler.AwaitingApprovalTimeout.String())
// Start scheduler
logger.Info("starting scheduler")
@@ -406,15 +509,18 @@ func main() {
Verification: verificationHandler,
Export: exportHandler,
Digest: *digestHandler,
HealthChecks: healthCheckHandler,
BulkRevocation: bulkRevocationHandler,
})
// Register EST (RFC 7030) handlers if enabled
if cfg.EST.Enabled {
issuerConn, ok := issuerRegistry[cfg.EST.IssuerID]
issuerConn, ok := issuerRegistry.Get(cfg.EST.IssuerID)
if !ok {
logger.Error("EST issuer not found in registry", "issuer_id", cfg.EST.IssuerID)
os.Exit(1)
}
estService := service.NewESTService(cfg.EST.IssuerID, issuerConn, auditService, logger)
estService.SetProfileRepo(profileRepo)
if cfg.EST.ProfileID != "" {
estService.SetProfileID(cfg.EST.ProfileID)
}
@@ -426,13 +532,102 @@ func main() {
"endpoints", "/.well-known/est/{cacerts,simpleenroll,simplereenroll,csrattrs}")
}
// Register SCEP (RFC 8894) handlers if enabled
if cfg.SCEP.Enabled {
// H-2 fix: fail closed at startup when SCEP is enabled without a
// challenge password configured. Previously the service-layer guard
// at internal/service/scep.go:72-79 skipped the password check when
// s.challengePassword == "", meaning any client that could reach the
// /scep endpoint could enroll an arbitrary CSR against the configured
// issuer (CWE-306, missing authentication for a critical function).
// Refuse to start instead: the operator must set
// CERTCTL_SCEP_CHALLENGE_PASSWORD (or disable SCEP) before the control
// plane can boot.
if err := preflightSCEPChallengePassword(cfg.SCEP.Enabled, cfg.SCEP.ChallengePassword); err != nil {
logger.Error(
"startup refused: SCEP is enabled but CERTCTL_SCEP_CHALLENGE_PASSWORD is not set "+
"(would allow unauthenticated certificate enrollment, CWE-306). "+
"Set a non-empty challenge password or disable SCEP before restarting.",
"error", err,
)
os.Exit(1)
}
issuerConn, ok := issuerRegistry.Get(cfg.SCEP.IssuerID)
if !ok {
logger.Error("SCEP issuer not found in registry", "issuer_id", cfg.SCEP.IssuerID)
os.Exit(1)
}
scepService := service.NewSCEPService(cfg.SCEP.IssuerID, issuerConn, auditService, logger, cfg.SCEP.ChallengePassword)
scepService.SetProfileRepo(profileRepo)
if cfg.SCEP.ProfileID != "" {
scepService.SetProfileID(cfg.SCEP.ProfileID)
}
scepHandler := handler.NewSCEPHandler(scepService)
apiRouter.RegisterSCEPHandlers(scepHandler)
logger.Info("SCEP server enabled",
"issuer_id", cfg.SCEP.IssuerID,
"profile_id", cfg.SCEP.ProfileID,
"challenge_password_set", cfg.SCEP.ChallengePassword != "",
"endpoints", "/scep?operation={GetCACaps,GetCACert,PKIOperation}")
}
// Register RFC 5280 CRL and RFC 6960 OCSP handlers under /.well-known/pki/.
// These are always enabled (no config gate) — revocation data must be
// reachable to relying parties for any cert certctl issues. The finalHandler
// routing gate below strips auth middleware for this prefix so browsers,
// OpenSSL, OCSP stapling sidecars, and mTLS clients can fetch without
// presenting certctl Bearer tokens.
apiRouter.RegisterPKIHandlers(certificateHandler)
logger.Info("PKI endpoints registered",
"endpoints", "/.well-known/pki/{crl/{issuer_id},ocsp/{issuer_id}/{serial}}")
logger.Info("registered all API handlers")
// Build middleware stack
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
Type: cfg.Auth.Type,
Secret: cfg.Auth.Secret,
})
// Build middleware stack.
//
// Authentication unification (M-002): every authenticated request now
// carries a named actor in the request context so audit events record
// the real key identity instead of the hardcoded "api-key-user" string.
// Named keys come from CERTCTL_API_KEYS_NAMED (preferred). For backward
// compatibility CERTCTL_AUTH_SECRET is synthesized into legacy-key-N
// entries with Admin=false.
var namedKeys []middleware.NamedAPIKey
if cfg.Auth.Type != "none" {
// Translate typed config.NamedAPIKey -> middleware.NamedAPIKey. The
// two structs are field-compatible but live in different packages to
// preserve the config→middleware dependency direction.
for _, nk := range cfg.Auth.NamedKeys {
namedKeys = append(namedKeys, middleware.NamedAPIKey{
Name: nk.Name,
Key: nk.Key,
Admin: nk.Admin,
})
}
// Back-compat: if no named keys but legacy Secret is configured,
// synthesize named entries so the audit trail still attributes the
// action (instead of falling back to "api-key-user" / "anonymous").
if len(namedKeys) == 0 && cfg.Auth.Secret != "" {
parts := strings.Split(cfg.Auth.Secret, ",")
idx := 0
for _, p := range parts {
p = strings.TrimSpace(p)
if p == "" {
continue
}
namedKeys = append(namedKeys, middleware.NamedAPIKey{
Name: fmt.Sprintf("legacy-key-%d", idx),
Key: p,
Admin: false,
})
idx++
}
if len(namedKeys) > 0 {
logger.Warn("CERTCTL_AUTH_SECRET is deprecated — set CERTCTL_API_KEYS_NAMED for named actor attribution and admin gating",
"synthesized_keys", len(namedKeys))
}
}
}
authMiddleware := middleware.NewAuthWithNamedKeys(namedKeys)
corsMiddleware := middleware.NewCORS(middleware.CORSConfig{
AllowedOrigins: cfg.CORS.AllowedOrigins,
})
@@ -464,7 +659,7 @@ func main() {
bodyLimitMiddleware,
corsMiddleware,
authMiddleware,
auditMiddleware,
auditMiddleware.Middleware,
}
// Add rate limiter if enabled
@@ -481,7 +676,7 @@ func main() {
rateLimiter,
corsMiddleware,
authMiddleware,
auditMiddleware,
auditMiddleware.Middleware,
}
logger.Info("rate limiting enabled", "rps", cfg.RateLimit.RPS, "burst", cfg.RateLimit.BurstSize)
}
@@ -528,6 +723,14 @@ func main() {
noAuthHandler.ServeHTTP(w, r)
return
}
// RFC 5280 CRL and RFC 6960 OCSP live under /.well-known/pki/ and
// MUST be served unauthenticated — relying parties (browsers,
// OpenSSL, OCSP stapling sidecars, mTLS clients) cannot present
// certctl Bearer tokens. See router.RegisterPKIHandlers.
if len(path) >= 16 && path[:16] == "/.well-known/pki" {
noAuthHandler.ServeHTTP(w, r)
return
}
// All other API and EST routes go through the full middleware stack (with auth)
if (len(path) >= 8 && path[:8] == "/api/v1/") ||
(len(path) >= 16 && path[:16] == "/.well-known/est") {
@@ -544,13 +747,18 @@ func main() {
})
logger.Info("dashboard available at /", "web_dir", webDir)
} else {
// No dashboard: route health/auth-info without auth, everything else through full stack
// No dashboard: route health/auth-info and /.well-known/pki without
// auth, everything else through full stack.
finalHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
path := r.URL.Path
if path == "/health" || path == "/ready" || path == "/api/v1/auth/info" {
noAuthHandler.ServeHTTP(w, r)
return
}
if len(path) >= 16 && path[:16] == "/.well-known/pki" {
noAuthHandler.ServeHTTP(w, r)
return
}
apiHandler.ServeHTTP(w, r)
})
logger.Info("dashboard directory not found, serving API only")
@@ -599,6 +807,17 @@ func main() {
logger.Error("HTTP server shutdown error", "error", err)
}
// Drain in-flight audit-recording goroutines before closing the DB pool.
// The audit middleware spawns one goroutine per non-excluded request; those
// goroutines run detached from the request context and write to the
// audit_events table via the same *sql.DB. Without this drain, SIGTERM
// would close the DB pool while recordings were mid-flight, silently
// dropping audit events (M-1, CWE-662 / CWE-400).
logger.Info("flushing audit middleware in-flight recordings")
if err := auditMiddleware.Flush(shutdownCtx); err != nil {
logger.Warn("audit middleware flush did not complete in time", "error", err)
}
// Close database connection
if err := db.Close(); err != nil {
logger.Error("error closing database connection", "error", err)
@@ -607,22 +826,23 @@ func main() {
logger.Info("certctl server stopped")
}
// getEnvDefault reads an environment variable with a default fallback.
func getEnvDefault(key, defaultVal string) string {
if val := os.Getenv(key); val != "" {
return val
// preflightSCEPChallengePassword enforces the H-2 fix: if SCEP is enabled, a
// non-empty challenge password MUST be configured. Returns a non-nil error
// otherwise so the caller can refuse to start the control plane (CWE-306,
// missing authentication for a critical function).
//
// This helper is extracted so the check can be unit tested without booting
// the full server. The caller (main) is responsible for translating the
// returned error into a structured log line and os.Exit(1).
func preflightSCEPChallengePassword(enabled bool, challengePassword string) error {
if !enabled {
return nil
}
return defaultVal
if challengePassword == "" {
return fmt.Errorf("SCEP enabled but CERTCTL_SCEP_CHALLENGE_PASSWORD is empty: " +
"SCEP enrollment would accept any client (CWE-306); " +
"configure a non-empty shared secret or set CERTCTL_SCEP_ENABLED=false")
}
return nil
}
// getEnvIntDefault parses an integer from a string with a default fallback.
func getEnvIntDefault(s string, defaultVal int) int {
if s == "" {
return defaultVal
}
val, err := strconv.Atoi(s)
if err != nil {
return defaultVal
}
return val
}
+606
View File
@@ -0,0 +1,606 @@
package main
import (
"context"
"fmt"
"log/slog"
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/api/router"
"github.com/shankar0123/certctl/internal/config"
"github.com/shankar0123/certctl/internal/service"
)
// TestMain_HealthEndpointBypassesAuth verifies that health check endpoints
// bypass auth middleware while protected API endpoints require auth.
// This is the most critical test — it validates the core routing pattern used in main.go.
func TestMain_HealthEndpointBypassesAuth(t *testing.T) {
// Simulate the finalHandler logic from main.go with minimal setup
// Create handler functions for health endpoints
healthHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
})
readyHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ready"}`))
})
authInfoHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"auth_type":"api-key"}`))
})
// Protected API endpoint
certHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`[]`))
})
// Build the handler chain the same way main.go does
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
Type: "api-key",
Secret: "test-secret-key",
})
// API handler with auth
authHandler := middleware.Chain(certHandler,
middleware.RequestID,
middleware.Recovery,
authMiddleware,
)
// Create finalHandler matching main.go logic
finalHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
path := r.URL.Path
switch path {
case "/health":
healthHandler.ServeHTTP(w, r)
case "/ready":
readyHandler.ServeHTTP(w, r)
case "/api/v1/auth/info":
authInfoHandler.ServeHTTP(w, r)
case "/api/v1/certificates":
authHandler.ServeHTTP(w, r)
default:
http.Error(w, "Not Found", http.StatusNotFound)
}
})
tests := []struct {
name string
path string
method string
bypassesAuth bool
expectedStatus int
}{
{
name: "GET /health without auth",
path: "/health",
method: "GET",
bypassesAuth: true,
expectedStatus: http.StatusOK,
},
{
name: "GET /ready without auth",
path: "/ready",
method: "GET",
bypassesAuth: true,
expectedStatus: http.StatusOK,
},
{
name: "GET /api/v1/auth/info without auth",
path: "/api/v1/auth/info",
method: "GET",
bypassesAuth: true,
expectedStatus: http.StatusOK,
},
{
name: "GET /api/v1/certificates without auth (should fail)",
path: "/api/v1/certificates",
method: "GET",
bypassesAuth: false,
expectedStatus: http.StatusUnauthorized,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
req := httptest.NewRequest(tt.method, tt.path, nil)
w := httptest.NewRecorder()
finalHandler.ServeHTTP(w, req)
if tt.bypassesAuth && w.Code != tt.expectedStatus {
t.Errorf("endpoint %s should bypass auth, got status %d, expected %d",
tt.path, w.Code, tt.expectedStatus)
}
if !tt.bypassesAuth && w.Code != tt.expectedStatus {
t.Logf("endpoint %s requires auth, got status %d, expected %d (auth middleware working)",
tt.path, w.Code, tt.expectedStatus)
}
})
}
}
// TestMain_HealthHandlersRespond verifies health endpoints return correct responses.
func TestMain_HealthHandlersRespond(t *testing.T) {
healthHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
})
req := httptest.NewRequest("GET", "/health", nil)
w := httptest.NewRecorder()
healthHandler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status 200, got %d", w.Code)
}
if body := w.Body.String(); body != `{"status":"ok"}` {
t.Errorf("expected body '{\"status\":\"ok\"}', got '%s'", body)
}
}
// TestMain_AuthMiddlewareRejectsUnauthorized verifies auth middleware works.
func TestMain_AuthMiddlewareRejectsUnauthorized(t *testing.T) {
// Create a protected endpoint
protectedHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"protected"}`))
})
// Wrap with auth middleware
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
Type: "api-key",
Secret: "test-secret-key",
})
chainedHandler := middleware.Chain(protectedHandler, authMiddleware)
// Request without auth should be rejected
req := httptest.NewRequest("GET", "/api/v1/protected", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code != http.StatusUnauthorized {
t.Errorf("expected status 401 for unauthorized request, got %d", w.Code)
}
}
// TestMain_AuthMiddlewareAllowsWithValidKey verifies auth middleware allows valid keys.
func TestMain_AuthMiddlewareAllowsWithValidKey(t *testing.T) {
testKey := "test-secret-key"
// Create a protected endpoint
protectedHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"protected"}`))
})
// Wrap with auth middleware
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
Type: "api-key",
Secret: testKey,
})
chainedHandler := middleware.Chain(protectedHandler, authMiddleware)
// Request with valid auth should be allowed
req := httptest.NewRequest("GET", "/api/v1/protected", nil)
req.Header.Set("Authorization", "Bearer "+testKey)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status 200 for authorized request, got %d", w.Code)
}
}
// TestMain_ServerConfigFromEnvironment verifies config.Load() reads env vars correctly.
func TestMain_ServerConfigFromEnvironment(t *testing.T) {
// Save original env vars
oldAuthType := os.Getenv("CERTCTL_AUTH_TYPE")
oldServerHost := os.Getenv("CERTCTL_SERVER_HOST")
oldServerPort := os.Getenv("CERTCTL_SERVER_PORT")
defer func() {
if oldAuthType != "" {
os.Setenv("CERTCTL_AUTH_TYPE", oldAuthType)
} else {
os.Unsetenv("CERTCTL_AUTH_TYPE")
}
if oldServerHost != "" {
os.Setenv("CERTCTL_SERVER_HOST", oldServerHost)
} else {
os.Unsetenv("CERTCTL_SERVER_HOST")
}
if oldServerPort != "" {
os.Setenv("CERTCTL_SERVER_PORT", oldServerPort)
} else {
os.Unsetenv("CERTCTL_SERVER_PORT")
}
}()
// Set test env vars
os.Setenv("CERTCTL_AUTH_TYPE", "none")
os.Setenv("CERTCTL_SERVER_HOST", "127.0.0.1")
os.Setenv("CERTCTL_SERVER_PORT", "8080")
cfg, err := config.Load()
if err != nil {
t.Fatalf("Failed to load config from env vars: %v", err)
}
if cfg.Auth.Type != "none" {
t.Errorf("Expected auth type 'none', got '%s'", cfg.Auth.Type)
}
if cfg.Server.Host != "127.0.0.1" {
t.Errorf("Expected server host '127.0.0.1', got '%s'", cfg.Server.Host)
}
if cfg.Server.Port != 8080 {
t.Errorf("Expected server port 8080, got %d", cfg.Server.Port)
}
}
// TestMain_AuthTypeConfiguration verifies auth type is read from config.
func TestMain_AuthTypeConfiguration(t *testing.T) {
// Save original env vars
oldAuthType := os.Getenv("CERTCTL_AUTH_TYPE")
oldAuthSecret := os.Getenv("CERTCTL_AUTH_SECRET")
defer func() {
if oldAuthType != "" {
os.Setenv("CERTCTL_AUTH_TYPE", oldAuthType)
} else {
os.Unsetenv("CERTCTL_AUTH_TYPE")
}
if oldAuthSecret != "" {
os.Setenv("CERTCTL_AUTH_SECRET", oldAuthSecret)
} else {
os.Unsetenv("CERTCTL_AUTH_SECRET")
}
}()
// Set auth secret for api-key mode
os.Setenv("CERTCTL_AUTH_SECRET", "test-secret")
testCases := []string{"api-key", "none"}
for _, authType := range testCases {
t.Run(fmt.Sprintf("auth_type_%s", authType), func(t *testing.T) {
os.Setenv("CERTCTL_AUTH_TYPE", authType)
cfg, err := config.Load()
if err != nil {
t.Fatalf("Failed to load config: %v", err)
}
if cfg.Auth.Type != authType {
t.Errorf("Expected auth type '%s', got '%s'", authType, cfg.Auth.Type)
}
})
}
}
// TestMain_MiddlewareChainConstruction tests that middleware can be properly chained.
func TestMain_MiddlewareChainConstruction(t *testing.T) {
// Test that the middleware.Chain function works as expected
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("success"))
})
// Chain with RequestID and Recovery middleware
chainedHandler := middleware.Chain(baseHandler,
middleware.RequestID,
middleware.Recovery,
)
req := httptest.NewRequest("GET", "/test", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status 200, got %d", w.Code)
}
if body := w.Body.String(); body != "success" {
t.Errorf("expected body 'success', got '%s'", body)
}
}
// TestMain_RequestIDMiddleware verifies RequestID is added to responses.
func TestMain_RequestIDMiddleware(t *testing.T) {
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
})
// Wrap with RequestID middleware
chainedHandler := middleware.Chain(baseHandler, middleware.RequestID)
req := httptest.NewRequest("GET", "/test", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
// RequestID should be set in response header
if rid := w.Header().Get("X-Request-ID"); rid == "" {
t.Logf("X-Request-ID header not present (middleware may work differently)")
} else {
t.Logf("X-Request-ID header set: %s", rid)
}
}
// TestMain_RecoveryMiddlewareHandlesPanic verifies recovery middleware works.
func TestMain_RecoveryMiddlewareHandlesPanic(t *testing.T) {
panicHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
panic("test panic")
})
// Wrap with recovery middleware
chainedHandler := middleware.Chain(panicHandler, middleware.Recovery)
req := httptest.NewRequest("GET", "/test", nil)
w := httptest.NewRecorder()
// Should not panic
chainedHandler.ServeHTTP(w, req)
// Should return 500 error
if w.Code != http.StatusInternalServerError {
t.Logf("Expected 500 for panicked handler, got %d", w.Code)
}
}
// TestMain_ServiceInitialization tests that services can be instantiated.
// This validates the initialization pattern from main.go without needing a real DB.
func TestMain_ServiceInitialization(t *testing.T) {
logger := slog.New(slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{
Level: slog.LevelInfo,
}))
// Create test issuer registry (same as main.go does)
issuerRegistry := service.NewIssuerRegistry(logger)
if issuerRegistry == nil {
t.Fatal("issuer registry should not be nil")
}
// Verify the registry has a Len() method (used in main.go)
count := issuerRegistry.Len()
if count < 0 {
t.Errorf("issuer registry length should be >= 0, got %d", count)
}
}
// TestMain_CORSMiddlewareSetHeaders verifies CORS headers are set.
func TestMain_CORSMiddlewareSetHeaders(t *testing.T) {
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
})
corsMiddleware := middleware.NewCORS(middleware.CORSConfig{
AllowedOrigins: []string{"http://example.com"},
})
chainedHandler := middleware.Chain(baseHandler, corsMiddleware)
req := httptest.NewRequest("GET", "/test", nil)
req.Header.Set("Origin", "http://example.com")
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
// CORS middleware should set access control headers
if acah := w.Header().Get("Access-Control-Allow-Origin"); acah == "" {
t.Logf("Access-Control-Allow-Origin not set (may be by design)")
}
}
// TestMain_AuthNoneMode verifies auth can be disabled.
func TestMain_AuthNoneMode(t *testing.T) {
protectedHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"data":"protected"}`))
})
// Wrap with auth middleware in "none" mode
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
Type: "none",
})
chainedHandler := middleware.Chain(protectedHandler, authMiddleware)
// Request without auth should be allowed in "none" mode
req := httptest.NewRequest("GET", "/api/v1/protected", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status 200 in 'none' auth mode, got %d", w.Code)
}
}
// TestMain_RouterRegistration tests that router registration works.
func TestMain_RouterRegistration(t *testing.T) {
r := router.New()
// Register a test handler
r.RegisterFunc("GET /test", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("test"))
})
// Request the route
req := httptest.NewRequest("GET", "/test", nil)
w := httptest.NewRecorder()
r.ServeHTTP(w, req)
// Route should be registered and accessible
if w.Code == http.StatusNotFound {
t.Errorf("route not registered, got 404")
} else if w.Code == http.StatusOK {
t.Logf("route registered successfully")
}
}
// TestMain_RateLimiterIntegration tests rate limiter middleware works.
func TestMain_RateLimiterIntegration(t *testing.T) {
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
})
// Create rate limiter with 10 RPS, 1 burst
rateLimiter := middleware.NewRateLimiter(middleware.RateLimitConfig{
RPS: 10,
BurstSize: 1,
})
chainedHandler := middleware.Chain(baseHandler, rateLimiter)
// First request should succeed
req := httptest.NewRequest("GET", "/test", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code == http.StatusServiceUnavailable {
t.Logf("rate limiter is active")
} else {
t.Logf("rate limiter allowed request (status %d)", w.Code)
}
}
// TestMain_ContentTypeMiddleware verifies content type is set correctly.
func TestMain_ContentTypeMiddleware(t *testing.T) {
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte(`{"status":"ok"}`))
})
// Wrap with middleware that sets Content-Type
chainedHandler := middleware.Chain(baseHandler, middleware.ContentType)
req := httptest.NewRequest("GET", "/api/v1/test", nil)
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
// Verify response
if w.Code != http.StatusOK {
t.Errorf("expected status 200, got %d", w.Code)
}
// ContentType middleware should set header
if ct := w.Header().Get("Content-Type"); ct != "" {
t.Logf("Content-Type header set: %s", ct)
}
}
// TestMain_ContextPropagation verifies context is propagated through middleware.
func TestMain_ContextPropagation(t *testing.T) {
type contextKey string
testKey := contextKey("test-key")
testValue := "test-value"
baseHandler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
val := r.Context().Value(testKey)
if val == testValue {
w.WriteHeader(http.StatusOK)
} else {
w.WriteHeader(http.StatusInternalServerError)
}
})
chainedHandler := middleware.Chain(baseHandler, middleware.RequestID)
req := httptest.NewRequest("GET", "/test", nil)
// Add context value before request
req = req.WithContext(context.WithValue(req.Context(), testKey, testValue))
w := httptest.NewRecorder()
chainedHandler.ServeHTTP(w, req)
if w.Code != http.StatusOK {
t.Logf("Context value may not be propagated (status %d), this may be expected", w.Code)
}
}
// TestPreflightSCEPChallengePassword is the H-2 regression guard for the
// startup pre-flight check. The helper MUST return a non-nil error whenever
// SCEP is enabled with an empty challenge password — that configuration
// previously allowed unauthenticated certificate enrollment (CWE-306).
// Disabled-SCEP and configured-password cases must pass cleanly.
func TestPreflightSCEPChallengePassword(t *testing.T) {
tests := []struct {
name string
enabled bool
challengePassword string
wantErr bool
wantErrSubstring string
}{
{
name: "disabled_empty_password_ok",
enabled: false,
challengePassword: "",
wantErr: false,
},
{
name: "disabled_with_password_ok",
enabled: false,
challengePassword: "leftover-value",
wantErr: false,
},
{
name: "enabled_empty_password_rejected",
enabled: true,
challengePassword: "",
wantErr: true,
wantErrSubstring: "CERTCTL_SCEP_CHALLENGE_PASSWORD",
},
{
name: "enabled_with_password_ok",
enabled: true,
challengePassword: "hunter2",
wantErr: false,
},
{
name: "enabled_single_char_password_ok",
enabled: true,
challengePassword: "x",
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := preflightSCEPChallengePassword(tt.enabled, tt.challengePassword)
if tt.wantErr {
if err == nil {
t.Fatalf("expected error, got nil")
}
if tt.wantErrSubstring != "" && !strings.Contains(err.Error(), tt.wantErrSubstring) {
t.Errorf("expected error to mention %q, got: %v", tt.wantErrSubstring, err)
}
if !strings.Contains(err.Error(), "CWE-306") {
t.Errorf("expected error to cite CWE-306 for traceability, got: %v", err)
}
} else if err != nil {
t.Errorf("expected no error, got: %v", err)
}
})
}
}
+520
View File
@@ -0,0 +1,520 @@
# certctl Docker Compose Environments
This guide walks through every Docker Compose file in the `deploy/` directory. Each section explains what the environment does, when to use it, every service and environment variable, and the commands to run it. If you've never used Docker before, start with the [Prerequisites](#prerequisites) section. If you're experienced, skip to the environment you need.
## Contents
1. [Prerequisites](#prerequisites)
2. [How Docker Compose Works (30-Second Version)](#how-docker-compose-works)
3. [Base Environment (docker-compose.yml)](#base-environment)
4. [Demo Overlay (docker-compose.demo.yml)](#demo-overlay)
5. [Development Overlay (docker-compose.dev.yml)](#development-overlay)
6. [Test Environment (docker-compose.test.yml)](#test-environment)
7. [Environment Variable Reference](#environment-variable-reference)
8. [Common Operations](#common-operations)
---
## Prerequisites
You need two things: **Docker** (the container runtime) and **Docker Compose** (an orchestration tool that ships with Docker Desktop).
On macOS:
```bash
brew install --cask docker
```
On Linux (Ubuntu/Debian):
```bash
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
```
Verify the install:
```bash
docker --version # Docker Engine 24+ recommended
docker compose version # Docker Compose v2+ required (note: no hyphen)
```
**What Docker actually does:** Docker packages an application and all its dependencies (OS libraries, runtimes, config files) into an isolated unit called a container. When you run `docker compose up`, Docker reads a YAML file that describes multiple containers, creates a private network between them, and starts everything in the right order. Each container sees only its own filesystem and network unless you explicitly share volumes or ports.
**Why this matters for certctl:** Instead of installing PostgreSQL, building Go binaries, configuring the agent, and wiring everything together by hand, one command gives you the complete platform. Each compose file targets a different use case.
---
## How Docker Compose Works
A compose file defines **services** (containers), **networks** (how they talk to each other), and **volumes** (persistent storage). The key concepts:
**Services** are named containers. `certctl-server` is the API and web dashboard. `postgres` is the database. `certctl-agent` polls the server for certificate work.
**Depends_on + healthchecks** control startup order. The server won't start until PostgreSQL reports healthy. The agent won't start until the server reports healthy. This prevents connection errors during boot.
**Volumes** persist data across restarts. `postgres_data` keeps your database between `docker compose down` and `docker compose up`. Adding `-v` to `down` deletes volumes for a clean slate.
**Overlay files** let you layer changes. Running `docker compose -f base.yml -f overlay.yml up` merges both files. The overlay can add services, change environment variables, or mount extra volumes without editing the base.
**Port mapping** (`"8443:8443"`) maps host port (left) to container port (right). After startup, `http://localhost:8443` on your machine reaches the certctl server inside its container.
---
## Base Environment
**File:** `docker-compose.yml`
**When to use:** Production deployments, first-time setup, or any time you want a clean dashboard with the onboarding wizard.
### What it runs
Three services on a private bridge network:
| Service | Image | Purpose | Ports |
|---------|-------|---------|-------|
| `postgres` | `postgres:16-alpine` | Database. Stores certificates, agents, jobs, audit trail, policies, discovery results. | 5432 |
| `certctl-server` | Built from `Dockerfile` | API server + web dashboard + background scheduler. | 8443 |
| `certctl-agent` | Built from `Dockerfile.agent` | Polls server for work, generates keys, deploys certificates, discovers existing certs. | none |
### Starting it
```bash
git clone https://github.com/shankar0123/certctl.git
cd certctl
docker compose -f deploy/docker-compose.yml up -d --build
```
`--build` compiles the Go server and agent from source, including the React frontend. Without it, Docker may reuse a stale image from a previous build.
`-d` runs in detached mode (background). Omit it to see logs in your terminal.
Wait about 30 seconds, then verify:
```bash
docker compose -f deploy/docker-compose.yml ps
# All three services should show "Up (healthy)"
curl http://localhost:8443/health
# {"status":"healthy"}
```
Open **http://localhost:8443** in your browser. You'll see the onboarding wizard guiding you through: connecting a CA, deploying an agent, and adding your first certificate.
### Service-by-service walkthrough
#### PostgreSQL
```yaml
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: certctl
POSTGRES_USER: certctl
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-certctl}
```
Alpine-based PostgreSQL 16. The `${POSTGRES_PASSWORD:-certctl}` syntax means: use the `POSTGRES_PASSWORD` environment variable from your shell if set, otherwise default to `certctl`. For production, create a `.env` file:
```bash
echo 'POSTGRES_PASSWORD=your-secure-password-here' > deploy/.env
```
The `volumes` section mounts 10 migration files into PostgreSQL's init directory (`/docker-entrypoint-initdb.d/`). PostgreSQL runs these SQL files in alphabetical order on first boot only. They create the schema (tables, indexes, constraints) and seed the base data (default issuer, default policy). If the `postgres_data` volume already exists with an initialized database, these scripts are skipped entirely.
**Expert note:** The numbered prefix pattern (`001_`, `002_`, ..., `020_`) ensures deterministic execution order. All migrations use `IF NOT EXISTS` and `ON CONFLICT DO NOTHING` for idempotency, so re-running them against an existing database is safe.
#### certctl Server
```yaml
certctl-server:
depends_on:
postgres:
condition: service_healthy
environment:
CERTCTL_DATABASE_URL: postgres://certctl:${POSTGRES_PASSWORD:-certctl}@postgres:5432/certctl?sslmode=disable
CERTCTL_SERVER_HOST: 0.0.0.0
CERTCTL_SERVER_PORT: 8443
CERTCTL_LOG_LEVEL: info
CERTCTL_AUTH_TYPE: none
CERTCTL_KEYGEN_MODE: server
CERTCTL_NETWORK_SCAN_ENABLED: "true"
CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY:-change-me-32-char-encryption-key}
```
The server is the control plane. It serves the REST API, the React dashboard, runs 7 background scheduler loops (renewal, job processing, health checks, notifications, short-lived cert expiry, network scanning, digest emails), and manages the issuer/target registry.
Key environment variables explained:
- `CERTCTL_DATABASE_URL` references the `postgres` service by hostname. Docker's internal DNS resolves `postgres` to the container's IP on the bridge network. `sslmode=disable` is appropriate because traffic stays on the private Docker network.
- `CERTCTL_AUTH_TYPE: none` disables API key authentication so you can explore immediately. For production, set `api-key` and configure `CERTCTL_AUTH_SECRET`.
- `CERTCTL_KEYGEN_MODE: server` means the server generates private keys. This is convenient for demos but insecure for production. In production, set `agent` so keys are generated on agent machines and never transmitted.
- `CERTCTL_CONFIG_ENCRYPTION_KEY` enables AES-256-GCM encryption for issuer and target configurations stored in the database (credentials, API keys). Without this, the dynamic configuration GUI (adding issuers/targets from the dashboard) won't encrypt sensitive fields. For production, generate a strong random key.
- `CERTCTL_NETWORK_SCAN_ENABLED` activates the scheduler loop that probes TLS endpoints on your network to discover certificates you might not be managing.
**Expert note:** The healthcheck hits `GET /health` every 10 seconds with 5 retries. The `depends_on: condition: service_healthy` on the agent means Docker holds agent startup until this check passes. Resource limits (`cpus: '1.0'`, `memory: 512M`) prevent the server from consuming unbounded resources in shared environments.
#### certctl Agent
```yaml
certctl-agent:
depends_on:
certctl-server:
condition: service_healthy
environment:
CERTCTL_SERVER_URL: http://certctl-server:8443
CERTCTL_API_KEY: ${CERTCTL_API_KEY:-change-me-in-production}
CERTCTL_AGENT_NAME: docker-agent
CERTCTL_LOG_LEVEL: info
CERTCTL_DISCOVERY_DIRS: /var/lib/certctl/keys
volumes:
- agent_keys:/var/lib/certctl/keys
```
The agent is a lightweight Go binary that polls the server for pending work (certificate deployments, CSR generation requests), executes that work locally, and reports results back. It also scans configured directories for existing certificates (filesystem discovery).
- `CERTCTL_SERVER_URL` uses the Docker internal hostname `certctl-server`. This resolves inside the Docker network only.
- `CERTCTL_DISCOVERY_DIRS` tells the agent which directories to scan for existing certificates. The agent walks these directories recursively, parses PEM and DER files, and reports findings to the server for triage.
- The `agent_keys` volume persists private keys generated by the agent across container restarts. Without this volume, keys would be lost when the container stops.
**Expert note:** The agent's healthcheck uses `pgrep` because the agent doesn't expose an HTTP endpoint. The `restart: unless-stopped` policy means Docker automatically restarts the agent on crashes but respects manual `docker compose stop` commands.
### Stopping and cleaning up
```bash
# Stop containers but keep data
docker compose -f deploy/docker-compose.yml down
# Stop and delete all data (database, keys, volumes)
docker compose -f deploy/docker-compose.yml down -v
```
---
## Demo Overlay
**File:** `docker-compose.demo.yml`
**When to use:** Demos, screenshots, stakeholder presentations, or any time you want a populated dashboard on first boot.
### What it adds
One line: mounts `seed_demo.sql` into PostgreSQL's init directory. This 667-line SQL file inserts 180 days of simulated operational history: teams, owners, certificates across multiple issuers, agents on different platforms, jobs with realistic timestamps, discovery scan results, audit events, policies, and profiles.
### Starting it
```bash
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build
```
The `-f` flags are ordered: base first, overlay second. Docker merges them. The demo overlay adds the seed_demo.sql volume mount to the `postgres` service defined in the base file.
### What you see
The dashboard shows pre-populated charts: expiration heatmap with upcoming renewals, status distribution across Active/Expiring/Expired/Failed states, 30-day job trends, and issuance rates. The sidebar pages (Certificates, Agents, Discovery, Jobs, etc.) all have data to explore.
### Resetting demo data
```bash
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml down -v
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build
```
The `down -v` deletes the `postgres_data` volume. On next boot, PostgreSQL re-runs all init scripts including the demo seed, giving you a clean starting point.
**Expert note:** The demo overlay is a pure data layer, not a configuration change. The server, agent, and their environment variables remain identical to the base. This means any behavior you see in the demo is exactly what the base environment produces once you populate data through normal operations.
---
## Development Overlay
**File:** `docker-compose.dev.yml`
**When to use:** When you're contributing to certctl and need debug logging, database inspection, or a debugger attached to the server process.
### What it adds
| Addition | Purpose |
|----------|---------|
| Debug-level logging on server and agent | See every HTTP request, scheduler tick, and connector operation |
| PgAdmin on port 5050 | Visual database browser for inspecting tables, running queries |
| Delve debugger port 40000 | Attach a Go debugger to the running server process |
### Starting it
```bash
docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.dev.yml up --build
```
Omit `-d` during development so you see logs streaming in your terminal.
### Using PgAdmin
Open **http://localhost:5050** in your browser. PgAdmin is pre-configured in desktop mode (no login required). To connect to the certctl database:
1. Right-click "Servers" in the left panel, choose "Register" > "Server"
2. Name: `certctl`
3. Connection tab: Host = `postgres`, Port = `5432`, Username = `certctl`, Password = `certctl` (or whatever you set in `.env`)
From there you can browse all 19 tables, inspect certificate records, view audit events, check the scheduler's job queue, and run arbitrary SQL.
### Using the Delve debugger
Port 40000 is exposed for remote debugging. To use it, you'd need to modify the Dockerfile to build with debug symbols and start the server under Delve:
```bash
# In Dockerfile, replace the CMD with:
CMD ["dlv", "--listen=:40000", "--headless=true", "--api-version=2", "exec", "/app/server"]
```
Then attach from your IDE (VS Code, GoLand) using remote debug configuration pointing to `localhost:40000`.
### Hot reload
The dev overlay includes commented-out volume mounts for source code directories. Uncomment them and install [air](https://github.com/cosmtrek/air) to get automatic recompilation on file changes:
```bash
go install github.com/cosmtrek/air@latest
```
**Expert note:** The `builds: context: ..` in the dev overlay overrides the base service's image reference, forcing a local build from the repository root. This means changes to your Go source code are compiled fresh on each `docker compose up --build`.
---
## Test Environment
**File:** `docker-compose.test.yml`
**When to use:** Integration testing against real CA backends. This is a standalone environment (not an overlay) with 7 containers on a static-IP subnet.
### What it runs
| Service | IP | Purpose |
|---------|----|---------|
| `postgres` | 10.30.50.2 | Database (clean, no demo data) |
| `pebble-challtestsrv` | 10.30.50.3 | DNS/HTTP challenge test server for Pebble |
| `pebble` | 10.30.50.4 | ACME test server (simulates Let's Encrypt) |
| `step-ca` | 10.30.50.5 | Private CA (Smallstep, JWK provisioner) |
| `certctl-server` | 10.30.50.6 | Control plane with all issuers configured |
| `nginx` | 10.30.50.7 | TLS target server for deployment testing |
| `certctl-agent` | 10.30.50.8 | Agent with NGINX volume + discovery |
### Why static IPs?
Pebble (the ACME test server) validates HTTP-01 challenges by connecting to the challenge URL. It resolves domain names via `pebble-challtestsrv`, which is configured to return `10.30.50.6` (the certctl server) for all lookups. Without static IPs, container IPs would be assigned randomly on each boot, breaking the challenge validation chain.
The `/24` subnet (10.30.50.0/24) provides 254 usable addresses, far more than needed but standard practice for test networks.
### Starting it
```bash
docker compose -f deploy/docker-compose.test.yml up --build
```
Wait for all health checks to pass (about 60 seconds for step-ca's first-run bootstrap). Then:
```bash
# Dashboard with auth enabled
open http://localhost:8443
# API key: test-key-2026
# NGINX serving a self-signed placeholder
curl -k https://localhost:8444
```
### What's different from the base
The test environment is configured for production-like behavior:
- **API key auth enabled** (`CERTCTL_AUTH_TYPE: api-key`, `CERTCTL_AUTH_SECRET: test-key-2026`). Every API request needs `Authorization: Bearer test-key-2026`.
- **Agent-side key generation** (`CERTCTL_KEYGEN_MODE: agent`). The agent generates ECDSA P-256 keys locally and submits only the CSR to the server. Private keys never leave the agent container.
- **Three real issuers configured:**
- **Local CA** (self-signed) for instant issuance testing
- **ACME via Pebble** for Let's Encrypt-compatible flow testing (HTTP-01 challenges validated through the challenge test server)
- **step-ca** for private CA testing with JWK provisioner authentication
- **EST server enabled** (`CERTCTL_EST_ENABLED: "true"`) for RFC 7030 enrollment testing
- **Post-deployment verification enabled** (`CERTCTL_VERIFY_DEPLOYMENT: "true"`) so the agent probes NGINX after deploying a cert and confirms the TLS fingerprint matches
- **Dynamic config encryption enabled** (`CERTCTL_CONFIG_ENCRYPTION_KEY`) so issuer/target configs added through the GUI are encrypted at rest
- **TLS trust bootstrapping:** The server runs a `setup-trust.sh` entrypoint that fetches Pebble's root CA from its management API and copies step-ca's root cert from a shared volume, then runs `update-ca-certificates` before starting the server binary. This is necessary because both CAs use self-signed roots that aren't in Alpine's default trust store.
### Running the Go integration tests
The test environment is designed to support the Go integration test suite at `deploy/test/integration_test.go`:
```bash
# Start the environment
docker compose -f deploy/docker-compose.test.yml up --build -d
# Wait for health checks
sleep 30
# Run integration tests (from repo root)
go test -tags integration -v ./deploy/test/...
```
The integration tests exercise 12 phases: health, agent heartbeat, Local CA issuance, ACME issuance, renewal, step-ca issuance, revocation + CRL + OCSP, EST enrollment, S/MIME issuance, discovery, network scan, and deployment verification. PostgreSQL port 5432 is exposed so the test binary can query the database directly for assertions.
See [docs/test-env.md](../docs/test-env.md) for the full walkthrough and manual QA procedures.
### Stopping and cleaning up
```bash
# Stop but keep data (volumes persist)
docker compose -f deploy/docker-compose.test.yml down
# Full reset (delete step-ca bootstrap, database, agent keys, NGINX certs)
docker compose -f deploy/docker-compose.test.yml down -v
```
**Expert note:** The step-ca container auto-bootstraps on first run: generates a root CA, creates a JWK provisioner named "admin" with password "password123", and writes everything to the `stepca_data` volume. Subsequent starts reuse this volume. If you `down -v`, the next boot generates a new root CA, which means all previously issued step-ca certs become untrusted.
---
## Environment Variable Reference
Every `CERTCTL_*` environment variable is read by the server's `internal/config/config.go` via `os.Getenv`. If the prefix is missing, the variable is silently ignored.
### Server
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_DATABASE_URL` | (required) | PostgreSQL connection string |
| `CERTCTL_SERVER_HOST` | `0.0.0.0` | Listen address |
| `CERTCTL_SERVER_PORT` | `8443` | Listen port |
| `CERTCTL_LOG_LEVEL` | `info` | Log verbosity: `debug`, `info`, `warn`, `error` |
| `CERTCTL_AUTH_TYPE` | `api-key` | Auth mode: `api-key` or `none` |
| `CERTCTL_AUTH_SECRET` | (none) | API key(s), comma-separated for rotation |
| `CERTCTL_KEYGEN_MODE` | `agent` | Key generation: `agent` (production) or `server` (demo) |
| `CERTCTL_CONFIG_ENCRYPTION_KEY` | (none) | AES-256-GCM key for encrypting issuer/target configs in DB |
| `CERTCTL_NETWORK_SCAN_ENABLED` | `false` | Enable network TLS scanning scheduler loop |
| `CERTCTL_NETWORK_SCAN_INTERVAL` | `6h` | How often the network scanner runs |
| `CERTCTL_MAX_BODY_SIZE` | `1048576` | Max request body size in bytes (1MB) |
| `CERTCTL_CORS_ORIGINS` | (empty) | Allowed CORS origins, comma-separated. Empty = deny all cross-origin |
| `CERTCTL_RATE_LIMIT_RPS` | `10` | Requests per second per client |
| `CERTCTL_RATE_LIMIT_BURST` | `20` | Burst allowance above RPS |
### Agent
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SERVER_URL` | (required) | Server API URL |
| `CERTCTL_API_KEY` | (none) | API key for authenticating with server |
| `CERTCTL_AGENT_NAME` | (hostname) | Display name in dashboard |
| `CERTCTL_AGENT_ID` | (auto-generated) | Stable agent identifier |
| `CERTCTL_KEYGEN_MODE` | `agent` | Must match server setting |
| `CERTCTL_LOG_LEVEL` | `info` | Log verbosity |
| `CERTCTL_KEY_DIR` | `/var/lib/certctl/keys` | Directory for private key storage (0600 perms) |
| `CERTCTL_DISCOVERY_DIRS` | (none) | Comma-separated paths to scan for existing certs |
### Issuers (Server)
| Variable | Description |
|----------|-------------|
| `CERTCTL_ACME_DIRECTORY_URL` | ACME CA directory (e.g., Let's Encrypt, Pebble) |
| `CERTCTL_ACME_EMAIL` | ACME account email |
| `CERTCTL_ACME_CHALLENGE_TYPE` | `http-01`, `dns-01`, or `dns-persist-01` |
| `CERTCTL_ACME_INSECURE` | Skip TLS verification for ACME CA (test only) |
| `CERTCTL_ACME_EAB_KID` / `CERTCTL_ACME_EAB_HMAC` | External Account Binding for ZeroSSL, Google Trust Services |
| `CERTCTL_ACME_ARI_ENABLED` | Enable RFC 9773 Renewal Information |
| `CERTCTL_ACME_PROFILE` | ACME profile (`tlsserver`, `shortlived`) |
| `CERTCTL_STEPCA_URL` | step-ca server URL |
| `CERTCTL_STEPCA_ROOT_CERT` | Path to step-ca root CA cert |
| `CERTCTL_STEPCA_PROVISIONER` | Provisioner name |
| `CERTCTL_STEPCA_PASSWORD` | Provisioner password |
| `CERTCTL_STEPCA_KEY_PATH` | Path to provisioner key |
| `CERTCTL_CA_CERT_PATH` / `CERTCTL_CA_KEY_PATH` | Sub-CA mode: load CA cert+key from disk |
| `CERTCTL_VAULT_ADDR` | Vault server address |
| `CERTCTL_VAULT_TOKEN` | Vault auth token |
| `CERTCTL_VAULT_MOUNT` | PKI secrets engine mount (default: `pki`) |
| `CERTCTL_VAULT_ROLE` | PKI role name |
| `CERTCTL_DIGICERT_API_KEY` | DigiCert CertCentral API key |
| `CERTCTL_DIGICERT_ORG_ID` | DigiCert organization ID |
| `CERTCTL_SECTIGO_CUSTOMER_URI` / `_LOGIN` / `_PASSWORD` | Sectigo SCM auth |
| `CERTCTL_GOOGLE_CAS_PROJECT` / `_LOCATION` / `_CA_POOL` / `_CREDENTIALS` | Google CAS config |
### EST Server
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_EST_ENABLED` | `false` | Enable RFC 7030 EST endpoints |
| `CERTCTL_EST_ISSUER_ID` | `iss-local` | Which issuer processes EST enrollments |
| `CERTCTL_EST_PROFILE_ID` | (none) | Optional profile constraint |
### Post-Deployment Verification
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_VERIFY_DEPLOYMENT` | `false` | Agent probes TLS after deploying |
| `CERTCTL_VERIFY_TIMEOUT` | `10s` | TLS probe timeout |
| `CERTCTL_VERIFY_DELAY` | `2s` | Wait before probing (let service reload) |
### Notifications
| Variable | Description |
|----------|-------------|
| `CERTCTL_SMTP_HOST` / `_PORT` / `_USERNAME` / `_PASSWORD` / `_FROM_ADDRESS` / `_USE_TLS` | SMTP email |
| `CERTCTL_SLACK_WEBHOOK_URL` / `_CHANNEL` / `_USERNAME` | Slack notifications |
| `CERTCTL_TEAMS_WEBHOOK_URL` | Microsoft Teams |
| `CERTCTL_PAGERDUTY_ROUTING_KEY` / `_SEVERITY` | PagerDuty alerts |
| `CERTCTL_OPSGENIE_API_KEY` / `_PRIORITY` | OpsGenie alerts |
| `CERTCTL_DIGEST_ENABLED` / `_INTERVAL` / `_RECIPIENTS` | Scheduled digest email |
---
## Common Operations
### Viewing logs
```bash
# All services
docker compose -f deploy/docker-compose.yml logs -f
# Single service
docker compose -f deploy/docker-compose.yml logs -f certctl-server
# Last 100 lines
docker compose -f deploy/docker-compose.yml logs --tail 100 certctl-server
```
### Rebuilding after code changes
```bash
docker compose -f deploy/docker-compose.yml up -d --build
```
Docker only rebuilds images that have changed source files. The `--build` flag is essential after editing Go code or frontend files.
### Connecting to the database directly
```bash
docker exec -it certctl-postgres psql -U certctl -d certctl
```
Useful queries:
```sql
-- Certificate inventory
SELECT id, common_name, status, expires_at FROM managed_certificates ORDER BY expires_at;
-- Recent jobs
SELECT id, type, status, certificate_id, created_at FROM jobs ORDER BY created_at DESC LIMIT 20;
-- Audit trail
SELECT event_type, actor, resource_id, created_at FROM audit_events ORDER BY created_at DESC LIMIT 20;
-- Issuer configurations (encrypted_config is AES-256-GCM)
SELECT id, type, source, enabled, test_status FROM issuers;
```
### Checking container resource usage
```bash
docker stats --no-stream
```
### Upgrading
```bash
git pull
docker compose -f deploy/docker-compose.yml up -d --build
```
Migrations are idempotent (`IF NOT EXISTS`), so upgrading to a version with new schema changes is safe. PostgreSQL only runs init scripts on first boot of a fresh volume, so new migrations in an upgrade require running them manually:
```bash
docker exec -i certctl-postgres psql -U certctl -d certctl < migrations/000011_new_feature.up.sql
```
Or, for a clean upgrade: `down -v` and `up --build` (loses existing data).
+14
View File
@@ -0,0 +1,14 @@
# Demo mode: pre-populated dashboard with 32 certificates, 8 agents, 10 issuers, etc.
# Use this to showcase certctl's dashboard with realistic data.
#
# Usage:
# docker compose -f docker-compose.yml -f docker-compose.demo.yml up --build
#
# To start fresh (wipe previous data):
# docker compose -f docker-compose.yml -f docker-compose.demo.yml down -v
# docker compose -f docker-compose.yml -f docker-compose.demo.yml up --build
services:
postgres:
volumes:
- ../migrations/seed_demo.sql:/docker-entrypoint-initdb.d/030_seed_demo.sql
+23 -4
View File
@@ -9,11 +9,21 @@ services:
build:
context: ..
dockerfile: Dockerfile
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Node frontend stage and Go module
# download can reach the public registries behind corporate proxies.
# Defaults to empty; omit the variables from the host environment for
# un-proxied builds and the behaviour is byte-identical to the pre-fix
# tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
environment:
# Verbose logging for development
LOG_LEVEL: debug
SERVER_HOST: 0.0.0.0
SERVER_PORT: 8443
CERTCTL_LOG_LEVEL: debug
CERTCTL_SERVER_HOST: 0.0.0.0
CERTCTL_SERVER_PORT: "8443"
volumes:
# Mount local source for hot reload (requires air or similar)
# Uncomment if using air or similar for hot reload:
@@ -29,8 +39,17 @@ services:
build:
context: ..
dockerfile: Dockerfile.agent
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Go module download stage can reach
# the public Go module proxy behind corporate proxies. Defaults to
# empty; omit the variables from the host environment for un-proxied
# builds and the behaviour is byte-identical to the pre-fix tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
environment:
LOG_LEVEL: debug
CERTCTL_LOG_LEVEL: debug
# PgAdmin for database exploration
pgadmin:
+26 -2
View File
@@ -45,8 +45,10 @@ services:
- ../migrations/000006_discovery.up.sql:/docker-entrypoint-initdb.d/006_discovery.sql
- ../migrations/000007_network_discovery.up.sql:/docker-entrypoint-initdb.d/007_network_discovery.sql
- ../migrations/000008_verification.up.sql:/docker-entrypoint-initdb.d/008_verification.sql
- ../migrations/seed.sql:/docker-entrypoint-initdb.d/010_seed.sql
- ../migrations/seed_test.sql:/docker-entrypoint-initdb.d/015_seed_test.sql
- ../migrations/000009_issuer_config.up.sql:/docker-entrypoint-initdb.d/009_issuer_config.sql
- ../migrations/000010_target_config.up.sql:/docker-entrypoint-initdb.d/010_target_config.sql
- ../migrations/seed.sql:/docker-entrypoint-initdb.d/020_seed.sql
- ../migrations/seed_test.sql:/docker-entrypoint-initdb.d/025_seed_test.sql
# No seed_demo.sql — start with a clean database for real testing
networks:
certctl-test:
@@ -148,6 +150,16 @@ services:
build:
context: ..
dockerfile: Dockerfile
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Node frontend stage and Go module
# download can reach the public registries behind corporate proxies.
# Defaults to empty; omit the variables from the host environment for
# un-proxied builds and the behaviour is byte-identical to the pre-fix
# tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
container_name: certctl-test-server
depends_on:
postgres:
@@ -196,6 +208,9 @@ services:
CERTCTL_EST_ENABLED: "true"
CERTCTL_EST_ISSUER_ID: iss-local
# Dynamic issuer/target config encryption (M34/M35)
CERTCTL_CONFIG_ENCRYPTION_KEY: test-encryption-key-32chars!!
# Network scanning
CERTCTL_NETWORK_SCAN_ENABLED: "true"
@@ -261,6 +276,15 @@ services:
build:
context: ..
dockerfile: Dockerfile.agent
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Go module download stage can reach
# the public Go module proxy behind corporate proxies. Defaults to
# empty; omit the variables from the host environment for un-proxied
# builds and the behaviour is byte-identical to the pre-fix tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
container_name: certctl-test-agent
depends_on:
certctl-server:
+24 -2
View File
@@ -19,8 +19,9 @@ services:
- ../migrations/000006_discovery.up.sql:/docker-entrypoint-initdb.d/006_discovery.sql
- ../migrations/000007_network_discovery.up.sql:/docker-entrypoint-initdb.d/007_network_discovery.sql
- ../migrations/000008_verification.up.sql:/docker-entrypoint-initdb.d/008_verification.sql
- ../migrations/seed.sql:/docker-entrypoint-initdb.d/010_seed.sql
- ../migrations/seed_demo.sql:/docker-entrypoint-initdb.d/011_seed_demo.sql
- ../migrations/000009_issuer_config.up.sql:/docker-entrypoint-initdb.d/009_issuer_config.sql
- ../migrations/000010_target_config.up.sql:/docker-entrypoint-initdb.d/010_target_config.sql
- ../migrations/seed.sql:/docker-entrypoint-initdb.d/020_seed.sql
networks:
- certctl-network
healthcheck:
@@ -35,6 +36,16 @@ services:
build:
context: ..
dockerfile: Dockerfile
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Node frontend stage and Go module
# download can reach the public registries behind corporate proxies.
# Defaults to empty; omit the variables from the host environment for
# un-proxied builds and the behaviour is byte-identical to the pre-fix
# tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
container_name: certctl-server
depends_on:
postgres:
@@ -47,6 +58,7 @@ services:
CERTCTL_AUTH_TYPE: none
CERTCTL_KEYGEN_MODE: server # Demo uses server-side keygen; production should use "agent"
CERTCTL_NETWORK_SCAN_ENABLED: "true" # Enable network scan GUI with seeded demo targets
CERTCTL_CONFIG_ENCRYPTION_KEY: ${CERTCTL_CONFIG_ENCRYPTION_KEY:-change-me-32-char-encryption-key} # AES-256-GCM for dynamic issuer/target config
ports:
- "8443:8443"
networks:
@@ -73,6 +85,15 @@ services:
build:
context: ..
dockerfile: Dockerfile.agent
# Proxy propagation (M-4, Issue #9) — forwards host shell's proxy env
# vars into the Docker build so the Go module download stage can reach
# the public Go module proxy behind corporate proxies. Defaults to
# empty; omit the variables from the host environment for un-proxied
# builds and the behaviour is byte-identical to the pre-fix tree.
args:
HTTP_PROXY: ${HTTP_PROXY:-}
HTTPS_PROXY: ${HTTPS_PROXY:-}
NO_PROXY: ${NO_PROXY:-}
container_name: certctl-agent
depends_on:
certctl-server:
@@ -82,6 +103,7 @@ services:
CERTCTL_API_KEY: ${CERTCTL_API_KEY:-change-me-in-production}
CERTCTL_AGENT_NAME: docker-agent
CERTCTL_LOG_LEVEL: info
CERTCTL_DISCOVERY_DIRS: /var/lib/certctl/keys # Agent scans this directory for existing certificates
volumes:
- agent_keys:/var/lib/certctl/keys
networks:
+1 -1
View File
@@ -458,4 +458,4 @@ For issues, questions, or contributions:
## License
BSL-1.1 (Business Source License)
Converts to Apache 2.0 on March 28, 2033
Converts to Apache 2.0 on March 14, 2033
@@ -18,7 +18,14 @@ metadata:
name: {{ include "certctl.fullname" . }}
labels:
{{- include "certctl.labels" . | nindent 4 }}
rules: []
rules:
{{- if .Values.kubernetesSecrets.enabled }}
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "create", "update", "patch"]
{{- else }}
[]
{{- end }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
+7
View File
@@ -381,6 +381,13 @@ serviceAccount:
rbac:
create: true
# ==============================================================================
# Kubernetes Secrets Target Connector
# ==============================================================================
kubernetesSecrets:
# Enable RBAC rules for managing TLS Secrets
enabled: false
# ==============================================================================
# Pod Disruption Budget (for HA deployments)
# ==============================================================================
+37 -19
View File
@@ -195,16 +195,11 @@ type metricsResponse struct {
Uptime float64 `json:"uptime_seconds"`
}
// crlResponse for the CRL endpoint.
type crlResponse struct {
Version int `json:"version"`
Total int `json:"total"`
Entries []struct {
Serial string `json:"serial_number"`
Reason string `json:"reason"`
RevokedAt string `json:"revoked_at"`
} `json:"entries"`
}
// M-006: The non-standard JSON CRL endpoint (`GET /api/v1/crl`) was removed.
// RFC 5280 §5 defines only the DER wire format, which is now served
// unauthenticated at `/.well-known/pki/crl/{issuer_id}` per RFC 8615.
// The `crlResponse` Go struct that used to decode the JSON envelope is gone;
// Phase 7 parses the DER bytes directly via `x509.ParseRevocationList`.
// ---------------------------------------------------------------------------
// PostgreSQL test helper
@@ -728,18 +723,41 @@ func TestIntegrationSuite(t *testing.T) {
t.Fatalf("revocation response unexpected: %s", body)
}
// Check CRL
t.Run("CRL", func(t *testing.T) {
resp, err := c.Get("/api/v1/crl")
// Check DER CRL served unauthenticated under /.well-known/pki/ per
// RFC 5280 §5 + RFC 8615 (M-006). Use a plain http.Get — no Bearer
// token — to prove the endpoint is reachable by relying parties that
// have no certctl API credentials.
t.Run("CRL_DER_Unauthenticated", func(t *testing.T) {
resp, err := http.Get(serverURL + "/.well-known/pki/crl/iss-local")
if err != nil {
t.Fatalf("GET CRL: %v", err)
t.Fatalf("GET DER CRL: %v", err)
}
var crl crlResponse
if err := decodeJSON(resp, &crl); err != nil {
t.Fatalf("decode CRL: %v", err)
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
t.Fatalf("unexpected status: got %d, want 200 (body=%s)", resp.StatusCode, string(body))
}
if crl.Total < 1 {
t.Fatalf("CRL total: got %d, want >= 1", crl.Total)
if ct := resp.Header.Get("Content-Type"); ct != "application/pkix-crl" {
t.Errorf("Content-Type: got %q, want %q", ct, "application/pkix-crl")
}
body, err := io.ReadAll(resp.Body)
if err != nil {
t.Fatalf("read CRL body: %v", err)
}
if len(body) == 0 {
t.Fatal("CRL body empty")
}
// Parse the DER bytes as an X.509 CRL (RFC 5280) and verify the
// just-revoked certificate is listed.
crl, err := x509.ParseRevocationList(body)
if err != nil {
t.Fatalf("parse DER CRL: %v", err)
}
if len(crl.RevokedCertificateEntries) < 1 {
t.Fatalf("CRL entries: got %d, want >= 1", len(crl.RevokedCertificateEntries))
}
})
File diff suppressed because it is too large Load Diff
+15 -6
View File
@@ -608,13 +608,22 @@ else
fail "Revocation failed" "$REVOKE_RESP"
fi
info "Checking CRL..."
CRL_RESP=$(api_get "/api/v1/crl" 2>/dev/null || echo '{"total":0}')
CRL_TOTAL=$(echo "$CRL_RESP" | python3 -c "import sys,json; print(json.load(sys.stdin).get('total',0))" 2>/dev/null || echo 0)
if [ "$CRL_TOTAL" -ge 1 ]; then
pass "CRL contains $CRL_TOTAL revoked certificate(s)"
info "Checking DER CRL under /.well-known/pki (RFC 5280 §5, RFC 8615)..."
# The JSON CRL endpoint (`GET /api/v1/crl`) was removed in M-006. RFC 5280
# defines only the DER wire format, now served unauthenticated at
# `/.well-known/pki/crl/{issuer_id}`. Fetch without the Bearer header to
# prove the endpoint is reachable by relying parties with no API key.
CRL_TMP=$(mktemp)
CRL_HEADERS=$(mktemp)
CRL_HTTP_CODE=$(curl -s -o "$CRL_TMP" -D "$CRL_HEADERS" -w "%{http_code}" "${API_URL}/.well-known/pki/crl/iss-local" 2>/dev/null || echo "000")
CRL_SIZE=$(wc -c < "$CRL_TMP" | tr -d ' ')
CRL_CONTENT_TYPE=$(awk 'tolower($1)=="content-type:" { sub(/\r$/,"",$2); print tolower($2) }' "$CRL_HEADERS" | head -n1)
rm -f "$CRL_TMP" "$CRL_HEADERS"
if [ "$CRL_HTTP_CODE" = "200" ] && [ "$CRL_CONTENT_TYPE" = "application/pkix-crl" ] && [ "$CRL_SIZE" -gt 0 ]; then
pass "DER CRL served unauthenticated (HTTP 200, Content-Type application/pkix-crl, ${CRL_SIZE} bytes)"
else
fail "CRL empty after revocation"
fail "DER CRL fetch failed: HTTP=$CRL_HTTP_CODE Content-Type=$CRL_CONTENT_TYPE size=$CRL_SIZE"
fi
CERT_STATUS=$(api_get "/api/v1/certificates/mc-local-test" | python3 -c "import sys,json; print(json.load(sys.stdin).get('status',''))" 2>/dev/null || echo "unknown")
+184 -35
View File
@@ -82,6 +82,12 @@ flowchart TB
CA4["OpenSSL / Custom CA\n(script-based)"]
CA6["Vault PKI\n(token auth, /sign API)"]
CA7["DigiCert CertCentral\n(async order model)"]
CA8["Sectigo SCM\n(async order model)"]
CA9["Google CAS\n(OAuth2, sync)"]
CA10["AWS ACM PCA\n(sync issuance)"]
CA11["Entrust\n(mTLS, sync/async)"]
CA12["GlobalSign Atlas\n(mTLS + API key)"]
CA13["EJBCA\n(mTLS or OAuth2)"]
end
subgraph "Target Systems"
@@ -90,8 +96,14 @@ flowchart TB
T5["HAProxy\n(combined PEM + reload)"]
T6["Traefik\n(file provider)"]
T7["Caddy\n(admin API / file)"]
T2["F5 BIG-IP\n(proxy agent + iControl REST, planned)"]
T3["IIS\n(agent-local PowerShell, planned)"]
T8["Envoy\n(file-based SDS)"]
T9["Postfix/Dovecot\n(file + service reload)"]
T2["F5 BIG-IP\n(proxy agent + iControl REST)"]
T3["IIS\n(WinRM + local)"]
T10["SSH\n(SFTP + reload)"]
T11["WinCertStore\n(PowerShell import)"]
T12["Java Keystore\n(keytool pipeline)"]
T13["Kubernetes Secrets\n(K8s API)"]
end
DASH --> API
@@ -99,7 +111,7 @@ flowchart TB
SVC --> REPO
REPO --> PG
SCHED --> SVC
SVC -->|"Issue/Renew"| CA1 & CA2 & CA3 & CA4 & CA6 & CA7
SVC -->|"Issue/Renew"| CA1 & CA2 & CA3 & CA4 & CA6 & CA7 & CA8 & CA9 & CA10
A1 & A2 & A3 -->|"CSR + Heartbeat"| API
API -->|"Cert + Chain\n(NO private key)"| A1 & A2 & A3
@@ -119,7 +131,7 @@ The server exposes a REST API under `/api/v1/` and optionally serves the web das
### Agents
Lightweight Go processes that run on or near your infrastructure. Agents generate ECDSA P-256 private keys locally, create CSRs, and submit them to the control plane for signing — private keys never leave agent infrastructure. Agents also handle certificate deployment to target systems (NGINX, Apache httpd, HAProxy fully implemented; F5 BIG-IP, IIS interface only with V2 implementations planned) and report job status. They communicate with the control plane via HTTP and authenticate with API keys.
Lightweight Go processes that run on or near your infrastructure. Agents generate ECDSA P-256 private keys locally, create CSRs, and submit them to the control plane for signing — private keys never leave agent infrastructure. Agents also handle certificate deployment to target systems (NGINX, Apache httpd, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS, F5 BIG-IP, SSH, Windows Certificate Store, Java Keystore, Kubernetes Secrets) and report job status. They communicate with the control plane via HTTP and authenticate with API keys.
The agent runs two background loops: a heartbeat (every 60 seconds) to signal it's alive, and a work poll (every 30 seconds) to check for actionable jobs via `GET /api/v1/agents/{id}/work`. Jobs may be `AwaitingCSR` (agent needs to generate key + submit CSR) or `Deployment` (agent needs to deploy a certificate). Private keys are stored in `CERTCTL_KEY_DIR` (default `/var/lib/certctl/keys`) with 0600 permissions.
@@ -127,11 +139,21 @@ The agent runs two background loops: a heartbeat (every 60 seconds) to signal it
**Agent groups (M11b):** Dynamic device grouping allows organizing agents by metadata criteria. Agent groups can match by OS, architecture, IP CIDR, and version. Groups support both dynamic matching (agents automatically join when criteria match) and manual membership (explicit include/exclude). Renewal policies can be scoped to agent groups via the `agent_group_id` foreign key. The GUI provides full CRUD management for agent groups with visual match criteria badges.
**Agent soft-retirement (I-004):** `DELETE /api/v1/agents/{id}` is a soft-delete surface — the row is never removed. Retirement stamps `agents.retired_at` (TIMESTAMPTZ) and `agents.retired_reason` (TEXT) and flips the operational status to `Offline`. Default listings (`GET /api/v1/agents`, the dashboard stats counter, and the stale-offline sweeper) filter retired rows out via `AgentRepository.ListActive`; retired rows are surfaced only through the opt-in `GET /api/v1/agents/retired` view. The endpoint follows a preflight → block → escape-hatch contract:
- **Clean retire** (no active dependencies) — `200 OK` with `RetireAgentResponse` (`cascade=false`, zero counts).
- **Blocked by active dependencies**`409 Conflict` with `BlockedByDependenciesResponse`. The three counts (`active_targets`, `active_certificates`, `pending_jobs`) tell the operator exactly which rows would be orphaned. The schema diverges from `ErrorResponse` because downstream dashboards parse the stable three-key shape.
- **Force cascade**`DELETE /api/v1/agents/{id}?force=true&reason=...`. `reason` is required (400 otherwise). Transactionally soft-retires downstream `deployment_targets`, cancels pending jobs, and soft-retires the agent, emitting an `agent_retirement_cascaded` audit event with actor + reason + per-bucket counts.
- **Idempotent re-retire** — a retire attempt against an already-retired agent returns `204 No Content` with an empty body (no second audit event, no response shape — callers that POST again on a retry get a clean no-op).
- **Sentinel refusal** — the four sentinel agent IDs (`server-scanner`, `cloud-aws-sm`, `cloud-azure-kv`, `cloud-gcp-sm`) back non-agent discovery subsystems (the network scanner and the three cloud secret-manager sources). They are refused unconditionally — even with `force=true` — via `ErrAgentIsSentinel``403 Forbidden`. The ID list lives in `internal/domain/connector.go` (`SentinelAgentIDs`) so handler, repository, and scheduler code can filter them without importing `service`.
Retired agents receive `410 Gone` on subsequent heartbeats (`service.ErrAgentRetired`). `cmd/agent` treats 410 as a terminal signal and exits cleanly so retired agents stop phoning home. Migration `000015` flipped `deployment_targets.agent_id` from `ON DELETE CASCADE` to `ON DELETE RESTRICT`, making the old hard-delete path a schema error and forcing all retirement through this contract.
### Web Dashboard
The web dashboard is the primary operational interface for certctl. It is built with Vite + React + TypeScript and uses TanStack Query for server state management (caching, background refetching, optimistic updates).
**Current views** (21 pages): certificate inventory (list with multi-select bulk operations + "New Certificate" creation modal + detail with deployment status timeline, inline policy/profile editor, version history, deploy, revoke, archive, and trigger renewal actions), agent fleet (list + detail with system info + OS/architecture grouping with charts), job queue (status, retry, cancel, approve/reject for AwaitingApproval jobs), notification inbox (threshold alert grouping, mark-as-read), audit trail (time range, actor, action filters + CSV/JSON export), policy management (rules with enable/disable toggle + delete + violations), issuers (list with test connection + delete), targets (list with 3-step configuration wizard + delete), owners (list with team resolution + delete), teams (list with delete), agent groups (list with dynamic match criteria badges + enable/disable + delete), certificate profiles (list with crypto constraints), short-lived credentials dashboard (TTL countdown, profile filtering, auto-refresh), discovered certificates triage (claim/dismiss unmanaged certs discovered by agents or network scans), network scan targets management (CRUD for network scan targets + Scan Now button), summary dashboard with charts (expiration heatmap, renewal success rate, status distribution, issuance rate), and login page.
**Current views** (24 pages): certificate inventory (list with multi-select bulk operations + "New Certificate" creation modal + detail with deployment status timeline, inline policy/profile editor, version history, deploy, revoke, archive, and trigger renewal actions), agent fleet (list + detail with system info + OS/architecture grouping with charts), job queue (list + detail with verification section, timeline, audit events; approve/reject for AwaitingApproval jobs), notification inbox (threshold alert grouping, mark-as-read), audit trail (time range, actor, action filters + CSV/JSON export), policy management (rules with enable/disable toggle + delete + violations), issuers (catalog with 10 type cards + 3-step create wizard + detail with test connection), targets (list with 3-step configuration wizard + detail with deployment history), owners (list with team resolution + delete), teams (list with delete), agent groups (list with dynamic match criteria badges + enable/disable + delete), certificate profiles (list with crypto constraints), short-lived credentials dashboard (TTL countdown, profile filtering, auto-refresh), discovered certificates triage (claim/dismiss unmanaged certs discovered by agents or network scans), network scan targets management (CRUD + Scan Now button), summary dashboard with charts (expiration heatmap, renewal success rate, status distribution, issuance rate), digest preview and send, observability (health, metrics, Prometheus config), and login page.
The dashboard includes an **ErrorBoundary component** for graceful error recovery — if a view crashes, the boundary catches the error and displays a user-friendly message instead of breaking the entire dashboard. It also includes a **demo mode** that activates when the API is unreachable — it renders realistic mock data for screenshots and offline presentations.
@@ -384,7 +406,11 @@ sequenceDiagram
Note over A: Agent deploys using locally-held private key
```
**Profile enforcement:** If the certificate is assigned to a profile (`certificate_profile_id`), the profile's `allowed_key_algorithms` and `max_validity_days` constraints are checked during CSR validation. A CSR with a disallowed key type or a validity period exceeding the profile maximum is rejected before reaching the issuer connector.
**Profile enforcement (M11c):** Crypto policy enforcement is wired into all four issuance paths: renewal (server-side and agent CSR), agent fallback CSR signing, EST enrollment (RFC 7030), and SCEP enrollment (RFC 8894). At each path, the service layer resolves the certificate's profile and calls `ValidateCSRAgainstProfile()` to check the CSR key algorithm and minimum key size against the profile's `allowed_key_algorithms` rules. A CSR with a disallowed key type or insufficient key size is rejected before reaching the issuer connector.
**MaxTTL enforcement:** When a profile specifies `max_ttl_seconds`, the value is forwarded through the service-layer `IssuerConnector` interface to the connector layer via `MaxTTLSeconds` on `IssuanceRequest` and `RenewalRequest`. Each issuer connector enforces the cap according to its capabilities: the Local CA caps `NotAfter` directly, Vault overrides its TTL string, step-ca caps `NotAfter` with zero-value handling, and OpenSSL logs an advisory warning (script-based signing can't enforce server-side). For CAs that control validity themselves (ACME, DigiCert, Sectigo, Google CAS, AWS ACM PCA), MaxTTLSeconds passes through but the CA makes the final decision.
**Key metadata persistence:** Certificate versions record `key_algorithm` and `key_size` extracted from the CSR during issuance. This metadata enables post-hoc auditing — operators can verify that all issued certificates comply with the key requirements in effect at the time of issuance.
#### Server-Side Key Generation (Demo Only)
@@ -416,8 +442,8 @@ The agent deploys certificates using target connectors. Each connector knows how
- **NGINX**: Writes cert/chain/key files to disk, validates config with `nginx -t`, reloads with `nginx -s reload` or `systemctl reload nginx`
- **Apache httpd**: Writes separate cert/chain/key files, validates with `apachectl configtest`, graceful reload
- **HAProxy**: Builds a combined PEM file (cert + chain + key), optionally validates config, reloads via systemctl or signal
- **F5 BIG-IP** (planned): A proxy agent in the same network zone calls the iControl REST API to upload certificate and update SSL profile bindings. The server assigns the work; the proxy agent executes it.
- **IIS** (planned, dual-mode): (1) Agent-local (recommended) — a Windows agent on the IIS box runs PowerShell `Import-PfxCertificate` + `Set-WebBinding` directly. (2) Proxy agent WinRM — for agentless IIS targets, a nearby Windows agent reaches the IIS box via WinRM.
- **F5 BIG-IP**: A proxy agent in the same network zone calls the iControl REST API to upload certificate/key files, install crypto objects, and update the SSL client profile within an atomic transaction. The server assigns the work; the proxy agent executes it.
- **IIS** (implemented, dual-mode): (1) Agent-local (recommended) — a Windows agent on the IIS box runs PowerShell `Import-PfxCertificate` + `Set-WebBinding` directly with PFX conversion and SHA-1 thumbprint computation. (2) Proxy agent WinRM — for agentless IIS targets, a nearby Windows agent reaches the IIS box via WinRM.
The agent handles both the certificate (public) and the private key (read from local key store at `CERTCTL_KEY_DIR`). The control plane never sees the private key and never initiates outbound connections to agents or targets (pull-only model).
@@ -447,10 +473,14 @@ sequenceDiagram
API-->>U: 200 OK
```
The revocation is recorded in the `certificate_revocations` table (separate from the certificate status update) for CRL generation. The DER-encoded CRL at `GET /api/v1/crl/{issuer_id}` is generated on-demand by querying this table and signing with the issuing CA's key. The OCSP responder at `GET /api/v1/ocsp/{issuer_id}/{serial}` checks both the certificate status and the revocations table to return signed good/revoked/unknown responses.
The revocation is recorded in the `certificate_revocations` table (separate from the certificate status update) for CRL generation. The DER-encoded CRL at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5, RFC 8615) is generated on-demand by querying this table and signing with the issuing CA's key. The OCSP responder at `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960) checks both the certificate status and the revocations table to return signed good/revoked/unknown responses. Both endpoints are served unauthenticated — relying parties (TLS clients, hardware appliances, browsers) must be able to reach them without a certctl API key — and carry the IANA-registered media types `application/pkix-crl` and `application/ocsp-response` respectively.
Short-lived certificates (those with profile TTL < 1 hour) return "good" from OCSP and are excluded from CRL — their rapid expiry is treated as sufficient revocation.
#### Bulk Revocation
For compliance events requiring fleet-wide revocation (key compromise, CA distrust, mass decommission), certctl supports bulk revocation by filter criteria. The `POST /api/v1/certificates/bulk-revoke` endpoint accepts filter parameters (profile_id, owner_id, agent_id, issuer_id) and creates individual revocation jobs for each matching certificate. Bulk revocation reuses the same 7-step single-cert flow for each certificate — no new issuer notification or audit mechanics. The operation is idempotent: revoking an already-revoked certificate is a no-op. Partial failures are tolerated — if one certificate fails to revoke (e.g., issuer unavailable), the operation continues for remaining certs and returns a summary. A single `bulk_revocation_initiated` audit event logs the operation with filter criteria, operator actor, and summary (total requested, succeeded, failed counts). Audit events for individual certificate revocations record the operator identity separately. The GUI bulk revoke button on the certificates list filters by visible selections and displays an affected-cert count modal before confirmation.
### 4. Automatic Renewal
The control plane runs a scheduler with seven background loops:
@@ -507,10 +537,16 @@ flowchart TB
II["IssuerConnector Interface\nIssueCertificate() | RenewCertificate()\nRevokeCertificate() | GetOrderStatus()"]
II --> LC["Local CA"]
II --> ACME["ACME v2"]
II --> SC["step-ca"]
II --> SCA["step-ca"]
II --> OC["OpenSSL / Custom CA"]
II --> VP["Vault PKI"]
II --> DC["DigiCert CertCentral"]
II --> SG["Sectigo SCM"]
II --> GC["Google CAS"]
II --> AP2["AWS ACM PCA"]
II --> EN["Entrust"]
II --> GS["GlobalSign Atlas"]
II --> EJ["EJBCA"]
end
subgraph "Target Connectors"
@@ -521,8 +557,14 @@ flowchart TB
TI --> HP["HAProxy"]
TI --> TF["Traefik"]
TI --> CD["Caddy"]
TI --> F5["F5 BIG-IP (interface only)"]
TI --> IIS["IIS (interface only)"]
TI --> EV["Envoy"]
TI --> PO["Postfix/Dovecot"]
TI --> IIS["IIS"]
TI --> F5["F5 BIG-IP"]
TI --> SSH["SSH"]
TI --> WCS["WinCertStore"]
TI --> JKS["Java Keystore"]
TI --> K8S["K8s Secrets"]
end
subgraph "Notifier Connectors"
@@ -574,9 +616,9 @@ type Connector interface {
}
```
Built-in issuers: **Local CA** (self-signed or sub-CA mode using `crypto/x509`), **ACME v2** (HTTP-01, DNS-01, and DNS-PERSIST-01 challenges, compatible with Let's Encrypt, ZeroSSL, Sectigo, Google Trust Services, and any ACME-compliant CA), **step-ca** (Smallstep private CA via native /sign API with JWK provisioner auth), **OpenSSL/Custom CA** (script-based signing delegating to user-provided shell scripts), **Vault PKI** (HashiCorp Vault's PKI secrets engine via /sign API with token auth), and **DigiCert** (commercial CA via CertCentral REST API with async order processing). The ACME connector uses `golang.org/x/crypto/acme`, generates an ECDSA P-256 account key, handles account registration with ToS acceptance and optional External Account Binding (EAB) for CAs that require it (ZeroSSL, Google Trust Services, SSL.com), order creation, challenge solving (HTTP-01 via built-in server, DNS-01 via script-based hooks, DNS-PERSIST-01 via standing TXT records with auto-fallback to DNS-01), order finalization, and DER-to-PEM chain conversion. For ZeroSSL, EAB credentials are auto-fetched from ZeroSSL's public API when the directory URL is detected as ZeroSSL and no EAB credentials are provided — zero-friction onboarding with no dashboard visit required.
Built-in issuers (9 connectors): **Local CA** (self-signed or sub-CA mode using `crypto/x509`), **ACME v2** (HTTP-01, DNS-01, and DNS-PERSIST-01 challenges, compatible with Let's Encrypt, ZeroSSL, Sectigo, Google Trust Services, and any ACME-compliant CA), **step-ca** (Smallstep private CA via native /sign API with JWK provisioner auth), **OpenSSL/Custom CA** (script-based signing delegating to user-provided shell scripts), **Vault PKI** (HashiCorp Vault's PKI secrets engine via /sign API with token auth), **DigiCert** (commercial CA via CertCentral REST API with async order processing), **Sectigo SCM** (async order model with 3-header auth), **Google CAS** (Cloud Certificate Authority Service with OAuth2 service account auth), and **AWS ACM Private CA** (synchronous issuance via ACM PCA API). The ACME connector uses `golang.org/x/crypto/acme`, generates an ECDSA P-256 account key, handles account registration with ToS acceptance and optional External Account Binding (EAB) for CAs that require it (ZeroSSL, Google Trust Services, SSL.com), order creation, challenge solving (HTTP-01 via built-in server, DNS-01 via script-based hooks, DNS-PERSIST-01 via standing TXT records with auto-fallback to DNS-01), order finalization, and DER-to-PEM chain conversion. For ZeroSSL, EAB credentials are auto-fetched from ZeroSSL's public API when the directory URL is detected as ZeroSSL and no EAB credentials are provided — zero-friction onboarding with no dashboard visit required.
**ACME Renewal Information (ARI, RFC 9702):** The ACME connector supports CA-directed renewal timing via the `GetRenewalInfo()` method. Instead of using fixed thresholds (e.g., renew 30 days before expiry), the CA tells certctl when to renew by providing a `suggestedWindow` with start and end times. This is useful for distributing renewal load during maintenance windows and coordinating mass-revocation scenarios. Enable with `CERTCTL_ACME_ARI_ENABLED=true`. Cert ID is computed as `base64url(SHA-256(DER cert))` per RFC 9702. If the CA doesn't support ARI (404 from the ARI endpoint), certctl automatically falls back to threshold-based renewal — no operator intervention required. Errors from the CA are logged as warnings.
**ACME Renewal Information (ARI, RFC 9773):** The ACME connector supports CA-directed renewal timing via the `GetRenewalInfo()` method. Instead of using fixed thresholds (e.g., renew 30 days before expiry), the CA tells certctl when to renew by providing a `suggestedWindow` with start and end times. This is useful for distributing renewal load during maintenance windows and coordinating mass-revocation scenarios. Enable with `CERTCTL_ACME_ARI_ENABLED=true`. Cert ID is computed as `base64url(SHA-256(DER cert))` per RFC 9773. If the CA doesn't support ARI (404 from the ARI endpoint), certctl automatically falls back to threshold-based renewal — no operator intervention required. Errors from the CA are logged as warnings.
The interface also includes `GetCACertPEM(ctx)` for CA chain distribution (used by the EST server's `/cacerts` endpoint).
@@ -594,11 +636,11 @@ type Connector interface {
The `DeploymentRequest` struct carries the full material needed by the target system: the signed certificate, the CA chain, the agent-generated private key, target-specific configuration, and arbitrary metadata. The key field is populated by the agent from its local key store (`CERTCTL_KEY_DIR`) — it never originates from the control plane.
Built-in targets: **NGINX** (writes cert/chain/key files, validates with `nginx -t`, reloads), **Apache httpd** (writes cert/chain/key files, validates with `apachectl configtest`, graceful reload), **HAProxy** (combined PEM file with cert+chain+key, validates config, reloads via systemctl/signal), **Traefik** (file provider — writes cert/key to watched directory, Traefik auto-reloads), **Caddy** (dual-mode: admin API hot-reload or file-based), **F5 BIG-IP** (interface only — proxy agent + iControl REST, implementation planned), **IIS** (interface only — dual-mode: agent-local PowerShell primary + proxy agent WinRM for agentless targets, implementation planned).
Built-in targets (14 connector types): **NGINX** (writes cert/chain/key files, validates with `nginx -t`, reloads), **Apache httpd** (writes cert/chain/key files, validates with `apachectl configtest`, graceful reload), **HAProxy** (combined PEM file with cert+chain+key, validates config, reloads via systemctl/signal), **Traefik** (file provider — writes cert/key to watched directory, Traefik auto-reloads), **Caddy** (dual-mode: admin API hot-reload or file-based), **Envoy** (file-based with optional SDS JSON config), **F5 BIG-IP** (proxy agent + iControl REST, transaction-based atomic SSL profile updates), **IIS** (dual-mode: agent-local PowerShell + proxy agent WinRM for agentless targets), **Postfix/Dovecot** (file write + service reload), **SSH** (agentless deployment via SSH/SFTP), **Windows Certificate Store** (PowerShell-based cert import, dual-mode local/WinRM), **Java Keystore** (PEM → PKCS#12 → keytool pipeline, JKS and PKCS12 formats), **Kubernetes Secrets** (deploys as `kubernetes.io/tls` Secrets via injectable K8sClient interface, in-cluster or kubeconfig auth).
After deployment, agents can perform **post-deployment TLS verification**: the agent probes the live TLS endpoint using `crypto/tls.DialWithDialer` and compares the SHA-256 fingerprint of the served certificate against what was deployed. Results are reported via `POST /api/v1/jobs/{id}/verify` and stored on the job record. Verification is best-effort — failures don't block or rollback deployments.
Additional cloud, network, and Kubernetes target connectors are planned for future releases.
The SSH connector enables agentless deployment to any Linux/Unix server via SSH/SFTP, using the proxy agent pattern. The Kubernetes Secrets connector deploys certificates as `kubernetes.io/tls` Secrets via an injectable K8sClient interface supporting both in-cluster and out-of-cluster auth.
### Notifier Connector
@@ -651,10 +693,50 @@ type ESTService interface {
}
```
**Issuer connector extension:** EST required adding `GetCACertPEM(ctx) (string, error)` to the issuer connector interface so the `/cacerts` endpoint can serve the CA chain. The Local CA connector returns its CA certificate PEM; ACME, step-ca, OpenSSL, Vault, and DigiCert connectors return errors (they don't expose a static CA chain — their chains are per-issuance).
**Issuer connector extension:** EST required adding `GetCACertPEM(ctx) (string, error)` to the issuer connector interface so the `/cacerts` endpoint can serve the CA chain. The Local CA returns its CA certificate PEM; Vault PKI fetches via `GET /v1/{mount}/ca/pem`; Google CAS fetches via API; AWS ACM PCA retrieves via `GetCertificateAuthorityCertificate`. ACME, step-ca, OpenSSL, DigiCert, and Sectigo connectors return errors (they don't expose a static CA chain — their chains are per-issuance).
**Audit:** Every EST enrollment is recorded in the audit trail with `protocol: "EST"`, the CN, SANs, issuer ID, serial number, and optional profile ID.
### SCEP Server (RFC 8894)
The SCEP (Simple Certificate Enrollment Protocol) server provides certificate enrollment for MDM platforms and network devices. It runs at `/scep` with operation-based dispatch via query parameters per RFC 8894.
**Architecture:** SCEP follows the exact same layering as EST — a handler-level protocol that delegates certificate issuance to an existing `IssuerConnector`. The `SCEPService` bridges the `SCEPHandler` to whichever issuer connector is configured via `CERTCTL_SCEP_ISSUER_ID`.
```
Client (MDM, network device, SCEP client)
SCEPHandler (handler layer)
│ PKCS#7 envelope parsing, CSR extraction, challenge password extraction
SCEPService (service layer)
│ Challenge password validation, CSR validation, CN/SAN extraction, audit recording
IssuerConnector (connector layer via IssuerConnectorAdapter)
│ Certificate signing (Local CA, step-ca, etc.)
Signed certificate returned as PKCS#7 certs-only
```
**Wire format:** SCEP clients wrap CSRs in PKCS#7 SignedData envelopes. The handler parses the outer ASN.1 ContentInfo → SignedData → EncapsulatedContentInfo to extract the CSR bytes. Fallback paths handle base64-encoded PKCS#7 and raw CSR submissions (for simpler clients). Responses use PKCS#7 certs-only via the shared `internal/pkcs7` package (same as EST). Single certs are returned as raw DER for `GetCACert`, chains as PKCS#7.
**Authentication:** SCEP uses challenge passwords embedded in CSR attributes (OID 1.2.840.113549.1.9.7) rather than TLS client certificates. The server validates the challenge password against `CERTCTL_SCEP_CHALLENGE_PASSWORD`. When no challenge password is configured, any value is accepted.
**Interface:** The `SCEPHandler` defines an `SCEPService` interface (dependency inversion):
```go
type SCEPService interface {
GetCACaps(ctx context.Context) string
GetCACert(ctx context.Context) (string, error)
PKCSReq(ctx context.Context, csrPEM string, challengePassword string, transactionID string) (*domain.SCEPEnrollResult, error)
}
```
**Shared PKCS#7 package:** Both EST and SCEP handlers share a common `internal/pkcs7` package for building PKCS#7 certs-only responses and PEM-to-DER chain conversion, eliminating code duplication between the two enrollment protocols.
**Audit:** Every SCEP enrollment is recorded in the audit trail with `protocol: "SCEP"`, the CN, SANs, issuer ID, serial number, transaction ID, and optional profile ID.
## Security Model
### Private Key Management
@@ -736,6 +818,34 @@ All shell-facing inputs (connector scripts, domain names, ACME tokens) are valid
All incoming HTTP request bodies are capped by `http.MaxBytesReader` middleware (default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`). Requests exceeding the limit receive a 413 Request Entity Too Large response. The middleware is positioned before authentication in the chain so oversized payloads are rejected early, before any auth processing or database work occurs. Requests without bodies (GET, HEAD, nil body) skip the limit check.
### Config Encryption at Rest
Dynamic issuer and target configurations (rows with `source='database'`) contain credentials — ACME EAB HMACs, Vault tokens, DigiCert/Sectigo API keys, SSH private keys, WinRM passwords, F5 BIG-IP passwords, and similar. These are sealed at rest in PostgreSQL via `internal/crypto/encryption.go` using AES-256-GCM with a key derived from the operator passphrase `CERTCTL_CONFIG_ENCRYPTION_KEY` through PBKDF2-SHA256 (100,000 rounds, 32-byte output).
**v2 wire format (current, M-8 remediation, CWE-916 / CWE-329):**
```
magic(0x02) || salt(16) || nonce(12) || ciphertext+tag
```
Every call to `EncryptIfKeySet` draws 16 fresh bytes from `crypto/rand` as the PBKDF2 salt, so the derived AES-256 key is distinct per ciphertext and per re-encryption. The salt is stored alongside the ciphertext; decryption reads the magic byte, splits out the salt, re-derives the key, and verifies the AEAD tag.
**v1 legacy format (read-only):**
```
nonce(12) || ciphertext+tag
```
Pre-M-8 blobs were sealed with a package-level fixed salt `"certctl-config-encryption-v1"`. `DecryptIfKeySet` preserves the v1 read path unchanged — a blob whose first byte is not `0x02`, or whose v2 AEAD verification fails (including the 1/256 case where a v1 nonce happens to begin with `0x02`), falls through to a v1 attempt against the legacy fixed salt. v1 blobs are never written by the post-M-8 code path; they re-seal as v2 naturally on the next UPDATE through the normal service CRUD flow. No operator migration ceremony is required.
**Fail-closed behavior (C-2 sentinel, CWE-311):** both `EncryptIfKeySet` and `DecryptIfKeySet` return `ErrEncryptionKeyRequired` when invoked with an empty passphrase. The server refuses to start if any `source='database'` rows already exist without `CERTCTL_CONFIG_ENCRYPTION_KEY` set.
**Low-level primitives preserved byte-identical.** `Encrypt`, `Decrypt`, and `DeriveKey` are kept bit-stable so v1 fixtures on disk remain decryptable unchanged and so callers outside the config-encryption path (none today, but the symbols are exported) do not see a breaking change. The new per-ciphertext salt path is reached via the helper `deriveKeyWithSalt(passphrase, salt)`.
**Passphrase plumbing.** Services (`IssuerService`, `TargetService`, `IssuerRegistry`) hold the operator passphrase as a raw `string` and delegate PBKDF2 to the crypto package per ciphertext. This replaces the pre-M-8 design that pre-derived a single `[]byte` key at service construction and reused it for every row, which was the direct consequence of the fixed-salt KDF.
**Coverage gate.** CI enforces `internal/crypto/...` coverage ≥ 85% (observed 86.7%) — the encryption primitives are a security-critical gate, and the v2 format plus v1 fallback plus C-2 sentinel paths all need exhaustive coverage to avoid silent regressions.
### CORS
CORS uses a **deny-by-default** posture: when `CERTCTL_CORS_ORIGINS` is empty, no CORS headers are set and only same-origin requests can read responses. Operators must explicitly configure allowed origins. This prevents accidental exposure of the API to cross-origin requests in production.
@@ -774,10 +884,12 @@ All endpoints are under `/api/v1/` and follow consistent patterns:
Resources: certificates, issuers, targets, agents, jobs, policies, profiles, teams, owners, agent-groups, audit, notifications, discovered-certificates, discovery-scans, network-scan-targets, stats, metrics.
The full API is documented in an OpenAPI 3.1 specification at `api/openapi.yaml` with 99 endpoints across 23 resource domains (97 under `/api/v1/` + `/.well-known/est/` plus `/health` and `/ready`; includes auth, 7 discovery endpoints from M18b, 6 network scan endpoints from M21, Prometheus metrics from M22, 4 EST enrollment endpoints from M23, 2 digest endpoints from M29), all request/response schemas, and pagination conventions. See the [OpenAPI Guide](openapi.md) for usage with Swagger UI and SDK generation.
The full API is documented in an OpenAPI 3.1 specification at `api/openapi.yaml` with 97 operations across `/api/v1/` and `/.well-known/est/` (includes auth, 7 discovery endpoints, 6 network scan endpoints, Prometheus metrics, 4 EST enrollment endpoints, 2 digest endpoints, 2 verification endpoints, 2 export endpoints), all request/response schemas, and pagination conventions. The server also registers `/health` and `/ready` outside the OpenAPI spec, bringing the total route count to 107. See the [OpenAPI Guide](openapi.md) for usage with Swagger UI and SDK generation.
Jobs support additional action endpoints: `POST /api/v1/jobs/{id}/cancel`, `POST /api/v1/jobs/{id}/approve`, `POST /api/v1/jobs/{id}/reject`.
**Bulk Operations:** `POST /api/v1/certificates/bulk-revoke` — Bulk revocation by filter criteria (profile_id, owner_id, agent_id, issuer_id). Creates individual revocation jobs for matching certificates, with partial-failure tolerance and a summary audit event.
**Enhanced Query Features (M20):** Certificate list endpoints support additional query capabilities beyond basic pagination:
- **Sorting**: `?sort=notAfter` (ascending) or `?sort=-createdAt` (descending). Whitelist: notAfter, expiresAt, createdAt, updatedAt, commonName, name, status, environment.
@@ -787,7 +899,7 @@ Jobs support additional action endpoints: `POST /api/v1/jobs/{id}/cancel`, `POST
- **Additional filters**: `?agent_id=`, `?profile_id=` (in addition to existing status, environment, owner_id, team_id, issuer_id).
- **Deployments**: `GET /api/v1/certificates/{id}/deployments` returns deployment targets for a certificate.
Certificate revocation: `POST /api/v1/certificates/{id}/revoke` with optional `{"reason": "keyCompromise"}`. Supports RFC 5280 reason codes (unspecified, keyCompromise, caCompromise, affiliationChanged, superseded, cessationOfOperation, certificateHold, privilegeWithdrawn). Returns the updated certificate status. Best-effort issuer notification — the revocation succeeds even if the issuer connector is unavailable. A JSON-formatted CRL is available at `GET /api/v1/crl`, and a DER-encoded X.509 CRL signed by the issuing CA at `GET /api/v1/crl/{issuer_id}`. An embedded OCSP responder serves signed responses at `GET /api/v1/ocsp/{issuer_id}/{serial}`. Short-lived certificates (profile TTL < 1 hour) are exempt from CRL/OCSP — expiry is sufficient revocation.
Certificate revocation: `POST /api/v1/certificates/{id}/revoke` with optional `{"reason": "keyCompromise"}`. Supports RFC 5280 reason codes (unspecified, keyCompromise, caCompromise, affiliationChanged, superseded, cessationOfOperation, certificateHold, privilegeWithdrawn). Returns the updated certificate status. Best-effort issuer notification — the revocation succeeds even if the issuer connector is unavailable. The DER-encoded X.509 CRL signed by the issuing CA is served unauthenticated at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5 + RFC 8615, `Content-Type: application/pkix-crl`). The embedded OCSP responder serves signed responses unauthenticated at `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960, `Content-Type: application/ocsp-response`). Both endpoints are accessible to relying parties with no certctl API credentials, as RFC-compliant PKI consumers expect. Short-lived certificates (profile TTL < 1 hour) are exempt from CRL/OCSP — expiry is sufficient revocation.
Certificate export (M27): `GET /api/v1/certificates/{id}/export/pem` returns PEM-encoded certificate and chain, and `POST /api/v1/certificates/{id}/export/pkcs12` returns a PKCS#12 bundle (binary). Private keys are never exported — they remain on agents. All exports are audited with actor, timestamp, and format.
@@ -802,7 +914,7 @@ flowchart LR
AI["AI Assistant\n(Claude, Cursor)"] -->|"stdio"| MCP["MCP Server\ncmd/mcp-server/"]
MCP -->|"HTTP + Bearer token"| API["certctl REST API\n:8443"]
subgraph "78 MCP Tools"
subgraph "MCP Tools"
T1["Certificate CRUD"]
T2["Agent Management"]
T3["Job Operations"]
@@ -816,7 +928,7 @@ flowchart LR
The MCP server is a stateless HTTP proxy — every MCP tool call translates to an HTTP request to the certctl REST API. It adds no new state, no new dependencies, and no new attack surface beyond what the API already exposes. Configuration is minimal: `CERTCTL_SERVER_URL` and `CERTCTL_API_KEY` environment variables.
The 78 tools are organized across 16 resource domains with typed input structs and `jsonschema` struct tags for automatic LLM-friendly schema generation. Binary response support handles DER CRL and OCSP endpoints.
The tools are organized across 16 resource domains with typed input structs and `jsonschema` struct tags for automatic LLM-friendly schema generation. Binary response support handles DER CRL and OCSP endpoints.
## CLI Tool
@@ -891,9 +1003,9 @@ See `deploy/helm/certctl/values.yaml` for the full configuration reference and `
For production, you would also add an ingress controller, TLS termination for the certctl API itself, and external PostgreSQL (RDS, Cloud SQL, etc.).
## Discovery Data Flow (M18b + M21)
## Discovery Data Flow (M18b + M21 + M50)
Certificate discovery enables operators to build a complete inventory of existing certificates before managing them with certctl. There are two discovery modes that feed into the same pipeline:
Certificate discovery enables operators to build a complete inventory of existing certificates before managing them with certctl. There are three discovery modes that feed into the same pipeline:
```mermaid
flowchart TB
@@ -902,6 +1014,7 @@ flowchart TB
SCAN["Filesystem Scanner\n(CERTCTL_DISCOVERY_DIRS)"]
SERVER["certctl-server\n(network discovery)"]
NETSCAN["TLS Scanner\n(CIDR ranges + ports)"]
CLOUD["Cloud Discovery\n(AWS SM / Azure KV / GCP SM)"]
end
EXTRACT["Extract Metadata\n(CN, SANs, serial, issuer, expiry, fingerprint)"]
@@ -917,6 +1030,7 @@ flowchart TB
SCAN --> EXTRACT
SERVER -->|"Scheduler loop\n(every 6h)"| NETSCAN
NETSCAN -->|"crypto/tls.Dial\n50 goroutines"| EXTRACT
CLOUD -->|"Scheduler loop\n(every 6h)"| EXTRACT
EXTRACT --> SERVICE
SERVICE --> REPO
REPO -->|"Dedup by fingerprint\n+ agent_id + source_path"| DB
@@ -943,7 +1057,16 @@ flowchart TB
5. **Sentinel agent** — Results submitted using `server-scanner` as virtual agent ID, with `source_path` set to `ip:port` and `source_format` set to `network`
6. **Same pipeline** — Feeds into the same `DiscoveryService.ProcessDiscoveryReport()` as filesystem discovery — same dedup, same audit trail, same triage workflow
**Common triage workflow (both sources):**
**Cloud Secret Manager Discovery (M50):**
1. **Pluggable sources** — Each cloud provider implements the `DiscoverySource` interface (Name, Type, Discover, ValidateConfig). Three built-in sources: AWS Secrets Manager, Azure Key Vault, GCP Secret Manager
2. **CloudDiscoveryService orchestrator** — Iterates registered sources, calls `Discover()` on each, feeds reports into `ProcessDiscoveryReport()`. Errors from one source don't prevent other sources from running
3. **Scheduler integration** — 9th scheduler loop (6h default), runs immediately on startup, `atomic.Bool` idempotency guard
4. **Sentinel agents** — Each source uses its own sentinel agent ID (`cloud-aws-sm`, `cloud-azure-kv`, `cloud-gcp-sm`) for dedup and triage filtering
5. **Source path format**`aws-sm://{region}/{secret}`, `azure-kv://{cert-name}/{version}`, `gcp-sm://{project}/{secret}`
6. **No new schema** — Reuses existing `discovered_certificates` and `discovery_scans` tables. Sentinel agent IDs leverage existing `(fingerprint_sha256, agent_id, source_path)` dedup constraint
**Common triage workflow (all sources):**
1. **Storage** — Records stored in `discovered_certificates` table with status = "Unmanaged"
2. **Audit**`discovery_scan_completed` event logged with agent ID, cert count, scan timestamp
@@ -956,29 +1079,53 @@ flowchart TB
This data flow is pull-based and non-blocking. Agents discover at their own pace; the server stores results for later review. There's no pressure to claim or dismiss; operators can leave certificates in "Unmanaged" status indefinitely.
## Continuous TLS Health Monitoring (M48)
Beyond one-time discovery, certctl continuously monitors TLS endpoints for certificate health using a shared TLS probing package and a state-machine-driven health check service. Endpoints transition between states (Healthy → Degraded → Down) based on consecutive failures, and `cert_mismatch` status alerts when a deployed certificate is unexpectedly replaced.
**Architecture:** Probing is extracted into a shared `internal/tlsprobe/` package used by both the network scanner (M21) and the health monitor. The `HealthCheckService` manages 8 API endpoints for CRUD operations and state transitions. A dedicated 8th scheduler loop runs every 60 seconds (configurable via `CERTCTL_HEALTH_CHECK_INTERVAL`). Individual health check targets have their own check intervals (default 300 seconds) — the scheduler queries only endpoints due for check via `ListDueForCheck()`. Results are stored with historical tracking for 30 days (configurable via `CERTCTL_HEALTH_CHECK_HISTORY_RETENTION`). State transitions trigger notifications (critical for down endpoints, warning for degraded, high for cert_mismatch).
**State Machine:** Healthy → Degraded (configurable threshold, default 2 consecutive failures) → Down (default 5 failures). The `cert_mismatch` status is special — it fires whenever the observed certificate fingerprint differs from the expected (deployed) fingerprint, catching silent rollbacks and unauthorized cert replacements. Recovery from degraded/down transitions back to healthy and resets the failure counter.
**API:** 8 endpoints for list (with filters: status, certificate_id, network_scan_target_id, enabled), get, create, update, delete, history (with limit param), acknowledge (incident marking), and summary (aggregate status counts).
**Auto-Create:** When a deployment job completes with successful verification (M25), the system automatically creates a health check with the deployed certificate's fingerprint as the expected value. Network scan targets can also opt-in to auto-create health checks for discovered endpoints.
**Configuration:**
| Env Var | Default | Description |
|---|---|---|
| `CERTCTL_HEALTH_CHECK_ENABLED` | `false` | Enable/disable the feature |
| `CERTCTL_HEALTH_CHECK_INTERVAL` | `60s` | Scheduler tick interval |
| `CERTCTL_HEALTH_CHECK_DEFAULT_INTERVAL` | `300s` | Default per-endpoint check interval (5 min) |
| `CERTCTL_HEALTH_CHECK_DEFAULT_TIMEOUT` | `5000ms` | TLS connection timeout per probe |
| `CERTCTL_HEALTH_CHECK_MAX_CONCURRENT` | `20` | Max concurrent TLS probes |
| `CERTCTL_HEALTH_CHECK_HISTORY_RETENTION` | `30 days` | Purge probe history older than this |
| `CERTCTL_HEALTH_CHECK_AUTO_CREATE` | `true` | Auto-create checks from deployments |
## Testing Strategy
certctl uses a layered testing approach aligned with the handler → service → repository architecture, with 1050+ tests across six layers (service, handler, integration, connector, frontend, and scheduler). The goal is high-confidence regression prevention at the service and handler layers, where the most complex business logic lives, combined with integration tests that exercise the full request path from HTTP to database.
certctl is extensively tested across eight layers with CI-enforced coverage gates that act as regression floors. The goal is high-confidence regression prevention at the service and handler layers (where the most complex business logic lives), combined with integration tests that exercise the full request path from HTTP to database.
**Service layer unit tests** (`internal/service/*_test.go`) — ~238 test functions across 15 files with mock repositories. These test all business logic in isolation: certificate CRUD with validation, certificate revocation (success, already-revoked, archived, invalid reason, all RFC 5280 reason codes, issuer notification, notification service integration, OCSP/CRL generation), agent lifecycle (registration, heartbeat, CSR submission with both keygen modes), job state machine (creation, processing, cancellation, retry logic), policy evaluation (all 5 rule types, violation creation), renewal and issuance flow (server-side and agent-side keygen paths), notification deduplication (threshold tag matching, channel routing), team/owner/agent group CRUD with pagination and audit recording, issuer service CRUD with connection testing, and the issuer connector adapter (type translation between connector and service layers including revocation). Mock repositories are simple structs with function fields, avoiding heavy mocking frameworks — this keeps tests readable and avoids coupling to mock library APIs.
**Service layer unit tests** (`internal/service/*_test.go`) — Mock-based tests across all service files covering certificate CRUD, revocation (all RFC 5280 reason codes, OCSP/CRL generation, bulk revocation by filter with partial-failure tolerance), agent lifecycle, job state machine, policy evaluation, renewal/issuance flow (both keygen modes), notification deduplication, team/owner/agent group CRUD, issuer service CRUD with connection testing, and the issuer connector adapter. Mock repositories are simple structs with function fields — no heavy mocking frameworks.
**Handler layer tests** (`internal/api/handler/*_test.go`) — ~257 test functions across 11 files using Go's `httptest` package. Every handler file has a corresponding test file: certificates (50 tests including revocation, DER CRL, and OCSP), agents (28 tests), jobs (21 tests including approve/reject), notifications (11 tests), policies (19 tests), profiles (18 tests), issuers (17 tests), targets (17 tests), agent groups (12 tests), teams (26 tests), and owners (21 tests). Each test file follows the same pattern: a mock service struct with function fields, `httptest.NewRecorder` for capturing responses, and a shared `contextWithRequestID()` helper. Tests cover the happy path, input validation (missing fields, invalid JSON, empty IDs, name length limits), error propagation from the service layer, method-not-allowed responses, and pagination parameters.
**Handler layer tests** (`internal/api/handler/*_test.go`) — Every handler file has a corresponding test file using Go's `httptest` package: certificates (including revocation, bulk revocation by profile/owner/agent/issuer, DER CRL, OCSP), agents, jobs (including approve/reject), notifications, policies, profiles, issuers, targets, agent groups, teams, owners, discovery, network scan, verification, export, EST, digest, stats, and metrics. Tests cover the happy path, input validation, error propagation, method-not-allowed, pagination, and bulk operation partial-failure scenarios.
**Integration tests** (`internal/integration/`) — Two test files exercising the full stack from HTTP request through router, handler, service, and postgres repository layers. `lifecycle_test.go` has 11 subtests covering the complete certificate lifecycle: team/owner creation, certificate creation, issuer verification, renewal trigger, job verification, agent registration, CSR submission, deployment, and status reporting. `negative_test.go` has 14 subtests covering error paths, 19 M11b endpoint tests, and 8 revocation endpoint tests (M15a+M15b): nonexistent resource lookups (404s), invalid request bodies (malformed JSON, missing required fields), invalid CSR submission, heartbeat for nonexistent agents, wrong HTTP methods on list endpoints, empty list responses, renewal on nonexistent certificates, expired certificate lifecycle, team/owner/agent group CRUD validation, revocation success, already-revoked rejection, not-found revocation, JSON CRL retrieval, DER CRL retrieval, OCSP response retrieval, and short-lived cert exemption. Both use a shared `setupTestServer()` that builds a fully-wired server with real postgres repositories and the Local CA issuer connector. A third file, `e2e_test.go`, contains 8 cross-milestone test functions with 48+ subtests that exercise features across milestones end-to-end: M10 agent metadata via heartbeat, M11 profiles/teams/owners/agent-groups CRUD, M12 issuer registry verification, M13 GUI operation endpoints, M14 stats and metrics, M15 revocation and CRL, M16 notification channels, and M20 enhanced query API (sorting, cursor pagination, sparse fields, time-range filters).
**Integration tests** (`internal/integration/`) — Three test files exercising the full stack from HTTP request through router, handler, service, and repository layers. `lifecycle_test.go` covers the complete certificate lifecycle (team/owner creation through deployment and status reporting). `negative_test.go` covers error paths, endpoint validation, and revocation scenarios. `e2e_test.go` exercises cross-milestone features end-to-end (agent metadata, profiles, issuer registry, GUI operations, stats, revocation, notifications, enhanced query API).
**Frontend tests** (`web/src/api/client.test.ts`, `web/src/api/utils.test.ts`) — 86 Vitest tests covering the API client, stats/metrics endpoints, and utility functions. The API client tests mock `globalThis.fetch` and verify all endpoint functions (certificates, agents, jobs, policies, issuers, targets, notifications, audit, stats, metrics, health) send correct HTTP methods, URLs, headers, and request bodies. They also test API key management (store/retrieve/clear), auth header propagation, 401 event dispatching, and error handling (server messages, error fields, status text fallback). The stats/metrics endpoint tests verify correct query parameter handling and response shape validation. The utility tests use `vi.useFakeTimers()` for deterministic date testing and cover `formatDate`, `formatDateTime`, `timeAgo`, `daysUntil`, and `expiryColor`. The test environment uses jsdom with `@testing-library/jest-dom` matchers.
**Go integration tests** (`deploy/test/integration_test.go`) — Runs against the live Docker Compose test environment with real CA backends (Local CA, Pebble ACME, step-ca). Covers health checks, agent heartbeat, issuance, renewal, revocation, CRL/OCSP, EST enrollment, S/MIME, discovery, network scanning, and deployment verification using `crypto/x509` for cert parsing and `crypto/tls` for live TLS verification.
**CLI tests** (`internal/cli/client_test.go`) — 14 tests covering all 10 CLI subcommands with httptest mock servers, PEM parsing for bulk import, auth header verification, and JSON/table output formatting.
**Frontend tests** (`web/src/api/`) — Vitest tests covering the full API client (all endpoint functions with fetch mocking), stats/metrics endpoints, utility functions, and auth flows. Test environment uses jsdom with `@testing-library/jest-dom` matchers.
**CI pipeline** (`.github/workflows/ci.yml`) — Two parallel jobs: Go (build, vet, race detection, static analysis, vulnerability scanning, test with coverage, coverage threshold enforcement) and Frontend (TypeScript type check, Vitest test suite, Vite production build). The Go job runs `go test -race` on service, handler, middleware, and scheduler packages to catch data races. It runs `golangci-lint` with 11 linters (errcheck, govet, staticcheck, unused, gosimple, ineffassign, typecheck, gocritic, gosec, bodyclose, noctx) configured in `.golangci.yml`. It runs `govulncheck ./...` to scan dependencies for known CVEs. Coverage thresholds are enforced per-layer: service 60%, handler 60%, domain 40%, middleware 50%. These thresholds act as regression floors — they can only go up. Connector tests are included via `./internal/connector/issuer/...` and `./internal/connector/target/...` (covers Local CA, ACME, step-ca, NGINX, Apache, HAProxy, Traefik, and Caddy packages with unit tests for certificate signing logic, DNS solver, issuer validation, and deployment flows). The Frontend job runs `npx vitest run` between the TypeScript check and production build steps.
**Connector tests** (`internal/connector/`) — Issuer connectors (Local CA self-signed/sub-CA modes, ACME DNS-01/DNS-PERSIST-01, step-ca, OpenSSL, Vault PKI, DigiCert, Sectigo, Google CAS, AWS ACM PCA — all with httptest mock servers or injectable interface mocks). Target connectors (NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, IIS with mock PowerShell executor, F5 BIG-IP with mock iControl client, Postfix/Dovecot, SSH with mock SSH client, Windows Certificate Store with mock PowerShell executor, Java Keystore with mock command executor, Kubernetes Secrets with mock K8s client, shared certutil package). Notifier connectors (Slack, Teams, PagerDuty, OpsGenie).
**Connector tests** (`internal/connector/`) — 57 test functions covering issuer, target, and notifier connectors. The Local CA connector has tests for self-signed and sub-CA modes (RSA, ECDSA, config validation, non-CA cert rejection). The ACME DNS solver has 10 tests for script-based DNS-01 and DNS-PERSIST-01 challenges (6 DNS-01 tests + 4 DNS-PERSIST-01 tests covering `PresentPersist` success, no-script error, script failure, and wildcard domain handling). The step-ca connector has tests with a mock HTTP server for issuance, renewal, revocation, and error paths. The OpenSSL/Custom CA connector has 14 tests covering config validation, issuance success/failure/timeout, renewal, revocation, and CRL generation. The NGINX target connector has 13 tests covering config validation, certificate deployment (file writing, permissions, validate/reload commands), and deployment validation. Apache httpd and HAProxy connectors each have 3 tests covering config validation, deployment, and validation flows. Traefik and Caddy connectors have tests covering file-based deployment and (for Caddy) dual-mode API/file configuration. Notifier connector tests span 20 tests across Slack (5), Teams (4), PagerDuty (6), and OpsGenie (5) — verifying channel identity, payload formatting, HTTP error handling, connection failures, auth headers, and configuration defaults.
**Scheduler tests** (`internal/scheduler/scheduler_test.go`) — Idempotency guards (`sync/atomic.Bool`), `WaitForCompletion` success and timeout paths, and multi-loop concurrency safety.
**Scheduler tests** (`internal/scheduler/scheduler_test.go`) — Tests for idempotency guards (`sync/atomic.Bool` CompareAndSwap prevents concurrent loop ticks), `WaitForCompletion` success and timeout paths, and multi-loop idempotency.
**Fuzz tests** (`internal/validation/`, `internal/domain/`) — Go native fuzz tests for command validation (`ValidateShellCommand`, `ValidateDomainName`, `ValidateACMEToken`) and revocation domain parsing.
**Fuzz tests** (`internal/validation/command_fuzz_test.go`, `internal/domain/revocation_fuzz_test.go`) — Go native fuzz tests (`testing/fuzz`) for command validation functions and revocation domain parsing. These exercise `ValidateShellCommand`, `ValidateDomainName`, and `ValidateACMEToken` with random inputs to discover edge cases.
**CI pipeline** (`.github/workflows/ci.yml`) — Two parallel jobs. Go: build, vet, `go test -race`, `golangci-lint` (11 linters), `govulncheck`, test with coverage, per-layer coverage threshold enforcement (service 55%, handler 60%, domain 40%, middleware 30%). Frontend: TypeScript type check, Vitest, Vite production build.
**What's not tested and why:** Postgres repository implementations (`internal/repository/postgres/`) require a real database and are tested only through integration tests, not unit tests — a `testcontainers-go` scaffolding for isolated PostgreSQL instances is planned. Target connectors for F5 BIG-IP and IIS are interface stubs (implementation planned for V3). The ACME connector requires a real ACME server (tested manually against Let's Encrypt staging). These are all candidates for future expansion as the test infrastructure matures.
For detailed test procedures, smoke tests, and the release sign-off checklist, see the [Testing Guide](testing-guide.md). For setting up the Docker Compose test environment with real CA backends, see [Test Environment](test-env.md).
## What's Next
@@ -988,3 +1135,5 @@ certctl uses a layered testing approach aligned with the handler → service →
- [Compliance Mapping](compliance.md) — SOC 2, PCI-DSS 4.0, and NIST SP 800-57 alignment
- [MCP Server Guide](mcp.md) — AI-native access to the API
- [OpenAPI Spec](openapi.md) — Full API reference and SDK generation
- [Testing Guide](testing-guide.md) — Test procedures and release sign-off
- [Test Environment](test-env.md) — Docker Compose test environment setup
+6 -6
View File
@@ -82,7 +82,7 @@ Agents scan configured directories and report back all existing certs. In the da
Set up the same issuer certctl uses for non-Kubernetes certs:
- **ACME** (Let's Encrypt, for public certs)
- **step-ca** (Smallstep, for internal certs)
- **Vault PKI** (planned) (HashiCorp Vault, for enterprise PKI)
- **Vault PKI** (HashiCorp Vault, for enterprise PKI)
- **Private CA** (your own internal root CA)
No new CA infrastructure needed. If cert-manager already uses your CA, certctl points to the same one.
@@ -115,7 +115,7 @@ Certificates are linked to issuers and profiles when created or claimed from dis
If cert-manager and certctl both use the same CA:
- **ACME**: cert-manager uses ClusterIssuer + certctl uses ACME connector → same Let's Encrypt account, transparent coexistence
- **step-ca**: cert-manager uses external issuer CRD + certctl uses step-ca connector → same provisioner, shared certificate inventory
- **Vault PKI** (planned): cert-manager uses external issuer CRD + certctl uses Vault connector → same mount, same audit trail
- **Vault PKI**: cert-manager uses external issuer CRD + certctl uses Vault connector → same mount, same audit trail
No conflict. They just issue certs through the same CA. certctl's discovery scanning finds cert-manager-issued certs and shows them alongside certctl-managed ones.
@@ -138,7 +138,7 @@ For now: cert-manager handles Kubernetes, certctl handles everything else. They
## Next Steps
1. Review [Quick Start](./quickstart.md) for a 5-minute demo
2. Explore [Architecture](./architecture.md#agents) for deployment architecture
3. Read about [Discovery Scanning](./quickstart.md#certificate-discovery) to auto-find certs
4. Check [Helm Chart](../deploy/helm/certctl/) for production Kubernetes deployment
1. Run through the [Quick Start](./quickstart.md) for a 5-minute demo
2. Try the [Multi-Issuer example](../examples/multi-issuer/multi-issuer.md) — manages public and internal certs from one dashboard
3. Explore [Architecture](./architecture.md#agents) for deployment patterns
4. Check the [Helm Chart](../deploy/helm/certctl/) for production Kubernetes deployment
+18 -12
View File
@@ -72,7 +72,7 @@ certctl implements tiered key storage with different protection profiles based o
- Configured via: `CERTCTL_CA_CERT_PATH=/path/to/ca.crt` and `CERTCTL_CA_KEY_PATH=/path/to/ca.key`
**NIST Gap: HSM Storage**
NIST SP 800-57 Part 1 recommends Hardware Security Module (HSM) storage for high-value keys (CA signing keys). certctl V2 uses filesystem storage on the server. HSM support is planned for V5 roadmap, enabling integration with:
NIST SP 800-57 Part 1 recommends Hardware Security Module (HSM) storage for high-value keys (CA signing keys). certctl V2 uses filesystem storage on the server. HSM support is planned for certctl Pro (V3), enabling integration with:
- AWS CloudHSM
- Azure Dedicated HSM
- Thales Luna, Gemalto SafeNet, YubiHSM (on-premises)
@@ -210,15 +210,17 @@ NIST SP 800-57 Part 1 Section 6.2 addresses secure key distribution to minimize
- Proxy agent executes deployment via appliance API
**Revocation Distribution**
- Certificate Revocation List (CRL) via `GET /api/v1/crl/{issuer_id}`
- Returns DER-encoded X.509 CRL signed by issuing CA
- Certificate Revocation List (CRL) via `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5, RFC 8615)
- Returns DER-encoded X.509 CRL signed by issuing CA (`Content-Type: application/pkix-crl`)
- 24-hour validity period
- Includes all revoked serials, reasons, and revocation timestamps
- Served unauthenticated so relying parties without certctl API credentials can fetch it
- Subject to URL caching; OCSP preferred for real-time revocation
- OCSP via `GET /api/v1/ocsp/{issuer_id}/{serial}`
- Returns DER-encoded OCSP response (OCSPResponse ASN.1 structure)
- OCSP via `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960)
- Returns DER-encoded OCSP response (OCSPResponse ASN.1 structure, `Content-Type: application/ocsp-response`)
- Signed by issuing CA (or delegated OCSP signing cert)
- Responds with good/revoked/unknown status
- Served unauthenticated — the RFC 6960 relying-party model does not assume API credentials
- Real-time, more bandwidth-efficient than CRL polling
## Revocation and Compromise (NIST SP 800-57 Part 3)
@@ -272,20 +274,23 @@ NIST SP 800-57 Part 3 covers revocation (Section 2.5) when keys are suspected co
- OCSP responder queries revocation table in real-time
- Short-lived certificate exemption: certs with TTL < 1 hour skip CRL/OCSP (expiry is sufficient revocation)
**Bulk Revocation for Large-Scale Compromise Response** (V2.2) — NIST SP 800-57 Part 3 emphasizes rapid revocation when keys are compromised. `POST /api/v1/certificates/bulk-revoke` revokes all certificates matching filter criteria (profile, owner, agent, issuer) in a single operation. This enables operators to execute fleet-wide revocation for key compromise events affecting multiple certificates. Each bulk revocation creates individual jobs reusing the existing revocation pipeline, ensuring every certificate is recorded in the audit trail with the incident reason.
**Revocation Audit Trail**
All revocation events logged:
- Event type: `certificate_revoked`
- Event type: `certificate_revoked` or `bulk_revocation_initiated` (for fleet operations)
- Actor: authenticated user or service
- Reason code: RFC 5280 enum
- Reason code: RFC 5280 enum (or incident justification for bulk operations)
- Timestamp: RFC3339
- Issuer notification status: success or error reason
- Filter criteria: profile_id, owner_id, agent_id, issuer_id (for bulk revocation)
## Alignment Summary Table
| NIST SP 800-57 Area | Status | Coverage | Notes |
|---|---|---|---|
| **Key Generation** | ✅ Aligned | 100% | Agent-side ECDSA P-256 using crypto/rand; server mode flagged as demo-only |
| **Key Storage** | ⚠️ Partially Aligned | 80% | Filesystem with 0600 perms; HSM support planned V5 |
| **Key Storage** | ⚠️ Partially Aligned | 80% | Filesystem with 0600 perms; HSM support planned V3 Pro |
| **Cryptoperiods** | ✅ Aligned | 100% | Profile-enforced max_ttl; threshold-based renewal alerting |
| **Key States** | ✅ Aligned | 100% | Full lifecycle tracking with immutable audit trail |
| **Algorithms** | ✅ Aligned | 100% | NIST-approved algorithms only; post-quantum tracking in progress |
@@ -301,13 +306,14 @@ All revocation events logged:
- [x] RFC 5280 revocation support
- [x] Immutable audit trail
### V2.2 (Planned: 2026)
- Bulk revocation by profile/owner/agent/issuer (fleet-level revocation for incident response)
### V3 (Planned: 2026)
- Role-based access control (limit revocation/approval to authorized operators)
- Bulk revocation by profile/owner/agent (fleet-level revocation policy)
### V5 (Planned: 2027+)
- HSM support for CA key storage
- PKCS#11 integration for hardware tokens
### V3 Pro (Planned)
- HSM support for CA key storage and agent key storage (TPM 2.0, PKCS#11)
- FIPS 140-2/3 validated crypto module (BoringCrypto build or external FIPS library)
- Key destruction API (explicit secure erasure of agent keys)
- Key escrow / recovery mechanism (backup encrypted private keys for disaster recovery)
+16 -11
View File
@@ -92,9 +92,11 @@ Your QSA will request evidence that your certificate and key management systems
- **Certificate Status Tracking** — Four statuses: Active (deployed, not yet expired), Expiring (within threshold, awaiting renewal), Expired (past not-after date), Revoked (revoked via RFC 5280 revocation API). Dashboard charts show status distribution.
- **Revocation Infrastructure** (M15a, M15b):
- CRL endpoint: `GET /api/v1/crl` (JSON format) or `GET /api/v1/crl/{issuer_id}` (DER X.509 CRL, 24h validity, signed by issuing CA)
- OCSP responder: `GET /api/v1/ocsp/{issuer_id}/{serial}` (returns DER-encoded OCSP response: good/revoked/unknown)
- **Revocation Infrastructure** (M15a, M15b, M-006):
- Revocation API: `POST /api/v1/certificates/{id}/revoke` with RFC 5280 reason codes
- CRL endpoint: `GET /.well-known/pki/crl/{issuer_id}` — DER X.509 CRL, 24h validity, signed by issuing CA, served unauthenticated (RFC 5280 §5, RFC 8615, `Content-Type: application/pkix-crl`)
- OCSP responder: `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` — DER-encoded OCSP response (good/revoked/unknown), served unauthenticated (RFC 6960, `Content-Type: application/ocsp-response`)
- Bulk revocation (V2.2): `POST /api/v1/certificates/bulk-revoke` with filter criteria (profile, owner, agent, issuer) for fleet-wide incident response
- Short-lived cert exemption: certs with TTL < 1 hour skip CRL/OCSP (expiry is sufficient revocation)
- **Stats API** (M14) — Real-time visibility:
@@ -107,7 +109,7 @@ Your QSA will request evidence that your certificate and key management systems
- Discovered certificate report: `GET /api/v1/discovered-certificates` JSON export showing all certs on systems, fingerprints, and status.
- Managed certificate inventory: `GET /api/v1/certificates` with filters (`?status=Expiring` for upcoming renewals).
- Expiration alert configuration: policy JSON showing `alert_thresholds_days` for each environment.
- CRL/OCSP availability proof: HTTP GET requests to `/api/v1/crl` and `/api/v1/ocsp/{issuer}/{serial}` with signed responses.
- CRL/OCSP availability proof: unauthenticated HTTP GET requests to `/.well-known/pki/crl/{issuer_id}` (DER, `application/pkix-crl`) and `/.well-known/pki/ocsp/{issuer_id}/{serial}` (DER, `application/ocsp-response`) with signed responses.
- Audit trail for certificate creation/renewal/revocation: `GET /api/v1/audit?type=certificate_issued,certificate_renewed,certificate_revoked`.
- Dashboard charts showing expiration timeline, renewal success trends, status distribution.
@@ -326,11 +328,14 @@ This requirement covers key generation, storage, rotation, and destruction. Cert
- Issuer notified (best-effort; ACME lacks standard revocation, Local CA skips issuer step).
- Revocation notifications sent to owner via email/webhook/Slack/Teams/PagerDuty.
- **CRL and OCSP Publication** (M15b) — Revoked certificates published in:
- CRL: `GET /api/v1/crl` (JSON format) or `GET /api/v1/crl/{issuer_id}` (DER X.509, signed by CA, 24h validity)
- OCSP: `GET /api/v1/ocsp/{issuer_id}/{serial}` (returns revoked status for clients validating certificate chain)
- **CRL and OCSP Publication** (M15b, M-006) — Revoked certificates published in:
- CRL: `GET /.well-known/pki/crl/{issuer_id}` (DER X.509 signed by CA, 24h validity, RFC 5280 §5 + RFC 8615, `Content-Type: application/pkix-crl`)
- OCSP: `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (returns revoked status for clients validating certificate chain, RFC 6960, `Content-Type: application/ocsp-response`)
- Both endpoints are served unauthenticated so relying parties (browsers, TLS appliances) without certctl API keys can verify revocation — this is the RFC-compliant PKI model.
- Clients checking certificate status via OCSP or CRL see revoked status within 24 hours.
- **Bulk Revocation for Incident Response** (V2.2) — `POST /api/v1/certificates/bulk-revoke` with filter criteria (profile, owner, agent, issuer) revokes all matching certificates in a single operation. PCI-DSS Req 4 requires rapid response to data transmission security incidents — bulk revocation enables operators to revoke an entire certificate set (e.g., all certs used by a compromised team or endpoint) in minutes rather than hours.
- **Private Key Destruction on Agent** — When certificate renewed or revoked:
- Agent removes old private key file from `CERTCTL_KEY_DIR` when new certificate deployed.
- Job status tracking confirms old key is no longer needed.
@@ -338,8 +343,8 @@ This requirement covers key generation, storage, rotation, and destruction. Cert
**Evidence You Can Provide**:
- Revocation requests: `GET /api/v1/audit?type=certificate_revoked` with RFC 5280 reason codes.
- CRL publication: HTTP GET `/api/v1/crl` and parse JSON to show revoked serial numbers and timestamps.
- OCSP responder validation: Query `GET /api/v1/ocsp/{issuer}/{serial}` for a known-revoked cert; response includes `revoked` status.
- CRL publication: HTTP GET `/.well-known/pki/crl/{issuer_id}` (unauthenticated) returns a DER X.509 CRL — parse with `openssl crl -inform der -noout -text` to show revoked serial numbers, reasons, and timestamps.
- OCSP responder validation: Query `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (unauthenticated) for a known-revoked cert; response includes `revoked` status and can be parsed with `openssl ocsp` tooling.
- Audit trail: Certificate status transitions (Active → Revoked) recorded in `audit_events`.
**Operator Responsibility**:
@@ -717,12 +722,12 @@ This requirement covers key generation, storage, rotation, and destruction. Cert
| PCI-DSS Requirement | certctl Feature | API/UI Evidence | Database/Config | Audit Trail | Status |
|---|---|---|---|---|---|
| **4.2.1** Strong Crypto | TLS cert issuance, ACME/step-ca/Local CA, RSA 2048+/ECDSA P-256 | `GET /api/v1/certificates` (key_type, key_size) | Certificate profiles | `GET /api/v1/audit?type=certificate_issued` | Available |
| **4.2.2** Cert Inventory & Validation | Managed cert CRUD, discovery (M18b), expiration alerting, CRL/OCSP | `GET /api/v1/certificates`, `GET /api/v1/discovered-certificates`, `GET /api/v1/crl`, `GET /api/v1/ocsp/{issuer}/{serial}` | `managed_certificates`, `discovered_certificates` tables | `GET /api/v1/audit?type=certificate_*` | Available |
| **4.2.2** Cert Inventory & Validation | Managed cert CRUD, discovery (M18b), expiration alerting, CRL/OCSP | `GET /api/v1/certificates`, `GET /api/v1/discovered-certificates`, `GET /.well-known/pki/crl/{issuer_id}`, `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (both unauthenticated, RFC 5280 / RFC 6960) | `managed_certificates`, `discovered_certificates` tables | `GET /api/v1/audit?type=certificate_*` | Available |
| **3.6** Key Documentation | Profiles, owner/team tracking, issuer config, audit trail | `GET /api/v1/profiles`, `GET /api/v1/issuers`, certificate detail with owner/team | Profiles, certificate owner/team fields, issuer config | `GET /api/v1/audit?resource_type=certificate` | Available |
| **3.7.1** Key Generation | Agent-side ECDSA P-256, server keygen (demo only) | Agent logs, renewal job detail, CSR audit | `CERTCTL_KEYGEN_MODE=agent` (config), job_type=AwaitingCSR | `GET /api/v1/audit?type=certificate_issued` with CSR hash | Available |
| **3.7.2** Key Storage | Agent `/var/lib/certctl/keys` (0600), env var secrets, .env excluded | Deployment manifest (env var refs), agent key dir listing | `.env` file (git-ignored), `CERTCTL_KEY_DIR`, `CERTCTL_CA_KEY_PATH` | No API audit (keys off-platform) | Available |
| **3.7.3** Key Rotation | Auto renewal, expiration thresholds, renewal jobs | Dashboard renewal trends, `GET /api/v1/jobs?type=Renewal`, certificate versions | Renewal policies, certificate version history | `GET /api/v1/audit?type=certificate_renewed` | Available |
| **3.7.4** Key Destruction | Revocation API (RFC 5280), CRL/OCSP, private key cleanup | `POST /api/v1/certificates/{id}/revoke`, `GET /api/v1/crl`, OCSP endpoint | `certificate_revocations` table, CRL publication | `GET /api/v1/audit?type=certificate_revoked` | Available |
| **3.7.4** Key Destruction | Revocation API (RFC 5280), CRL/OCSP, private key cleanup | `POST /api/v1/certificates/{id}/revoke`, unauthenticated `GET /.well-known/pki/crl/{issuer_id}` and `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` | `certificate_revocations` table, CRL publication | `GET /api/v1/audit?type=certificate_revoked` | Available |
| **8.3** Strong Authentication | API key (SHA-256 hash, TLS), GUI login, 401 redirect | GUI login screenshot, API key auth header, TLS cert | API key hash in database | `GET /api/v1/audit` showing API calls | Available |
| **8.6** Acct Management | Credentials out of source, .env excluded, env var config | Code review (no hardcoded secrets), `.gitignore` check | Deployment manifests showing env var refs only | No account lifecycle audit (outside scope) | Available in part |
| **10.2** Audit Logging | API audit middleware (M19), certificate lifecycle events | `GET /api/v1/audit` with filter/pagination | `audit_events` table (every API call) | Real-time via API | Available |
+5 -5
View File
@@ -282,12 +282,13 @@ Each section includes:
- `certificateHold` — temporary revocation (can be "unhold" by reissue)
- `privilegeWithdrawn` — access rights revoked
Revocation is **immediate** (no approval workflow). The certificate is marked `Revoked` in inventory, an audit event is logged, and optional issuer notification is best-effort. All revoked certs are excluded from active deployments.
- **CRL Endpoint**`GET /api/v1/crl` returns a JSON-formatted Certificate Revocation List (serial, reason, timestamp for each revoked cert). `GET /api/v1/crl/{issuer_id}` returns a DER-encoded X.509 CRL signed by the issuing CA (useful for legacy clients that don't support OCSP).
- **OCSP Responder**`GET /api/v1/ocsp/{issuer_id}/{serial}` returns a signed OCSP response indicating whether a cert is good, revoked, or unknown. Clients (browsers, TLS libraries) query this endpoint to verify cert validity in real-time.
- **CRL Endpoint**`GET /.well-known/pki/crl/{issuer_id}` returns a DER-encoded X.509 CRL signed by the issuing CA (RFC 5280 §5, RFC 8615, `Content-Type: application/pkix-crl`), served unauthenticated for relying parties that don't hold certctl API credentials.
- **OCSP Responder**`GET /.well-known/pki/ocsp/{issuer_id}/{serial}` returns a signed OCSP response indicating whether a cert is good, revoked, or unknown (RFC 6960, `Content-Type: application/ocsp-response`). Also unauthenticated. Clients (browsers, TLS libraries) query this endpoint to verify cert validity in real-time.
- **Revocation Notifications** — When a cert is revoked, notifications are sent to:
- Certificate owner (email)
- Configured webhooks (if you have a SIEM that subscribes)
- Slack/Teams channels (if notifiers are configured)
- **Bulk Revocation for Fleet-Wide Incidents** (V2.2) — `POST /api/v1/certificates/bulk-revoke` with filter criteria (profile, owner, agent, issuer) revokes all matching certificates in a single operation. Essential for incident response: key compromise affecting multiple certs, CA distrust events, decommissioning a team's infrastructure. Each bulk revocation creates individual jobs reusing the existing revocation pipeline, ensuring audit trail and notifications for every certificate.
- **Short-Lived Cert Exemption** — Certificates with TTL < 1 hour (configured in profile) skip CRL/OCSP publication. Expiry is the revocation mechanism for short-lived certs (e.g., Kubernetes pod certs, session tokens).
- **Deployment Rollback** — If a revoked cert is still deployed (shouldn't happen, but race conditions exist), operators can manually redeploy a previous version via the GUI. Rollback is audited.
@@ -302,7 +303,6 @@ Each section includes:
**V3 Enhancement**:
- **Bulk Revocation** — Revoke all certs issued by a specific profile, owner, or agent in a single API call (useful for large-scale incidents like CA compromise)
- **Revocation Automation** — Trigger revocation based on external events (e.g., employee termination, security breach alert from CT Log monitoring)
**Operator Responsibility**:
@@ -460,8 +460,8 @@ Each section includes:
| | Notification Routing | Email, Slack, Teams, PagerDuty, OpsGenie | ✅ | ✅ | Configure notifiers, on-call integration |
| | Deployment Rollback | Redeploy previous cert version via GUI | ✅ | ✅ | Audit rollback decisions |
| **CC7.3** Incident Response | Revocation API (RFC 5280 reasons) | `POST /api/v1/certificates/{id}/revoke` | ✅ | Enhanced (bulk revocation) | Establish incident response policy |
| | CRL Endpoint (JSON + DER) | `GET /api/v1/crl`, `GET /api/v1/crl/{issuer_id}` | ✅ | ✅ | Ensure CRL/OCSP accessible to all clients |
| | OCSP Responder | `GET /api/v1/ocsp/{issuer_id}/{serial}` | ✅ | ✅ | Test revocation in staging |
| | CRL Endpoint (DER, RFC 5280 §5) | `GET /.well-known/pki/crl/{issuer_id}` (unauthenticated, `application/pkix-crl`) | ✅ | ✅ | Ensure CRL/OCSP accessible to all clients without API keys |
| | OCSP Responder (RFC 6960) | `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (unauthenticated, `application/ocsp-response`) | ✅ | ✅ | Test revocation in staging |
| | Revocation Notifications | Email, webhook, Slack/Teams on revocation | ✅ | ✅ | Integrate into on-call, document justification separately |
| | Short-Lived Cert Exemption | TTL < 1h skip CRL/OCSP | ✅ | ✅ | Configure profiles appropriately |
| **CC7.4** Risk Mitigation | Renewal Job Tracking | Job state machine (Pending → Running → Completed/Failed) | ✅ | ✅ | Monitor renewal success rate |
+21 -7
View File
@@ -123,11 +123,13 @@ At no point does the private key leave the agent. This is a fundamental security
Agents also report **metadata** about themselves — their operating system, CPU architecture, IP address, hostname, and version — with every heartbeat. This gives ops teams fleet-wide visibility (e.g., "how many agents are running on ARM?", "which agents are still on v1.0.0?") and powers **agent groups** — dynamic device grouping where policies can be scoped to specific agent criteria like OS type, architecture, or network subnet.
**Retiring an agent.** When you decommission a server, the certctl record for its agent needs to be retired, not deleted. certctl uses a **soft-delete** model: `DELETE /api/v1/agents/{id}` stamps the row with a retired-at timestamp and a reason, instead of removing it. This is deliberate — an audit trail of "who owned this certificate, on which host, for which team" stays intact forever, and the downstream deployment_targets, certificates, and jobs keep valid foreign keys. Retired agents are filtered out of default list views and the dashboard's agent counter, but remain visible through a separate retired-agents view for compliance reconciliation. If the agent still has active deployment targets, deployed certificates, or pending jobs, retirement is blocked by default so you don't silently orphan those rows; the API responds with the exact counts so you can retire or reassign each dependency explicitly. A force-retire escape hatch (`?force=true&reason=...`) is available for true decommission scenarios — it transactionally retires the downstream targets, cancels pending jobs, and records the cascade in the audit trail with the reason you provided. Four internal sentinel agents that back the network scanner and the cloud secret-manager discovery sources cannot be retired at all, even with force, because retiring them would orphan their subsystems. Once retired, an agent that still attempts to heartbeat receives `410 Gone` — the agent process reads that as "you've been retired, shut down" and exits cleanly.
### Deployment Targets
Targets are the systems where certificates actually get installed — NGINX web servers, Apache httpd servers, HAProxy load balancers, F5 BIG-IP appliances, Microsoft IIS servers. Each target type has a **connector** that knows how to deploy certificates to that specific system (e.g., writing files and reloading NGINX or Apache config, building a combined PEM for HAProxy).
Targets are the systems where certificates actually get installed — NGINX web servers, Apache httpd servers, HAProxy load balancers, Traefik reverse proxies, Caddy servers, Envoy gateways, Postfix/Dovecot mail servers, Microsoft IIS servers, and network appliances. Each target type has a **connector** that knows how to deploy certificates to that specific system (e.g., writing files and reloading NGINX or Apache config, building a combined PEM for HAProxy).
For targets where an agent runs directly on the machine (NGINX, Apache, HAProxy, IIS), the agent deploys certificates locally — no remote access needed. For network appliances where you can't install an agent (F5 BIG-IP, Palo Alto, etc.), a **proxy agent** in the same network zone picks up the deployment job and calls the appliance's API. The server never initiates outbound connections to any target.
For targets where an agent runs directly on the machine (NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS), the agent deploys certificates locally — no remote access needed. For network appliances where you can't install an agent (F5 BIG-IP, Palo Alto, etc.), a **proxy agent** in the same network zone picks up the deployment job and calls the appliance's API. The server never initiates outbound connections to any target.
## The Certificate Lifecycle
@@ -183,11 +185,11 @@ Profiles are managed via the API (`/api/v1/profiles`) and the GUI, and can be as
For policies with `auto_renew` disabled, renewal jobs enter an **AwaitingApproval** state instead of processing immediately. An operator must explicitly approve or reject the renewal via the API or GUI. Approved jobs transition to Pending and are picked up by the scheduler. Rejected jobs are cancelled with an optional reason. This is useful for high-value certificates where you want human oversight before renewal.
### Renewal Timing: Thresholds vs. ARI (RFC 9702)
### Renewal Timing: Thresholds vs. ARI (RFC 9773)
**Traditional approach (thresholds):** By default, certctl uses static renewal thresholds — renew a certificate at a fixed number of days before expiry (default: 30 days). This simple, predictable model works for most use cases: it avoids unnecessary renewals near expiry and gives you a predictable window to catch failures.
**Advanced approach (ACME ARI):** Some Certificate Authorities support ACME Renewal Information (RFC 9702), which allows the CA to tell certctl the optimal time to renew. Instead of guessing "renew 30 days before expiry," the CA responds with a precise `suggestedWindow` containing start and end times. This is useful when:
**Advanced approach (ACME ARI):** Some Certificate Authorities support ACME Renewal Information (RFC 9773), which allows the CA to tell certctl the optimal time to renew. Instead of guessing "renew 30 days before expiry," the CA responds with a precise `suggestedWindow` containing start and end times. This is useful when:
- The CA is performing maintenance and wants to batch renewals in a specific window
- The CA is coordinating a mass revocation (e.g., due to a compromise) and needs to control renewal timing
- You want to avoid thundering herd renewal spikes by accepting the CA's suggested timing
@@ -196,6 +198,16 @@ For policies with `auto_renew` disabled, renewal jobs enter an **AwaitingApprova
**Graceful degradation:** If your CA doesn't support ARI (returns 404 from the ARI endpoint), certctl automatically falls back to the traditional threshold-based renewal. No configuration change needed — the fallback is transparent. Errors from the CA are logged as warnings and don't block the renewal process.
### Shorter Certificate Validity (45-Day and 6-Day Certs)
The industry is moving toward shorter certificate lifetimes. The CA/Browser Forum's SC-081v3 ballot mandates a phased reduction: 200-day max (March 2026), 100-day max (March 2027), and 47-day max (March 2029). Let's Encrypt has already begun reducing default validity to 45 days, and offers 6-day "shortlived" certificates via ACME profile selection.
certctl handles shorter-lived certificates correctly out of the box:
- **45-day certs** with the default 31-day renewal window trigger renewal at day 14 — at roughly 1/3 of the cert's lifetime.
- **6-day "shortlived" certs** are always within the renewal window. ARI (RFC 9773) is the expected renewal path for these — the CA directs timing. Short-lived certs also skip CRL/OCSP since expiry is sufficient revocation (per profile TTL < 1 hour exemption).
- **ACME profile selection** lets you request specific certificate profiles from your CA. Set `CERTCTL_ACME_PROFILE=shortlived` to get 6-day certificates from Let's Encrypt, or `CERTCTL_ACME_PROFILE=tlsserver` for standard TLS certificates.
### Certificate Revocation
When a private key is compromised, a certificate is superseded, or a service is decommissioned, you need to revoke the certificate immediately — not wait for it to expire. Revocation tells clients "stop trusting this certificate right now."
@@ -204,9 +216,11 @@ certctl implements revocation using three complementary mechanisms:
**Revocation API**: `POST /api/v1/certificates/{id}/revoke` marks a certificate as revoked in the inventory, records the revocation in a dedicated `certificate_revocations` table, notifies the issuing CA (best-effort — the revocation succeeds even if the CA is unreachable), creates an audit trail entry, and sends notifications. You can specify an RFC 5280 reason code (keyCompromise, superseded, cessationOfOperation, etc.) or let it default to "unspecified."
**Certificate Revocation List (CRL)**: certctl serves both a JSON-formatted CRL at `GET /api/v1/crl` and DER-encoded X.509 CRLs per issuer at `GET /api/v1/crl/{issuer_id}`. The DER CRL is signed by the issuing CA's key and has 24-hour validity — clients can download it periodically to check revocation status offline.
**Bulk Revocation** (Fleet-Level Incident Response): For large-scale incidents like CA compromise or team infrastructure decommissioning, `POST /api/v1/certificates/bulk-revoke` revokes all certificates matching filter criteria in a single operation. Filter by profile, owner, team, agent group, or issuer to target the affected certificate set. This is essential for incident response — instead of revoking certificates one-by-one, operators can revoke an entire fleet in minutes. Bulk revocation creates individual revocation jobs that reuse the existing revocation pipeline, ensuring every certificate is audited and notifications are sent.
**OCSP Responder**: For real-time revocation checking, certctl includes an embedded OCSP responder at `GET /api/v1/ocsp/{issuer_id}/{serial}`. It returns signed OCSP responses (good, revoked, or unknown) so clients can verify certificate status without downloading the full CRL.
**Certificate Revocation List (CRL)**: certctl serves DER-encoded X.509 CRLs per issuer at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5 wire format, RFC 8615 well-known namespace). The endpoint is unauthenticated so any relying party — browser, TLS client, hardware appliance — can fetch it without a certctl API key. The CRL is signed by the issuing CA's key and has 24-hour validity; clients can download it periodically to check revocation status offline. The response carries `Content-Type: application/pkix-crl`.
**OCSP Responder**: For real-time revocation checking, certctl includes an embedded OCSP responder at `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960). Like the CRL endpoint, it is unauthenticated and returns signed OCSP responses (good, revoked, or unknown) with `Content-Type: application/ocsp-response`, so clients can verify certificate status without downloading the full CRL.
Short-lived certificates (those assigned to profiles with TTL under 1 hour) are exempt from CRL and OCSP — their rapid expiry is considered sufficient revocation. This is a deliberate design choice to reduce infrastructure overhead for ephemeral machine-to-machine credentials.
@@ -242,7 +256,7 @@ The CLI supports both table and JSON output formats (`--format table` or `--form
### MCP Server (AI Integration)
certctl includes an MCP (Model Context Protocol) server that exposes 78 MCP tools covering the REST API. This enables AI assistants like Claude, Cursor, and other MCP-compatible tools to interact with your certificate infrastructure using natural language — "show me all expiring certificates," "revoke the VPN cert," or "what agents are offline?"
certctl includes an MCP (Model Context Protocol) server that exposes the entire REST API as MCP tools. This enables AI assistants like Claude, Cursor, and other MCP-compatible tools to interact with your certificate infrastructure using natural language — "show me all expiring certificates," "revoke the VPN cert," or "what agents are offline?"
The MCP server is a separate binary (`cmd/mcp-server/`) that communicates via stdio transport and acts as a stateless HTTP proxy to the certctl REST API. It requires no additional infrastructure — just point it at your certctl server URL and API key.
+520 -38
View File
@@ -11,9 +11,13 @@ Connectors extend certctl to integrate with external systems for certificate iss
- [Built-in: ACME v2 (Let's Encrypt, Sectigo, ZeroSSL)](#built-in-acme-v2-lets-encrypt-sectigo-zerossl)
- [Built-in: step-ca (Smallstep Private CA)](#built-in-step-ca-smallstep-private-ca)
- [OpenSSL / Custom CA](#openssl--custom-ca)
- [Built-in: Vault PKI](#built-in-vault-pki)
- [Built-in: DigiCert CertCentral](#built-in-digicert-certcentral)
- [Built-in: Sectigo SCM](#built-in-sectigo-scm)
- [Built-in: Google CAS](#built-in-google-cas)
- [Built-in: AWS ACM Private CA](#built-in-aws-acm-private-ca)
- [Revocation Across Issuers](#revocation-across-issuers)
- [EST Integration (GetCACertPEM)](#est-integration-getcacertpem)
- [Planned Issuers](#planned-issuers)
- [Building a Custom Issuer](#building-a-custom-issuer)
3. [Target Connector](#target-connector)
- [Interface](#interface-1)
@@ -21,9 +25,15 @@ Connectors extend certctl to integrate with external systems for certificate iss
- [Built-in: Apache httpd](#built-in-apache-httpd)
- [Built-in: HAProxy](#built-in-haproxy)
- [Built-in: Traefik](#built-in-traefik)
- [Built-in: Envoy](#built-in-envoy)
- [Built-in: Postfix / Dovecot](#built-in-postfix--dovecot)
- [Built-in: Caddy](#built-in-caddy)
- [F5 BIG-IP (Interface Only)](#f5-big-ip-interface-only)
- [IIS (Interface Only, Dual-Mode)](#iis-interface-only-dual-mode)
- [F5 BIG-IP (Implemented)](#f5-big-ip-implemented)
- [IIS (Implemented, Dual-Mode)](#iis-implemented-dual-mode)
- [SSH (Agentless Deployment)](#ssh-agentless-deployment)
- [Windows Certificate Store](#windows-certificate-store)
- [Java Keystore (JKS / PKCS#12)](#java-keystore-jks--pkcs12)
- [Kubernetes Secrets](#kubernetes-secrets)
4. [Notifier Connector](#notifier-connector)
- [Interface](#interface-2)
5. [Registering a Connector](#registering-a-connector)
@@ -51,8 +61,8 @@ Connectors extend certctl to integrate with external systems for certificate iss
Three types of connectors:
1. **Issuer Connector** — Obtains certificates from CAs (Local CA with sub-CA support, ACME with HTTP-01 + DNS-01 + DNS-PERSIST-01, step-ca, OpenSSL/Custom CA implemented; additional CA integrations planned)
2. **Target Connector** — Deploys certificates to infrastructure (NGINX, Apache httpd, HAProxy, Traefik, Caddy implemented; F5 via proxy agent, IIS dual-mode interface only; additional cloud and network targets planned)
1. **Issuer Connector** — Obtains certificates from CAs. 9 built-in: Local CA (self-signed + sub-CA), ACME v2 (HTTP-01, DNS-01, DNS-PERSIST-01, ARI, EAB, profile selection), step-ca, OpenSSL/Custom CA, Vault PKI, DigiCert CertCentral, Sectigo SCM, Google CAS, AWS ACM Private CA
2. **Target Connector** — Deploys certificates to infrastructure. 14 built-in: NGINX, Apache httpd, HAProxy, Traefik, Caddy, Envoy, Postfix, Dovecot, IIS (local + WinRM), F5 BIG-IP (proxy agent), SSH (agentless), Windows Certificate Store, Java Keystore, Kubernetes Secrets
3. **Notifier Connector** — Sends alerts about certificate events (Email, Webhooks, Slack, Microsoft Teams, PagerDuty, OpsGenie implemented)
All connectors accept JSON configuration at initialization, support config validation, and are registered in the service layer. Issuer connectors run on the control plane; target connectors run on agents. For network appliances where agents can't be installed, a **proxy agent** in the same network zone handles deployment — the server never initiates outbound connections.
@@ -145,10 +155,12 @@ The Local CA issuer signs certificates using Go's `crypto/x509` library. It supp
**Sub-CA mode:** Loads a CA certificate and private key from disk (`CERTCTL_CA_CERT_PATH` + `CERTCTL_CA_KEY_PATH`). The CA cert is signed by an upstream CA (e.g., ADCS), so all issued certificates chain to the enterprise root trust hierarchy. Clients that already trust the enterprise root automatically trust certctl-issued certs. Supports RSA, ECDSA, and PKCS#8 key formats. If the paths are not set, falls back to self-signed mode. The loaded certificate must have `IsCA=true` and `KeyUsageCertSign`.
**CRL and OCSP support (M15b):** The Local CA supports DER-encoded X.509 CRL generation via `GET /api/v1/crl/{issuer_id}` with 24-hour validity. An embedded OCSP responder at `GET /api/v1/ocsp/{issuer_id}/{serial}` returns signed OCSP responses for issued certificates (good/revoked/unknown status). Certificates with profile TTL < 1 hour automatically skip CRL/OCSP — expiry is treated as sufficient revocation for short-lived credentials.
**CRL and OCSP support (M15b):** The Local CA supports DER-encoded X.509 CRL generation served unauthenticated at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5, RFC 8615, `Content-Type: application/pkix-crl`) with 24-hour validity. An embedded OCSP responder at `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960, `Content-Type: application/ocsp-response`) returns signed OCSP responses for issued certificates (good/revoked/unknown status). Both endpoints are reachable by relying parties with no certctl API credentials, which is how standard TLS clients, browsers, and hardware appliances consume these resources. Certificates with profile TTL < 1 hour automatically skip CRL/OCSP — expiry is treated as sufficient revocation for short-lived credentials.
**Extended Key Usage (EKU) support (M27):** The Local CA respects EKU constraints from certificate profiles and adjusts key usage flags accordingly. For S/MIME certificates (emailProtection EKU), it uses `DigitalSignature | ContentCommitment` instead of the TLS default. For TLS certificates (serverAuth/clientAuth EKU), it uses `DigitalSignature | KeyEncipherment`. This enables support for multiple certificate types — TLS, S/MIME, code signing, timestamping — from a single CA.
**MaxTTL enforcement (M11c):** When a certificate profile defines a maximum TTL, the Local CA caps the `NotAfter` field to `min(validity_days, maxTTL)`. This ensures certificates never exceed the profile's configured lifetime regardless of the issuer's `validity_days` setting.
Configuration:
```json
{
@@ -171,7 +183,7 @@ The ACME connector implements the full ACME v2 protocol using Go's `golang.org/x
**DNS-PERSIST-01 (standing record):** Creates a one-time persistent TXT record at `_validation-persist.<domain>` containing the CA's issuer domain and your ACME account URI. Once set, this record authorizes unlimited future certificate issuances without per-renewal DNS updates. Based on [draft-ietf-acme-dns-persist](https://datatracker.ietf.org/doc/draft-ietf-acme-dns-persist/) and CA/Browser Forum ballot SC-088v3. If the CA doesn't offer dns-persist-01 yet, the connector falls back to dns-01 automatically.
**ACME Renewal Information (ARI, RFC 9702):** Instead of using fixed renewal thresholds (e.g., renew 30 days before expiry), certctl can ask the CA when it should renew. Enable with `CERTCTL_ACME_ARI_ENABLED=true`. The ARI protocol lets the CA specify a `suggestedWindow` (start and end times) for when you should renew — useful for distributing load during maintenance windows or coordinating mass revocation scenarios. Cert ID is computed as `base64url(SHA-256(DER cert))`. If the CA doesn't support ARI (404 response), certctl automatically falls back to threshold-based renewal with no operator intervention required.
**ACME Renewal Information (ARI, RFC 9773):** Instead of using fixed renewal thresholds (e.g., renew 30 days before expiry), certctl can ask the CA when it should renew. Enable with `CERTCTL_ACME_ARI_ENABLED=true`. The ARI protocol lets the CA specify a `suggestedWindow` (start and end times) for when you should renew — useful for distributing load during maintenance windows or coordinating mass revocation scenarios. Cert ID is computed as `base64url(SHA-256(DER cert))`. If the CA doesn't support ARI (404 response), certctl automatically falls back to threshold-based renewal with no operator intervention required.
HTTP-01 configuration:
```json
@@ -241,6 +253,9 @@ Environment variables for the default ACME connector:
- `CERTCTL_ACME_DNS_PRESENT_SCRIPT` — Path to DNS record creation script (dns-01 and dns-persist-01)
- `CERTCTL_ACME_DNS_CLEANUP_SCRIPT` — Path to DNS record cleanup script (dns-01 only, not used by dns-persist-01)
- `CERTCTL_ACME_DNS_PERSIST_ISSUER_DOMAIN` — CA issuer domain for persistent record (dns-persist-01 only, e.g., `letsencrypt.org`)
- `CERTCTL_ACME_PROFILE` — Certificate profile for the newOrder request. Let's Encrypt supports `tlsserver` (standard TLS, default) and `shortlived` (6-day certs). Leave empty for the CA's default profile.
**Certificate Profiles:** Let's Encrypt (GA January 2026) supports ACME certificate profile selection. Set `CERTCTL_ACME_PROFILE=shortlived` to request 6-day certificates — ideal for ephemeral workloads where short validity substitutes for revocation. The `tlsserver` profile produces standard TLS certificates. When the profile field is empty (default), the CA uses its default profile, maintaining full backward compatibility.
The connector is registered in the issuer registry under `iss-acme-staging` and `iss-acme-prod`. Use `iss-acme-staging` for Let's Encrypt staging (rate-limit-friendly testing) and `iss-acme-prod` for production certificates.
@@ -272,7 +287,9 @@ Environment variables:
The connector is registered in the issuer registry under `iss-stepca`. step-ca also works with the existing ACME connector (point `iss-acme-*` at step-ca's ACME directory URL for ACME-based issuance).
**Note:** step-ca-issued certificates rely on step-ca's own CRL/OCSP infrastructure. certctl's local CRL/OCSP endpoints (`GET /api/v1/crl/{issuer_id}` and `GET /api/v1/ocsp/{issuer_id}/{serial}`) are populated from step-ca's revocation data if available, but clients should validate against step-ca's endpoints for the authoritative status.
**Note:** step-ca-issued certificates rely on step-ca's own CRL/OCSP infrastructure. certctl's local CRL/OCSP endpoints (`GET /.well-known/pki/crl/{issuer_id}` and `GET /.well-known/pki/ocsp/{issuer_id}/{serial}`, served unauthenticated per RFC 5280 §5 / RFC 6960 / RFC 8615) are populated from step-ca's revocation data if available, but clients should validate against step-ca's endpoints for the authoritative status.
**MaxTTL enforcement (M11c):** When a certificate profile defines a maximum TTL, the step-ca connector caps the `NotAfter` field to ensure the issued certificate does not exceed the profile limit, regardless of the step-ca provisioner's own maximum.
Location: `internal/connector/issuer/stepca/stepca.go`
@@ -301,16 +318,16 @@ Each issuer handles revocation differently:
- **step-ca**: Calls step-ca's `/revoke` API endpoint. Clients should check step-ca's own CRL/OCSP for authoritative status.
- **OpenSSL/Custom CA**: Invokes the configured revoke script (`CERTCTL_OPENSSL_REVOKE_SCRIPT`) with the serial number as an argument.
### EST Integration (GetCACertPEM)
### EST/SCEP Integration (GetCACertPEM)
The `GetCACertPEM()` method returns the PEM-encoded CA certificate chain, used by the EST server's `/.well-known/est/cacerts` endpoint (RFC 7030) to distribute the CA chain to enrolling devices. Each issuer handles this differently:
The `GetCACertPEM()` method returns the PEM-encoded CA certificate chain, used by both the EST server's `/.well-known/est/cacerts` endpoint (RFC 7030) and the SCEP server's `GetCACert` operation (RFC 8894) to distribute the CA chain to enrolling devices. Each issuer handles this differently:
- **Local CA**: Returns the CA certificate PEM (self-signed or sub-CA cert). This is the primary EST issuer.
- **Local CA**: Returns the CA certificate PEM (self-signed or sub-CA cert). This is the primary EST/SCEP issuer.
- **ACME**: Returns error — ACME CAs provide chains per-issuance, not statically.
- **step-ca**: Returns error — step-ca serves its own `/root` endpoint for CA distribution.
- **OpenSSL/Custom CA**: Returns error — custom script-based CAs have no CA cert access through certctl.
Note: EST (Enrollment over Secure Transport) is not a connector — it's a protocol handler (`internal/api/handler/est.go`) that delegates certificate issuance to whichever issuer connector is configured via `CERTCTL_EST_ISSUER_ID`. See the [Architecture Guide](architecture.md#est-server-rfc-7030) for details.
Note: EST and SCEP are not connectorsthey are protocol handlers (`internal/api/handler/est.go` and `internal/api/handler/scep.go`) that delegate certificate issuance to whichever issuer connector is configured via `CERTCTL_EST_ISSUER_ID` or `CERTCTL_SCEP_ISSUER_ID`. Both share a common `internal/pkcs7` package for PKCS#7 response encoding. See the [Architecture Guide](architecture.md#est-server-rfc-7030) for details.
### Built-in: Vault PKI
@@ -330,6 +347,8 @@ The connector is registered in the issuer registry under `iss-vault`. Vault issu
**Note:** CRL and OCSP are managed by Vault itself. Clients should validate certificate status against Vault's own CRL/OCSP endpoints (`GET /v1/{mount}/crl` and Vault's OCSP responder). certctl does not generate local CRL/OCSP for Vault-issued certificates. Revocation is recorded locally but Vault is the authoritative source.
**MaxTTL enforcement (M11c):** When a certificate profile defines a maximum TTL, the Vault connector overrides the TTL string in the signing request to ensure the issued certificate does not exceed the profile limit. This is applied before Vault's own role-level max TTL.
Location: `internal/connector/issuer/vault/vault.go`
### Built-in: DigiCert CertCentral
@@ -353,20 +372,143 @@ The connector submits certificate orders to DigiCert's `/order/certificate/creat
Location: `internal/connector/issuer/digicert/digicert.go`
### Coming in V2.2+
### Built-in: Sectigo SCM
The following issuer connectors are planned for future releases:
The Sectigo connector integrates with Sectigo Certificate Manager's REST API for ordering and managing DV, OV, and EV certificates. Like DigiCert, it uses an async order model: submit an enrollment, receive an sslId, then poll for completion.
- **Entrust** — Enterprise CA via Entrust API
- **Sectigo** — Commercial CA integration via Sectigo REST API
- **Google CAS** — Google Cloud Certificate Authority Service
- **AWS ACM Private CA** — AWS-managed private CA
**Configuration:**
Note: ADCS (Active Directory Certificate Services) integration is handled via the **sub-CA mode** of the Local CA issuer, not as a separate connector. certctl operates as a subordinate CA with its signing certificate issued by ADCS, so all certctl-issued certs chain to the enterprise ADCS root. See the Local CA section above.
| Variable | Default | Description |
|----------|---------|-------------|
| `CERTCTL_SECTIGO_CUSTOMER_URI` | — | Sectigo customer URI (organization identifier) |
| `CERTCTL_SECTIGO_LOGIN` | — | API account login |
| `CERTCTL_SECTIGO_PASSWORD` | — | API account password |
| `CERTCTL_SECTIGO_ORG_ID` | — | Organization ID (integer) |
| `CERTCTL_SECTIGO_CERT_TYPE` | — | Certificate type ID (integer, from `/ssl/v1/types`) |
| `CERTCTL_SECTIGO_TERM` | `365` | Certificate validity in days |
| `CERTCTL_SECTIGO_BASE_URL` | `https://cert-manager.com/api` | Sectigo API base URL |
The connector submits certificate enrollments to Sectigo's `/ssl/v1/enroll` API. DV certificates may issue immediately; OV/EV certificates require validation (handled by Sectigo) and poll-based completion. The connector periodically checks enrollment status via `/ssl/v1/{sslId}` and downloads the PEM bundle via `/ssl/v1/collect/{sslId}/pem` when issued.
**Authentication:** Three custom headers on every request — `customerUri`, `login`, and `password`.
**Note:** CRL and OCSP are managed by Sectigo. certctl records revocations locally and notifies Sectigo via `/ssl/v1/revoke/{sslId}`.
Location: `internal/connector/issuer/sectigo/sectigo.go`
### Built-in: Google CAS
Google Cloud Certificate Authority Service — managed private CA on GCP. Synchronous issuance via CAS REST API with OAuth2 service account auth.
| Setting | Required | Default | Description |
|---------|----------|---------|-------------|
| `CERTCTL_GOOGLE_CAS_PROJECT` | Yes | — | GCP project ID |
| `CERTCTL_GOOGLE_CAS_LOCATION` | Yes | — | GCP region (e.g., `us-central1`) |
| `CERTCTL_GOOGLE_CAS_CA_POOL` | Yes | — | CA pool name |
| `CERTCTL_GOOGLE_CAS_CREDENTIALS` | Yes | — | Path to service account JSON |
| `CERTCTL_GOOGLE_CAS_TTL` | No | `8760h` | Default certificate TTL |
**Authentication:** OAuth2 service account. The connector reads a service account JSON file, signs a JWT with the private key, and exchanges it for an access token at Google's token endpoint. Tokens are cached and refreshed automatically (5 min before expiry).
**Note:** CRL and OCSP are managed by Google CAS directly. certctl records revocations locally and notifies Google CAS via the revoke endpoint.
Location: `internal/connector/issuer/googlecas/googlecas.go`
### Built-in: AWS ACM Private CA
AWS Certificate Manager Private Certificate Authority — managed private CA on AWS. Synchronous issuance via ACM PCA API with standard AWS credential chain (env vars, IAM roles, instance profiles, SSO).
| Setting | Required | Default | Description |
|---------|----------|---------|-------------|
| `CERTCTL_AWS_PCA_REGION` | Yes | — | AWS region (e.g., `us-east-1`) |
| `CERTCTL_AWS_PCA_CA_ARN` | Yes | — | ARN of the ACM Private CA |
| `CERTCTL_AWS_PCA_SIGNING_ALGORITHM` | No | `SHA256WITHRSA` | Signing algorithm |
| `CERTCTL_AWS_PCA_VALIDITY_DAYS` | No | `365` | Certificate validity in days |
| `CERTCTL_AWS_PCA_TEMPLATE_ARN` | No | — | Optional certificate template ARN |
**Supported signing algorithms:** SHA256WITHRSA, SHA384WITHRSA, SHA512WITHRSA, SHA256WITHECDSA, SHA384WITHECDSA, SHA512WITHECDSA.
**Authentication:** Standard AWS credential chain. The connector uses `aws-sdk-go-v2/config.LoadDefaultConfig()` which supports environment variables (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`), IAM roles (EC2/ECS), instance profiles, and SSO credentials.
**Note:** CRL and OCSP are managed by AWS ACM PCA directly. certctl records revocations locally and notifies AWS via the RevokeCertificate API with RFC 5280 reason mapping.
Location: `internal/connector/issuer/awsacmpca/awsacmpca.go`
### Built-in: Entrust Certificate Services
Entrust CA Gateway REST API with mutual TLS (mTLS) client certificate authentication. Supports synchronous issuance (200 OK with PEM) and approval-pending flows (201 Accepted with async polling).
| Setting | Required | Default | Description |
|---------|----------|---------|-------------|
| `CERTCTL_ENTRUST_API_URL` | Yes | — | Entrust CA Gateway base URL |
| `CERTCTL_ENTRUST_CLIENT_CERT_PATH` | Yes | — | Path to mTLS client certificate PEM |
| `CERTCTL_ENTRUST_CLIENT_KEY_PATH` | Yes | — | Path to mTLS client private key PEM |
| `CERTCTL_ENTRUST_CA_ID` | Yes | — | Certificate Authority ID (from `GET /certificate-authorities`) |
| `CERTCTL_ENTRUST_PROFILE_ID` | No | — | Optional enrollment profile ID |
**Authentication:** Mutual TLS — the client certificate and key are loaded via `tls.LoadX509KeyPair()` and attached to the HTTP transport. No API key or token required.
**Issuance model:** Enrollment via `POST /v1/certificate-authorities/{caId}/enrollments`. Returns 200 with PEM immediately for auto-approved enrollments, or 201 Accepted with a tracking ID for approval-pending orders. `GetOrderStatus` polls the enrollment endpoint.
**Note:** CRL and OCSP are managed by Entrust. certctl records revocations locally and notifies Entrust via `PUT /v1/certificate-authorities/{caId}/certificates/{serial}/revoke`.
Location: `internal/connector/issuer/entrust/entrust.go`
### Built-in: GlobalSign Atlas HVCA
GlobalSign Atlas High Volume CA REST API with dual authentication: mTLS for the TLS handshake and API key/secret headers for request authorization. Region-aware base URLs (EMEA, APAC, Americas).
| Setting | Required | Default | Description |
|---------|----------|---------|-------------|
| `CERTCTL_GLOBALSIGN_API_URL` | Yes | — | Atlas HVCA API URL (region-specific) |
| `CERTCTL_GLOBALSIGN_API_KEY` | Yes | — | API key for request authentication |
| `CERTCTL_GLOBALSIGN_API_SECRET` | Yes | — | API secret for request authentication |
| `CERTCTL_GLOBALSIGN_CLIENT_CERT_PATH` | Yes | — | Path to mTLS client certificate PEM |
| `CERTCTL_GLOBALSIGN_CLIENT_KEY_PATH` | Yes | — | Path to mTLS client private key PEM |
| `CERTCTL_GLOBALSIGN_SERVER_CA_PATH` | No | system trust store | PEM bundle used to verify the Atlas API server certificate. Set this for private/lab Atlas deployments whose server TLS chain is not in the host's default trust bundle. |
**Authentication:** Dual — mTLS client certificate for TLS handshake plus `X-API-Key` and `X-API-Secret` headers on every request.
**TLS verification:** The connector always verifies the server certificate. When `server_ca_path` is set, the PEM bundle at that path is used as the trust anchor; otherwise the host's system trust store is used. TLS 1.2 is the minimum protocol version.
**Issuance model:** `POST /v2/certificates` returns a serial number. Certificate PEM is available after validation completes. Typically resolves within seconds for DV. `GetOrderStatus` polls the certificate endpoint.
**Note:** CRL and OCSP are managed by GlobalSign. certctl records revocations locally and notifies GlobalSign via `PUT /v2/certificates/{serial}/revoke`.
Location: `internal/connector/issuer/globalsign/globalsign.go`
### Built-in: EJBCA (Keyfactor)
EJBCA REST API for self-hosted open-source and enterprise CAs. Supports dual authentication: mTLS (default) or OAuth2 Bearer token, selectable via configuration.
| Setting | Required | Default | Description |
|---------|----------|---------|-------------|
| `CERTCTL_EJBCA_API_URL` | Yes | — | EJBCA REST API base URL |
| `CERTCTL_EJBCA_AUTH_MODE` | No | `mtls` | Auth mode: `mtls` or `oauth2` |
| `CERTCTL_EJBCA_CLIENT_CERT_PATH` | mTLS | — | Path to client certificate PEM (mTLS mode) |
| `CERTCTL_EJBCA_CLIENT_KEY_PATH` | mTLS | — | Path to client key PEM (mTLS mode) |
| `CERTCTL_EJBCA_TOKEN` | OAuth2 | — | Bearer token (oauth2 mode) |
| `CERTCTL_EJBCA_CA_NAME` | Yes | — | EJBCA CA name |
| `CERTCTL_EJBCA_CERT_PROFILE` | No | — | EJBCA certificate profile |
| `CERTCTL_EJBCA_EE_PROFILE` | No | — | EJBCA end-entity profile |
**Authentication:** Configurable via `auth_mode`. In mTLS mode, client certificate and key are loaded for the TLS handshake. In OAuth2 mode, the token is sent as `Authorization: Bearer {token}`.
**Issuance model:** `POST /v1/certificate/pkcs10enroll` with base64-encoded CSR. Returns base64-encoded certificate PEM. EJBCA 9.3+ creates end-entity and issues cert in a single call. Approval-pending enrollments return 201.
**Revocation note:** EJBCA requires both issuer DN and serial number for revocation. The connector stores these as a composite `OrderID` in `issuer_dn::serial` format.
**Note:** CRL and OCSP are managed by the EJBCA instance. certctl records revocations locally and notifies EJBCA via `PUT /v1/certificate/{issuer_dn}/{serial}/revoke`.
Location: `internal/connector/issuer/ejbca/ejbca.go`
### ADCS Integration
Active Directory Certificate Services integration is handled via the **sub-CA mode** of the Local CA issuer, not as a separate connector. certctl operates as a subordinate CA with its signing certificate issued by ADCS, so all certctl-issued certs chain to the enterprise ADCS root. See the Local CA section above.
### Building a Custom Issuer
Here's the structure for a HashiCorp Vault PKI issuer:
Here's a simplified example showing the connector pattern (using a hypothetical Vault-like CA):
```go
package vault
@@ -590,51 +732,334 @@ When `mode` is `"api"`, the connector posts the certificate to the admin API end
Location: `internal/connector/target/caddy/caddy.go`
### F5 BIG-IP (Interface Only)
### Built-in: Envoy
The F5 BIG-IP target connector interface is defined with the iControl REST flow mapped out, but the actual API calls are not yet implemented. F5 appliances can't run agents directly, so this connector uses the **proxy agent pattern**: a designated agent in the same network zone picks up F5 deployment jobs and calls the iControl REST API. The server assigns the work; the proxy agent executes it.
The Envoy connector uses file-based certificate delivery — it writes certificate and key files to a directory that Envoy watches via its SDS (Secret Discovery Service) file-based configuration or static `filename` references in the bootstrap config. When files change, Envoy automatically picks up the new certificates without requiring a reload command.
The planned flow is: authenticate via `POST /mgmt/shared/authn/login`, upload cert PEM via `POST /mgmt/tm/ltm/certificate`, update the SSL profile via `PATCH /mgmt/tm/ltm/profile/client-ssl/{profile}`, and validate deployment by checking profile status.
Configuration:
```json
{
"cert_dir": "/etc/envoy/certs",
"cert_filename": "cert.pem",
"key_filename": "key.pem",
"chain_filename": "chain.pem",
"sds_config": true
}
```
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `cert_dir` | string | (required) | Directory where Envoy watches for certificate files |
| `cert_filename` | string | `cert.pem` | Filename for the certificate (leaf + chain unless `chain_filename` is set) |
| `key_filename` | string | `key.pem` | Filename for the private key |
| `chain_filename` | string | (empty) | If set, chain is written to a separate file instead of appended to the cert |
| `sds_config` | bool | `false` | If true, writes an `sds.json` file for Envoy's file-based SDS provider |
When `sds_config` is `true`, the connector writes an SDS JSON file (`{cert_dir}/sds.json`) containing a `tls_certificate` resource that points to the cert and key file paths. Envoy's file-based SDS (`path_config_source`) watches this file for changes, providing automatic hot-reload of certificates. This is the recommended approach for production Envoy deployments using dynamic TLS configuration.
When `sds_config` is `false` (the default), the connector simply writes cert and key files. Use this mode when Envoy's bootstrap config references the cert/key files directly via static `filename` fields in the TLS context.
Location: `internal/connector/target/envoy/envoy.go`
### Built-in: Postfix / Dovecot
The Postfix/Dovecot connector is a dual-mode mail server TLS connector. It writes certificate, key, and chain files to configured paths and reloads the mail service. The `mode` field selects between Postfix MTA and Dovecot IMAP/POP3, which determines default file paths and reload commands.
This connector pairs with certctl's S/MIME certificate support (email protection EKU, email SAN routing) for a complete email infrastructure story — TLS for transport encryption, S/MIME for end-to-end message signing and encryption.
**Postfix configuration:**
```json
{
"mode": "postfix",
"cert_path": "/etc/postfix/certs/cert.pem",
"key_path": "/etc/postfix/certs/key.pem",
"chain_path": "/etc/postfix/certs/chain.pem",
"reload_command": "postfix reload",
"validate_command": "postfix check"
}
```
**Dovecot configuration:**
```json
{
"mode": "dovecot",
"cert_path": "/etc/dovecot/certs/cert.pem",
"key_path": "/etc/dovecot/certs/key.pem",
"chain_path": "/etc/dovecot/certs/chain.pem",
"reload_command": "doveadm reload",
"validate_command": "doveconf -n"
}
```
| Field | Type | Default (Postfix) | Default (Dovecot) | Description |
|-------|------|-------------------|-------------------|-------------|
| `mode` | string | `postfix` | `dovecot` | Service mode — determines defaults |
| `cert_path` | string | `/etc/postfix/certs/cert.pem` | `/etc/dovecot/certs/cert.pem` | Path for certificate file |
| `key_path` | string | `/etc/postfix/certs/key.pem` | `/etc/dovecot/certs/key.pem` | Path for private key (0600 permissions) |
| `chain_path` | string | (empty) | (empty) | If set, chain written separately; otherwise appended to cert |
| `reload_command` | string | `postfix reload` | `doveadm reload` | Command to reload the mail service |
| `validate_command` | string | `postfix check` | `doveconf -n` | Optional config validation before reload |
All commands are validated against shell injection via `validation.ValidateShellCommand()`. File permissions: cert/chain 0644, key 0600.
Location: `internal/connector/target/postfix/postfix.go`
### F5 BIG-IP (Implemented)
The F5 BIG-IP target connector deploys certificates to F5 load balancers via the iControl REST API. F5 appliances can't run agents directly, so this connector uses the **proxy agent pattern**: a designated certctl agent in the same network zone polls for F5 deployment jobs and executes iControl REST calls on behalf of the control plane. Minimum supported BIG-IP version: 12.0+.
The deployment flow uses F5's transaction API for atomic updates: authenticate via token auth, upload cert/key/chain PEM files, install as crypto objects, update the SSL client profile within a transaction, and commit. If the transaction fails, F5 rolls back automatically and the connector cleans up uploaded crypto objects. Updating an SSL profile automatically takes effect on all bound virtual servers — no separate virtual server binding step is needed.
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `host` | string | *(required)* | F5 BIG-IP management hostname or IP |
| `port` | int | `443` | iControl REST API port |
| `username` | string | *(required)* | Administrative username |
| `password` | string | *(required)* | Administrative password |
| `partition` | string | `Common` | F5 partition for crypto objects and profiles |
| `ssl_profile` | string | *(required)* | SSL client profile name to update |
| `insecure` | bool | `true` | Skip TLS verification for management interface (self-signed certs common) |
| `timeout` | int | `30` | HTTP timeout in seconds |
Configuration (defined, not yet functional):
```json
{
"host": "f5.internal.example.com",
"port": 443,
"username": "admin",
"password": "...",
"partition": "Common",
"ssl_profile": "/Common/clientssl_api"
"ssl_profile": "clientssl_api",
"insecure": true,
"timeout": 30
}
```
Note: F5 credentials are stored on the proxy agent, not on the control plane server. This limits the credential blast radius to the proxy agent's network zone.
F5 credentials are stored on the proxy agent, not on the control plane server. This limits the credential blast radius to the proxy agent's network zone. Config fields are validated against regex patterns to prevent injection.
Location: `internal/connector/target/f5/f5.go`
### IIS (Interface Only, Dual-Mode)
### IIS (Implemented, Dual-Mode)
The IIS target connector supports two planned deployment modes:
The IIS target connector supports two deployment modes — agent-local (recommended) and proxy agent WinRM for agentless targets.
**Agent-local (recommended):** A Windows agent runs directly on the IIS server and deploys certificates using PowerShell — `Import-PfxCertificate` to install into the certificate store and `Set-WebBinding` to bind to the IIS site. This is the preferred approach: no remote access needed, no credential management, same pull-based model as NGINX/Apache/HAProxy.
**Agent-local (recommended):** A Windows agent runs directly on the IIS server and deploys certificates using PowerShell — `Import-PfxCertificate` to install into the certificate store and `Set-WebBinding` to bind to the IIS site. The agent handles PEM-to-PFX conversion via `go-pkcs12`, computes SHA-1 thumbprint from the certificate, and executes parameterized PowerShell scripts for injection-safe binding management. This is the preferred approach: no remote access needed, no credential management, same pull-based model as NGINX/Apache/HAProxy.
**Proxy agent WinRM (for agentless targets):** For Windows servers where you don't want to install an agent, a nearby Windows agent acts as a proxy and reaches the IIS box via WinRM. The proxy agent picks up the deployment job, transfers the PFX bundle over WinRM, and runs the PowerShell commands remotely. WinRM credentials are stored on the proxy agent, not on the control plane.
**Proxy agent WinRM (for agentless targets):** For Windows servers where you don't want to install an agent, a Linux or Windows proxy agent in the same network zone connects via WinRM (Windows Remote Management) and executes PowerShell commands remotely. The PFX bundle is base64-encoded, transferred inline in the WinRM session, decoded to a temp file on the remote host, imported, and the temp file is cleaned up in a `try/finally` block. WinRM credentials are configured on the target, not on the control plane. Uses the `masterzen/winrm` Go library with support for Basic, NTLM, and Kerberos authentication.
Configuration (defined, not yet functional):
**Agent-local configuration:**
```json
{
"mode": "local",
"hostname": "iis-server.example.com",
"site_name": "Default Web Site",
"cert_store": "WebHosting",
"winrm_host": "",
"winrm_username": "",
"winrm_password": "",
"winrm_use_https": true
"port": 443,
"sni": true,
"ip_address": "*",
"binding_info": "www.example.com"
}
```
When `mode` is `"local"`, the `winrm_*` fields are ignored. When `mode` is `"proxy"`, the agent connects to the remote IIS server via WinRM using the provided credentials.
**WinRM proxy configuration:**
```json
{
"hostname": "iis-server.example.com",
"site_name": "Default Web Site",
"cert_store": "WebHosting",
"port": 443,
"sni": true,
"ip_address": "*",
"mode": "winrm",
"winrm": {
"winrm_host": "iis-server.example.com",
"winrm_port": 5985,
"winrm_username": "Administrator",
"winrm_password": "...",
"winrm_https": false,
"winrm_insecure": false,
"winrm_timeout": 60
}
}
```
Location: `internal/connector/target/iis/iis.go`
**Configuration Fields:**
- `hostname` (string, required): IIS server hostname or FQDN
- `site_name` (string, required): IIS website name (e.g., "Default Web Site")
- `cert_store` (string, required): Certificate store for import (e.g., "WebHosting", "My")
- `port` (number, default 443): HTTPS binding port
- `sni` (boolean, default false): Enable Server Name Indication (SNI)
- `ip_address` (string, default "*"): Specific IP to bind to, or "*" for all IPs
- `binding_info` (string, optional): Host header for SNI bindings
- `mode` (string, default "local"): Deployment mode — `local` (agent-local PowerShell) or `winrm` (remote via WinRM)
**WinRM fields (required when `mode` is `winrm`):**
- `winrm.winrm_host` (string, required): Remote Windows server hostname or IP
- `winrm.winrm_port` (number, default 5985 HTTP / 5986 HTTPS): WinRM listener port
- `winrm.winrm_username` (string, required): Windows account with admin privileges
- `winrm.winrm_password` (string, required): Account password
- `winrm.winrm_https` (boolean, default false): Use HTTPS transport
- `winrm.winrm_insecure` (boolean, default false): Skip TLS certificate verification
- `winrm.winrm_timeout` (number, default 60): Operation timeout in seconds
**Security Model:**
- PFX files are transient — generated with random passwords, deleted after import
- In WinRM mode, PFX data is base64-encoded and transferred inline (no SMB/file share needed), with remote temp file cleanup in `try/finally`
- PowerShell commands use parameterized values — IIS names and cert stores are regex-validated before script execution
- Field names are validated against `^[a-zA-Z0-9 _\-\.]+$` to prevent PowerShell injection
- Certificate thumbprints computed via SHA-1 for IIS binding lookups
Location: `internal/connector/target/iis/iis.go`, `internal/connector/target/iis/winrm.go`
### SSH (Agentless Deployment)
The SSH target connector enables agentless certificate deployment to any Linux/Unix server via SSH/SFTP. Instead of installing the certctl agent binary on every target, a single "proxy agent" in the same network zone deploys certificates to remote servers over SSH. This is ideal for environments where installing agents on every server is impractical.
**Key authentication (recommended):**
```json
{
"host": "web-server.internal",
"port": 22,
"user": "certctl",
"auth_method": "key",
"private_key_path": "/home/certctl/.ssh/id_ed25519",
"cert_path": "/etc/ssl/certs/cert.pem",
"key_path": "/etc/ssl/private/key.pem",
"chain_path": "/etc/ssl/certs/chain.pem",
"reload_command": "systemctl reload nginx",
"timeout": 30
}
```
**Password authentication:**
```json
{
"host": "legacy-server.internal",
"user": "deploy",
"auth_method": "password",
"password": "s3cret",
"cert_path": "/etc/ssl/cert.pem",
"key_path": "/etc/ssl/key.pem",
"reload_command": "systemctl reload apache2"
}
```
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `host` | string | *(required)* | SSH hostname or IP address |
| `port` | number | 22 | SSH port |
| `user` | string | *(required)* | SSH username |
| `auth_method` | string | `"key"` | `"key"` or `"password"` |
| `private_key_path` | string | | Path to SSH private key file (key auth) |
| `private_key` | string | | Inline SSH private key PEM (alternative to path) |
| `password` | string | | SSH password (password auth) |
| `passphrase` | string | | Passphrase for encrypted private keys |
| `cert_path` | string | *(required)* | Remote path for certificate file |
| `key_path` | string | *(required)* | Remote path for private key file |
| `chain_path` | string | | Remote path for chain file (if empty, chain appended to cert) |
| `cert_mode` | string | `"0644"` | File permissions for cert (octal) |
| `key_mode` | string | `"0600"` | File permissions for private key (octal) |
| `reload_command` | string | | Command to execute after deployment |
| `timeout` | number | 30 | SSH connection timeout in seconds |
**Security:**
- Key-based authentication is recommended over password authentication
- Reload commands are validated against shell injection (same validation as Postfix/Dovecot connectors)
- Host field is regex-validated to prevent shell metacharacters
- Private keys are written with 0600 permissions by default
- Host key verification is intentionally skipped (same rationale as network scanner and F5 connector — deploying to known, operator-configured infrastructure)
- Encrypted private keys supported via passphrase
Location: `internal/connector/target/ssh/ssh.go`
### Windows Certificate Store
The Windows Certificate Store connector imports certificates into the Windows cert store via PowerShell, without managing IIS site bindings. Use this for non-IIS Windows services that read certificates from the cert store (Exchange, RDP, SQL Server, ADFS, etc.). Same injectable `PowerShellExecutor` pattern as the IIS connector, with optional WinRM proxy mode.
```json
{
"store_name": "My",
"store_location": "LocalMachine",
"friendly_name": "Production API Cert",
"remove_expired": true
}
```
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `store_name` | string | `"My"` | Windows cert store name (My, Root, WebHosting, etc.) |
| `store_location` | string | `"LocalMachine"` | `"LocalMachine"` or `"CurrentUser"` |
| `friendly_name` | string | | Optional friendly name for the imported certificate |
| `remove_expired` | boolean | `false` | Remove expired certs with same CN after import |
| `mode` | string | `"local"` | `"local"` (agent-local) or `"winrm"` (remote) |
| `winrm_host` | string | | WinRM hostname (required for winrm mode) |
| `winrm_port` | number | 5985 | WinRM port (5985 HTTP, 5986 HTTPS) |
| `winrm_username` | string | | WinRM username (required for winrm mode) |
| `winrm_password` | string | | WinRM password (required for winrm mode) |
| `winrm_https` | boolean | `false` | Use HTTPS for WinRM |
| `winrm_insecure` | boolean | `false` | Skip TLS verification for WinRM |
Location: `internal/connector/target/wincertstore/wincertstore.go`
### Java Keystore (JKS / PKCS#12)
The Java Keystore connector deploys certificates to JKS or PKCS#12 keystores via the `keytool` CLI. This enables TLS cert deployment for Tomcat, Jetty, Kafka, Elasticsearch, and any JVM-based service. Flow: PEM to temp PKCS#12, then `keytool -importkeystore` into the target keystore.
```json
{
"keystore_path": "/opt/tomcat/conf/keystore.p12",
"keystore_password": "changeit",
"keystore_type": "PKCS12",
"alias": "server",
"reload_command": "systemctl restart tomcat"
}
```
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `keystore_path` | string | *(required)* | Absolute path to the keystore file |
| `keystore_password` | string | *(required)* | Keystore password |
| `keystore_type` | string | `"PKCS12"` | `"PKCS12"` or `"JKS"` |
| `alias` | string | `"server"` | Key entry alias in the keystore |
| `reload_command` | string | | Optional command to run after keystore update |
| `create_keystore` | boolean | `true` | Create keystore if it doesn't exist |
| `keytool_path` | string | `"keytool"` | Override keytool binary path |
**Security:**
- Reload commands validated against shell injection via `validation.ValidateShellCommand()`
- Alias validated against injection (alphanumeric, hyphens, underscores only)
- Path traversal prevention on keystore path
- Transient PKCS#12 temp file cleaned up after import (even on error)
Location: `internal/connector/target/javakeystore/javakeystore.go`
### Kubernetes Secrets
The Kubernetes Secrets connector deploys certificates as `kubernetes.io/tls` Secrets, compatible with Ingress controllers (nginx-ingress, Traefik, HAProxy), service meshes (Istio, Linkerd), and any Kubernetes workload that reads TLS Secrets.
```json
{
"namespace": "production",
"secret_name": "api-tls",
"labels": {"app": "api-gateway"},
"kubeconfig_path": "/home/agent/.kube/config"
}
```
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `namespace` | string | *(required)* | Kubernetes namespace (DNS-1123, max 63 chars) |
| `secret_name` | string | *(required)* | Secret name (DNS subdomain, max 253 chars) |
| `labels` | object | | Additional labels to apply to the Secret |
| `kubeconfig_path` | string | | Path to kubeconfig for out-of-cluster agents |
**Deployment modes:**
- **In-cluster (default):** Agent runs as a Pod with a ServiceAccount. Authentication via auto-mounted token. Requires RBAC (`secrets.get`, `secrets.create`, `secrets.update`, `secrets.list`) — see Helm chart.
- **Out-of-cluster:** Agent runs outside the cluster with `kubeconfig_path` pointing to a kubeconfig file. Useful for proxy agent pattern.
**Secret format:** Standard `kubernetes.io/tls` with `tls.crt` (cert + chain PEM) and `tls.key` (private key PEM). Managed labels (`app.kubernetes.io/managed-by: certctl`) and annotations (`certctl.io/deployed-at`, `certctl.io/certificate-id`) are applied automatically.
**Validation:** After deployment, the connector reads the Secret back and compares the certificate serial number to verify successful deployment.
Location: `internal/connector/target/k8ssecret/k8ssecret.go`
## Notifier Connector
@@ -974,6 +1399,63 @@ When `CERTCTL_NETWORK_SCAN_ENABLED=true`, the server runs a 6th scheduler loop (
- **Migration assessment** — Scan a network range before onboarding to certctl management
- **Expiration monitoring** — Discover soon-to-expire certs on network endpoints before they cause outages
## Cloud Secret Manager Discovery
certctl extends the existing filesystem and network discovery pipeline to cloud secret managers. Certificates stored in cloud vaults are automatically discovered, inventoried, and available for triage in the Discovery page.
Each cloud source runs as a pluggable `DiscoverySource` with its own sentinel agent ID. Discovered certificates flow through the same `ProcessDiscoveryReport` pipeline used by filesystem and network discovery — dedup by fingerprint, audit trail, status tracking.
### AWS Secrets Manager
Discovers certificates stored as secrets in AWS Secrets Manager. Filters by tag (`type=certificate` by default) and optional name prefix.
| Variable | Description | Default |
|---|---|---|
| `CERTCTL_CLOUD_DISCOVERY_ENABLED` | Enable cloud discovery scheduler | `false` |
| `CERTCTL_AWS_SM_DISCOVERY_ENABLED` | Enable AWS SM source | `false` |
| `CERTCTL_AWS_SM_REGION` | AWS region (e.g., `us-east-1`) | — |
| `CERTCTL_AWS_SM_TAG_FILTER` | Tag key=value filter | `type=certificate` |
| `CERTCTL_AWS_SM_NAME_PREFIX` | Secret name prefix filter | — |
Source path format: `aws-sm://{region}/{secret-name}`. Sentinel agent: `cloud-aws-sm`.
### Azure Key Vault
Discovers certificates from Azure Key Vault using OAuth2 client credentials authentication. No Azure SDK dependency — uses stdlib HTTP with Azure AD token exchange.
| Variable | Description | Default |
|---|---|---|
| `CERTCTL_AZURE_KV_DISCOVERY_ENABLED` | Enable Azure KV source | `false` |
| `CERTCTL_AZURE_KV_VAULT_URL` | Vault URL (e.g., `https://myvault.vault.azure.net`) | — |
| `CERTCTL_AZURE_KV_TENANT_ID` | Azure AD tenant ID | — |
| `CERTCTL_AZURE_KV_CLIENT_ID` | Azure AD application (client) ID | — |
| `CERTCTL_AZURE_KV_CLIENT_SECRET` | Azure AD application secret | — |
Source path format: `azure-kv://{cert-name}/{version}`. Sentinel agent: `cloud-azure-kv`.
### GCP Secret Manager
Discovers certificates stored in GCP Secret Manager. Filters by label (`type=certificate`). Uses JWT-based OAuth2 service account auth — no Google SDK dependency.
| Variable | Description | Default |
|---|---|---|
| `CERTCTL_GCP_SM_DISCOVERY_ENABLED` | Enable GCP SM source | `false` |
| `CERTCTL_GCP_SM_PROJECT` | GCP project ID | — |
| `CERTCTL_GCP_SM_CREDENTIALS` | Path to service account JSON file | — |
Source path format: `gcp-sm://{project}/{secret-name}`. Sentinel agent: `cloud-gcp-sm`.
### Cloud Discovery Scheduler
All enabled cloud sources run on a shared scheduler loop (9th loop). The interval is configurable:
| Variable | Description | Default |
|---|---|---|
| `CERTCTL_CLOUD_DISCOVERY_ENABLED` | Master switch | `false` |
| `CERTCTL_CLOUD_DISCOVERY_INTERVAL` | Scan interval | `6h` |
The loop runs immediately on startup and then on each tick. Each source runs sequentially within the loop. Errors from one source do not prevent other sources from running.
## What's Next
- [Architecture Guide](architecture.md) — Understanding the full system design
+14 -12
View File
@@ -307,8 +307,8 @@ flowchart TD
A --> F["ACME\n(Let's Encrypt)"]
A --> G["step-ca\n(implemented)"]
A --> H["OpenSSL / Custom CA\n(script-based)"]
A --> J["DigiCert API\n(planned)"]
A --> K["Vault PKI\n(planned)"]
A --> J["DigiCert API\n(implemented)"]
A --> K["Vault PKI\n(implemented)"]
A --> L["Entrust / GlobalSign\n(planned)"]
A --> M["Google CAS / EJBCA\n(planned)"]
```
@@ -724,22 +724,24 @@ curl -s -X POST $API/api/v1/certificates/mc-demo-payments/revoke \
6. Creates an audit trail entry
7. Sends revocation notifications via configured channels
Check the CRL (Certificate Revocation List):
Check the CRL (Certificate Revocation List) — served unauthenticated under the RFC 8615 well-known namespace so relying parties without a certctl API key can still verify revocation (RFC 5280 §5):
```bash
# JSON-formatted CRL
curl -s $API/api/v1/crl | jq .
# DER-encoded X.509 CRL for the local CA (binary — pipe to openssl for inspection)
curl -s $API/api/v1/crl/iss-local -o /tmp/crl.der
# DER-encoded X.509 CRL for the local CA (binary — pipe to openssl for inspection).
# Note: no -H "Authorization: Bearer ..." — the endpoint is deliberately
# unauthenticated. Content-Type is application/pkix-crl.
curl -s http://localhost:8443/.well-known/pki/crl/iss-local -o /tmp/crl.der
openssl crl -inform DER -in /tmp/crl.der -text -noout
```
Check OCSP status:
Check OCSP status (RFC 6960, also unauthenticated, `application/ocsp-response`):
```bash
# Replace SERIAL with the actual serial number from the certificate version
curl -s $API/api/v1/ocsp/iss-local/SERIAL | jq .
# Replace SERIAL with the actual serial number from the certificate version.
# The embedded OCSP responder returns a signed DER response — parse it with
# `openssl ocsp -respin` or similar tooling.
curl -s http://localhost:8443/.well-known/pki/ocsp/iss-local/SERIAL -o /tmp/ocsp.der
openssl ocsp -respin /tmp/ocsp.der -noverify -resp_text | head -40
```
**Why RFC 5280 reason codes:** The reason code isn't just metadata — it tells clients *why* the certificate was revoked. A `keyCompromise` revocation means the private key was exposed and the certificate should be distrusted immediately. A `superseded` revocation means a newer certificate replaced it — less urgent. CRLs and OCSP responses include the reason code so client software can make informed trust decisions.
@@ -981,7 +983,7 @@ export CERTCTL_API_KEY="test-key-123"
## Part 15: MCP Server for AI Integration (M18a)
certctl exposes 78 MCP tools covering the REST API via the Model Context Protocol (MCP), enabling seamless integration with Claude, Cursor, and other AI assistants:
certctl exposes the full REST API via the Model Context Protocol (MCP), enabling seamless integration with Claude, Cursor, and other AI assistants:
```bash
# Build the MCP server
+120
View File
@@ -0,0 +1,120 @@
# Deployment Examples
Five turnkey docker-compose scenarios, each runnable in under 5 minutes. Pick the one closest to your setup.
## Which Example Should I Use?
| I need to... | Example | Issuer | Target |
|--------------|---------|--------|--------|
| Get Let's Encrypt certs for NGINX on a public server | [ACME + NGINX](#acme--nginx) | ACME (HTTP-01) | NGINX |
| Issue wildcard certs without opening port 80 | [Wildcard DNS-01](#wildcard-dns-01) | ACME (DNS-01) | Any |
| Run an internal CA for services behind a firewall | [Private CA + Traefik](#private-ca--traefik) | Local CA | Traefik |
| Use Smallstep step-ca as my PKI backend | [step-ca + HAProxy](#step-ca--haproxy) | step-ca | HAProxy |
| Manage both public and internal certs from one dashboard | [Multi-Issuer](#multi-issuer) | ACME + Local CA | Mixed |
**Already using another tool?** See the migration sections below each example for Certbot, acme.sh, and cert-manager users.
---
## ACME + NGINX
**Scenario:** You have one or more public-facing domains, NGINX as the reverse proxy, and want automated Let's Encrypt certificates with HTTP-01 challenges.
**What it deploys:** certctl server + PostgreSQL + certctl agent + NGINX, all on one Docker network. The agent generates keys locally (ECDSA P-256), submits CSRs to the server, receives signed certs from Let's Encrypt, and deploys them to NGINX with automatic reload.
**Prerequisites:** A domain pointing to your server, ports 80 and 443 open, Docker Compose v20.10+.
```bash
cd examples/acme-nginx
cp .env.example .env # Edit with your domain and email
docker compose up -d
```
The full walkthrough — including how HTTP-01 challenges work, adding multiple domains, switching to staging for testing, and a production checklist — is in the [example README](../examples/acme-nginx/acme-nginx.md).
**Migrating from Certbot?** certctl discovers your existing `/etc/letsencrypt/live/` certificates automatically. You keep your ACME account, disable the Certbot cron, and certctl takes over renewal with centralized visibility and deployment verification. The step-by-step process is in [Migrating from Certbot](migrate-from-certbot.md).
---
## Wildcard DNS-01
**Scenario:** You need wildcard certificates (`*.example.com`) or your servers aren't reachable from the internet (no port 80). DNS-01 validates ownership by creating a TXT record at your DNS provider.
**What it deploys:** certctl server + PostgreSQL + certctl agent. Includes a Cloudflare DNS hook script as a working reference — swap in your own DNS provider (Route53, Azure DNS, Google Cloud DNS, or any provider with an API).
**Prerequisites:** A domain, API credentials for your DNS provider, Docker Compose.
```bash
cd examples/acme-wildcard-dns01
cp .env.example .env # Edit with domain, email, DNS provider credentials
docker compose up -d
```
The full walkthrough — including DNS-PERSIST-01 (set a TXT record once, never touch DNS again on renewals), adapting scripts for other providers, and propagation troubleshooting — is in the [example README](../examples/acme-wildcard-dns01/acme-wildcard-dns01.md).
**Migrating from acme.sh?** Your existing `dns_*` hook scripts are compatible with certctl's DNS-01 — they use the same pattern (shell scripts creating TXT records). The migration guide covers script adaptation, discovery of existing acme.sh certificates, and phasing out the acme.sh cron. See [Migrating from acme.sh](migrate-from-acmesh.md).
---
## Private CA + Traefik
**Scenario:** Internal services that don't need public CA validation. You run your own certificate authority — either a self-signed root for development, or a subordinate CA chained to your enterprise root (e.g., Active Directory Certificate Services).
**What it deploys:** certctl server + PostgreSQL + certctl agent + Traefik. The Local CA issuer signs certificates directly. Traefik watches a cert directory and auto-reloads when new files appear.
**Prerequisites:** Docker Compose. For sub-CA mode, you'll need a CA certificate and key signed by your enterprise root.
```bash
cd examples/private-ca-traefik
docker compose up -d # Self-signed mode (no .env needed for demo)
```
The full walkthrough — including sub-CA setup with `CERTCTL_CA_CERT_PATH` and `CERTCTL_CA_KEY_PATH`, creating certificates via the API, monitoring deployments, and production hardening — is in the [example README](../examples/private-ca-traefik/private-ca-traefik.md).
---
## step-ca + HAProxy
**Scenario:** You use Smallstep's step-ca as your private PKI and want automated lifecycle management for certificates deployed to HAProxy load balancers.
**What it deploys:** certctl server + PostgreSQL + certctl agent + step-ca (with JWK provisioner) + HAProxy. certctl issues certs via step-ca's native `/sign` API, combines them into HAProxy's expected PEM format (cert + chain + key in one file), and reloads HAProxy.
**Prerequisites:** Docker Compose.
```bash
cd examples/step-ca-haproxy
docker compose up -d
```
The full walkthrough — including step-ca provisioner configuration, integrating with an existing step-ca instance, HAProxy PEM format details, and advanced features (approval workflows, policy-based renewal, multi-instance HAProxy) — is in the [example README](../examples/step-ca-haproxy/step-ca-haproxy.md).
---
## Multi-Issuer
**Scenario:** You manage both public-facing services (needing Let's Encrypt or another public CA) and internal services (using a private CA) and want a single dashboard for everything.
**What it deploys:** certctl server + PostgreSQL + certctl agent configured with both an ACME issuer and a Local CA issuer. Demonstrates issuer assignment via profiles — public services get ACME certs, internal services get Local CA certs, all visible in one inventory.
**Prerequisites:** Docker Compose. For real ACME certs, a public domain and port 80 access.
```bash
cd examples/multi-issuer
docker compose up -d
```
The full walkthrough — including profile-based issuer assignment, testing with ACME staging, Local CA enterprise sub-CA mode, and scaling beyond Docker Compose — is in the [example README](../examples/multi-issuer/multi-issuer.md).
**Using cert-manager for Kubernetes?** certctl complements cert-manager — cert-manager handles in-cluster certs, certctl handles everything outside: VMs, bare metal, network appliances, Windows servers. They can share the same CA (ACME, step-ca, Vault PKI). See [certctl for cert-manager Users](certctl-for-cert-manager-users.md).
---
## Beyond These Examples
These 5 scenarios cover the most common deployment patterns, but certctl supports 7 issuer backends and 10 target connectors. Once you have the basics running, you can mix and match:
**Issuers:** ACME (Let's Encrypt, ZeroSSL, Buypass, Google Trust Services), Local CA (self-signed or sub-CA), step-ca, Vault PKI, DigiCert CertCentral, OpenSSL/Custom CA script, Sectigo (coming soon).
**Targets:** NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, IIS (local PowerShell or WinRM proxy), Postfix, Dovecot, F5 BIG-IP (coming soon).
See [Connector Reference](connectors.md) for configuration details on every issuer and target.
+1263 -1270
View File
File diff suppressed because it is too large Load Diff
+2 -2
View File
@@ -94,7 +94,7 @@ Add certctl as an MCP server in your project's `.mcp.json`:
## Available Tools
The MCP server registers 78 tools organized across 16 resource domains:
The MCP server exposes the full REST API organized across 16 resource domains:
| Domain | Tools | Examples |
|--------|-------|---------|
@@ -153,7 +153,7 @@ flowchart LR
AI <-->|"stdio"| MCP
MCP -->|"HTTP + Bearer token"| SERVER
MCP ~~~ TOOLS["78 tools · 16 domains\nTyped input structs"]
MCP ~~~ TOOLS["REST API via MCP · 16 domains\nTyped input structs"]
```
The MCP server is intentionally thin:
+5 -4
View File
@@ -267,8 +267,9 @@ export CERTCTL_ACME_DNS_PRESENT_SCRIPT=/etc/certctl/dns/cloudflare-present.sh
certctl automatically falls back to DNS-01 if the CA doesn't support dns-persist-01 yet.
## Support
## Next Steps
See [Connector Configuration](connectors.md) for advanced ACME options (EAB, ARI, custom timeouts).
See [Discovery Guide](concepts.md#certificate-discovery) for managing discovered certificates at scale.
- Try the [Wildcard DNS-01 example](../examples/acme-wildcard-dns01/acme-wildcard-dns01.md) — a working docker-compose with Cloudflare hooks you can adapt for your DNS provider
- See [Connector Reference](connectors.md) for advanced ACME options (EAB, ARI, custom timeouts)
- See [Discovery Guide](concepts.md#certificate-discovery) for managing discovered certificates at scale
- See all [Deployment Examples](./examples.md) for other scenarios (ACME+NGINX, private CA, step-ca, multi-issuer)
+2 -1
View File
@@ -166,6 +166,7 @@ certctl will stop renewing that cert when the policy is disabled. Certbot resume
## Next Steps
- Try the [ACME + NGINX example](../examples/acme-nginx/acme-nginx.md) — a working docker-compose you can run locally before deploying to production
- Review the [Concepts Guide](./concepts.md) for terminology (profiles, policies, agents, jobs)
- Explore [Network Discovery](./quickstart.md#network-discovery-agentless) to find certificates you didn't know about
- Set up [Kubernetes cert-manager integration](./certctl-for-cert-manager-users.md) if you manage in-cluster certs too
- See all [Deployment Examples](./examples.md) for other scenarios (wildcard DNS-01, private CA, step-ca, multi-issuer)
+295
View File
@@ -0,0 +1,295 @@
# QA Test Suite Guide (`qa_test.go`)
> **Audience:** Anyone running release QA for certctl — whether you're a first-time contributor or the maintainer cutting a release tag.
>
> **Companion to:** `docs/testing-guide.md` (the *what* to test). This document explains the *how* — the automated test file, what it covers, what it skips, and how to fill the gaps manually.
---
## What Is This File?
`deploy/test/qa_test.go` is a single Go test file (~1700 lines) that automates as much of `docs/testing-guide.md` as possible against a running certctl Docker Compose demo stack. It replaces the legacy `qa-smoke-test.sh` bash script.
It covers **all 54 Parts** of the testing guide:
- **~164 automated subtests** — API calls, database queries, source file checks, performance benchmarks
- **11 skipped Parts** — with documented reasons (external CAs, Windows, browser-only, etc.)
- **Remaining ~282 manual tests** — GUI flows, scheduler timing, Docker log inspection — must be done by a human following `docs/testing-guide.md`
## Architecture
```
┌────────────────────────┐ ┌──────────────────────────┐
│ qa_test.go │────▶│ certctl demo stack │
│ (//go:build qa) │ │ docker-compose.yml + │
│ │ │ docker-compose.demo.yml │
│ TestQA(t *testing.T) │ │ │
│ ├─ Part01_Infra │ │ ┌─ certctl-server :8443 │
│ ├─ Part02_Auth │ │ ├─ postgres :5432 │
│ ├─ Part03_CertCRUD │ │ └─ certctl-agent │
│ ├─ ... │ └──────────────────────────┘
│ └─ Part52_HelmChart │
└────────────────────────┘
```
Key design choices:
- **Build tag:** `//go:build qa` — never runs during `go test ./...` or CI. Only runs when explicitly requested.
- **Package:** `integration_test` — same package as `integration_test.go` (which uses `//go:build integration` for the test stack). They coexist but never run together.
- **Zero internal imports:** Uses only stdlib + `lib/pq` (from `go.mod`). All API interactions are plain HTTP. All JSON is decoded into lightweight local structs (`qaCert`, `qaJob`, etc.) — not the internal domain types.
- **Self-cleaning:** Tests that create data use `t.Cleanup()` to delete it afterward. The seed data is not modified.
## Prerequisites
1. **Docker Compose demo stack running:**
```bash
cd deploy
docker compose -f docker-compose.yml -f docker-compose.demo.yml up --build -d
```
Wait ~15 seconds for health checks to pass.
2. **Go 1.22+** installed (the project uses Go 1.25 in `go.mod`, but 1.22+ works for running tests).
3. **PostgreSQL port exposed** — the demo stack exposes port 5432 for database verification tests (table counts, schema checks).
4. **Repository checkout** — source file verification tests (`fileExists`, `fileContains`) read files relative to `qaRepoDir` (default: `../..` from `deploy/test/`).
## Running the Tests
### Full suite
```bash
cd deploy/test
go test -tags qa -v -timeout 10m ./...
```
### Single Part
```bash
go test -tags qa -v -run TestQA/Part03 ./...
```
### Single subtest
```bash
go test -tags qa -v -run TestQA/Part03_CertCRUD/Create_Minimal ./...
```
### With custom environment
```bash
CERTCTL_QA_SERVER_URL=https://staging.internal:8443 \
CERTCTL_QA_API_KEY=my-staging-key \
CERTCTL_QA_DB_URL=postgres://certctl:secret@db.internal:5432/certctl?sslmode=require \
CERTCTL_QA_REPO_DIR=/path/to/certctl \
go test -tags qa -v -timeout 10m ./...
```
### Environment Variables
| Variable | Default | Description |
|---|---|---|
| `CERTCTL_QA_SERVER_URL` | `http://localhost:8443` | certctl server URL |
| `CERTCTL_QA_API_KEY` | `change-me-in-production` | API key for Bearer auth |
| `CERTCTL_QA_DB_URL` | `postgres://certctl:certctl@localhost:5432/certctl?sslmode=disable` | PostgreSQL connection string |
| `CERTCTL_QA_REPO_DIR` | `../..` | Path to certctl repo root (for source file checks) |
## Part-by-Part Coverage Map
This table shows what each Part tests and what's left for manual verification.
| Part | Testing Guide Section | Automated Subtests | What's Automated | What's Manual |
|------|----------------------|-------------------|-----------------|--------------|
| 1 | Infrastructure & Deployment | 8 | Table count, health/ready endpoints, seed data counts (certs, agents, issuers, targets, policies) | Docker container health, log inspection, volume mounts |
| 2 | Authentication & Security | 4 | No-auth 401, bad-key 401, health-no-auth 200, no private keys in API | CORS preflight, rate limiting (429 + Retry-After), TLS config |
| 3 | Certificate Lifecycle | 10 | Create (minimal + full), get, 404, list pagination, status/issuer filters, sparse fields, update, archive | Deployment trigger, version history, certificate detail UI |
| 4 | Renewal Workflow | 3 | Trigger renewal, 404 on nonexistent, agent work endpoint | AwaitingCSR flow, agent key generation, full issuance cycle |
| 5 | Revocation | 5 | Revoke (default reason), already-revoked, nonexistent, invalid reason, CRL JSON | DER CRL, OCSP responder, revocation notifications |
| 6 | Policies & Profiles | 6 | Policy CRUD (create/delete), invalid type 400, profile CRUD, list | Policy violation detection, profile enforcement on CSR |
| 7 | Ownership & Teams | 4 | Team CRUD, owner CRUD, agent groups list | Owner notification routing, dynamic group matching |
| 8 | Job System | 2 | List jobs, 404 on nonexistent | Job state transitions, approval workflow, cancellation |
| 9 | Issuer Connectors | 4 | List, get detail, create (GenericCA), missing name 400 | Test connection, issuer-specific issuance flow |
| 10 | Sub-CA Mode | SKIP | — | Requires CA cert+key on disk |
| 11 | ACME ARI | SKIP | — | Requires ARI-capable CA |
| 12 | Vault PKI | SKIP | — | Requires live Vault server |
| 13 | DigiCert | SKIP | — | Requires DigiCert sandbox |
| 14 | Target Connectors | 3 | List, create NGINX target, delete 204 | Deploy to real target, validate deployment |
| 1517 | Apache/HAProxy, Traefik/Caddy, IIS | — | (Covered by source checks in Parts 4246) | Requires real services or Windows |
| 18 | Agent Operations | 3 | Heartbeat (register), metadata check, auto-create on heartbeat | Agent binary behavior, key storage, discovery scan |
| 19 | Agent Work Routing | 1 | Empty work for agent with no targets | Scoped job assignment, multi-target fan-out |
| 20 | Post-Deployment Verification | 1 | 404 on nonexistent job verification | TLS probing, fingerprint comparison |
| 21 | EST Server | 2 | CACerts (200 + content-type), CSRAttrs (200/204) | simpleenroll with CSR, simplereenroll, PKCS#7 parsing |
| 22 | Certificate Export | 3 | PEM export, PKCS#12 export, 404 on nonexistent | Download mode, file content validation |
| 25 | Certificate Discovery | 5 | List discovered, summary, list scan targets, create target, invalid CIDR 400 | Agent filesystem scan, claim/dismiss workflow |
| 26 | Enhanced Query API | 4 | Sort descending, cursor pagination, time-range filter, invalid sort field | Field projection correctness, cursor token cycling |
| 27 | Request Body Size Limits | 1 | 2MB body rejected (413/400) | Exact limit boundary (1MB) |
| 28 | CLI | SKIP | — | Requires compiled `certctl-cli` binary |
| 29 | MCP Server | SKIP | — | Requires compiled `mcp-server` binary + stdio |
| 30 | Observability | 7 | Dashboard summary, certs by status, expiration timeline, job trends, issuance rate, JSON metrics (uptime + gauges), Prometheus (content-type + 4 metric names) | Chart rendering (GUI), Grafana import |
| 31 | Notifications | 2 | List, 404 on nonexistent | Notification content, mark-read, email/Slack delivery |
| 32 | Audit Trail | 3 | List events (≥10), PUT immutability, DELETE immutability | Actor attribution, body hash, time range filters |
| 33 | Background Scheduler | SKIP | — | Timing-dependent; verify via Docker logs |
| 34 | Structured Logging | SKIP | — | Requires Docker log inspection |
| 35 | GUI Testing | SKIP | — | Requires browser |
| 3637 | Issuer Catalog, Frontend Audit | SKIP | — | Requires browser |
| 38 | Error Handling | 5 | Malformed JSON, missing required field, method not allowed, UTF-8 CN, empty body | Stack trace suppression, error response format |
| 39 | Performance | 5 | List certs < 200ms, stats < 500ms, metrics < 200ms, Prometheus < 300ms, audit < 500ms | Load testing, concurrent request handling |
| 40 | Documentation | 8 | README, quickstart, architecture, connectors, compliance exist; migration guides exist; 8 issuer types in docs; 11 target types in docs | Content accuracy, link validity |
| 41 | Regression | 3 | DELETE 204, per_page max fallback, network scan target seed count | `errors.Is(errors.New())` anti-pattern source scan |
| 42 | Envoy Target | 5 | Domain type, connector file, test file, OpenAPI, agent dispatch | Envoy deployment test, SDS config |
| 43 | Postfix/Dovecot | 3 | Domain types (Postfix + Dovecot), connector file, OpenAPI | Mail server deployment test |
| 44 | SSH Target | 4 | Domain type, connector file, agent dispatch (`sshconn`), OpenAPI | SSH deployment test (requires target host) |
| 45 | Windows Certificate Store | 3 | Domain type, connector file, shared certutil package | Windows deployment (requires Windows) |
| 46 | Java Keystore | 3 | Domain type, connector file, OpenAPI | JKS deployment (requires keytool) |
| 47 | Certificate Digest Email | 3 | Preview endpoint (200/503), service file, adapter file | SMTP delivery, HTML template rendering |
| 48 | Dynamic Issuer Config | 4 | Crypto package exists, create ACME issuer via API, config redaction check, migration exists | Test connection flow, registry rebuild |
| 49 | Dynamic Target Config | 2 | Create NGINX target via API, migration exists | Test connection via agent heartbeat |
| 50 | Onboarding Wizard | 2 | Wizard component exists, docker-compose split (clean vs demo) | Wizard UI flow, step completion |
| 51 | ACME Profile Selection | 3 | Profile module exists, frontend config, RFC 9702→9773 renumber check | Profile-aware issuance against real CA |
| 52 | Helm Chart | 5 | Chart.yaml, values.yaml, 4 templates exist, securityContext, health probes | `helm template` rendering, `helm install` |
| 53 | Kubernetes Secrets Target Connector (M47) | 18 | Config validation (namespace DNS-1123, secret name DNS subdomain, label keys, required fields), deployment (create/update Secret, chain concatenation, error propagation), validation (serial comparison, not-found, empty cert) | GUI target wizard KubernetesSecrets fields (namespace, secret_name, labels, kubeconfig_path), Helm RBAC toggle, TargetDetailPage type label |
| 54 | AWS ACM Private CA Issuer Connector (M47) | 23 | Config validation (region, CA ARN regex, signing algorithm whitelist, validity_days, defaults), issuance (full flow, empty CSR, errors), renewal (reuses issuance), revocation (reason mapping, default, errors), GetOrderStatus completed, GetCACertPEM (success/chain/error), GetRenewalInfo nil | GUI issuer wizard AWSACMPCA fields (region, ca_arn, signing_algorithm, validity_days, template_arn), seed data visibility, create issuer flow |
**Totals:** ~164 automated subtests, 11 fully skipped Parts, ~282 manual tests remaining.
## Test Categories
The automated tests fall into four categories:
### 1. API Integration Tests (majority)
Make real HTTP requests to the running server and verify status codes, response structure, and JSON field values. Examples:
- `POST /api/v1/certificates` with valid payload → 201
- `GET /api/v1/certificates?status=Active` → all returned certs have `status: "Active"`
- `DELETE /api/v1/certificates/mc-qa-full` → 204
### 2. Database Verification Tests
Connect directly to PostgreSQL and verify schema state:
- Table count ≥ 19 (from migrations 000001000010)
- Useful for catching migration regressions
### 3. Source File Verification Tests
Read files from the repo checkout and verify structure:
- Domain types exist in `internal/domain/connector.go` (e.g., `TargetTypeEnvoy`)
- Connector implementations exist (e.g., `internal/connector/target/envoy/envoy.go`)
- Documentation contains expected content (all issuer/target types listed)
- No stale RFC 9702 references (replaced by RFC 9773)
### 4. Performance Spot Checks
Timed API requests with threshold assertions:
- `GET /api/v1/certificates?per_page=15` < 200ms
- `GET /api/v1/stats/summary` < 500ms
- `GET /api/v1/metrics/prometheus` < 300ms
## What This Test Does NOT Cover
These gaps must be filled by manual testing per `docs/testing-guide.md`:
### External CA Integrations (Parts 1013)
- **Sub-CA mode** — requires CA cert+key files on disk
- **ACME ARI** — requires a CA that supports RFC 9773 Renewal Information
- **Vault PKI** — requires a running HashiCorp Vault instance
- **DigiCert / Sectigo / Google CAS** — requires sandbox API credentials
### Browser/GUI Testing (Parts 3537, 50)
- Dashboard chart rendering (Recharts)
- Onboarding wizard step-by-step flow
- Issuer catalog card layout and create wizard
- Bulk operations UI (multi-select, progress bars)
- Discovery triage workflow
### Real Deployment Testing (Parts 1517)
- NGINX/Apache/HAProxy file write + reload
- Traefik/Caddy file provider or API reload
- IIS PowerShell/WinRM (requires Windows)
- F5 BIG-IP iControl REST (requires appliance or mock)
- SSH agentless deployment (requires target host)
### Agent Binary Behavior (Parts 18, 2829)
- Agent-side ECDSA key generation and CSR submission
- Agent filesystem discovery scan
- CLI tool (`certctl-cli`) — all 10 subcommands
- MCP server (`mcp-server`) — stdio transport
### Timing-Dependent Tests (Parts 3334)
- Background scheduler loop execution (renewal, jobs, health, notifications, digest, network scan)
- Structured logging format verification (requires Docker log parsing)
## How This Relates to `integration_test.go`
Both files live in `deploy/test/` in the same Go package (`integration_test`):
| | `qa_test.go` | `integration_test.go` |
|---|---|---|
| **Build tag** | `//go:build qa` | `//go:build integration` |
| **Target stack** | Demo (`docker-compose.yml` + `docker-compose.demo.yml`) | Test (`docker-compose.test.yml`) |
| **Port** | 8443 | Different (test stack config) |
| **Seed data** | `seed_demo.sql` (32 certs, 8 agents, realistic history) | Minimal (created by tests) |
| **CA backends** | Local CA only (demo mode) | Pebble ACME, step-ca, NGINX |
| **Purpose** | Release QA — broad coverage, spot checks | Functional — end-to-end issuance, renewal, revocation against real CAs |
| **Run frequency** | Before each release tag | CI on every PR |
They are complementary. Integration tests prove the machinery works. QA tests prove the product works at release quality.
## Seed Data Reference
The QA tests depend on `migrations/seed_demo.sql`. Key IDs used:
### Certificates (32 total)
`mc-api-prod`, `mc-web-prod`, `mc-pay-prod`, `mc-dash-prod`, `mc-data-prod`, `mc-search-prod`, `mc-admin-prod`, `mc-blog-prod`, `mc-docs-prod`, `mc-status-prod`, `mc-grpc-prod`, `mc-vault-prod`, `mc-consul-prod`, `mc-shop-prod`, `mc-auth-prod`, `mc-cdn-prod`, `mc-mail-prod`, `mc-ci-prod`, `mc-legacy-prod`, `mc-old-api`, `mc-wiki-prod`, `mc-api-stg`, `mc-web-stg`, `mc-pay-stg`, `mc-api-dev`, `mc-grafana-prod`, `mc-vpn-prod`, `mc-wildcard-prod`, `mc-compromised`, `mc-edge-eu`, `mc-k8s-ingress`, `mc-smime-bob`
### Agents (9 total)
`ag-web-prod`, `ag-web-staging`, `ag-lb-prod`, `ag-iis-prod`, `ag-data-prod`, `ag-edge-01`, `ag-k8s-prod`, `ag-mac-dev`, `server-scanner` (sentinel)
### Issuers (9 total)
`iss-local`, `iss-acme-le`, `iss-stepca`, `iss-acme-zs`, `iss-openssl`, `iss-vault`, `iss-digicert`, `iss-sectigo`, `iss-googlecas`
### Targets (8 total)
`tgt-nginx-prod`, `tgt-nginx-staging`, `tgt-haproxy-prod`, `tgt-apache-prod`, `tgt-iis-prod`, `tgt-traefik-prod`, `tgt-caddy-prod`, `tgt-nginx-data`
### Network Scan Targets (4 total)
`nst-dc1-web`, `nst-dc2-apps`, `nst-dmz`, `nst-edge`
## Troubleshooting
### "Server unreachable" on startup
The test pings `GET /health` before running anything. If this fails:
```bash
# Check if the stack is running
docker compose -f docker-compose.yml -f docker-compose.demo.yml ps
# Check server logs
docker compose -f docker-compose.yml -f docker-compose.demo.yml logs certctl-server
# Check if the port is exposed
curl -s http://localhost:8443/health
```
### "connect to QA DB" failure
The database tests connect directly to PostgreSQL. Ensure port 5432 is exposed:
```bash
docker compose -f docker-compose.yml -f docker-compose.demo.yml port postgres 5432
```
### Performance tests flaking
The performance thresholds (200ms, 300ms, 500ms) assume a local Docker stack. On slow CI runners or remote Docker hosts, increase the thresholds or skip Part 39:
```bash
go test -tags qa -v -run 'TestQA/Part(?!39)' ./...
```
### Source file checks failing
The `fileExists` and `fileContains` helpers read from `CERTCTL_QA_REPO_DIR` (default `../..`). If running from a non-standard location:
```bash
CERTCTL_QA_REPO_DIR=/absolute/path/to/certctl go test -tags qa -v ./...
```
## Adding New Tests
When a new feature ships:
1. **Add a Part section** in `qa_test.go` following the numbering in `docs/testing-guide.md`
2. **API tests**: use `c.get()`, `c.post()`, `c.bodyStr()`, `c.getJSON()`, `c.timedGet()`
3. **Source checks**: use `fileExists(t, "relative/path")` and `fileContains(t, "path", "substring")`
4. **DB checks**: use `openQADB(t)` and `db.queryInt(t, "SELECT ...")`
5. **Cleanup**: always use `t.Cleanup()` for data created during tests
6. **Skip if external**: use `t.Skip("Requires X — manual test")` with a clear reason
## Version History
- **v1.0** (April 2026) — Initial release covering all 52 Parts of testing-guide.md v2.1. Replaces `qa-smoke-test.sh`.
- **v1.1** (April 2026) — Added Parts 5354 (M47: Kubernetes Secrets target + AWS ACM PCA issuer). 54 Parts total, ~164 automated subtests.
+24 -4
View File
@@ -60,6 +60,21 @@ cp deploy/.env.example deploy/.env
docker compose -f deploy/docker-compose.yml up -d --build
```
### Docker Compose Environments
The `deploy/` directory contains four compose files for different use cases:
| File | Purpose | How to run |
|------|---------|------------|
| `docker-compose.yml` | **Base platform.** PostgreSQL + certctl server + agent. Clean dashboard with onboarding wizard — use this for production or first-time setup. | `docker compose -f deploy/docker-compose.yml up --build` |
| `docker-compose.demo.yml` | **Demo data override.** Layers 180 days of realistic seed data (15 certs, 5 agents, multiple issuers) onto the base. Dashboard charts and tables look populated on first boot. | `docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up --build` |
| `docker-compose.dev.yml` | **Development override.** Adds PgAdmin (port 5050), debug-level logging, and a Delve debugger port (40000) for the server. | `docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.dev.yml up --build` |
| `docker-compose.test.yml` | **Integration test environment.** 7 containers on a static-IP subnet: PostgreSQL, certctl server+agent, step-ca, Pebble ACME server, challenge test server, and NGINX. Runs the full issuance→deployment→verification flow against real CA backends. Standalone — does not combine with the base file. | `docker compose -f deploy/docker-compose.test.yml up --build` |
Override files are layered onto the base with multiple `-f` flags. The test environment is self-contained and runs independently. To reset any environment's data, add `down -v` to remove volumes.
For a deep dive into every service, environment variable, and networking decision, see the [Docker Compose Environments Guide](../deploy/ENVIRONMENTS.md).
### Kubernetes with Helm
For production deployments on Kubernetes, use the Helm chart:
@@ -271,9 +286,11 @@ curl -s -X POST http://localhost:8443/api/v1/certificates/$CERT_ID/revoke \
Supported RFC 5280 reason codes: `unspecified`, `keyCompromise`, `caCompromise`, `affiliationChanged`, `superseded`, `cessationOfOperation`, `certificateHold`, `privilegeWithdrawn`.
Confirm via CRL:
Confirm via the unauthenticated DER CRL (RFC 5280 §5, RFC 8615):
```bash
curl -s http://localhost:8443/api/v1/crl | jq .
# Fetch the CRL without any API key — relying parties shouldn't need one.
curl -s http://localhost:8443/.well-known/pki/crl/iss-local -o /tmp/crl.der
openssl crl -inform der -in /tmp/crl.der -noout -text | head -40
```
### Interactive approval workflow
@@ -404,7 +421,7 @@ export CERTCTL_API_KEY="test-key-123"
./mcp-server
```
Exposes 78 MCP tools covering the REST API via stdio transport. Ask Claude: "What certificates are expiring in the next 30 days?", "Revoke the payments cert due to key compromise", "Show me the audit trail."
Exposes the full REST API via MCP over stdio transport. Ask Claude: "What certificates are expiring in the next 30 days?", "Revoke the payments cert due to key compromise", "Show me the audit trail."
## Demo Data Reference
@@ -461,7 +478,10 @@ The `-v` flag removes the PostgreSQL data volume for a clean slate.
## What's Next
**Ready to deploy with your stack?** The [Deployment Examples](examples.md) page has 5 turnkey docker-compose scenarios — pick the one closest to your setup and have it running in minutes. It also covers migration paths from Certbot, acme.sh, and cert-manager.
- **[Deployment Examples](examples.md)** — ACME+NGINX, wildcard DNS-01, private CA+Traefik, step-ca+HAProxy, multi-issuer
- **[Advanced Demo](demo-advanced.md)** — Issue a real certificate via the Local CA end-to-end
- **[Architecture](architecture.md)** — How the control plane, agents, and connectors work together
- **[Connector Guide](connectors.md)** — Build custom connectors for your infrastructure
- **[Connector Reference](connectors.md)** — Configuration for all 7 issuers and 10 targets
- **[Concepts Guide](concepts.md)** — TLS certificates, CAs, and private keys explained from scratch
+6 -3
View File
@@ -512,12 +512,15 @@ curl -s -X POST http://localhost:8443/api/v1/certificates/mc-local-test/revoke \
### Step 7b: Check the CRL (Certificate Revocation List)
The CRL is a DER-encoded X.509 v2 CRL (RFC 5280 §5) served under the RFC 8615 well-known namespace. It is deliberately unauthenticated — relying parties that need to verify revocation don't have certctl API keys.
```bash
curl -s -H "Authorization: Bearer test-key-2026" \
http://localhost:8443/api/v1/crl | python3 -m json.tool
# No Authorization header — the endpoint is public by design.
curl -s http://localhost:8443/.well-known/pki/crl/iss-local -o /tmp/crl.der
openssl crl -inform der -in /tmp/crl.der -noout -text | head -40
```
**What you should see**: A list that includes the revoked certificate's serial number, the reason, and the timestamp.
**What you should see**: `openssl` prints the CRL issuer DN, `This Update` / `Next Update` timestamps, and at least one entry whose `Serial Number` matches the cert you just revoked, with `CRL Reason Code: Superseded` (or whichever reason you passed in step 7a). The response's `Content-Type` header is `application/pkix-crl`.
### Step 7c: Check in the dashboard
+4557 -2931
View File
File diff suppressed because it is too large Load Diff
+77 -40
View File
@@ -1,82 +1,119 @@
# Why certctl?
Certificate management is broken at every scale between "one domain on Let's Encrypt" and "Fortune 500 budget for Venafi."
Certificate management is broken at every scale between "one domain on Let's Encrypt" and "Fortune 500 budget for Venafi." certctl fills that gap: a self-hosted platform that automates the entire certificate lifecycle, works with any CA, deploys to any server, and keeps private keys on your infrastructure. It's free, source-available, and you own everything.
If you run a personal blog, Certbot works fine. If your company spends $200K/year on Keyfactor, you're covered. But if you're an ops engineer managing 20-500 certificates across NGINX, Apache, HAProxy, and maybe a private CA — the tools available today either don't do enough or cost too much.
## The Math That Forces the Decision
certctl fills that gap.
The CA/Browser Forum passed [Ballot SC-081v3](https://cabforum.org/2025/04/11/ballot-sc081v3-introduce-schedule-of-reducing-validity-and-data-reuse-periods/) in April 2025, mandating a phased reduction in TLS certificate lifetimes: **200 days** as of March 2026, **100 days** by March 2027, and **47 days** by March 2029.
## The Problem
At 47-day lifespans, a team managing 100 certificates is processing **7+ renewals per week**, every week, forever. At 200 certificates, it's two per day. Manual processes, calendar reminders, and certbot cron jobs don't scale to this — a single missed renewal becomes a production outage at 3 AM. Certificate lifecycle automation is no longer optional; the only question is what tool runs it.
The CA/Browser Forum passed [Ballot SC-081v3](https://cabforum.org/2025/04/11/ballot-sc081v3-introduce-schedule-of-reducing-validity-and-data-reuse-periods/) in April 2025, mandating a phased reduction in TLS certificate lifetimes: 200 days as of March 2026, 100 days by March 2027, and 47 days by March 2029. That means every organization needs automated certificate renewal — not eventually, but now.
## The Landscape Today
The existing options for automation are:
If you're evaluating your options, here's what you'll find:
- **ACME clients** (Certbot, Lego, CertWarden): Handle issuance and renewal for ACME-compatible CAs, but don't manage deployment to target servers, don't provide inventory visibility, don't support non-ACME CAs, and don't offer audit trails or policy enforcement.
- **Kubernetes-native** (cert-manager): Works well inside Kubernetes, but if your infrastructure includes bare-metal servers, VMs, or network appliances alongside Kubernetes, you need a separate solution for everything cert-manager can't reach.
- **Commercial SaaS** (CertKit, Sectigo CLM): Handle more of the lifecycle but are proprietary, cloud-dependent, and priced per certificate — costs scale linearly with your infrastructure.
- **Enterprise platforms** (Venafi, Keyfactor, AppViewX): Comprehensive but start at $75K/year and require dedicated teams to operate.
**ACME clients** (certbot, lego, acme.sh) handle issuance and renewal for Let's Encrypt and similar CAs, but they don't deploy to target servers, don't track inventory, don't support private CAs, and give you no audit trail or policy enforcement. You end up writing glue scripts and hoping they don't break.
**Kubernetes-native tools** (cert-manager) work well inside the cluster, but most organizations run mixed infrastructure — NGINX on VMs, HAProxy at the edge, IIS on Windows, maybe an F5. You need a separate solution for everything outside Kubernetes.
**Commercial SaaS platforms** handle more of the lifecycle but are proprietary, cloud-dependent, and priced per certificate. At 100 certs and 20 agents, SaaS pricing runs $3,000-5,000/year and scales linearly. You're paying rent on your own infrastructure's security.
**Enterprise platforms** (Venafi, Keyfactor, AppViewX) are comprehensive but start at $75K/year and require dedicated teams to operate. If you have a 50-server environment, the licensing costs more than the servers.
## What certctl Does Differently
certctl is a self-hosted certificate lifecycle platform. It handles issuance, renewal, deployment, revocation, discovery, and monitoring — with three design decisions that no other tool at any price point combines:
certctl handles issuance, renewal, deployment, revocation, discovery, and monitoring — with three design decisions that no other tool at any price point combines:
### 1. Private Keys Never Leave Your Infrastructure
certctl agents generate private keys locally using ECDSA P-256. The agent creates a CSR and submits it to the control plane. The signed certificate comes back. The private key stays on the agent's filesystem with 0600 permissions.
certctl agents generate ECDSA P-256 private keys locally. The agent creates a CSR and submits it to the control plane. The signed certificate comes back. The private key stays on the agent's filesystem with 0600 permissions — it never crosses the network.
This isn't a premium feature — it's the default behavior in the free tier. Most competitors either generate keys server-side (creating a single point of compromise) or gate key isolation behind paid tiers.
This isn't a premium feature. It's the default behavior, free. Most alternatives either generate keys on the server (creating a single point of compromise) or gate key isolation behind paid tiers.
### 2. CA-Agnostic Issuer Architecture
certctl works with any certificate authority, not just ACME providers:
certctl works with any certificate authority, not just ACME providers. Nine issuer connectors ship today, all free:
- **ACME** (Let's Encrypt, ZeroSSL, Google Trust Services, Buypass) — HTTP-01 and DNS-01 challenges, DNS-PERSIST-01 for zero-touch renewals, External Account Binding
- **step-ca** (Smallstep) — native /sign API with JWK provisioner authentication
- **Local CA** — self-signed or sub-CA mode (chain to your enterprise root CA, e.g. ADCS)
- **OpenSSL / Custom CA** — delegate signing to any shell script with configurable timeout
- **EST enrollment** (RFC 7030) — device certificate enrollment for WiFi/802.1X, MDM, and IoT
- **ACME v2** (Let's Encrypt, ZeroSSL, Google Trust Services, Buypass) — HTTP-01, DNS-01, DNS-PERSIST-01 challenges, External Account Binding, ACME Renewal Information (RFC 9773), certificate profile selection
- **HashiCorp Vault PKI**`/v1/{mount}/sign/{role}` API, token auth
- **DigiCert CertCentral** — async order model, OV/EV support
- **Sectigo SCM** — async order model, DV/OV/EV support, 3-header auth
- **Google Cloud CAS** — Certificate Authority Service, OAuth2 service account auth, CA pool selection
- **step-ca** (Smallstep) — native /sign API with JWK provisioner auth
- **Local CA** — self-signed or sub-CA mode (chain to ADCS or any enterprise root)
- **OpenSSL / Custom CA** — delegate signing to any shell script
- **EST enrollment** (RFC 7030) — device certs for WiFi/802.1X, MDM, IoT
Every issuer connector implements the same interface. Switching CAs or running multiple CAs in parallel requires zero code changes — just configuration.
Every connector implements the same interface. Running multiple CAs in parallel — Let's Encrypt for public certs, Vault for internal services, your enterprise CA for legacy systems — is configuration, not code.
### 3. Post-Deployment Verification
Every other tool in this space stops at "the deployment command succeeded." certctl goes further: after deploying a certificate to a target, the agent connects back to the target's TLS endpoint and verifies the served certificate matches what was deployed, using SHA-256 fingerprint comparison.
Every other tool in this space stops at "the deployment command succeeded." certctl goes further: after deploying a certificate, the agent connects back to the live TLS endpoint and compares the SHA-256 fingerprint of the served certificate against what was deployed.
A reload command can exit 0 while the certificate doesn't take effect — wrong virtual host, stale cache, config that validates but doesn't apply. certctl catches this.
A reload command can exit 0 while the certificate doesn't take effect — wrong virtual host, stale cache, config that validates but doesn't apply. certctl catches this automatically.
## What Else Ships Free
The three differentiators above get the headlines, but the feature surface is wider than most paid platforms:
**13 deployment targets** — NGINX, Apache, HAProxy, Traefik, Caddy, Envoy, IIS (local PowerShell + remote WinRM), F5 BIG-IP (proxy agent + iControl REST), Postfix, Dovecot, SSH (agentless), Windows Certificate Store, and Java Keystore. All use a pluggable connector model. The control plane never initiates outbound connections — agents poll for work, meaning certctl works behind firewalls, across network zones, and in air-gapped environments.
**Network certificate discovery** — active TLS scanning of CIDR ranges finds certificates you didn't know existed. Agents also scan local filesystems for PEM/DER files. Everything feeds into a triage workflow where you claim, dismiss, or import discovered certs into management.
**Immutable audit trail** — every API call recorded (method, path, actor, body hash, status, latency). Every certificate lifecycle event tracked. Append-only, no update or delete. Mapped to SOC 2, PCI-DSS 4.0, and NIST SP 800-57 compliance frameworks with published evidence guides.
**Policy engine** — 5 rule types (allowed issuers, allowed domains, required metadata, allowed environments, renewal lead time) with violation tracking and severity levels.
**PKI compliance** — DER-encoded X.509 CRL signed by issuing CA, embedded OCSP responder, RFC 5280 revocation with all reason codes, short-lived certificate exemption.
**Prometheus metrics** — `/api/v1/metrics/prometheus` in standard exposition format. Works with Prometheus, Grafana Agent, Datadog Agent, Victoria Metrics.
**MCP server** — the entire REST API is exposed via MCP for AI-assisted certificate management via Claude, Cursor, or any MCP-compatible client. No other certificate platform offers this.
**Full REST API** — OpenAPI 3.1-documented operations covering the entire platform. CLI tool with 10 subcommands. Helm chart for Kubernetes deployment. Scheduled certificate digest emails. Certificate export in PEM and PKCS#12. S/MIME support with EKU-aware issuance.
**Extensively tested** — Go backend with race detection, static analysis (golangci-lint), and vulnerability scanning (govulncheck) on every commit. CI-enforced per-layer coverage thresholds. Frontend test suite. Every push is gated.
## How certctl Compares
### vs. CertKit
### vs. ACME Clients
Closest competitor architecturally — agent-based, private key isolation (Keystore), multi-platform. certctl leads on issuer coverage (ACME + step-ca + Local CA + OpenSSL + EST vs. ACME-only), PKI compliance (CRL, OCSP, RFC 5280 revocation, immutable audit trail — all missing from CertKit today), policy engine (5 rule types vs. none), and network discovery (CIDR TLS scanning vs. none). certctl is source-available (BSL 1.1 → Apache 2.0) with no cert limit; CertKit is proprietary SaaS with a 3-cert free tier. Where CertKit leads: more deployment targets today (adds LiteSpeed, IIS, auto-detection), Windows support, Kubernetes, and polished SaaS onboarding.
ACME clients solve one slice of the problem — issuance and renewal from ACME CAs. certctl replaces the ACME client, adds 6 more CA integrations, deploys the cert to the right server, verifies it's live, tracks it in an inventory, alerts on expiry, logs everything to an audit trail, and enforces policy. If you're currently running certbot behind a cron job and a prayer, certctl replaces all of it.
### vs. KeyTalk
### vs. Agent-Based SaaS
Commercial (proprietary) PKI platform from a Dutch company — on-prem appliance, cloud, or managed service. Broader cert type coverage (TLS, S/MIME, device auth, VPN) and DigiCert + SCEP integrations. No public documentation on policy engine, API surface, or audit capabilities. No free tier, no public pricing. certctl trades breadth of cert types for full transparency — source-available, public API spec, free community edition with no limits.
The closest architectural competitors use the same agent model — local key generation, CSR submission, push-based deployment. Where certctl differs: it supports 9 issuer types (not just ACME), provides CRL/OCSP/revocation infrastructure (not just issuance), includes a policy engine and network discovery, and is source-available with no certificate limit. SaaS alternatives are typically proprietary, priced per certificate ($2+/cert/month), and cap their free tiers at 3-5 certificates. certctl is free for any number of certificates, forever.
### vs. Enterprise Platforms (Venafi, Keyfactor)
### vs. Commercial PKI Platforms
Comprehensive solutions with decades of features — at $75K-$250K+/yr. certctl targets organizations that need 80% of those capabilities at 1% of the cost. The trade-off: no SSO/RBAC yet (coming in certctl Pro), no F5/IIS target connectors yet, no SLA-backed support.
On-prem or hosted commercial platforms offer broader cert type coverage (VPN certs, device auth, SCEP) and deeper CA integrations. The trade-off: no free tier, opaque pricing (often €13K+/year for 1,500 certs), proprietary codebases, and no public API documentation. certctl trades breadth of exotic cert types for full transparency — source-available code, fully documented OpenAPI spec, and a free community edition with no artificial limits.
## Getting Started
### vs. Enterprise Platforms
Venafi and Keyfactor offer decades of features at $75K-$250K+/year. certctl targets organizations that need 80% of those capabilities at a fraction of the cost. What certctl doesn't have yet: SSO/RBAC (coming in certctl Pro), vendor SLA-backed support. What certctl does have that enterprise platforms don't: an MCP server for AI-assisted management, ACME ARI (RFC 9773) for CA-directed renewal timing, and a deployment model that works in 5 minutes instead of 5 months.
## Who Should Look Elsewhere
certctl isn't the right tool for everyone:
- **Single-domain sites** — if you have one certificate on one server, certbot is fine. certctl is designed for managing tens to hundreds of certificates across multiple servers and CAs.
- **Pure Kubernetes environments** — if every workload runs in-cluster and you're happy with cert-manager, there's no reason to add another tool. certctl shines when your infrastructure extends beyond Kubernetes.
- **Organizations that need a vendor SLA today** — certctl is source-available software maintained by a small team. If you need contractual uptime guarantees and a support hotline, an enterprise platform is the right choice (for now).
## See It Running
The demo seeds certificates across multiple issuers, agents, and deployment targets with 180 days of realistic history — jobs, audit events, discovery scans, approval workflows — so you can explore every feature immediately.
```bash
# Clone and start with Docker Compose (includes demo data)
git clone https://github.com/shankar0123/certctl.git
cd certctl/deploy
docker compose up -d
# Open the dashboard
open http://localhost:8443
cd certctl/deploy && docker compose up -d
# Dashboard at http://localhost:8443
```
The demo seeds 35 certificates across 5 issuers, 8 agents, 8 deployment targets, 90 days of job history, discovery scan data, network scan targets, and pending approval jobs so you can explore every feature immediately.
See the [Quickstart Guide](quickstart.md) for a full walkthrough.
See the [Quickstart Guide](quickstart.md) for a full walkthrough, or explore the [5 turnkey examples](../examples/) for specific scenarios (ACME+NGINX, wildcard DNS-01, private CA+Traefik, step-ca+HAProxy, multi-issuer).
## License
certctl is licensed under the [Business Source License 1.1](../LICENSE). The licensed work is free to use for any purpose other than offering a competing managed service. The license converts to Apache 2.0 on March 1, 2033.
certctl is source-available under the [Business Source License 1.1](../LICENSE). Free for any use except offering a competing managed service. Converts to Apache 2.0 on March 14, 2033.
The source is available, auditable, and self-hostable. You own your data, your keys, and your deployment.
You own your data, your keys, and your deployment.
+12 -10
View File
@@ -13,16 +13,18 @@ This example demonstrates certctl's core use case: **automatically manage TLS ce
## Architecture
```
Your Domain (example.com)
↓ [HTTP-01 validation, port 80]
Let's Encrypt ACME
↓ [CSR submission]
certctl Server (control plane)
↓ [API polling]
certctl Agent (on NGINX server)
↓ [deploy cert+key]
NGINX Reverse Proxy
```mermaid
flowchart TD
A["Your Domain (example.com)"]
B["Let's Encrypt ACME"]
C["certctl Server (control plane)"]
D["certctl Agent (on NGINX server)"]
E["NGINX Reverse Proxy"]
A -->|HTTP-01 validation<br/>port 80| B
B -->|CSR submission| C
C -->|API polling| D
D -->|deploy cert+key| E
```
## Prerequisites
+2 -2
View File
@@ -26,7 +26,7 @@ services:
container_name: certctl-server-acme-nginx
environment:
# Database
DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
CERTCTL_DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
# Server settings
CERTCTL_SERVER_PORT: 8443
@@ -61,7 +61,7 @@ services:
networks:
- certctl-network
healthcheck:
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/api/v1/health || exit 1']
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/health || exit 1']
interval: 10s
timeout: 5s
retries: 3
@@ -50,7 +50,7 @@ services:
container_name: certctl-server-dns01
environment:
# Database
DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
CERTCTL_DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
# Server settings
CERTCTL_SERVER_PORT: 8443
@@ -88,7 +88,7 @@ services:
# Default is 30s; increase if your DNS propagates slowly
# Set via CERTCTL_ACME_DNS_PROPAGATION_WAIT in code, or rely on default
# Optional: Let's Encrypt Renewal Information (RFC 9702) for CA-directed renewal timing
# Optional: Let's Encrypt Renewal Information (RFC 9773) for CA-directed renewal timing
# CERTCTL_ACME_ARI_ENABLED: "true"
# Local CA as fallback for internal services (optional)
@@ -113,7 +113,7 @@ services:
- certctl-network
healthcheck:
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/api/v1/health || exit 1']
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/health || exit 1']
interval: 10s
timeout: 5s
retries: 3
+2 -2
View File
@@ -27,7 +27,7 @@ services:
container_name: certctl-server-multi-issuer
environment:
# Database
DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
CERTCTL_DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
# Server settings
CERTCTL_SERVER_PORT: 8443
@@ -64,7 +64,7 @@ services:
networks:
- certctl-network
healthcheck:
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/api/v1/health || exit 1']
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/health || exit 1']
interval: 10s
timeout: 5s
retries: 3
+24 -22
View File
@@ -13,27 +13,29 @@ With certctl, both issuer types are configured and available. You assign each ce
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
certctl Server (Control Plane)
│ - Let's Encrypt ACME issuer (HTTP-01 challenges)
│ - Local CA issuer (self-signed or sub-CA mode)
│ - PostgreSQL database (cert inventory, audit, jobs)
└─────────────────────────────────────────────────────────────────┘
│ API polling
┌─────────────────────────────────────────────────────────────────┐
│ certctl Agent │
│ - Discovers existing certs in /etc/nginx/ssl and /etc/app/ssl │
│ - Polls server for renewal/issuance/deployment jobs │
│ - Generates keys locally (agent-side crypto) │
│ - Deploys certs to NGINX and app service directories │
└─────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
NGINX (public TLS) App Services (internal TLS)
(Let's Encrypt certs) (Local CA certs)
```mermaid
flowchart TD
subgraph Server ["certctl Server (Control Plane)"]
A["Let's Encrypt ACME issuer<br/>(HTTP-01 challenges)"]
B["Local CA issuer<br/>(self-signed or sub-CA mode)"]
C["PostgreSQL database<br/>(cert inventory, audit, jobs)"]
end
subgraph Agent ["certctl Agent"]
D["Discovers existing certs<br/>(/etc/nginx/ssl, /etc/app/ssl)"]
E["Polls server for<br/>renewal/issuance/deployment jobs"]
F["Generates keys locally<br/>(agent-side crypto)"]
G["Deploys certs to NGINX<br/>and app service directories"]
end
subgraph Targets ["Target Services"]
H["NGINX (public TLS)<br/>(Let's Encrypt certs)"]
I["App Services (internal TLS)<br/>(Local CA certs)"]
end
Server -->|API polling| Agent
Agent -->|Deploy| H
Agent -->|Deploy| I
```
## Prerequisites
@@ -212,7 +214,7 @@ Each agent independently manages its local cert inventory and deployments. The s
- For ACME, ensure ports 80/443 are open and your domain resolves
### Agent can't reach server
- Check network: `docker compose exec certctl-agent curl http://certctl-server:8443/api/v1/health`
- Check network: `docker compose exec certctl-agent curl http://certctl-server:8443/health`
- Verify `CERTCTL_SERVER_URL` environment variable
### No issuers showing up
@@ -26,7 +26,7 @@ services:
container_name: certctl-server-private-ca
environment:
# Database
DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
CERTCTL_DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
# Server settings
CERTCTL_SERVER_PORT: 8443
@@ -77,7 +77,7 @@ services:
networks:
- certctl-network
healthcheck:
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/api/v1/health || exit 1']
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/health || exit 1']
interval: 10s
timeout: 5s
retries: 3
@@ -17,29 +17,16 @@ This example demonstrates certctl managing certificates for **internal services
## Architecture
```
┌──────────────────┐
certctl-server(Local CA issuer)
│ (control │
│ plane) │
└────────┬─────────┘
REST API (job polling)
┌────────▼──────────┐
│ certctl-agent │ (certificate deployer)
└────────┬──────────┘
│ Write cert/key files
┌────────▼──────────────────────┐
│ Traefik │
│ (watches cert directory) │
└────────────────────────────────┘
│ TLS handshakes
[Internal Services]
```mermaid
flowchart TD
A["certctl-server<br/>(control plane)<br/>(Local CA issuer)"]
B["certctl-agent<br/>(certificate deployer)"]
C["Traefik<br/>(watches cert directory)"]
D["[Internal Services]"]
A -->|REST API<br/>job polling| B
B -->|Write cert/key files| C
C -->|TLS handshakes| D
```
## Quick Start (Self-Signed CA)
+2 -2
View File
@@ -81,7 +81,7 @@ services:
container_name: certctl-server-stepca-haproxy
environment:
# Database
DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
CERTCTL_DATABASE_URL: postgres://certctl:${DB_PASSWORD:-certctl-dev-password}@postgres:5432/certctl?sslmode=disable
# Server settings
CERTCTL_SERVER_PORT: 8443
@@ -119,7 +119,7 @@ services:
networks:
- certctl-network
healthcheck:
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/api/v1/health || exit 1']
test: ['CMD-SHELL', 'curl -sf http://localhost:8443/health || exit 1']
interval: 10s
timeout: 5s
retries: 3
+1 -1
View File
@@ -315,7 +315,7 @@ Common issues:
Verify network:
```bash
docker compose exec certctl-agent curl http://certctl-server:8443/api/v1/health
docker compose exec certctl-agent curl http://certctl-server:8443/health
```
### HAProxy config validation fails
+23 -3
View File
@@ -1,6 +1,6 @@
module github.com/shankar0123/certctl
go 1.25.0
go 1.25.9
require (
github.com/google/uuid v1.6.0
@@ -10,14 +10,20 @@ require (
)
require (
golang.org/x/crypto v0.31.0
github.com/masterzen/winrm v0.0.0-20250927112105-5f8e6c707321
github.com/pkg/sftp v1.13.10
golang.org/x/crypto v0.41.0
software.sslmate.com/src/go-pkcs12 v0.7.0
)
require (
dario.cat/mergo v1.0.0 // indirect
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1 // indirect
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 // indirect
github.com/ChrisTrenkamp/goxpath v0.0.0-20210404020558-97928f7e12b6 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/bodgit/ntlmssp v0.0.0-20240506230425-31973bb52d9b // indirect
github.com/bodgit/windows v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.2.1 // indirect
github.com/containerd/containerd v1.7.18 // indirect
github.com/containerd/log v0.1.0 // indirect
@@ -32,12 +38,23 @@ require (
github.com/go-logr/logr v1.4.1 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/gofrs/uuid v4.4.0+incompatible // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/google/jsonschema-go v0.4.2 // indirect
github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
github.com/hashicorp/go-uuid v1.0.3 // indirect
github.com/jcmturner/aescts/v2 v2.0.0 // indirect
github.com/jcmturner/dnsutils/v2 v2.0.0 // indirect
github.com/jcmturner/gofork v1.7.6 // indirect
github.com/jcmturner/goidentity/v6 v6.0.1 // indirect
github.com/jcmturner/gokrb5/v8 v8.4.4 // indirect
github.com/jcmturner/rpc/v2 v2.0.3 // indirect
github.com/klauspost/compress v1.17.4 // indirect
github.com/kr/fs v0.1.0 // indirect
github.com/kr/text v0.2.0 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/magiconair/properties v1.8.7 // indirect
github.com/masterzen/simplexml v0.0.0-20190410153822-31eea3082786 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
github.com/moby/patternmatcher v0.6.0 // indirect
github.com/moby/sys/sequential v0.5.0 // indirect
@@ -54,7 +71,8 @@ require (
github.com/shirou/gopsutil/v3 v3.23.12 // indirect
github.com/shoenig/go-m1cpu v0.1.6 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/stretchr/testify v1.9.0 // indirect
github.com/stretchr/testify v1.10.0 // indirect
github.com/tidwall/transform v0.0.0-20201103190739-32f242e2dbde // indirect
github.com/tklauser/go-sysconf v0.3.12 // indirect
github.com/tklauser/numcpus v0.6.1 // indirect
github.com/yosida95/uritemplate/v3 v3.0.2 // indirect
@@ -63,7 +81,9 @@ require (
go.opentelemetry.io/otel v1.24.0 // indirect
go.opentelemetry.io/otel/metric v1.24.0 // indirect
go.opentelemetry.io/otel/trace v1.24.0 // indirect
golang.org/x/net v0.42.0 // indirect
golang.org/x/oauth2 v0.34.0 // indirect
golang.org/x/sys v0.40.0 // indirect
golang.org/x/text v0.28.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
+75 -10
View File
@@ -4,8 +4,16 @@ github.com/AdaLogics/go-fuzz-headers v0.0.0-20230811130428-ced1acdcaa24 h1:bvDV9
github.com/AdaLogics/go-fuzz-headers v0.0.0-20230811130428-ced1acdcaa24/go.mod h1:8o94RPi1/7XTJvwPpRSzSUedZrtlirdB3r9Z20bi2f8=
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1 h1:UQHMgLO+TxOElx5B5HZ4hJQsoJ/PvUvKRhJHDQXO8P8=
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 h1:mFRzDkZVAjdal+s7s0MwaRv9igoPqLRdzOLzw/8Xvq8=
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358/go.mod h1:chxPXzSsl7ZWRAuOIE23GDNzjWuZquvFlgA8xmpunjU=
github.com/ChrisTrenkamp/goxpath v0.0.0-20210404020558-97928f7e12b6 h1:w0E0fgc1YafGEh5cROhlROMWXiNoZqApk2PDN0M1+Ns=
github.com/ChrisTrenkamp/goxpath v0.0.0-20210404020558-97928f7e12b6/go.mod h1:nuWgzSkT5PnyOd+272uUmV0dnAnAn42Mk7PiQC5VzN4=
github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/bodgit/ntlmssp v0.0.0-20240506230425-31973bb52d9b h1:baFN6AnR0SeC194X2D292IUZcHDs4JjStpqtE70fjXE=
github.com/bodgit/ntlmssp v0.0.0-20240506230425-31973bb52d9b/go.mod h1:Ram6ngyPDmP+0t6+4T2rymv0w0BS9N8Ch5vvUJccw5o=
github.com/bodgit/windows v1.0.1 h1:tF7K6KOluPYygXa3Z2594zxlkbKPAOvqr97etrGNIz4=
github.com/bodgit/windows v1.0.1/go.mod h1:a6JLwrB4KrTR5hBpp8FI9/9W9jJfeQ2h4XDXU74ZCdM=
github.com/cenkalti/backoff/v4 v4.2.1 h1:y4OZtCnogmCPw98Zjyt5a6+QwPLGkiQsYW5oUqylYbM=
github.com/cenkalti/backoff/v4 v4.2.1/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
github.com/containerd/containerd v1.7.18 h1:jqjZTQNfXGoEaZdW1WwPU0RqSn1Bm2Ay/KJPUuO8nao=
@@ -39,6 +47,8 @@ github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-ole/go-ole v1.2.6 h1:/Fpf6oFPoeFik9ty7siob0G6Ke8QvQEuVcuChpwXzpY=
github.com/go-ole/go-ole v1.2.6/go.mod h1:pprOEPIfldk/42T2oK7lQ4v4JSDwmV0As9GaiUsvbm0=
github.com/gofrs/uuid v4.4.0+incompatible h1:3qXRTX8/NbyulANqlc0lchS1gqAVxRgsuW1YrTJupqA=
github.com/gofrs/uuid v4.4.0+incompatible/go.mod h1:b2aQJv3Z4Fp6yNu3cdSllBxTCLRxnplIgP/c0N/04lM=
github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang-jwt/jwt/v5 v5.3.0 h1:pv4AsKCKKZuqlgs5sUmn4x8UlGa0kEVt/puTpKx9vvo=
@@ -52,12 +62,35 @@ github.com/google/jsonschema-go v0.4.2 h1:tmrUohrwoLZZS/P3x7ex0WAVknEkBZM46iALbc
github.com/google/jsonschema-go v0.4.2/go.mod h1:r5quNTdLOYEz95Ru18zA0ydNbBuYoo9tgaYcxEYhJVE=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/gorilla/securecookie v1.1.1 h1:miw7JPhV+b/lAHSXz4qd/nN9jRiAFV5FwjeKyCS8BvQ=
github.com/gorilla/securecookie v1.1.1/go.mod h1:ra0sb63/xPlUeL+yeDciTfxMRAA+MP+HVt/4epWDjd4=
github.com/gorilla/sessions v1.2.1 h1:DHd3rPN5lE3Ts3D8rKkQ8x/0kqfeNmBAaiSi+o7FsgI=
github.com/gorilla/sessions v1.2.1/go.mod h1:dk2InVEVJ0sfLlnXv9EAgkf6ecYs/i80K/zI+bUmuGM=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.16.0 h1:YBftPWNWd4WwGqtY2yeZL2ef8rHAxPBD8KFhJpmcqms=
github.com/grpc-ecosystem/grpc-gateway/v2 v2.16.0/go.mod h1:YN5jB8ie0yfIUg6VvR9Kz84aCaG7AsGZnLjhHbUqwPg=
github.com/hashicorp/go-cleanhttp v0.5.2 h1:035FKYIWjmULyFRBKPs8TBQoi0x6d9G4xc9neXJWAZQ=
github.com/hashicorp/go-cleanhttp v0.5.2/go.mod h1:kO/YDlP8L1346E6Sodw+PrpBSV4/SoxCXGY6BqNFT48=
github.com/hashicorp/go-uuid v1.0.2/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/hashicorp/go-uuid v1.0.3 h1:2gKiV6YVmrJ1i2CKKa9obLvRieoRGviZFL26PcT/Co8=
github.com/hashicorp/go-uuid v1.0.3/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/jcmturner/aescts/v2 v2.0.0 h1:9YKLH6ey7H4eDBXW8khjYslgyqG2xZikXP0EQFKrle8=
github.com/jcmturner/aescts/v2 v2.0.0/go.mod h1:AiaICIRyfYg35RUkr8yESTqvSy7csK90qZ5xfvvsoNs=
github.com/jcmturner/dnsutils/v2 v2.0.0 h1:lltnkeZGL0wILNvrNiVCR6Ro5PGU/SeBvVO/8c/iPbo=
github.com/jcmturner/dnsutils/v2 v2.0.0/go.mod h1:b0TnjGOvI/n42bZa+hmXL+kFJZsFT7G4t3HTlQ184QM=
github.com/jcmturner/gofork v1.7.6 h1:QH0l3hzAU1tfT3rZCnW5zXl+orbkNMMRGJfdJjHVETg=
github.com/jcmturner/gofork v1.7.6/go.mod h1:1622LH6i/EZqLloHfE7IeZ0uEJwMSUyQ/nDd82IeqRo=
github.com/jcmturner/goidentity/v6 v6.0.1 h1:VKnZd2oEIMorCTsFBnJWbExfNN7yZr3EhJAxwOkZg6o=
github.com/jcmturner/goidentity/v6 v6.0.1/go.mod h1:X1YW3bgtvwAXju7V3LCIMpY0Gbxyjn/mY9zx4tFonSg=
github.com/jcmturner/gokrb5/v8 v8.4.4 h1:x1Sv4HaTpepFkXbt2IkL29DXRf8sOfZXo8eRKh687T8=
github.com/jcmturner/gokrb5/v8 v8.4.4/go.mod h1:1btQEpgT6k+unzCwX1KdWMEwPPkkgBtP+F6aCACiMrs=
github.com/jcmturner/rpc/v2 v2.0.3 h1:7FXXj8Ti1IaVFpSAziCZWNzbNuZmnvw/i6CqLNdWfZY=
github.com/jcmturner/rpc/v2 v2.0.3/go.mod h1:VUJYCIDm3PVOEHw8sgt091/20OJjskO/YJki3ELg/Hc=
github.com/kisielk/errcheck v1.5.0/go.mod h1:pFxgyoBC7bSaBwPgfKdkLd5X25qrDl4LWUI2bnpBCr8=
github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+oQHNcck=
github.com/klauspost/compress v1.17.4 h1:Ej5ixsIri7BrIjBkRZLTo6ghwrEtHFk7ijlczPW4fZ4=
github.com/klauspost/compress v1.17.4/go.mod h1:/dCuZOvVtNoHsyb+cuJD3itjs3NbnF6KH9zAO4BDxPM=
github.com/kr/fs v0.1.0 h1:Jskdu9ieNAYnjxsi0LbQp1ulIKZV1LAFgK1tWhpZgl8=
github.com/kr/fs v0.1.0/go.mod h1:FFnZGqtBN9Gxj7eW1uZ42v5BccTP0vu6NEaFoC2HwRg=
github.com/kr/pretty v0.3.0 h1:WgNl7dwNpEZ6jJ9k1snq4pZsg7DOEN8hP9Xw0Tsjwk0=
github.com/kr/pretty v0.3.0/go.mod h1:640gp4NfQd8pI5XOwp5fnNeVWj67G7CFk/SaSQn7NBk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
@@ -68,6 +101,10 @@ github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 h1:6E+4a0GO5zZEnZ
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0/go.mod h1:zJYVVT2jmtg6P3p1VtQj7WsuWi/y4VnjVBn7F8KPB3I=
github.com/magiconair/properties v1.8.7 h1:IeQXZAiQcpL9mgcAe1Nu6cX9LLw6ExEHKjN0VQdvPDY=
github.com/magiconair/properties v1.8.7/go.mod h1:Dhd985XPs7jluiymwWYZ0G4Z61jb3vdS329zhj2hYo0=
github.com/masterzen/simplexml v0.0.0-20190410153822-31eea3082786 h1:2ZKn+w/BJeL43sCxI2jhPLRv73oVVOjEKZjKkflyqxg=
github.com/masterzen/simplexml v0.0.0-20190410153822-31eea3082786/go.mod h1:kCEbxUJlNDEBNbdQMkPSp6yaKcRXVI6f4ddk8Riv4bc=
github.com/masterzen/winrm v0.0.0-20250927112105-5f8e6c707321 h1:AKIJL2PfBX2uie0Mn5pxtG1+zut3hAVMZbRfoXecFzI=
github.com/masterzen/winrm v0.0.0-20250927112105-5f8e6c707321/go.mod h1:JajVhkiG2bYSNYYPYuWG7WZHr42CTjMTcCjfInRNCqc=
github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
github.com/moby/docker-image-spec v1.3.1/go.mod h1:eKmb5VW8vQEh/BAr2yvVNvuiJuY6UIocYsFu/DxxRpo=
github.com/moby/patternmatcher v0.6.0 h1:GmP9lR19aU5GqSSFko+5pRqHi+Ohk1O69aFiKkVGiPk=
@@ -88,6 +125,8 @@ github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQ
github.com/opencontainers/image-spec v1.1.0/go.mod h1:W4s4sFTMaBeK1BQLXbG4AdM2szdn85PY75RI83NrTrM=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pkg/sftp v1.13.10 h1:+5FbKNTe5Z9aspU88DPIKJ9z2KZoaGCu6Sr6kKR/5mU=
github.com/pkg/sftp v1.13.10/go.mod h1:bJ1a7uDhrX/4OII+agvy28lzRvQrmIQuaHrcI1HbeGA=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c h1:ncq/mPwQF4JjgDlrVEn3C11VoGHZN7m8qihwgMEtzYw=
@@ -111,14 +150,18 @@ github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSS
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/objx v0.5.2 h1:xuMeJ0Sdp5ZMRXx/aWO6RZxdr3beISkG5/G/aIRr3pY=
github.com/stretchr/objx v0.5.2/go.mod h1:FRsXN1f5AsAjCGJKqEizvkpNtU+EGNCLh3NxZ/8L+MA=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/testcontainers/testcontainers-go v0.35.0 h1:uADsZpTKFAtp8SLK+hMwSaa+X+JiERHtd4sQAFmXeMo=
github.com/testcontainers/testcontainers-go v0.35.0/go.mod h1:oEVBj5zrfJTrgjwONs1SsRbnBtH9OKl+IGl3UMcr2B4=
github.com/tidwall/transform v0.0.0-20201103190739-32f242e2dbde h1:AMNpJRc7P+GTwVbl8DkK2I9I8BBUzNiHuH/tlxrpan0=
github.com/tidwall/transform v0.0.0-20201103190739-32f242e2dbde/go.mod h1:MvrEmduDUz4ST5pGZ7CABCnOU5f3ZiOAZzT6b1A6nX8=
github.com/tklauser/go-sysconf v0.3.12 h1:0QaGUFOdQaIVdPgfITYzaTegZvdCjmYO52cSFAEVmqU=
github.com/tklauser/go-sysconf v0.3.12/go.mod h1:Ho14jnntGE1fpdOqQEEaiKRpvIavV0hSfmBq8nJbHYI=
github.com/tklauser/numcpus v0.6.1 h1:ng9scYS7az0Bk4OZLvrNXNSAO2Pxr1XXRAPyjhIx+Fk=
@@ -127,6 +170,7 @@ github.com/yosida95/uritemplate/v3 v3.0.2 h1:Ed3Oyj9yrmi9087+NczuL5BwkIc4wvTb5zI
github.com/yosida95/uritemplate/v3 v3.0.2/go.mod h1:ILOh0sOhIJR3+L/8afwt/kE++YT040gmv5BQTMR2HP4=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
github.com/yusufpapurcu/wmi v1.2.3 h1:E1ctvB7uKFMOJw3fdOW32DwGE9I7t++CRUEMKvFoFiw=
github.com/yusufpapurcu/wmi v1.2.3/go.mod h1:SBZ9tNy3G9/m5Oi98Zks0QjeHVDvuK0qfxQmPyzfmi0=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.49.0 h1:jq9TW8u3so/bN+JPT166wjOI6/vQPF6Xe7nMNIltagk=
@@ -148,45 +192,65 @@ go.opentelemetry.io/proto/otlp v1.0.0/go.mod h1:Sy6pihPLfYHkr3NkUbEhGHFhINUSI/v8
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.31.0 h1:ihbySMvVjLAeSH1IbfcRTkD/iNscyz8rGzjF/E5hV6U=
golang.org/x/crypto v0.31.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.6.0/go.mod h1:OFC/31mSvZgRz0V1QTNCzfAI1aIRzbiufJtkMIlEp58=
golang.org/x/crypto v0.41.0 h1:WKYxWedPGCTVVl5+WHSSrOBT0O8lx32+zxmHxijgXp4=
golang.org/x/crypto v0.41.0/go.mod h1:pO5AFd7FA68rFak7rOAGVuygIISepHftHnr8dr6+sUc=
golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.3.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200114155413-6afb5195e5aa/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.23.0 h1:7EYJ93RZ9vYSZAIb2x3lnuvqO5zneoD6IvWjuhfxjTs=
golang.org/x/net v0.23.0/go.mod h1:JKghWKKOSdJwpW2GEx0Ja7fmaKnMsbu+MWVZTokSYmg=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.42.0 h1:jzkYrhi3YQWD6MLBJcsklgQsoAcw89EcZbJw8Z614hs=
golang.org/x/net v0.42.0/go.mod h1:FF1RA5d3u7nAYA4z2TkclSCKh68eSXtiFwcWQpPXdt8=
golang.org/x/oauth2 v0.34.0 h1:hqK/t4AKgbqWkdkcAeI8XLmbK+4m4G5YeQRrmiotGlw=
golang.org/x/oauth2 v0.34.0/go.mod h1:lzm5WQJQwKZ3nwavOZ3IS5Aulzxi68dUSgRHujetwEA=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190916202348-b4ddaad3f8a3/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201204225414-ed752295db88/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20210616094352-59db8d763f22/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.11.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.15.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.40.0 h1:DBZZqJ2Rkml6QMQsZywtnjnnGvHza6BTfYFWY9kjEWQ=
golang.org/x/sys v0.40.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
golang.org/x/term v0.27.0 h1:WP60Sv1nlK1T6SupCHbXzSaN0b9wUmsPoRS9b61A23Q=
golang.org/x/term v0.27.0/go.mod h1:iMsnZpn0cago0GOrHO2+Y7u7JPn5AylBrcoWkElMTSM=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.34.0 h1:O/2T7POpk0ZZ7MAzMeWFSg6S5IpWd/RXDlM9hgM3DR4=
golang.org/x/term v0.34.0/go.mod h1:5jC53AEywhIVebHgPVeg0mj8OD3VO9OzclacVrqpaAw=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.21.0 h1:zyQAAkrwaneQ066sspRyJaG9VNi/YJ1NfzcGB3hZ/qo=
golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.28.0 h1:rhazDwis8INMIwQ4tpjLDzUhx6RlXqZNPEM0huQojng=
golang.org/x/text v0.28.0/go.mod h1:U8nCwOR8jO/marOQ0QbDiOngZVEBB7MAiitBuMjXiNU=
golang.org/x/time v0.0.0-20220210224613-90d013bbcef8 h1:vVKdlvoWBphwdxWKrFZEuM0kGgGLxUOYcY4U/2Vjg44=
golang.org/x/time v0.0.0-20220210224613-90d013bbcef8/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20200619180055-7c47624df98f/go.mod h1:EkVYQZoAsY45+roYkvgYkIh4xh/qjgUK9TdY2XT94GE=
golang.org/x/tools v0.0.0-20210106214847-113979e3529a/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/tools v0.41.0 h1:a9b8iMweWG+S0OBnlU36rzLp20z1Rp10w+IY2czHTQc=
golang.org/x/tools v0.41.0/go.mod h1:XSY6eDqxVNiYgezAVqqCeihT4j1U2CCsqvH3WhQpnlg=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
@@ -205,6 +269,7 @@ google.golang.org/protobuf v1.33.0/go.mod h1:c6P6GXX6sHbq/GpV6MGZEdwhWPcYBgnhAHh
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
+140 -21
View File
@@ -60,8 +60,21 @@ OPTIONS:
-h, --help Show this help message
--server-url URL Set CERTCTL_SERVER_URL (skips interactive prompt)
--api-key KEY Set CERTCTL_API_KEY (skips interactive prompt)
--agent-id ID Set CERTCTL_AGENT_ID (defaults to hostname)
--no-start Install but don't start the service
EXAMPLES:
# Interactive install (download first):
curl -sSLO https://raw.githubusercontent.com/${GITHUB_REPO}/master/install-agent.sh
chmod +x install-agent.sh
sudo ./install-agent.sh
# Non-interactive install (pipe via curl):
curl -sSL https://raw.githubusercontent.com/${GITHUB_REPO}/master/install-agent.sh \\
| sudo bash -s -- \\
--server-url https://certctl.example.com \\
--api-key YOUR_API_KEY
EOF
}
@@ -74,19 +87,47 @@ parse_args() {
exit 0
;;
--server-url)
SERVER_URL="$2"
SERVER_URL="${2:-}"
if [[ -z "$SERVER_URL" ]]; then
echo -e "${RED}Error: --server-url requires a value${NC}" >&2
exit 1
fi
shift 2
;;
--server-url=*)
SERVER_URL="${1#*=}"
shift
;;
--api-key)
API_KEY="$2"
API_KEY="${2:-}"
if [[ -z "$API_KEY" ]]; then
echo -e "${RED}Error: --api-key requires a value${NC}" >&2
exit 1
fi
shift 2
;;
--api-key=*)
API_KEY="${1#*=}"
shift
;;
--agent-id)
AGENT_ID="${2:-}"
if [[ -z "$AGENT_ID" ]]; then
echo -e "${RED}Error: --agent-id requires a value${NC}" >&2
exit 1
fi
shift 2
;;
--agent-id=*)
AGENT_ID="${1#*=}"
shift
;;
--no-start)
NO_START=true
shift
;;
*)
echo -e "${RED}Error: Unknown option: $1${NC}"
echo -e "${RED}Error: Unknown option: $1${NC}" >&2
usage
exit 1
;;
@@ -94,6 +135,56 @@ parse_args() {
done
}
# Ensure stdin is interactive before prompting. When the script is piped via
# curl|bash, stdin is the pipe from curl, so `read` hits EOF immediately and
# set -e aborts the script silently. Reopen stdin from the controlling terminal
# (/dev/tty) if available; otherwise print a helpful error pointing at the
# flag-based non-interactive install.
ensure_interactive_input() {
# If all required config is already provided via flags, no prompting needed.
if [[ -n "${SERVER_URL:-}" && -n "${API_KEY:-}" ]]; then
return
fi
# Already interactive — nothing to do.
if [[ -t 0 ]]; then
return
fi
# Piped stdin — try to reopen from the controlling terminal. Actually
# attempt to open /dev/tty inside a subshell: the device node may exist
# even when the process has no controlling terminal (ENXIO on open), so
# `[[ -r /dev/tty ]]` is not reliable.
if ( exec </dev/tty ) 2>/dev/null; then
exec </dev/tty
return
fi
# No terminal available — emit clear guidance and exit.
# Use printf '%b' so the ANSI color escapes in $RED/$NC are interpreted
# rather than rendered as literal backslash sequences (a heredoc would
# keep them as raw text).
{
printf '%b\n' "${RED}Error: No interactive terminal available.${NC}"
printf '\n'
printf 'The installer was piped through curl and no controlling terminal (/dev/tty)\n'
printf 'is available for prompts. Pass the required values as flags instead:\n'
printf '\n'
printf ' curl -sSL https://raw.githubusercontent.com/%s/master/install-agent.sh \\\n' "$GITHUB_REPO"
printf ' | sudo bash -s -- \\\n'
printf ' --server-url https://certctl.example.com \\\n'
printf ' --api-key YOUR_API_KEY\n'
printf '\n'
printf 'Or download the script first and run it directly:\n'
printf '\n'
printf ' curl -sSLO https://raw.githubusercontent.com/%s/master/install-agent.sh\n' "$GITHUB_REPO"
printf ' chmod +x install-agent.sh\n'
printf ' sudo ./install-agent.sh\n'
printf '\n'
} >&2
exit 1
}
# Check if running as root/sudo on Linux
check_privileges() {
if [[ "$OS_TYPE" == "linux" && "$EUID" -ne 0 ]]; then
@@ -103,23 +194,33 @@ check_privileges() {
}
# Download agent binary from GitHub Releases
# IMPORTANT: main() captures this function's stdout via `binary_path=$(download_binary)`,
# so every status/error message MUST go to stderr (>&2). Only the final
# `echo "$temp_file"` is allowed on stdout — that's the return value.
#
# We deliberately do NOT register an EXIT trap to clean up $temp_file: because
# of the command substitution, this function runs in a subshell, and any EXIT
# trap set here fires when the subshell exits — which is *before* install_binary
# gets a chance to cp the file. Cleanup on success is install_binary's job
# (after the cp), and cleanup on curl failure is handled inline below.
download_binary() {
local binary_name="certctl-agent-${OS_TYPE}-${ARCH_TYPE}"
local download_url="${RELEASE_URL}/${binary_name}"
echo -e "${YELLOW}Downloading certctl agent (${OS_TYPE}-${ARCH_TYPE})...${NC}"
echo -e "${YELLOW}Downloading certctl agent (${OS_TYPE}-${ARCH_TYPE})...${NC}" >&2
if ! command -v curl &> /dev/null; then
echo -e "${RED}Error: curl is required but not installed${NC}"
echo -e "${RED}Error: curl is required but not installed${NC}" >&2
exit 1
fi
local temp_file=$(mktemp)
trap "rm -f $temp_file" EXIT
local temp_file
temp_file=$(mktemp)
if ! curl -sSL -f "$download_url" -o "$temp_file"; then
echo -e "${RED}Error: Failed to download binary from $download_url${NC}"
echo "Make sure the latest release exists on GitHub with the binary asset for ${OS_TYPE}-${ARCH_TYPE}."
if ! curl -sSL -f "$download_url" -o "$temp_file" >&2; then
rm -f "$temp_file"
echo -e "${RED}Error: Failed to download binary from $download_url${NC}" >&2
echo "Make sure the latest release exists on GitHub with the binary asset for ${OS_TYPE}-${ARCH_TYPE}." >&2
exit 1
fi
@@ -146,35 +247,52 @@ install_binary() {
chmod +x "$INSTALL_DIR/$SERVICE_NAME"
echo -e "${GREEN}Binary installed: $INSTALL_DIR/$SERVICE_NAME${NC}"
# Clean up the temp file created by download_binary. We can't use an EXIT
# trap inside download_binary because it runs in a subshell (command
# substitution), so the trap would fire before we got here. Doing it
# explicitly after the successful cp is the simplest correct pattern.
rm -f "$binary_path"
}
# Prompt for configuration (unless --server-url and --api-key provided)
# Prompt for configuration. Any value supplied via flag is honored as-is
# and we only prompt for the missing pieces. `read || true` prevents set -e
# from aborting the script on EOF — instead the empty check below fires the
# proper "required" error message.
prompt_for_config() {
if [[ -z "${SERVER_URL:-}" ]]; then
echo ""
echo -e "${YELLOW}Enter certctl server URL (e.g., https://certctl.example.com):${NC}"
read -r SERVER_URL
if [[ -z "$SERVER_URL" ]]; then
echo -e "${RED}Error: Server URL is required${NC}"
read -r SERVER_URL || true
if [[ -z "${SERVER_URL:-}" ]]; then
echo -e "${RED}Error: Server URL is required${NC}" >&2
echo "Hint: pass --server-url <URL> to run non-interactively." >&2
exit 1
fi
fi
if [[ -z "${API_KEY:-}" ]]; then
echo -e "${YELLOW}Enter certctl API key:${NC}"
read -sr API_KEY
read -rs API_KEY || true
echo ""
if [[ -z "$API_KEY" ]]; then
echo -e "${RED}Error: API key is required${NC}"
if [[ -z "${API_KEY:-}" ]]; then
echo -e "${RED}Error: API key is required${NC}" >&2
echo "Hint: pass --api-key <KEY> to run non-interactively." >&2
exit 1
fi
fi
if [[ -z "${AGENT_ID:-}" ]]; then
local default_agent_id="$(hostname)"
echo -e "${YELLOW}Enter agent ID (default: $default_agent_id):${NC}"
read -r AGENT_ID
if [[ -z "$AGENT_ID" ]]; then
local default_agent_id
default_agent_id="$(hostname)"
# If stdin is still piped (no /dev/tty was available but SERVER_URL +
# API_KEY arrived via flags), skip the prompt entirely and use the
# default — no need to block on an optional value.
if [[ -t 0 ]]; then
echo -e "${YELLOW}Enter agent ID (default: $default_agent_id):${NC}"
read -r AGENT_ID || true
fi
if [[ -z "${AGENT_ID:-}" ]]; then
AGENT_ID="$default_agent_id"
fi
fi
@@ -447,6 +565,7 @@ main() {
echo "Detected platform: ${OS_TYPE}-${ARCH_TYPE}"
echo ""
ensure_interactive_input
prompt_for_config
# Download and install binary
@@ -0,0 +1,339 @@
package handler
// Adversarial EST (RFC 7030) enrollment tests — Tier 1F.
//
// EST is the RFC 7030 protocol for certificate enrollment over HTTPS. The
// control-plane parser accepts PKCS#10 CSRs either as PEM or as base64-encoded
// DER, and it's a prime target for:
//
// * Malformed base64 / non-DER payloads
// * Valid base64 that doesn't decode to a valid CSR
// * PEM header spoofing (wrong block type)
// * Null bytes and control characters embedded in PEM or base64
// * Huge CSR bodies (we expect the handler's 1 MiB LimitReader to clamp them)
// * Truncated or partially-written PEM blocks
// * Unicode homoglyphs in PEM delimiters
// * Content-Type mismatch (handler ignores Content-Type, but attackers might
// still try header spoofing)
//
// The contract is the same as other adversarial tiers: the handler must never
// panic and must never return 500 for a malformed CSR (500 is reserved for
// issuer/service failures). For adversarial CSRs, the correct status is 400.
import (
"bytes"
"context"
"encoding/base64"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/domain"
)
// adversarialCSRInputs exercises the EST CSR parsing surface. None of these
// should reach the underlying ESTService — they must be rejected by
// readCSRFromRequest with a 400 before any service call is made.
func adversarialCSRInputs() []struct {
name string
body string
} {
// A garbage base64 string that decodes cleanly but isn't a PKCS#10 CSR.
// base64 of "this is definitely not a CSR" = dGhpcyBpcyBkZWZpbml0ZWx5IG5vdCBhIENTUg==
nonCSRBase64 := base64.StdEncoding.EncodeToString([]byte("this is definitely not a CSR"))
return []struct {
name string
body string
}{
{"garbage_string", "not-a-csr-at-all"},
{"base64_garbage", "!!!@@@###$$$%%%"},
{"base64_valid_non_csr", nonCSRBase64},
{"base64_very_short", "AA=="},
{"null_byte_only", "\x00"},
{"null_bytes_padding", "\x00\x00\x00\x00\x00\x00\x00\x00"},
{"control_chars", "\x01\x02\x03\x04\x05\x06\x07\x08"},
{"pem_wrong_block_type", "-----BEGIN CERTIFICATE-----\nMIIB\n-----END CERTIFICATE-----\n"},
{"pem_wrong_header_close", "-----BEGIN CERTIFICATE REQUEST-----\nMIIB\n-----END PRIVATE KEY-----\n"},
{"pem_empty_block", "-----BEGIN CERTIFICATE REQUEST-----\n-----END CERTIFICATE REQUEST-----\n"},
{"pem_garbage_body", "-----BEGIN CERTIFICATE REQUEST-----\n!!!not base64!!!\n-----END CERTIFICATE REQUEST-----\n"},
{"pem_truncated", "-----BEGIN CERTIFICATE REQUEST-----\nMIIBijCCAT"},
{"pem_no_end_marker", "-----BEGIN CERTIFICATE REQUEST-----\nMIIBijCCATICAQAwFjEUMBIGA1UE\n"},
{"pem_header_injection", "-----BEGIN CERTIFICATE REQUEST-----\r\nHost: evil.com\r\n\r\nMIIB\n-----END CERTIFICATE REQUEST-----\n"},
{"pem_embedded_null", "-----BEGIN CERTIFICATE\x00REQUEST-----\nMIIB\n-----END CERTIFICATE REQUEST-----\n"},
{"unicode_homoglyph_pem", "-----BEGIN CERTIFICATE REQUEST─────\nMIIB\n─────END CERTIFICATE REQUEST-----\n"},
{"double_pem_block", "-----BEGIN CERTIFICATE REQUEST-----\nMIIB\n-----END CERTIFICATE REQUEST-----\n-----BEGIN CERTIFICATE REQUEST-----\nMIIB\n-----END CERTIFICATE REQUEST-----\n"},
{"json_body", `{"csr":"MIIB","common_name":"attacker.com"}`},
{"xml_body", `<?xml version="1.0"?><csr>MIIB</csr>`},
{"shell_metacharacters", "$(whoami); rm -rf / #"},
{"sql_injection", "' OR 1=1; DROP TABLE certificates;--"},
{"long_garbage_10k", strings.Repeat("A", 10000)},
{"long_base64_not_csr", base64.StdEncoding.EncodeToString(bytes.Repeat([]byte{0xFF}, 5000))},
{"base64_with_newlines_garbage", "AAAAAAAAAAAAAAAA\nBBBBBBBBBBBBBBBB\nCCCCCCCCCCCCCCCC"},
{"percent_encoded_pem", "%2D%2D%2D%2D%2DBEGIN+CERTIFICATE+REQUEST%2D%2D%2D%2D%2D"},
}
}
// assertESTErrorResponse enforces the EST handler contract for adversarial CSRs:
// no panic, no 500, body is valid JSON (since Error helper emits JSON errors).
func assertESTErrorResponse(t *testing.T, w *httptest.ResponseRecorder, label string) {
t.Helper()
// The handler must never reach a 500 for parser-rejected CSRs — that would
// indicate a service call slipped through.
if w.Code == http.StatusInternalServerError {
t.Errorf("%s: handler returned 500 body=%q — adversarial CSR should not reach the service layer",
label, w.Body.String())
}
// The handler should return 400 Bad Request for adversarial CSR inputs.
// A 405 (method not allowed) is impossible here because we always POST.
if w.Code != http.StatusBadRequest {
t.Errorf("%s: expected 400, got %d (body=%q)", label, w.Code, w.Body.String())
}
}
// newESTHandlerWithTrap returns an ESTHandler whose service panics if reached.
// This is the core invariant for Tier 1F: adversarial CSRs must be rejected at
// the parser, never reaching SimpleEnroll/SimpleReEnroll on the service.
func newESTHandlerWithTrap() (ESTHandler, *trappedESTService) {
svc := &trappedESTService{}
return NewESTHandler(svc), svc
}
// trappedESTService is a mock that fails the test if any service method is
// called with an adversarial CSR. The parser should reject these before they
// get here.
type trappedESTService struct {
serviceCalled bool
}
func (t *trappedESTService) GetCACerts(ctx context.Context) (string, error) {
t.serviceCalled = true
return "", errors.New("trap: GetCACerts should not be called from adversarial CSR tests")
}
func (t *trappedESTService) SimpleEnroll(ctx context.Context, csrPEM string) (*domain.ESTEnrollResult, error) {
t.serviceCalled = true
return nil, errors.New("trap: SimpleEnroll should not be called from adversarial CSR tests")
}
func (t *trappedESTService) SimpleReEnroll(ctx context.Context, csrPEM string) (*domain.ESTEnrollResult, error) {
t.serviceCalled = true
return nil, errors.New("trap: SimpleReEnroll should not be called from adversarial CSR tests")
}
func (t *trappedESTService) GetCSRAttrs(ctx context.Context) ([]byte, error) {
t.serviceCalled = true
return nil, errors.New("trap: GetCSRAttrs should not be called from adversarial CSR tests")
}
// TestESTSimpleEnroll_AdversarialCSRs runs each adversarial CSR through the
// enrollment endpoint.
func TestESTSimpleEnroll_AdversarialCSRs(t *testing.T) {
for _, tc := range adversarialCSRInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on body %q: %v", tc.body, r)
}
}()
h, svc := newESTHandlerWithTrap()
req := httptest.NewRequest(http.MethodPost, "/.well-known/est/simpleenroll", strings.NewReader(tc.body))
req.Header.Set("Content-Type", "application/pkcs10")
w := httptest.NewRecorder()
h.SimpleEnroll(w, req)
assertESTErrorResponse(t, w, "SimpleEnroll/"+tc.name)
if svc.serviceCalled {
t.Errorf("SimpleEnroll/%s: service was reached with adversarial CSR (body=%q)",
tc.name, tc.body)
}
})
}
}
// TestESTSimpleReEnroll_AdversarialCSRs runs each adversarial CSR through the
// re-enrollment endpoint. Same contract as simpleenroll.
func TestESTSimpleReEnroll_AdversarialCSRs(t *testing.T) {
for _, tc := range adversarialCSRInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on body %q: %v", tc.body, r)
}
}()
h, svc := newESTHandlerWithTrap()
req := httptest.NewRequest(http.MethodPost, "/.well-known/est/simplereenroll", strings.NewReader(tc.body))
req.Header.Set("Content-Type", "application/pkcs10")
w := httptest.NewRecorder()
h.SimpleReEnroll(w, req)
assertESTErrorResponse(t, w, "SimpleReEnroll/"+tc.name)
if svc.serviceCalled {
t.Errorf("SimpleReEnroll/%s: service was reached with adversarial CSR (body=%q)",
tc.name, tc.body)
}
})
}
}
// TestESTSimpleEnroll_HugeBody verifies the handler's 1 MiB limit truncates
// oversized requests at the LimitReader boundary. We send a 2 MiB body of
// base64 garbage and confirm the handler rejects it cleanly (400, no panic,
// no 500) and the service is never reached.
func TestESTSimpleEnroll_HugeBody(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on 2 MiB body: %v", r)
}
}()
// 2 MiB of base64-valid garbage: the LimitReader will truncate to 1 MiB, and
// the truncated base64 chunk won't parse as a valid PKCS#10 CSR.
huge := strings.Repeat("A", 2<<20)
h, svc := newESTHandlerWithTrap()
req := httptest.NewRequest(http.MethodPost, "/.well-known/est/simpleenroll", strings.NewReader(huge))
req.Header.Set("Content-Type", "application/pkcs10")
w := httptest.NewRecorder()
h.SimpleEnroll(w, req)
// Contract: 400 Bad Request (parser fail), no panic, no 500.
if w.Code == http.StatusInternalServerError {
t.Errorf("HugeBody: handler returned 500 for 2 MiB body (body=%q)", w.Body.String())
}
if w.Code != http.StatusBadRequest {
t.Errorf("HugeBody: expected 400, got %d (body=%q)", w.Code, w.Body.String())
}
if svc.serviceCalled {
t.Error("HugeBody: service was reached with 2 MiB adversarial body")
}
}
// TestESTSimpleEnroll_ExactlyAtLimit sends a body exactly at the 1 MiB
// LimitReader boundary. The body is still garbage (won't parse as CSR), but we
// verify the handler doesn't panic or hang on the boundary case.
func TestESTSimpleEnroll_ExactlyAtLimit(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on exact-limit body: %v", r)
}
}()
atLimit := strings.Repeat("A", 1<<20) // exactly 1 MiB
h, _ := newESTHandlerWithTrap()
req := httptest.NewRequest(http.MethodPost, "/.well-known/est/simpleenroll", strings.NewReader(atLimit))
w := httptest.NewRecorder()
h.SimpleEnroll(w, req)
if w.Code == http.StatusInternalServerError {
t.Errorf("ExactlyAtLimit: handler returned 500 (body=%q)", w.Body.String())
}
}
// TestESTSimpleEnroll_MultipartBody sends a multipart/form-data body that a
// naive parser might try to unwrap. The handler should treat the raw bytes as
// a CSR payload and reject them.
func TestESTSimpleEnroll_MultipartBody(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on multipart body: %v", r)
}
}()
multipart := "--boundary\r\nContent-Disposition: form-data; name=\"csr\"\r\n\r\nMIIB\r\n--boundary--\r\n"
h, svc := newESTHandlerWithTrap()
req := httptest.NewRequest(http.MethodPost, "/.well-known/est/simpleenroll", strings.NewReader(multipart))
req.Header.Set("Content-Type", "multipart/form-data; boundary=boundary")
w := httptest.NewRecorder()
h.SimpleEnroll(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("MultipartBody: expected 400, got %d (body=%q)", w.Code, w.Body.String())
}
if svc.serviceCalled {
t.Error("MultipartBody: service was reached with multipart wrapper")
}
}
// TestESTCACerts_MethodAbuse verifies the /cacerts endpoint only accepts GET
// and rejects every other method cleanly. This is a small safety check for
// the spec invariant.
func TestESTCACerts_MethodAbuse(t *testing.T) {
methods := []string{
http.MethodPost, http.MethodPut, http.MethodDelete,
http.MethodPatch, http.MethodHead, http.MethodOptions,
"TRACE", "CONNECT", "PROPFIND", "BOGUS",
}
for _, method := range methods {
t.Run(method, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on method %s: %v", method, r)
}
}()
h, _ := newESTHandlerWithTrap()
req := httptest.NewRequest(method, "/.well-known/est/cacerts", nil)
w := httptest.NewRecorder()
h.CACerts(w, req)
// HEAD on a GET handler in Go's stdlib is normally accepted, but
// this handler enforces strict GET-only — so HEAD should also get 405.
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("method %s: expected 405, got %d", method, w.Code)
}
})
}
}
// TestESTSimpleEnroll_MethodAbuse verifies strict POST-only enforcement.
func TestESTSimpleEnroll_MethodAbuse(t *testing.T) {
methods := []string{
http.MethodGet, http.MethodPut, http.MethodDelete,
http.MethodPatch, http.MethodHead, http.MethodOptions,
"TRACE", "CONNECT",
}
for _, method := range methods {
t.Run(method, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on method %s: %v", method, r)
}
}()
h, svc := newESTHandlerWithTrap()
req := httptest.NewRequest(method, "/.well-known/est/simpleenroll", strings.NewReader("body"))
w := httptest.NewRecorder()
h.SimpleEnroll(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("method %s: expected 405, got %d", method, w.Code)
}
if svc.serviceCalled {
t.Errorf("method %s: service was called for non-POST", method)
}
})
}
}
@@ -0,0 +1,337 @@
package handler
// Adversarial path-parameter and multi-segment path tests.
//
// These tests exercise the input parsing boundary of the certificate handler
// against the attack categories listed in certctl-adversarial-testing-prompt.md
// Tier 1A / 1B:
//
// * Empty and whitespace-only path IDs
// * SQL-injection sentinels embedded in the path
// * Directory traversal (`../../etc/passwd`)
// * Null bytes and control characters
// * Extremely long IDs (10 KiB)
// * Unicode homoglyphs (visually identical substitutes)
// * Multi-segment paths (OCSP, DER CRL, versions, renew, deploy, revoke)
//
// The contract we verify is defensive, not behavioural:
//
// 1. The handler never panics.
// 2. The HTTP status is one of {200, 400, 404, 405} — never 500.
// 3. The response body is either empty or valid JSON.
// 4. No attacker-controlled input is echoed verbatim in a 500 body.
//
// We do not assert the exact status code for every adversarial input because
// the current handler intentionally delegates identifier validation to the
// repository layer; its only job here is to stay up and well-formed.
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/domain"
)
// adversarialPathInputs is the attack catalog shared by Tier 1A cases. Each
// entry targets a different parsing surface; adding a new category here makes
// every Tier 1A test below exercise it automatically.
func adversarialPathInputs() []struct {
name string
input string
} {
return []struct {
name string
input string
}{
{"sql_injection_drop_table", "'; DROP TABLE managed_certificates;--"},
{"sql_injection_or_true", "' OR 1=1--"},
{"sql_injection_union", "mc-001' UNION SELECT * FROM agents--"},
{"path_traversal_dot_dot", "../../etc/passwd"},
{"path_traversal_encoded", "..%2F..%2Fetc%2Fpasswd"},
{"null_byte_trailing", "mc-001\x00"},
{"null_byte_embedded", "mc-\x00-001"},
{"long_id_10k", strings.Repeat("A", 10000)},
{"unicode_homoglyph_hyphen", "mc\u2010001"}, // U+2010 HYPHEN
{"unicode_homoglyph_fullwidth", "mc\uFF0D001"}, // U+FF0D FULLWIDTH HYPHEN-MINUS
{"control_char_newline", "mc-001\n"},
{"control_char_tab", "mc\t001"},
{"control_char_bell", "mc\x07001"},
{"percent_encoded_null", "mc-001%00"},
{"whitespace_only", " "},
{"shell_metacharacters", "mc-001;`rm -rf /`"},
{"leading_slash", "/mc-001"},
{"trailing_slash", "mc-001/"},
{"double_slash", "mc//001"},
}
}
// assertSafeResponse is the core defensive check. Any adversarial input is
// allowed to produce a 4xx, but must not panic or leak through as a 500.
func assertSafeResponse(t *testing.T, w *httptest.ResponseRecorder, label string) {
t.Helper()
// 1. No 500 (500 implies the handler reached an unexpected internal state).
if w.Code == http.StatusInternalServerError {
t.Errorf("%s: handler returned 500, body=%q — adversarial input should not reach an internal error path",
label, w.Body.String())
}
// 2. Status must be in the expected safe set.
switch w.Code {
case http.StatusOK, http.StatusCreated, http.StatusAccepted, http.StatusNoContent,
http.StatusBadRequest, http.StatusNotFound, http.StatusMethodNotAllowed, http.StatusNotImplemented:
// ok
default:
t.Errorf("%s: unexpected status %d (body=%q)", label, w.Code, w.Body.String())
}
// 3. Non-empty bodies must be valid JSON (no template leakage, no raw panics).
if body := bytes.TrimSpace(w.Body.Bytes()); len(body) > 0 {
var discard interface{}
if err := json.Unmarshal(body, &discard); err != nil {
t.Errorf("%s: response body is not valid JSON: %v (body=%q)", label, err, w.Body.String())
}
}
}
// newCertHandlerWithMock builds a handler whose mock service returns nothing.
// This keeps every adversarial test focused on the handler's parsing layer
// rather than service behaviour.
func newCertHandlerWithMock() (CertificateHandler, *MockCertificateService) {
mock := &MockCertificateService{}
return NewCertificateHandler(mock), mock
}
// TestGetCertificate_PathInjection runs each adversarial path through the
// certificate GET handler.
func TestGetCertificate_PathInjection(t *testing.T) {
for _, tc := range adversarialPathInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on input %q: %v", tc.input, r)
}
}()
handler, mock := newCertHandlerWithMock()
// Force a 404 so we can distinguish "service was called" from
// "parser accepted the ID"; a 200 with null body is also fine.
mock.GetCertificateFn = func(_ context.Context, id string) (*domain.ManagedCertificate, error) {
return nil, ErrMockNotFound
}
// Build the URL by string concatenation to keep attacker-controlled
// bytes intact (httptest.NewRequest uses url.Parse under the hood,
// which normalises some characters — we want the raw path on the
// request object).
req := httptest.NewRequest(http.MethodGet, "/api/v1/certificates/x", nil)
req.URL.Path = "/api/v1/certificates/" + tc.input
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCertificate(w, req)
assertSafeResponse(t, w, "GetCertificate/"+tc.name)
})
}
}
// TestUpdateCertificate_PathInjection exercises the PUT handler's path parser.
// UpdateCertificate splits the path on "/" and takes parts[0]; traversal and
// double-slash inputs must still short-circuit at the parser rather than
// reaching the service.
func TestUpdateCertificate_PathInjection(t *testing.T) {
body := `{"common_name":"example.com","owner_id":"o-alice","team_id":"t-a","issuer_id":"iss-local","name":"n","renewal_policy_id":"rp-1"}`
for _, tc := range adversarialPathInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on input %q: %v", tc.input, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.UpdateCertificateFn = func(_ context.Context, id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
return nil, ErrMockNotFound
}
req := httptest.NewRequest(http.MethodPut, "/api/v1/certificates/x", bytes.NewBufferString(body))
req.URL.Path = "/api/v1/certificates/" + tc.input
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.UpdateCertificate(w, req)
assertSafeResponse(t, w, "UpdateCertificate/"+tc.name)
})
}
}
// TestArchiveCertificate_PathInjection exercises DELETE.
func TestArchiveCertificate_PathInjection(t *testing.T) {
for _, tc := range adversarialPathInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on input %q: %v", tc.input, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ArchiveCertificateFn = func(_ context.Context, id string) error { return ErrMockNotFound }
req := httptest.NewRequest(http.MethodDelete, "/api/v1/certificates/x", nil)
req.URL.Path = "/api/v1/certificates/" + tc.input
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.ArchiveCertificate(w, req)
assertSafeResponse(t, w, "ArchiveCertificate/"+tc.name)
})
}
}
// TestGetCertificateVersions_MultiSegment is a Tier 1B test: the versions
// handler requires a 2-segment path (certID/versions). The parser uses
// strings.Split(path, "/") and checks len(parts) < 2 — but an adversarial
// caller can inject extra slashes to either produce an empty parts[0] or a
// very long parts slice. Either way we must not panic.
func TestGetCertificateVersions_MultiSegment(t *testing.T) {
cases := []struct {
name string
path string
}{
{"missing_segment", "/api/v1/certificates/versions"},
{"empty_cert_id", "/api/v1/certificates//versions"},
{"traversal_cert_id", "/api/v1/certificates/..%2F..%2Fversions/versions"},
{"sql_injection_cert_id", "/api/v1/certificates/'%20OR%201=1--/versions"},
{"null_byte_cert_id", "/api/v1/certificates/mc\x00001/versions"},
{"very_long_cert_id", "/api/v1/certificates/" + strings.Repeat("A", 5000) + "/versions"},
{"trailing_segments", "/api/v1/certificates/mc-001/versions/extra/trailing"},
{"deep_nesting", "/api/v1/certificates/" + strings.Repeat("a/", 50) + "versions"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on path %q: %v", tc.path, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.GetCertificateVersionsFn = func(_ context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
return []domain.CertificateVersion{}, 0, nil
}
// Use a dummy safe URL in NewRequest to avoid url.Parse panics
// on control chars, then overwrite with the raw attacker path.
req := httptest.NewRequest(http.MethodGet, "/safe", nil)
req.URL.Path = tc.path
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCertificateVersions(w, req)
assertSafeResponse(t, w, "GetCertificateVersions/"+tc.name)
})
}
}
// TestHandleOCSP_MultiSegment exercises the OCSP responder's 2-segment path
// parser (/.well-known/pki/ocsp/{issuer_id}/{serial_hex}). Each leg is
// attacker-controlled and the serial can be arbitrary length. This is a key
// adversarial surface because the serial is passed directly to the
// CA-operations service, which is expected to treat it as an opaque
// identifier.
//
// M-006 relocation: these paths were previously served at /api/v1/ocsp/*;
// under RFC 8615 and RFC 6960 they now live under /.well-known/pki/ocsp/*.
func TestHandleOCSP_MultiSegment(t *testing.T) {
cases := []struct {
name string
path string
}{
{"missing_serial", "/.well-known/pki/ocsp/iss-local"},
{"missing_both", "/.well-known/pki/ocsp/"},
{"empty_issuer", "/.well-known/pki/ocsp//01ABCDEF"},
{"empty_serial", "/.well-known/pki/ocsp/iss-local/"},
{"traversal_issuer", "/.well-known/pki/ocsp/..%2F..%2Fetc/passwd/01"},
{"null_byte_serial", "/.well-known/pki/ocsp/iss-local/01\x00FF"},
{"sql_injection_serial", "/.well-known/pki/ocsp/iss-local/01'; DROP TABLE--"},
{"negative_hex_serial", "/.well-known/pki/ocsp/iss-local/-1"},
{"unicode_serial", "/.well-known/pki/ocsp/iss-local/01\u2010FF"},
{"extremely_long_serial", "/.well-known/pki/ocsp/iss-local/" + strings.Repeat("F", 10000)},
{"extra_segments", "/.well-known/pki/ocsp/iss-local/01FF/extra/segments"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on path %q: %v", tc.path, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.GetOCSPResponseFn = func(_ context.Context, issuerID, serialHex string) ([]byte, error) {
return nil, ErrMockNotFound
}
req := httptest.NewRequest(http.MethodGet, "/safe", nil)
req.URL.Path = tc.path
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.HandleOCSP(w, req)
// OCSP does NOT guarantee JSON responses (pkix-crl uses binary),
// so we only check status safety, not body structure.
if w.Code == http.StatusInternalServerError {
t.Errorf("HandleOCSP/%s: returned 500 body=%q", tc.name, w.Body.String())
}
if w.Code >= 500 {
t.Errorf("HandleOCSP/%s: unexpected 5xx %d", tc.name, w.Code)
}
})
}
}
// TestGetDERCRL_IssuerPathInjection exercises
// /.well-known/pki/crl/{issuer_id} (RFC 5280 CRL; M-006 relocation from
// /api/v1/crl/{issuer_id}).
func TestGetDERCRL_IssuerPathInjection(t *testing.T) {
for _, tc := range adversarialPathInputs() {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("handler panicked on input %q: %v", tc.input, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.GenerateDERCRLFn = func(_ context.Context, issuerID string) ([]byte, error) {
return nil, ErrMockNotFound
}
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/crl/x", nil)
req.URL.Path = "/.well-known/pki/crl/" + tc.input
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetDERCRL(w, req)
if w.Code >= 500 {
t.Errorf("GetDERCRL/%s: unexpected 5xx %d (body=%q)", tc.name, w.Code, w.Body.String())
}
})
}
}
@@ -0,0 +1,539 @@
package handler
// Adversarial query-parameter, request-body, and revocation-reason tests.
//
// These tests exercise the second boundary of the certificate handler:
//
// * Numeric pagination parsing (page, per_page, page_size)
// * Sort direction and field whitelist
// * Time-range filters (expires_before, expires_after, created_after, updated_after)
// * Cursor pagination
// * Sparse-field projection (?fields=...)
// * Request-body JSON parsing (create/update) — null, malformed, deep nesting,
// unicode, oversized
// * Revocation reason abuse
//
// The handler silently ignores malformed pagination values (it falls back to
// defaults) and ignores invalid RFC3339 time values. These tests lock in that
// behaviour so a future "fail-closed" change has to be deliberate.
import (
"bytes"
"context"
"fmt"
"net/http"
"net/http/httptest"
"net/url"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/repository"
)
// buildListRequest constructs a GET /api/v1/certificates request with the
// given raw query string. We use raw query strings (not url.Values.Encode)
// so adversarial inputs like "page=abc&page=-1" or "%00" pass through
// unchanged.
func buildListRequest(rawQuery string) *http.Request {
req := httptest.NewRequest(http.MethodGet, "/api/v1/certificates", nil)
req.URL.RawQuery = rawQuery
return req.WithContext(contextWithRequestID())
}
// TestListCertificates_PaginationAbuse verifies adversarial pagination values
// never produce a 500 and the handler always falls back to sane defaults.
func TestListCertificates_PaginationAbuse(t *testing.T) {
cases := []struct {
name string
rawQuery string
}{
{"negative_page", "page=-1"},
{"zero_page", "page=0"},
{"non_numeric_page", "page=abc"},
{"huge_page", "page=99999999999"},
{"int_overflow_page", "page=9223372036854775808"}, // int64 max + 1
{"negative_per_page", "per_page=-1"},
{"zero_per_page", "per_page=0"},
{"per_page_cap_at_500", "per_page=500"},
{"per_page_above_cap", "per_page=501"},
{"per_page_absurd", "per_page=1000000"},
{"non_numeric_per_page", "per_page=xyz"},
{"mixed_numeric_per_page", "per_page=10abc"},
{"negative_page_size", "page_size=-1"},
{"page_size_above_cap", "page_size=501"},
{"float_page", "page=1.5"},
{"exponent_page", "page=1e10"},
{"hex_page", "page=0xff"},
{"unicode_digits_page", "page=\u0661\u0662\u0663"}, // Arabic-Indic digits
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.rawQuery, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
// Sanity: page/perPage on the filter must never be negative
// and perPage must never exceed 500 after parsing.
if filter.Page < 1 {
t.Errorf("filter.Page=%d (must be >=1)", filter.Page)
}
if filter.PerPage < 1 || filter.PerPage > 500 {
t.Errorf("filter.PerPage=%d (must be in [1,500])", filter.PerPage)
}
return []domain.ManagedCertificate{}, 0, nil
}
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(tc.rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+tc.name)
if w.Code != http.StatusOK {
t.Errorf("%s: expected 200, got %d (body=%q)", tc.name, w.Code, w.Body.String())
}
})
}
}
// TestListCertificates_SortAbuse verifies the sort field (which feeds into a
// whitelist in the repository layer) handles adversarial input safely at the
// handler boundary. The handler accepts the raw value and forwards it; the
// repository is expected to whitelist it, but at THIS layer we just verify
// we don't crash or leak.
func TestListCertificates_SortAbuse(t *testing.T) {
cases := []struct {
name string
rawQuery string
}{
{"sql_injection_sort", "sort=notAfter;DROP TABLE managed_certificates--"},
{"sql_injection_or", "sort=notAfter' OR '1'='1"},
{"path_traversal_sort", "sort=../../etc/passwd"},
{"null_byte_sort", "sort=notAfter%00"},
{"unicode_sort", "sort=notAfter\u2010desc"},
{"leading_dash_only", "sort=-"},
{"leading_dashes", "sort=---notAfter"},
{"empty_sort", "sort="},
{"very_long_sort", "sort=" + strings.Repeat("a", 5000)},
{"sort_desc_flag", "sort=notAfter&sort_desc=true"},
{"conflicting_sort_desc", "sort=-notAfter&sort_desc=false"},
{"unknown_field", "sort=gibberish"},
{"shell_metacharacters_sort", "sort=notAfter;rm -rf /"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.rawQuery, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{}, 0, nil
}
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(tc.rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+tc.name)
})
}
}
// TestListCertificates_FieldsAbuse verifies sparse field projection handles
// adversarial field lists safely.
func TestListCertificates_FieldsAbuse(t *testing.T) {
cases := []struct {
name string
rawQuery string
}{
{"sql_injection_fields", "fields=id,name' OR 1=1--"},
{"path_traversal_fields", "fields=../../etc/passwd"},
{"empty_fields", "fields="},
{"single_comma", "fields=,"},
{"trailing_comma", "fields=id,name,"},
{"leading_comma", "fields=,id,name"},
{"whitespace_fields", "fields= id , name "},
{"duplicate_fields", "fields=id,id,id,id,id"},
{"unknown_fields", "fields=totally_not_a_field"},
{"many_fields", "fields=" + strings.Repeat("x,", 200) + "id"},
{"unicode_fields", "fields=id,n\u00e4me"},
{"null_byte_fields", "fields=id%00name"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.rawQuery, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{}, 0, nil
}
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(tc.rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+tc.name)
})
}
}
// TestListCertificates_TimeRangeAbuse verifies RFC3339 time-range filters
// handle malformed input by silently falling back to no filter (current
// behaviour).
func TestListCertificates_TimeRangeAbuse(t *testing.T) {
cases := []struct {
name string
rawQuery string
}{
{"invalid_expires_before", "expires_before=not-a-date"},
{"empty_expires_before", "expires_before="},
{"garbage_expires_before", "expires_before=%00%00"},
{"sql_injection_time", "expires_before=2026-01-01T00:00:00Z';DROP TABLE managed_certificates--"},
{"year_zero", "expires_before=0000-01-01T00:00:00Z"},
{"year_negative", "expires_before=-0001-01-01T00:00:00Z"},
{"year_huge", "expires_before=99999-12-31T23:59:59Z"},
{"invalid_month", "expires_before=2026-13-01T00:00:00Z"},
{"invalid_day", "expires_before=2026-02-30T00:00:00Z"},
{"valid_utc", "expires_before=2026-06-15T12:00:00Z"},
{"valid_with_offset", "expires_before=2026-06-15T12:00:00-07:00"},
{"unix_seconds_not_rfc3339", "expires_before=1767225600"},
{"all_four_filters", "expires_before=garbage&expires_after=garbage&created_after=garbage&updated_after=garbage"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.rawQuery, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{}, 0, nil
}
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(tc.rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+tc.name)
if w.Code != http.StatusOK {
t.Errorf("%s: expected 200, got %d", tc.name, w.Code)
}
})
}
}
// TestListCertificates_CursorAbuse exercises cursor-based pagination with
// adversarial cursor tokens. The handler forwards the cursor to the
// repository; we verify no 500 at the boundary and that the response type
// switches correctly.
func TestListCertificates_CursorAbuse(t *testing.T) {
cases := []struct {
name string
cursor string
}{
{"empty_not_set", ""}, // special-cased: should return PagedResponse
{"garbage_cursor", "not-a-valid-cursor"},
{"base64_garbage", "dGhpcyBpcyBub3QgYSB2YWxpZCBjdXJzb3I="},
{"sql_injection_cursor", "2026-01-01T00:00:00Z:mc-001';DROP TABLE--"},
{"path_traversal_cursor", "../../etc/passwd"},
{"null_byte_cursor", "valid%00cursor"},
{"very_long_cursor", strings.Repeat("A", 8192)},
{"unicode_cursor", "2026-01-01T00:00:00Z:mc\u20100001"},
{"valid_looking_cursor", "2026-01-01T00:00:00.000000000Z:mc-001"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.cursor, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{}, 0, nil
}
rawQuery := "cursor=" + url.QueryEscape(tc.cursor) + "&page_size=50"
if tc.cursor == "" {
rawQuery = "page=1&per_page=50"
}
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+tc.name)
if w.Code != http.StatusOK {
t.Errorf("%s: expected 200, got %d", tc.name, w.Code)
}
})
}
}
// TestListCertificates_FilterInjection verifies the basic string filters
// (status, environment, owner_id, team_id, issuer_id, agent_id, profile_id)
// are forwarded as-is without causing any handler-layer failures. These go
// into parameterized SQL at the repo layer.
func TestListCertificates_FilterInjection(t *testing.T) {
filters := []string{
"status", "environment", "owner_id", "team_id",
"issuer_id", "agent_id", "profile_id",
}
payloads := []string{
"' OR 1=1--",
"'; DROP TABLE managed_certificates;--",
"../../etc/passwd",
strings.Repeat("A", 5000),
"\u2010hyphen",
"%00null",
}
for _, f := range filters {
for _, p := range payloads {
name := f + "__" + p
if len(name) > 80 {
name = name[:80]
}
t.Run(name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked: %v", r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.ListCertificatesWithFilterFn = func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{}, 0, nil
}
rawQuery := f + "=" + url.QueryEscape(p)
w := httptest.NewRecorder()
handler.ListCertificates(w, buildListRequest(rawQuery))
assertSafeResponse(t, w, "ListCertificates/"+f)
})
}
}
}
// ---------- Request body abuse (Tier 1D) ----------
// TestCreateCertificate_BodyAbuse sends adversarial JSON bodies to
// POST /api/v1/certificates. Every case must respond with 400 (not 500,
// not 200). This proves we reject malformed input before reaching the
// service layer.
func TestCreateCertificate_BodyAbuse(t *testing.T) {
cases := []struct {
name string
body string
}{
{"null_body", "null"},
{"empty_body", ""},
{"not_json", "not json at all"},
{"truncated_json", `{"common_name":"exa`},
{"unclosed_object", `{"common_name":"example.com"`},
{"array_not_object", `["example.com"]`},
{"number_not_object", `42`},
{"string_not_object", `"hello"`},
{"boolean_not_object", `true`},
{"duplicate_keys", `{"common_name":"evil.com","common_name":"example.com"}`},
{"unicode_bom", "\ufeff{\"common_name\":\"example.com\"}"},
{"deep_nesting", strings.Repeat("{\"x\":", 100) + "null" + strings.Repeat("}", 100)},
{"nested_array_bomb", `{"common_name":"x","sans":[[[[[[[[[[]]]]]]]]]]}`},
{"sql_injection_cn", `{"common_name":"'; DROP TABLE managed_certificates;--"}`},
{"empty_cn", `{"common_name":""}`},
{"null_cn", `{"common_name":null}`},
{"whitespace_cn", `{"common_name":" "}`},
{"cn_too_long", fmt.Sprintf(`{"common_name":%q}`, strings.Repeat("a", 500))},
{"cn_path_traversal", `{"common_name":"../../etc/passwd"}`},
{"cn_null_byte", "{\"common_name\":\"example\\u0000.com\"}"},
{"cn_newline", "{\"common_name\":\"example\\n.com\"}"},
{"cn_only_missing_others", `{"common_name":"example.com"}`},
{"extra_unknown_fields", `{"common_name":"example.com","__proto__":{"polluted":true},"eval":"alert(1)"}`},
{"unicode_homoglyph_cn", "{\"common_name\":\"ex\u0430mple.com\"}"}, // Cyrillic а
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.name, r)
}
}()
handler, mock := newCertHandlerWithMock()
mock.CreateCertificateFn = func(_ context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
// If we ever reach this, the handler accepted a malformed
// body. Return a sentinel that passes but flag it.
c := cert
c.ID = "mc-accepted"
return &c, nil
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates", bytes.NewBufferString(tc.body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateCertificate(w, req)
assertSafeResponse(t, w, "CreateCertificate/"+tc.name)
// Must NOT be 201 — all these bodies should be rejected.
if w.Code == http.StatusCreated {
t.Errorf("%s: handler accepted malformed body (201) body=%q", tc.name, w.Body.String())
}
})
}
}
// TestCreateCertificate_HugeBody sends a 2 MiB JSON body. The body-limit
// middleware is not in this handler-unit test, so we just verify the handler
// doesn't OOM/panic on a large but well-formed body.
func TestCreateCertificate_HugeBody(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on huge body: %v", r)
}
}()
// 2 MiB of SANs — well-formed JSON, technically valid, just huge.
var sb strings.Builder
sb.WriteString(`{"common_name":"example.com","owner_id":"o","team_id":"t","issuer_id":"iss","name":"n","renewal_policy_id":"rp","sans":[`)
for i := 0; i < 20000; i++ {
if i > 0 {
sb.WriteByte(',')
}
fmt.Fprintf(&sb, `"host%d.example.com"`, i)
}
sb.WriteString(`]}`)
handler, mock := newCertHandlerWithMock()
mock.CreateCertificateFn = func(_ context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
c := cert
c.ID = "mc-huge"
return &c, nil
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates", strings.NewReader(sb.String()))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateCertificate(w, req)
assertSafeResponse(t, w, "CreateCertificate/huge_body")
}
// ---------- Revocation reason abuse (Tier 1E) ----------
// TestRevokeCertificate_ReasonAbuse sends adversarial revocation reasons to
// POST /api/v1/certificates/{id}/revoke. The handler forwards the reason
// string to the service layer, which validates against RFC 5280. Errors
// from the service containing "invalid revocation reason" must map to 400,
// never 500.
func TestRevokeCertificate_ReasonAbuse(t *testing.T) {
cases := []struct {
name string
body string
}{
{"empty_reason", `{"reason":""}`},
{"null_reason", `{"reason":null}`},
{"nonexistent_reason", `{"reason":"totally made up"}`},
{"case_variant", `{"reason":"KEYCOMPROMISE"}`},
{"with_spaces", `{"reason":"key compromise"}`},
{"with_dashes", `{"reason":"key-compromise"}`},
{"mixed_case", `{"reason":"KeyCompromise"}`},
{"lowercase_valid", `{"reason":"keycompromise"}`},
{"unicode_homoglyph", "{\"reason\":\"keyCompr\u043emise\"}"},
{"sql_injection", `{"reason":"keyCompromise';DROP TABLE revocations--"}`},
{"very_long", fmt.Sprintf(`{"reason":%q}`, strings.Repeat("a", 10000))},
{"integer_reason", `{"reason":1}`},
{"array_reason", `{"reason":["keyCompromise"]}`},
{"object_reason", `{"reason":{"code":1}}`},
{"extra_fields", `{"reason":"keyCompromise","admin":true,"bypass":true}`},
{"no_body", ``},
{"malformed_json", `{"reason":`},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
defer func() {
if r := recover(); r != nil {
t.Fatalf("panicked on %q: %v", tc.name, r)
}
}()
handler, mock := newCertHandlerWithMock()
// The mock always returns "invalid revocation reason" so we
// verify the handler's errMsg→status mapping turns it into a 400.
mock.RevokeCertificateFn = func(_ context.Context, id string, reason string, _ string) error {
// The service uses domain.IsValidRevocationReason. If we got
// through to here with something bogus, simulate a real
// service error.
return fmt.Errorf("invalid revocation reason: %q", reason)
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/mc-001/revoke", bytes.NewBufferString(tc.body))
req.URL.Path = "/api/v1/certificates/mc-001/revoke"
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RevokeCertificate(w, req)
assertSafeResponse(t, w, "RevokeCertificate/"+tc.name)
})
}
}
// TestRevokeCertificate_AlreadyRevoked locks in the specific error->status
// mapping for "already revoked". The handler uses substring matching on the
// service error message, which is fragile — this test catches regressions.
func TestRevokeCertificate_AlreadyRevoked(t *testing.T) {
handler, mock := newCertHandlerWithMock()
mock.RevokeCertificateFn = func(_ context.Context, id string, reason string, _ string) error {
return fmt.Errorf("cannot revoke: certificate is already revoked")
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/mc-001/revoke", strings.NewReader(`{"reason":"keyCompromise"}`))
req.URL.Path = "/api/v1/certificates/mc-001/revoke"
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RevokeCertificate(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400 for already-revoked, got %d (body=%q)", w.Code, w.Body.String())
}
assertSafeResponse(t, w, "RevokeCertificate/already_revoked")
}
// TestRevokeCertificate_NotFound verifies 404 mapping.
func TestRevokeCertificate_NotFound(t *testing.T) {
handler, mock := newCertHandlerWithMock()
mock.RevokeCertificateFn = func(_ context.Context, id string, reason string, _ string) error {
return fmt.Errorf("certificate not found")
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/mc-missing/revoke", strings.NewReader(`{"reason":"keyCompromise"}`))
req.URL.Path = "/api/v1/certificates/mc-missing/revoke"
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RevokeCertificate(w, req)
if w.Code != http.StatusNotFound {
t.Errorf("expected 404 for not-found, got %d (body=%q)", w.Code, w.Body.String())
}
assertSafeResponse(t, w, "RevokeCertificate/not_found")
}
@@ -10,6 +10,7 @@ import (
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// MockAgentService is a mock implementation of AgentService interface.
@@ -24,6 +25,11 @@ type MockAgentService struct {
GetWorkFn func(agentID string) ([]domain.Job, error)
GetWorkWithTargetsFn func(agentID string) ([]domain.WorkItem, error)
UpdateJobStatusFn func(agentID string, jobID string, status string, errMsg string) error
// I-004: soft-retirement hooks. Tests that don't set these receive nil
// results and nil errors, which mirrors the safest default (no-op) for
// unrelated suites that mock only the legacy surface.
RetireAgentFn func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error)
ListRetiredAgentsFn func(page, perPage int) ([]domain.Agent, int64, error)
}
func (m *MockAgentService) ListAgents(_ context.Context, page, perPage int) ([]domain.Agent, int64, error) {
@@ -96,6 +102,25 @@ func (m *MockAgentService) UpdateJobStatus(_ context.Context, agentID string, jo
return nil
}
// RetireAgent is the I-004 soft-retirement entrypoint. Tests that don't set
// RetireAgentFn get a nil result + nil error, which is a no-op response that
// lets unrelated suites compile without caring about the retirement surface.
func (m *MockAgentService) RetireAgent(_ context.Context, agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
if m.RetireAgentFn != nil {
return m.RetireAgentFn(agentID, actor, force, reason)
}
return nil, nil
}
// ListRetiredAgents returns retired rows for the retired-agents tab / audit
// views. Same zero-value default as RetireAgent for unrelated tests.
func (m *MockAgentService) ListRetiredAgents(_ context.Context, page, perPage int) ([]domain.Agent, int64, error) {
if m.ListRetiredAgentsFn != nil {
return m.ListRetiredAgentsFn(page, perPage)
}
return nil, 0, nil
}
// Test ListAgents - success case
func TestListAgents_Success(t *testing.T) {
now := time.Now()
@@ -0,0 +1,393 @@
package handler
import (
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// agentRetireTestSetup builds an AgentHandler with a mock AgentService whose
// RetireAgent / ListRetiredAgents / Heartbeat behavior is driven by the
// returned mock. Keeps every I-004 handler test self-contained so a single
// failing assertion can't cascade through a shared fixture.
func agentRetireTestSetup() (*MockAgentService, AgentHandler) {
mock := &MockAgentService{}
handler := NewAgentHandler(mock)
return mock, handler
}
// TestRetireAgentHandler_Success_200 pins the happy-path contract for the
// soft-retirement HTTP surface: DELETE /api/v1/agents/{id} with no dependency
// fallout returns 200 OK and a JSON body echoing retirement metadata
// (retired_at timestamp, already_retired=false, cascade=false, zero counts).
// Operators building dashboards parse these fields; keep the shape stable.
func TestRetireAgentHandler_Success_200(t *testing.T) {
retiredAt := time.Date(2026, 4, 18, 12, 0, 0, 0, time.UTC)
mock, handler := agentRetireTestSetup()
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
if agentID != "a-prod-001" {
t.Fatalf("retire handler received agentID=%q want a-prod-001", agentID)
}
if force {
t.Fatalf("retire handler set force=true unexpectedly; default path must be force=false")
}
return &service.AgentRetirementResult{
AlreadyRetired: false,
Cascade: false,
RetiredAt: retiredAt,
Counts: domain.AgentDependencyCounts{},
}, nil
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/a-prod-001", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status=%d body=%s want 200", w.Code, w.Body.String())
}
var body struct {
RetiredAt time.Time `json:"retired_at"`
AlreadyRetired bool `json:"already_retired"`
Cascade bool `json:"cascade"`
Counts domain.AgentDependencyCounts `json:"counts"`
}
if err := json.NewDecoder(w.Body).Decode(&body); err != nil {
t.Fatalf("decode 200 body: %v", err)
}
if !body.RetiredAt.Equal(retiredAt) {
t.Errorf("retired_at=%v want %v", body.RetiredAt, retiredAt)
}
if body.AlreadyRetired {
t.Errorf("already_retired=true want false on clean retire")
}
if body.Cascade {
t.Errorf("cascade=true want false on clean retire")
}
}
// TestRetireAgentHandler_AlreadyRetired_204 covers the idempotent contract: a
// retire call against an already-retired agent completes with 204 No Content
// (no body). This lets operators safely re-issue the DELETE after a network
// blip without fearing duplicate audit events or state mutations.
func TestRetireAgentHandler_AlreadyRetired_204(t *testing.T) {
mock, handler := agentRetireTestSetup()
past := time.Now().Add(-24 * time.Hour)
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
return &service.AgentRetirementResult{
AlreadyRetired: true,
Cascade: false,
RetiredAt: past,
Counts: domain.AgentDependencyCounts{},
}, nil
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/a-prod-001", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusNoContent {
t.Fatalf("status=%d body=%s want 204", w.Code, w.Body.String())
}
// 204 No Content must have zero body. If anything leaks through, downstream
// clients (curl scripts, dashboards) break.
if w.Body.Len() != 0 {
t.Errorf("204 body=%q want empty", w.Body.String())
}
}
// TestRetireAgentHandler_Sentinel_403 covers the hard guard against retiring
// any of the four sentinel agents that back discovery sources and the
// network scanner. These IDs are reserved; the handler must surface the
// service-layer ErrAgentIsSentinel as 403 Forbidden regardless of force/reason
// because no operator intent can legitimately retire them.
func TestRetireAgentHandler_Sentinel_403(t *testing.T) {
sentinels := []string{"server-scanner", "cloud-aws-sm", "cloud-azure-kv", "cloud-gcp-sm"}
for _, id := range sentinels {
t.Run(id, func(t *testing.T) {
mock, handler := agentRetireTestSetup()
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
return nil, service.ErrAgentIsSentinel
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/"+id, nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("sentinel %q status=%d body=%s want 403", id, w.Code, w.Body.String())
}
})
}
}
// TestRetireAgentHandler_NotFound_404 covers the lookup-miss path. Service
// returns a not-found error; handler maps to 404. Keeping the error
// discrimination at the service layer (sentinel errors.Is) rather than string
// matching is the whole point of wrapping.
func TestRetireAgentHandler_NotFound_404(t *testing.T) {
mock, handler := agentRetireTestSetup()
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
return nil, errors.New("agent not found")
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/unknown-id", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusNotFound {
t.Fatalf("status=%d body=%s want 404", w.Code, w.Body.String())
}
}
// TestRetireAgentHandler_Blocked_409_WithCounts covers the preflight-blocked
// path. Service returns *BlockedByDependenciesError wrapping
// ErrBlockedByDependencies; handler unwraps via errors.As, maps to 409, and
// MUST include the counts in the response body so operators know what's
// blocking them. Without counts the 409 is useless — the operator has to
// guess which downstream dependency is holding up the retirement.
func TestRetireAgentHandler_Blocked_409_WithCounts(t *testing.T) {
mock, handler := agentRetireTestSetup()
blockCounts := domain.AgentDependencyCounts{
ActiveTargets: 3,
ActiveCertificates: 7,
PendingJobs: 2,
}
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
return nil, &service.BlockedByDependenciesError{Counts: blockCounts}
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/a-prod-001", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusConflict {
t.Fatalf("status=%d body=%s want 409", w.Code, w.Body.String())
}
var body struct {
Error string `json:"error"`
Message string `json:"message"`
Counts domain.AgentDependencyCounts `json:"counts"`
}
if err := json.NewDecoder(w.Body).Decode(&body); err != nil {
t.Fatalf("decode 409 body: %v", err)
}
if body.Counts.ActiveTargets != 3 {
t.Errorf("counts.active_targets=%d want 3", body.Counts.ActiveTargets)
}
if body.Counts.ActiveCertificates != 7 {
t.Errorf("counts.active_certificates=%d want 7", body.Counts.ActiveCertificates)
}
if body.Counts.PendingJobs != 2 {
t.Errorf("counts.pending_jobs=%d want 2", body.Counts.PendingJobs)
}
if body.Message == "" {
t.Errorf("409 body missing human-readable message; operators need guidance")
}
}
// TestRetireAgentHandler_Force_NoReason_400 covers the force-escape-hatch
// guardrail: force=true without a non-empty reason must be rejected at the
// handler seam BEFORE the service performs any DB work, because a
// reason-less cascade is unauditable. Service returns ErrForceReasonRequired;
// handler maps to 400.
func TestRetireAgentHandler_Force_NoReason_400(t *testing.T) {
mock, handler := agentRetireTestSetup()
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
if !force {
t.Fatalf("handler did not forward force=true; force query param was dropped")
}
if reason != "" {
t.Fatalf("handler passed reason=%q; empty reason must reach service for error path", reason)
}
return nil, service.ErrForceReasonRequired
}
req := httptest.NewRequest(http.MethodDelete, "/api/v1/agents/a-prod-001?force=true", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("status=%d body=%s want 400", w.Code, w.Body.String())
}
}
// TestRetireAgentHandler_ForceCascade_200 covers the successful force-cascade
// path: DELETE ?force=true&reason=... → service executes transactional
// cascade → 200 with cascade=true and the pre-cascade counts echoed back so
// the operator's confirmation dialog can show "I just retired N targets,
// M certificates, K pending jobs."
func TestRetireAgentHandler_ForceCascade_200(t *testing.T) {
mock, handler := agentRetireTestSetup()
retiredAt := time.Date(2026, 4, 18, 14, 30, 0, 0, time.UTC)
mock.RetireAgentFn = func(agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error) {
if !force {
t.Fatalf("handler did not forward force=true; query-param parsing broken")
}
if reason != "decommissioning rack 7" {
t.Fatalf("handler forwarded reason=%q want %q", reason, "decommissioning rack 7")
}
return &service.AgentRetirementResult{
AlreadyRetired: false,
Cascade: true,
RetiredAt: retiredAt,
Counts: domain.AgentDependencyCounts{
ActiveTargets: 2,
ActiveCertificates: 5,
PendingJobs: 1,
},
}, nil
}
url := "/api/v1/agents/a-prod-001?force=true&reason=decommissioning+rack+7"
req := httptest.NewRequest(http.MethodDelete, url, nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status=%d body=%s want 200", w.Code, w.Body.String())
}
var body struct {
RetiredAt time.Time `json:"retired_at"`
AlreadyRetired bool `json:"already_retired"`
Cascade bool `json:"cascade"`
Counts domain.AgentDependencyCounts `json:"counts"`
}
if err := json.NewDecoder(w.Body).Decode(&body); err != nil {
t.Fatalf("decode force-cascade 200 body: %v", err)
}
if !body.Cascade {
t.Errorf("cascade=false want true on ?force=true successful retire")
}
if body.Counts.ActiveTargets != 2 || body.Counts.ActiveCertificates != 5 || body.Counts.PendingJobs != 1 {
t.Errorf("counts=%+v want {ActiveTargets:2 ActiveCertificates:5 PendingJobs:1}", body.Counts)
}
}
// TestHeartbeatHandler_RetiredAgent_410 covers the agent-shutdown signal. A
// retired agent that is still polling must be told its identity is gone
// (410 Gone) rather than offered the normal 200 "recorded" response.
// cmd/agent treats 410 as a terminal signal and exits rather than looping
// forever against a decommissioned identity. Service returns ErrAgentRetired;
// handler maps to 410.
func TestHeartbeatHandler_RetiredAgent_410(t *testing.T) {
mock, handler := agentRetireTestSetup()
mock.HeartbeatFn = func(agentID string, metadata *domain.AgentMetadata) error {
return service.ErrAgentRetired
}
req := httptest.NewRequest(http.MethodPost, "/api/v1/agents/a-prod-001/heartbeat", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.Heartbeat(w, req)
if w.Code != http.StatusGone {
t.Fatalf("heartbeat(retired) status=%d body=%s want 410", w.Code, w.Body.String())
}
}
// TestListRetiredAgentsHandler_Success covers the audit/forensics-facing
// endpoint GET /api/v1/agents/retired. Returns a paged list of retired rows
// alongside total count so the GUI can render a "Retired Agents" tab with
// pagination. Default listing (GET /agents) hides retired rows; this is the
// opt-in surface for them.
func TestListRetiredAgentsHandler_Success(t *testing.T) {
past := time.Now().Add(-48 * time.Hour)
reason := "old hardware"
retired := []domain.Agent{
{
ID: "agent-retired-01",
Name: "decom-01",
Hostname: "server-old",
Status: domain.AgentStatusOffline,
RegisteredAt: past,
RetiredAt: &past,
RetiredReason: &reason,
},
}
mock, handler := agentRetireTestSetup()
mock.ListRetiredAgentsFn = func(page, perPage int) ([]domain.Agent, int64, error) {
if page != 1 || perPage != 50 {
t.Fatalf("ListRetired handler received page=%d perPage=%d want 1/50 defaults", page, perPage)
}
return retired, 1, nil
}
req := httptest.NewRequest(http.MethodGet, "/api/v1/agents/retired", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.ListRetiredAgents(w, req)
if w.Code != http.StatusOK {
t.Fatalf("status=%d body=%s want 200", w.Code, w.Body.String())
}
var response PagedResponse
if err := json.NewDecoder(w.Body).Decode(&response); err != nil {
t.Fatalf("decode list-retired body: %v", err)
}
if response.Total != 1 {
t.Errorf("total=%d want 1", response.Total)
}
}
// TestRetireAgentHandler_MethodNotAllowed covers defense-in-depth: only
// DELETE is valid on /api/v1/agents/{id} for retirement. Using POST/PUT/PATCH
// must be rejected with 405 so misconfigured callers don't accidentally
// trigger retirement via a wrong-method request.
func TestRetireAgentHandler_MethodNotAllowed(t *testing.T) {
_, handler := agentRetireTestSetup()
for _, method := range []string{http.MethodPost, http.MethodPut, http.MethodPatch} {
t.Run(method, func(t *testing.T) {
req := httptest.NewRequest(method, "/api/v1/agents/a-prod-001", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.RetireAgent(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Fatalf("method=%s status=%d want 405", method, w.Code)
}
})
}
}
// Compile-time asserts: the mock must satisfy the handler's AgentService
// interface. Red state: this fails until the interface grows RetireAgent +
// ListRetiredAgents. Once Phase 2b adds those methods to AgentService, this
// assertion goes green along with every test above.
var _ AgentService = (*MockAgentService)(nil)
// Unused-import suppressor for context — the package-level tests already
// pull context from agent_handler_test.go, but leaving this here documents
// that the mock methods receive context.Context values even though this
// file's tests don't construct them directly (they ride on httptest.NewRequest).
var _ = context.Background
+199
View File
@@ -3,16 +3,24 @@ package handler
import (
"context"
"encoding/json"
"errors"
"log/slog"
"net/http"
"strconv"
"strings"
"time"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// AgentService defines the service interface for agent operations.
//
// I-004 expansion: RetireAgent + ListRetiredAgents back the soft-retirement
// surface. The handler depends on the service-package's AgentRetirementResult
// and BlockedByDependenciesError types for result shape + errors.As unwrap,
// which is why this file imports internal/service.
type AgentService interface {
ListAgents(ctx context.Context, page, perPage int) ([]domain.Agent, int64, error)
GetAgent(ctx context.Context, id string) (*domain.Agent, error)
@@ -24,6 +32,10 @@ type AgentService interface {
GetWork(ctx context.Context, agentID string) ([]domain.Job, error)
GetWorkWithTargets(ctx context.Context, agentID string) ([]domain.WorkItem, error)
UpdateJobStatus(ctx context.Context, agentID string, jobID string, status string, errMsg string) error
// I-004 soft-retirement API. Both default to no-op (nil result / nil error)
// in mocks that don't override them — handler tests opt in per suite.
RetireAgent(ctx context.Context, agentID, actor string, force bool, reason string) (*service.AgentRetirementResult, error)
ListRetiredAgents(ctx context.Context, page, perPage int) ([]domain.Agent, int64, error)
}
// AgentHandler handles HTTP requests for agent operations.
@@ -190,6 +202,15 @@ func (h AgentHandler) Heartbeat(w http.ResponseWriter, r *http.Request) {
}
if err := h.svc.Heartbeat(r.Context(), agentID, metadata); err != nil {
// I-004: a retired agent still polling must receive 410 Gone so
// cmd/agent detects the terminal signal and shuts down cleanly
// instead of looping forever against a decommissioned identity.
// Check this FIRST — before "not found" string matching — so the
// retired-path is never masked by a sibling error branch.
if errors.Is(err, service.ErrAgentRetired) {
ErrorWithRequestID(w, http.StatusGone, "Agent has been retired", requestID)
return
}
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Agent not found", requestID)
return
@@ -376,3 +397,181 @@ func (h AgentHandler) AgentReportJobStatus(w http.ResponseWriter, r *http.Reques
"status": "updated",
})
}
// RetireAgent executes the I-004 soft-retirement surface.
// DELETE /api/v1/agents/{id}[?force=true&reason=...]
//
// Contract (pinned by agent_retire_handler_test.go):
//
// 405 any method other than DELETE
// 200 clean retire (body: retired_at, already_retired=false, cascade=false, counts=0s)
// 200 force-cascade retire (body: cascade=true, counts=pre-cascade snapshot)
// 204 idempotent retire of an already-retired agent (NO body — downstream
// clients that tee responses into dashboards break on spurious bodies)
// 400 force=true without a non-empty reason (ErrForceReasonRequired)
// 403 one of the four reserved sentinel IDs (ErrAgentIsSentinel)
// 404 agent does not exist ("not found" string match, kept for compat with
// repo error strings; sentinel checks run first so they never mask)
// 409 blocked by preflight counts (*BlockedByDependenciesError) — body
// carries the per-bucket counts so the operator UI can tell the
// human which downstream dependency is holding up the retirement,
// rather than forcing them to re-run the DELETE with ?force=true
// and guess
// 500 anything else
//
// The 409 body intentionally does NOT go through ErrorWithRequestID because
// that helper's ErrorResponse shape has no `counts` field — we inline-marshal
// a custom body instead. Keeping this shape stable is important: the GUI
// pattern is "show the 409 dialog, list the N targets / M certs / K jobs
// blocking, let the operator retire them first or tick the force checkbox."
func (h AgentHandler) RetireAgent(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodDelete {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
requestID := middleware.GetRequestID(r.Context())
// Extract {id} from /api/v1/agents/{id}. Mirror GetAgent's pattern so
// the path parser is identical across the agent handler surface and a
// future refactor can extract it once without introducing drift.
rawID := strings.TrimPrefix(r.URL.Path, "/api/v1/agents/")
parts := strings.Split(rawID, "/")
if len(parts) == 0 || parts[0] == "" {
ErrorWithRequestID(w, http.StatusBadRequest, "Agent ID is required", requestID)
return
}
id := parts[0]
// Parse optional force + reason. A missing `force` param is treated as
// force=false (the default, safe path); anything strconv.ParseBool rejects
// is also force=false so a malformed query can never silently enable the
// cascade. The reason string is passed through verbatim — the service
// owns the "force=true requires reason" rule.
query := r.URL.Query()
force := false
if fv := query.Get("force"); fv != "" {
if parsed, err := strconv.ParseBool(fv); err == nil {
force = parsed
}
}
reason := query.Get("reason")
actor := resolveActor(r.Context())
result, err := h.svc.RetireAgent(r.Context(), id, actor, force, reason)
if err != nil {
// Sentinel + typed-error checks run BEFORE string matching on "not
// found" so a repo error that happens to contain those words can
// never mask a structural refusal (403/400/409). Order matters.
if errors.Is(err, service.ErrAgentIsSentinel) {
ErrorWithRequestID(w, http.StatusForbidden, "Agent is a reserved sentinel and cannot be retired", requestID)
return
}
if errors.Is(err, service.ErrForceReasonRequired) {
ErrorWithRequestID(w, http.StatusBadRequest, "force=true requires a non-empty reason", requestID)
return
}
var blocked *service.BlockedByDependenciesError
if errors.As(err, &blocked) {
// Custom 409 body with per-bucket counts. ErrorResponse has no
// `counts` field, so we marshal a bespoke struct instead.
// Keep `error`/`message`/`counts` as the stable shape — any
// dashboard parsing this relies on those three keys.
body := struct {
Error string `json:"error"`
Message string `json:"message"`
Counts domain.AgentDependencyCounts `json:"counts"`
}{
Error: "blocked_by_dependencies",
Message: "Agent has active downstream dependencies. Retire or reassign them " +
"first, or re-run with ?force=true&reason=... to cascade.",
Counts: blocked.Counts,
}
JSON(w, http.StatusConflict, body)
return
}
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Agent not found", requestID)
return
}
slog.Error("RetireAgent failed", "agent_id", id, "error", err.Error())
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to retire agent", requestID)
return
}
// Idempotent retire: the agent was already retired, so we return 204 No
// Content with a ZERO-length body. The Red contract (test line 106) fails
// if even a trailing newline leaks into the response. WriteHeader alone
// emits the status without invoking the JSON encoder.
if result.AlreadyRetired {
w.WriteHeader(http.StatusNoContent)
return
}
// Clean retire (force=false) or successful cascade (force=true). Body
// shape pinned by Red contract: retired_at, already_retired, cascade,
// counts. Omitempty is deliberately NOT used — operators parsing the
// response expect every field to always be present.
JSON(w, http.StatusOK, struct {
RetiredAt time.Time `json:"retired_at"`
AlreadyRetired bool `json:"already_retired"`
Cascade bool `json:"cascade"`
Counts domain.AgentDependencyCounts `json:"counts"`
}{
RetiredAt: result.RetiredAt,
AlreadyRetired: result.AlreadyRetired,
Cascade: result.Cascade,
Counts: result.Counts,
})
}
// ListRetiredAgents returns the opt-in listing of retired agents for the
// operator UI's "Retired" tab and for audit/forensics workflows.
// GET /api/v1/agents/retired?page=1&per_page=50
//
// The default ListAgents handler hides retired rows; this is the dedicated
// surface for reading them back. Pagination defaults match ListAgents so
// the GUI can reuse the same query hook (page=1, per_page=50, cap 500).
//
// Go 1.22's enhanced ServeMux routes `/agents/retired` to this handler via
// the literal-beats-pattern-var precedence rule (literal `retired` wins over
// `{id}` in the sibling GET /api/v1/agents/{id} route), so both entries can
// coexist without conflict. If that precedence ever regresses, the failure
// mode is TestListRetiredAgentsHandler_Success blowing up with a 404 — which
// is the fast signal we want.
func (h AgentHandler) ListRetiredAgents(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
requestID := middleware.GetRequestID(r.Context())
page := 1
perPage := 50
query := r.URL.Query()
if p := query.Get("page"); p != "" {
if parsed, err := strconv.Atoi(p); err == nil && parsed > 0 {
page = parsed
}
}
if pp := query.Get("per_page"); pp != "" {
if parsed, err := strconv.Atoi(pp); err == nil && parsed > 0 && parsed <= 500 {
perPage = parsed
}
}
agents, total, err := h.svc.ListRetiredAgents(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list retired agents", requestID)
return
}
JSON(w, http.StatusOK, PagedResponse{
Data: agents,
Total: total,
Page: page,
PerPage: perPage,
})
}
+5 -4
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"net/http"
"strconv"
"strings"
@@ -11,8 +12,8 @@ import (
// AuditService defines the service interface for audit event operations.
type AuditService interface {
ListAuditEvents(page, perPage int) ([]domain.AuditEvent, int64, error)
GetAuditEvent(id string) (*domain.AuditEvent, error)
ListAuditEvents(ctx context.Context, page, perPage int) ([]domain.AuditEvent, int64, error)
GetAuditEvent(ctx context.Context, id string) (*domain.AuditEvent, error)
}
// AuditHandler handles HTTP requests for audit event operations.
@@ -49,7 +50,7 @@ func (h AuditHandler) ListAuditEvents(w http.ResponseWriter, r *http.Request) {
}
}
events, total, err := h.svc.ListAuditEvents(page, perPage)
events, total, err := h.svc.ListAuditEvents(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list audit events", requestID)
return
@@ -83,7 +84,7 @@ func (h AuditHandler) GetAuditEvent(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
event, err := h.svc.GetAuditEvent(id)
event, err := h.svc.GetAuditEvent(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Audit event not found", requestID)
return
+419
View File
@@ -0,0 +1,419 @@
package handler
import (
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/api/middleware"
)
// mockAuditService implements AuditService for testing.
type mockAuditService struct {
listFunc func(page, perPage int) ([]domain.AuditEvent, int64, error)
getFunc func(id string) (*domain.AuditEvent, error)
}
func (m *mockAuditService) ListAuditEvents(_ context.Context, page, perPage int) ([]domain.AuditEvent, int64, error) {
if m.listFunc != nil {
return m.listFunc(page, perPage)
}
return nil, 0, nil
}
func (m *mockAuditService) GetAuditEvent(_ context.Context, id string) (*domain.AuditEvent, error) {
if m.getFunc != nil {
return m.getFunc(id)
}
return nil, nil
}
func TestListAuditEvents_Success(t *testing.T) {
events := []domain.AuditEvent{
{
ID: "ev-1",
Action: "certificate_issued",
Actor: "user@example.com",
ActorType: domain.ActorTypeUser,
ResourceID: "mc-api-prod",
ResourceType: "Certificate",
Timestamp: time.Now(),
},
{
ID: "ev-2",
Action: "certificate_renewed",
Actor: "user@example.com",
ActorType: domain.ActorTypeUser,
ResourceID: "mc-api-prod",
ResourceType: "Certificate",
Timestamp: time.Now(),
},
}
mockSvc := &mockAuditService{
listFunc: func(page, perPage int) ([]domain.AuditEvent, int64, error) {
if page != 1 || perPage != 50 {
t.Errorf("ListAuditEvents called with page=%d, perPage=%d, expected 1, 50", page, perPage)
}
return events, 2, nil
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
// Add request ID to context
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusOK)
}
var result PagedResponse
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.Total != 2 {
t.Errorf("Total = %d, want 2", result.Total)
}
if result.Page != 1 {
t.Errorf("Page = %d, want 1", result.Page)
}
if result.PerPage != 50 {
t.Errorf("PerPage = %d, want 50", result.PerPage)
}
// Check data is present
if result.Data == nil {
t.Error("Data is nil, want events slice")
}
}
func TestListAuditEvents_WithPagination(t *testing.T) {
events := []domain.AuditEvent{
{
ID: "ev-5",
Action: "certificate_issued",
Actor: "user@example.com",
ActorType: domain.ActorTypeUser,
ResourceID: "mc-api-prod",
ResourceType: "Certificate",
Timestamp: time.Now(),
},
}
mockSvc := &mockAuditService{
listFunc: func(page, perPage int) ([]domain.AuditEvent, int64, error) {
if page != 2 || perPage != 25 {
t.Errorf("ListAuditEvents called with page=%d, perPage=%d, expected 2, 25", page, perPage)
}
return events, 100, nil
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit?page=2&per_page=25", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusOK)
}
var result PagedResponse
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.Page != 2 {
t.Errorf("Page = %d, want 2", result.Page)
}
if result.PerPage != 25 {
t.Errorf("PerPage = %d, want 25", result.PerPage)
}
}
func TestListAuditEvents_PerPageMaxLimit(t *testing.T) {
mockSvc := &mockAuditService{
listFunc: func(page, perPage int) ([]domain.AuditEvent, int64, error) {
// Should be capped at 500
if perPage > 500 {
t.Errorf("perPage = %d, expected <= 500", perPage)
}
return []domain.AuditEvent{}, 0, nil
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit?per_page=1000", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusOK)
}
var result PagedResponse
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.PerPage > 500 {
t.Errorf("PerPage = %d, want <= 500", result.PerPage)
}
}
func TestListAuditEvents_EmptyResult(t *testing.T) {
mockSvc := &mockAuditService{
listFunc: func(page, perPage int) ([]domain.AuditEvent, int64, error) {
return []domain.AuditEvent{}, 0, nil
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusOK)
}
var result PagedResponse
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.Total != 0 {
t.Errorf("Total = %d, want 0", result.Total)
}
}
func TestListAuditEvents_ServiceError(t *testing.T) {
mockSvc := &mockAuditService{
listFunc: func(page, perPage int) ([]domain.AuditEvent, int64, error) {
return nil, 0, errors.New("database error")
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusInternalServerError {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusInternalServerError)
}
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Message != "Failed to list audit events" {
t.Errorf("Message = %q, want 'Failed to list audit events'", errResp.Message)
}
}
func TestListAuditEvents_MethodNotAllowed(t *testing.T) {
mockSvc := &mockAuditService{}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodPost, "/api/v1/audit", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.ListAuditEvents(w, req)
if status := w.Code; status != http.StatusMethodNotAllowed {
t.Errorf("ListAuditEvents returned status %d, want %d", status, http.StatusMethodNotAllowed)
}
}
func TestGetAuditEvent_Success(t *testing.T) {
event := &domain.AuditEvent{
ID: "ev-123",
Action: "certificate_issued",
Actor: "user@example.com",
ActorType: domain.ActorTypeUser,
ResourceID: "mc-api-prod",
ResourceType: "Certificate",
Timestamp: time.Now(),
}
mockSvc := &mockAuditService{
getFunc: func(id string) (*domain.AuditEvent, error) {
if id != "ev-123" {
t.Errorf("GetAuditEvent called with id=%q, expected ev-123", id)
}
return event, nil
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit/ev-123", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.GetAuditEvent(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("GetAuditEvent returned status %d, want %d", status, http.StatusOK)
}
var result domain.AuditEvent
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.ID != "ev-123" {
t.Errorf("ID = %q, want ev-123", result.ID)
}
if result.Action != "certificate_issued" {
t.Errorf("Action = %q, want certificate_issued", result.Action)
}
}
func TestGetAuditEvent_NotFound(t *testing.T) {
mockSvc := &mockAuditService{
getFunc: func(id string) (*domain.AuditEvent, error) {
return nil, errors.New("not found")
},
}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit/nonexistent", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.GetAuditEvent(w, req)
if status := w.Code; status != http.StatusNotFound {
t.Errorf("GetAuditEvent returned status %d, want %d", status, http.StatusNotFound)
}
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Message != "Audit event not found" {
t.Errorf("Message = %q, want 'Audit event not found'", errResp.Message)
}
}
func TestGetAuditEvent_MethodNotAllowed(t *testing.T) {
mockSvc := &mockAuditService{}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodDelete, "/api/v1/audit/ev-123", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.GetAuditEvent(w, req)
if status := w.Code; status != http.StatusMethodNotAllowed {
t.Errorf("GetAuditEvent returned status %d, want %d", status, http.StatusMethodNotAllowed)
}
}
func TestGetAuditEvent_EmptyID(t *testing.T) {
mockSvc := &mockAuditService{}
handler := NewAuditHandler(mockSvc)
req, err := http.NewRequest(http.MethodGet, "/api/v1/audit/", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
ctx := context.WithValue(req.Context(), middleware.RequestIDKey{}, "test-req-id")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.GetAuditEvent(w, req)
if status := w.Code; status != http.StatusBadRequest {
t.Errorf("GetAuditEvent returned status %d, want %d", status, http.StatusBadRequest)
}
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Message != "Audit event ID is required" {
t.Errorf("Message = %q, want 'Audit event ID is required'", errResp.Message)
}
}
+106
View File
@@ -0,0 +1,106 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
)
// BulkRevocationService defines the service interface for bulk certificate revocation.
type BulkRevocationService interface {
BulkRevoke(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error)
}
// BulkRevocationHandler handles HTTP requests for bulk revocation operations.
type BulkRevocationHandler struct {
svc BulkRevocationService
}
// NewBulkRevocationHandler creates a new BulkRevocationHandler.
func NewBulkRevocationHandler(svc BulkRevocationService) BulkRevocationHandler {
return BulkRevocationHandler{svc: svc}
}
// bulkRevokeRequest represents the JSON request body for bulk revocation.
type bulkRevokeRequest struct {
Reason string `json:"reason"`
ProfileID string `json:"profile_id,omitempty"`
OwnerID string `json:"owner_id,omitempty"`
AgentID string `json:"agent_id,omitempty"`
IssuerID string `json:"issuer_id,omitempty"`
TeamID string `json:"team_id,omitempty"`
CertificateIDs []string `json:"certificate_ids,omitempty"`
}
// BulkRevoke handles bulk certificate revocation.
// POST /api/v1/certificates/bulk-revoke
//
// M-003: admin-only. Bulk revocation is a fleet-scale destructive operation —
// a non-admin caller must not be able to invalidate certificates across
// profiles/owners/agents. The gate is enforced here (before body parsing) so a
// non-admin never sees its request criteria evaluated.
func (h BulkRevocationHandler) BulkRevoke(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
requestID := middleware.GetRequestID(r.Context())
// M-003: admin-only gate. Non-admin callers are rejected before any
// criteria/body processing to avoid leaking validation behavior to
// unauthorized actors.
if !middleware.IsAdmin(r.Context()) {
ErrorWithRequestID(w, http.StatusForbidden,
"Bulk revocation requires admin privileges",
requestID)
return
}
var req bulkRevokeRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, "Invalid request body", requestID)
return
}
// Validate reason is present
if req.Reason == "" {
ErrorWithRequestID(w, http.StatusBadRequest, "Revocation reason is required", requestID)
return
}
// Validate reason is a valid RFC 5280 code
if !domain.IsValidRevocationReason(req.Reason) {
ErrorWithRequestID(w, http.StatusBadRequest, "Invalid revocation reason: "+req.Reason, requestID)
return
}
criteria := domain.BulkRevocationCriteria{
ProfileID: req.ProfileID,
OwnerID: req.OwnerID,
AgentID: req.AgentID,
IssuerID: req.IssuerID,
TeamID: req.TeamID,
CertificateIDs: req.CertificateIDs,
}
// Safety guard: at least one criterion required
if criteria.IsEmpty() {
ErrorWithRequestID(w, http.StatusBadRequest, "At least one filter criterion is required (profile_id, owner_id, agent_id, issuer_id, team_id, or certificate_ids)", requestID)
return
}
// Extract actor from auth context (M-002: named-key identity → audit trail)
actor := resolveActor(r.Context())
result, err := h.svc.BulkRevoke(r.Context(), criteria, req.Reason, actor)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Bulk revocation failed: "+err.Error(), requestID)
return
}
JSON(w, http.StatusOK, result)
}
@@ -0,0 +1,289 @@
package handler
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
)
// mockBulkRevocationService is a test implementation of BulkRevocationService
type mockBulkRevocationService struct {
BulkRevokeFn func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error)
}
func (m *mockBulkRevocationService) BulkRevoke(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
if m.BulkRevokeFn != nil {
return m.BulkRevokeFn(ctx, criteria, reason, actor)
}
return &domain.BulkRevocationResult{}, nil
}
// adminContext returns a context carrying the admin flag, mimicking what the
// auth middleware sets for named-key callers whose entry is admin-tagged.
// M-003: bulk revocation handler requires admin context to reach the service.
func adminContext() context.Context {
ctx := context.WithValue(context.Background(), middleware.RequestIDKey{}, "test-request-id-bulk")
ctx = context.WithValue(ctx, middleware.AdminKey{}, true)
return ctx
}
func TestBulkRevoke_Success_WithIDs(t *testing.T) {
svc := &mockBulkRevocationService{
BulkRevokeFn: func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
if len(criteria.CertificateIDs) != 2 {
t.Errorf("expected 2 IDs, got %d", len(criteria.CertificateIDs))
}
if reason != "keyCompromise" {
t.Errorf("expected reason keyCompromise, got %s", reason)
}
return &domain.BulkRevocationResult{
TotalMatched: 2,
TotalRevoked: 2,
}, nil
},
}
h := NewBulkRevocationHandler(svc)
body := `{"reason":"keyCompromise","certificate_ids":["mc-1","mc-2"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d", w.Code)
}
var result domain.BulkRevocationResult
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result.TotalMatched != 2 {
t.Errorf("expected TotalMatched=2, got %d", result.TotalMatched)
}
if result.TotalRevoked != 2 {
t.Errorf("expected TotalRevoked=2, got %d", result.TotalRevoked)
}
}
func TestBulkRevoke_Success_WithProfile(t *testing.T) {
svc := &mockBulkRevocationService{
BulkRevokeFn: func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
if criteria.ProfileID != "prof-tls" {
t.Errorf("expected profile prof-tls, got %s", criteria.ProfileID)
}
return &domain.BulkRevocationResult{
TotalMatched: 5,
TotalRevoked: 4,
TotalSkipped: 1,
}, nil
},
}
h := NewBulkRevocationHandler(svc)
body := `{"reason":"keyCompromise","profile_id":"prof-tls"}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d", w.Code)
}
}
func TestBulkRevoke_MissingReason_400(t *testing.T) {
h := NewBulkRevocationHandler(&mockBulkRevocationService{})
body := `{"certificate_ids":["mc-1"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
func TestBulkRevoke_EmptyCriteria_400(t *testing.T) {
h := NewBulkRevocationHandler(&mockBulkRevocationService{})
body := `{"reason":"keyCompromise"}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
func TestBulkRevoke_InvalidReason_400(t *testing.T) {
h := NewBulkRevocationHandler(&mockBulkRevocationService{})
body := `{"reason":"totallyBogus","certificate_ids":["mc-1"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
func TestBulkRevoke_MethodNotAllowed_405(t *testing.T) {
h := NewBulkRevocationHandler(&mockBulkRevocationService{})
// Method check fires before the admin gate, so 405 must hold even for a
// non-admin caller — asserting this keeps the ordering explicit.
req := httptest.NewRequest(http.MethodGet, "/api/v1/certificates/bulk-revoke", nil)
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("expected 405, got %d", w.Code)
}
}
func TestBulkRevoke_ServiceError_500(t *testing.T) {
svc := &mockBulkRevocationService{
BulkRevokeFn: func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
return nil, fmt.Errorf("database connection failed")
},
}
h := NewBulkRevocationHandler(svc)
body := `{"reason":"keyCompromise","certificate_ids":["mc-1"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(adminContext())
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", w.Code)
}
}
// --- M-003: admin-only gate on bulk revocation ---
// TestBulkRevoke_NonAdmin_Returns403 is the central authorization regression
// for M-003. A caller without an admin-tagged context must be rejected with
// HTTP 403, regardless of how well-formed its body is, and the service layer
// must never see the request.
func TestBulkRevoke_NonAdmin_Returns403(t *testing.T) {
var serviceCalled bool
svc := &mockBulkRevocationService{
BulkRevokeFn: func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
serviceCalled = true
return &domain.BulkRevocationResult{}, nil
},
}
h := NewBulkRevocationHandler(svc)
// Well-formed body + well-formed reason + filter — the only thing
// missing is an admin-tagged context. The gate must still fire.
body := `{"reason":"keyCompromise","certificate_ids":["mc-1","mc-2"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(contextWithRequestID()) // request id only, no admin flag
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("expected status 403, got %d (body=%q)", w.Code, w.Body.String())
}
var resp map[string]any
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
msg, _ := resp["message"].(string)
if !strings.Contains(strings.ToLower(msg), "admin") {
t.Errorf("expected message to mention admin requirement, got %q", msg)
}
if serviceCalled {
t.Errorf("service was invoked despite non-admin caller — gate failed open")
}
}
// TestBulkRevoke_AdminExplicitFalse_Returns403 pins the specific case where the
// AdminKey exists but is set to false — e.g., a non-admin named-key caller.
// Without this we could regress to "key missing == deny, key present == allow"
// which would silently grant a false flag.
func TestBulkRevoke_AdminExplicitFalse_Returns403(t *testing.T) {
h := NewBulkRevocationHandler(&mockBulkRevocationService{})
body := `{"reason":"keyCompromise","certificate_ids":["mc-1"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
ctx := context.WithValue(context.Background(), middleware.RequestIDKey{}, "test-request-id")
ctx = context.WithValue(ctx, middleware.AdminKey{}, false)
req = req.WithContext(ctx)
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("expected status 403 for admin=false, got %d", w.Code)
}
}
// TestBulkRevoke_AdminPermitted_ForwardsActor confirms the happy path:
// an admin-tagged context reaches the service and the actor (from the auth
// UserKey) is propagated through to BulkRevoke. This keeps the admin gate and
// the M-002 actor-propagation wired together in a single regression.
func TestBulkRevoke_AdminPermitted_ForwardsActor(t *testing.T) {
var capturedActor string
svc := &mockBulkRevocationService{
BulkRevokeFn: func(ctx context.Context, criteria domain.BulkRevocationCriteria, reason string, actor string) (*domain.BulkRevocationResult, error) {
capturedActor = actor
return &domain.BulkRevocationResult{TotalMatched: 1, TotalRevoked: 1}, nil
},
}
h := NewBulkRevocationHandler(svc)
body := `{"reason":"keyCompromise","certificate_ids":["mc-1"]}`
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates/bulk-revoke", bytes.NewBufferString(body))
req.Header.Set("Content-Type", "application/json")
ctx := context.WithValue(context.Background(), middleware.RequestIDKey{}, "test-request-id")
ctx = context.WithValue(ctx, middleware.AdminKey{}, true)
ctx = context.WithValue(ctx, middleware.UserKey{}, "ops-admin")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
h.BulkRevoke(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200 for admin caller, got %d (body=%q)", w.Code, w.Body.String())
}
if capturedActor != "ops-admin" {
t.Errorf("expected actor ops-admin, got %q", capturedActor)
}
}
+167 -215
View File
@@ -17,116 +17,116 @@ import (
// MockCertificateService is a mock implementation of CertificateService interface.
type MockCertificateService struct {
ListCertificatesFn func(status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error)
ListCertificatesWithFilterFn func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error)
GetCertificateFn func(id string) (*domain.ManagedCertificate, error)
CreateCertificateFn func(cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
UpdateCertificateFn func(id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
ArchiveCertificateFn func(id string) error
GetCertificateVersionsFn func(certID string, page, perPage int) ([]domain.CertificateVersion, int64, error)
TriggerRenewalFn func(certID string) error
TriggerDeploymentFn func(certID string, targetID string) error
RevokeCertificateFn func(certID string, reason string) error
GetRevokedCertificatesFn func() ([]*domain.CertificateRevocation, error)
GenerateDERCRLFn func(issuerID string) ([]byte, error)
GetOCSPResponseFn func(issuerID string, serialHex string) ([]byte, error)
GetCertificateDeploymentsFn func(certID string) ([]domain.DeploymentTarget, error)
ListCertificatesFn func(ctx context.Context, status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error)
ListCertificatesWithFilterFn func(ctx context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error)
GetCertificateFn func(ctx context.Context, id string) (*domain.ManagedCertificate, error)
CreateCertificateFn func(ctx context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
UpdateCertificateFn func(ctx context.Context, id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
ArchiveCertificateFn func(ctx context.Context, id string) error
GetCertificateVersionsFn func(ctx context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error)
TriggerRenewalFn func(ctx context.Context, certID string, actor string) error
TriggerDeploymentFn func(ctx context.Context, certID string, targetID string, actor string) error
RevokeCertificateFn func(ctx context.Context, certID string, reason string, actor string) error
GetRevokedCertificatesFn func(ctx context.Context) ([]*domain.CertificateRevocation, error)
GenerateDERCRLFn func(ctx context.Context, issuerID string) ([]byte, error)
GetOCSPResponseFn func(ctx context.Context, issuerID string, serialHex string) ([]byte, error)
GetCertificateDeploymentsFn func(ctx context.Context, certID string) ([]domain.DeploymentTarget, error)
}
func (m *MockCertificateService) ListCertificates(status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error) {
func (m *MockCertificateService) ListCertificates(ctx context.Context, status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error) {
if m.ListCertificatesFn != nil {
return m.ListCertificatesFn(status, environment, ownerID, teamID, issuerID, page, perPage)
return m.ListCertificatesFn(ctx, status, environment, ownerID, teamID, issuerID, page, perPage)
}
return nil, 0, nil
}
func (m *MockCertificateService) GetCertificate(id string) (*domain.ManagedCertificate, error) {
func (m *MockCertificateService) GetCertificate(ctx context.Context, id string) (*domain.ManagedCertificate, error) {
if m.GetCertificateFn != nil {
return m.GetCertificateFn(id)
return m.GetCertificateFn(ctx, id)
}
return nil, nil
}
func (m *MockCertificateService) CreateCertificate(cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
func (m *MockCertificateService) CreateCertificate(ctx context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
if m.CreateCertificateFn != nil {
return m.CreateCertificateFn(cert)
return m.CreateCertificateFn(ctx, cert)
}
return nil, nil
}
func (m *MockCertificateService) UpdateCertificate(id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
func (m *MockCertificateService) UpdateCertificate(ctx context.Context, id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
if m.UpdateCertificateFn != nil {
return m.UpdateCertificateFn(id, cert)
return m.UpdateCertificateFn(ctx, id, cert)
}
return nil, nil
}
func (m *MockCertificateService) ArchiveCertificate(id string) error {
func (m *MockCertificateService) ArchiveCertificate(ctx context.Context, id string) error {
if m.ArchiveCertificateFn != nil {
return m.ArchiveCertificateFn(id)
return m.ArchiveCertificateFn(ctx, id)
}
return nil
}
func (m *MockCertificateService) GetCertificateVersions(certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
func (m *MockCertificateService) GetCertificateVersions(ctx context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
if m.GetCertificateVersionsFn != nil {
return m.GetCertificateVersionsFn(certID, page, perPage)
return m.GetCertificateVersionsFn(ctx, certID, page, perPage)
}
return nil, 0, nil
}
func (m *MockCertificateService) TriggerRenewal(certID string) error {
func (m *MockCertificateService) TriggerRenewal(ctx context.Context, certID string, actor string) error {
if m.TriggerRenewalFn != nil {
return m.TriggerRenewalFn(certID)
return m.TriggerRenewalFn(ctx, certID, actor)
}
return nil
}
func (m *MockCertificateService) TriggerDeployment(certID string, targetID string) error {
func (m *MockCertificateService) TriggerDeployment(ctx context.Context, certID string, targetID string, actor string) error {
if m.TriggerDeploymentFn != nil {
return m.TriggerDeploymentFn(certID, targetID)
return m.TriggerDeploymentFn(ctx, certID, targetID, actor)
}
return nil
}
func (m *MockCertificateService) RevokeCertificate(certID string, reason string) error {
func (m *MockCertificateService) RevokeCertificate(ctx context.Context, certID string, reason string, actor string) error {
if m.RevokeCertificateFn != nil {
return m.RevokeCertificateFn(certID, reason)
return m.RevokeCertificateFn(ctx, certID, reason, actor)
}
return nil
}
func (m *MockCertificateService) GetRevokedCertificates() ([]*domain.CertificateRevocation, error) {
func (m *MockCertificateService) GetRevokedCertificates(ctx context.Context) ([]*domain.CertificateRevocation, error) {
if m.GetRevokedCertificatesFn != nil {
return m.GetRevokedCertificatesFn()
return m.GetRevokedCertificatesFn(ctx)
}
return nil, nil
}
func (m *MockCertificateService) GenerateDERCRL(issuerID string) ([]byte, error) {
func (m *MockCertificateService) GenerateDERCRL(ctx context.Context, issuerID string) ([]byte, error) {
if m.GenerateDERCRLFn != nil {
return m.GenerateDERCRLFn(issuerID)
return m.GenerateDERCRLFn(ctx, issuerID)
}
return nil, nil
}
func (m *MockCertificateService) GetOCSPResponse(issuerID string, serialHex string) ([]byte, error) {
func (m *MockCertificateService) GetOCSPResponse(ctx context.Context, issuerID string, serialHex string) ([]byte, error) {
if m.GetOCSPResponseFn != nil {
return m.GetOCSPResponseFn(issuerID, serialHex)
return m.GetOCSPResponseFn(ctx, issuerID, serialHex)
}
return nil, nil
}
func (m *MockCertificateService) ListCertificatesWithFilter(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
func (m *MockCertificateService) ListCertificatesWithFilter(ctx context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if m.ListCertificatesWithFilterFn != nil {
return m.ListCertificatesWithFilterFn(filter)
return m.ListCertificatesWithFilterFn(ctx, filter)
}
return nil, 0, nil
}
func (m *MockCertificateService) GetCertificateDeployments(certID string) ([]domain.DeploymentTarget, error) {
func (m *MockCertificateService) GetCertificateDeployments(ctx context.Context, certID string) ([]domain.DeploymentTarget, error) {
if m.GetCertificateDeploymentsFn != nil {
return m.GetCertificateDeploymentsFn(certID)
return m.GetCertificateDeploymentsFn(ctx, certID)
}
return nil, nil
}
@@ -158,7 +158,7 @@ func TestListCertificates_Success(t *testing.T) {
}
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.Page == 1 && filter.PerPage == 50 {
return []domain.ManagedCertificate{cert1, cert2}, 2, nil
}
@@ -197,7 +197,7 @@ func TestListCertificates_Success(t *testing.T) {
// Test ListCertificates - with filters
func TestListCertificates_WithFilters(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.Status == "Active" && filter.Environment == "prod" {
return []domain.ManagedCertificate{}, 0, nil
}
@@ -236,7 +236,7 @@ func TestListCertificates_MethodNotAllowed(t *testing.T) {
// Test ListCertificates - service error
func TestListCertificates_ServiceError(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return nil, 0, ErrMockServiceFailed
},
}
@@ -266,7 +266,7 @@ func TestGetCertificate_Success(t *testing.T) {
}
mock := &MockCertificateService{
GetCertificateFn: func(id string) (*domain.ManagedCertificate, error) {
GetCertificateFn: func(_ context.Context, id string) (*domain.ManagedCertificate, error) {
if id == "mc-prod-001" {
return cert, nil
}
@@ -298,7 +298,7 @@ func TestGetCertificate_Success(t *testing.T) {
// Test GetCertificate - not found
func TestGetCertificate_NotFound(t *testing.T) {
mock := &MockCertificateService{
GetCertificateFn: func(id string) (*domain.ManagedCertificate, error) {
GetCertificateFn: func(_ context.Context, id string) (*domain.ManagedCertificate, error) {
return nil, ErrMockNotFound
},
}
@@ -345,7 +345,7 @@ func TestCreateCertificate_Success(t *testing.T) {
}
mock := &MockCertificateService{
CreateCertificateFn: func(cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
CreateCertificateFn: func(_ context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
return created, nil
},
}
@@ -403,7 +403,7 @@ func TestCreateCertificate_InvalidBody(t *testing.T) {
// Test CreateCertificate - service error
func TestCreateCertificate_ServiceError(t *testing.T) {
mock := &MockCertificateService{
CreateCertificateFn: func(cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
CreateCertificateFn: func(_ context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
return nil, ErrMockServiceFailed
},
}
@@ -432,6 +432,66 @@ func TestCreateCertificate_ServiceError(t *testing.T) {
}
}
// TestCreateCertificate_MissingRequiredField_Returns400 pins the C-001 handler
// contract: handler MUST reject a create payload that omits any of the five
// required fields (name, common_name, owner_id, team_id, issuer_id,
// renewal_policy_id) with HTTP 400 before the service is invoked. The mock
// service here would succeed if called; every subtest proving 400 therefore
// proves the handler guard fires.
func TestCreateCertificate_MissingRequiredField_Returns400(t *testing.T) {
baseBody := map[string]interface{}{
"name": "API Prod",
"common_name": "api.example.com",
"owner_id": "o-alice",
"team_id": "t-platform",
"issuer_id": "iss-local",
"renewal_policy_id": "rp-standard",
}
cases := []struct {
name string
missingField string
}{
{"missing name", "name"},
{"missing common_name", "common_name"},
{"missing owner_id", "owner_id"},
{"missing team_id", "team_id"},
{"missing issuer_id", "issuer_id"},
{"missing renewal_policy_id", "renewal_policy_id"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
body := make(map[string]interface{}, len(baseBody))
for k, v := range baseBody {
body[k] = v
}
delete(body, tc.missingField)
bodyBytes, _ := json.Marshal(body)
mock := &MockCertificateService{
CreateCertificateFn: func(_ context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
// Would succeed if handler guard did not fire.
cert.ID = "mc-would-be-created"
return &cert, nil
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
req.Header.Set("Content-Type", "application/json")
w := httptest.NewRecorder()
handler.CreateCertificate(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("%s: expected 400, got %d — body=%s", tc.name, w.Code, w.Body.String())
}
})
}
}
// Test UpdateCertificate - success case
func TestUpdateCertificate_Success(t *testing.T) {
updated := &domain.ManagedCertificate{
@@ -445,7 +505,7 @@ func TestUpdateCertificate_Success(t *testing.T) {
}
mock := &MockCertificateService{
UpdateCertificateFn: func(id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
UpdateCertificateFn: func(_ context.Context, id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error) {
if id == "mc-prod-001" {
return updated, nil
}
@@ -501,7 +561,7 @@ func TestUpdateCertificate_InvalidBody(t *testing.T) {
// Test ArchiveCertificate - success case
func TestArchiveCertificate_Success(t *testing.T) {
mock := &MockCertificateService{
ArchiveCertificateFn: func(id string) error {
ArchiveCertificateFn: func(_ context.Context, id string) error {
if id == "mc-prod-001" {
return nil
}
@@ -524,7 +584,7 @@ func TestArchiveCertificate_Success(t *testing.T) {
// Test ArchiveCertificate - not found
func TestArchiveCertificate_NotFound(t *testing.T) {
mock := &MockCertificateService{
ArchiveCertificateFn: func(id string) error {
ArchiveCertificateFn: func(_ context.Context, id string) error {
return ErrMockNotFound
},
}
@@ -554,7 +614,7 @@ func TestGetCertificateVersions_Success(t *testing.T) {
}
mock := &MockCertificateService{
GetCertificateVersionsFn: func(certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
GetCertificateVersionsFn: func(_ context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
if certID == "mc-prod-001" {
return []domain.CertificateVersion{ver1}, 1, nil
}
@@ -586,7 +646,7 @@ func TestGetCertificateVersions_Success(t *testing.T) {
// Test GetCertificateVersions - not found
func TestGetCertificateVersions_NotFound(t *testing.T) {
mock := &MockCertificateService{
GetCertificateVersionsFn: func(certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
GetCertificateVersionsFn: func(_ context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error) {
return nil, 0, ErrMockNotFound
},
}
@@ -606,7 +666,7 @@ func TestGetCertificateVersions_NotFound(t *testing.T) {
// Test TriggerRenewal - success case
func TestTriggerRenewal_Success(t *testing.T) {
mock := &MockCertificateService{
TriggerRenewalFn: func(certID string) error {
TriggerRenewalFn: func(_ context.Context, certID string, _ string) error {
if certID == "mc-prod-001" {
return nil
}
@@ -638,7 +698,7 @@ func TestTriggerRenewal_Success(t *testing.T) {
// Test TriggerRenewal - service error
func TestTriggerRenewal_ServiceError(t *testing.T) {
mock := &MockCertificateService{
TriggerRenewalFn: func(certID string) error {
TriggerRenewalFn: func(_ context.Context, certID string, _ string) error {
return ErrMockServiceFailed
},
}
@@ -658,7 +718,7 @@ func TestTriggerRenewal_ServiceError(t *testing.T) {
// Test TriggerDeployment - success case
func TestTriggerDeployment_Success(t *testing.T) {
mock := &MockCertificateService{
TriggerDeploymentFn: func(certID string, targetID string) error {
TriggerDeploymentFn: func(_ context.Context, certID string, targetID string, _ string) error {
if certID == "mc-prod-001" {
return nil
}
@@ -695,7 +755,7 @@ func TestTriggerDeployment_Success(t *testing.T) {
// Test TriggerDeployment - without target ID
func TestTriggerDeployment_NoTargetID(t *testing.T) {
mock := &MockCertificateService{
TriggerDeploymentFn: func(certID string, targetID string) error {
TriggerDeploymentFn: func(_ context.Context, certID string, targetID string, _ string) error {
// Should accept empty targetID (deploy to all)
return nil
},
@@ -716,7 +776,7 @@ func TestTriggerDeployment_NoTargetID(t *testing.T) {
// Test ListCertificates - invalid page parameter
func TestListCertificates_InvalidPageParam(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
// Should default to page 1
if filter.Page == 1 {
return []domain.ManagedCertificate{}, 0, nil
@@ -740,7 +800,7 @@ func TestListCertificates_InvalidPageParam(t *testing.T) {
// Test ListCertificates - per_page exceeds max
func TestListCertificates_PerPageExceedsMax(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
// Should cap perPage at 500
if filter.PerPage == 50 { // defaults to 50 if > 500
return []domain.ManagedCertificate{}, 0, nil
@@ -765,7 +825,7 @@ func TestListCertificates_PerPageExceedsMax(t *testing.T) {
func TestRevokeCertificate_Handler_Success(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
if certID != "mc-prod-001" {
t.Errorf("expected certID mc-prod-001, got %s", certID)
}
@@ -798,7 +858,7 @@ func TestRevokeCertificate_Handler_Success(t *testing.T) {
func TestRevokeCertificate_Handler_NoBody(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
// Empty reason is OK — service defaults to "unspecified"
return nil
},
@@ -818,7 +878,7 @@ func TestRevokeCertificate_Handler_NoBody(t *testing.T) {
func TestRevokeCertificate_Handler_AlreadyRevoked(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
return fmt.Errorf("certificate is already revoked")
},
}
@@ -839,7 +899,7 @@ func TestRevokeCertificate_Handler_AlreadyRevoked(t *testing.T) {
func TestRevokeCertificate_Handler_NotFound(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
return fmt.Errorf("failed to fetch certificate: not found")
},
}
@@ -858,7 +918,7 @@ func TestRevokeCertificate_Handler_NotFound(t *testing.T) {
func TestRevokeCertificate_Handler_InvalidReason(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
return fmt.Errorf("invalid revocation reason: badReason")
},
}
@@ -922,7 +982,7 @@ func TestRevokeCertificate_Handler_EmptyID(t *testing.T) {
func TestRevokeCertificate_Handler_CannotRevokeArchived(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
return fmt.Errorf("cannot revoke archived certificate")
},
}
@@ -941,7 +1001,7 @@ func TestRevokeCertificate_Handler_CannotRevokeArchived(t *testing.T) {
func TestRevokeCertificate_Handler_ServerError(t *testing.T) {
mock := &MockCertificateService{
RevokeCertificateFn: func(certID string, reason string) error {
RevokeCertificateFn: func(_ context.Context, certID string, reason string, _ string) error {
return fmt.Errorf("database connection lost")
},
}
@@ -958,132 +1018,18 @@ func TestRevokeCertificate_Handler_ServerError(t *testing.T) {
}
}
// === CRL Handler Tests ===
func TestGetCRL_Success(t *testing.T) {
mock := &MockCertificateService{
GetRevokedCertificatesFn: func() ([]*domain.CertificateRevocation, error) {
return []*domain.CertificateRevocation{
{
ID: "rev-1",
CertificateID: "cert-1",
SerialNumber: "ABC123",
Reason: "keyCompromise",
RevokedAt: time.Date(2026, 3, 20, 10, 0, 0, 0, time.UTC),
},
{
ID: "rev-2",
CertificateID: "cert-2",
SerialNumber: "DEF456",
Reason: "superseded",
RevokedAt: time.Date(2026, 3, 21, 14, 30, 0, 0, time.UTC),
},
}, nil
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCRL(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status %d, got %d", http.StatusOK, w.Code)
}
var resp map[string]interface{}
json.NewDecoder(w.Body).Decode(&resp)
if resp["version"] != float64(1) {
t.Errorf("expected version 1, got %v", resp["version"])
}
if resp["total"] != float64(2) {
t.Errorf("expected total 2, got %v", resp["total"])
}
entries, ok := resp["entries"].([]interface{})
if !ok {
t.Fatal("expected entries to be an array")
}
if len(entries) != 2 {
t.Errorf("expected 2 entries, got %d", len(entries))
}
entry1 := entries[0].(map[string]interface{})
if entry1["serial_number"] != "ABC123" {
t.Errorf("expected serial ABC123, got %v", entry1["serial_number"])
}
if entry1["revocation_reason"] != "keyCompromise" {
t.Errorf("expected reason keyCompromise, got %v", entry1["revocation_reason"])
}
}
func TestGetCRL_Empty(t *testing.T) {
mock := &MockCertificateService{
GetRevokedCertificatesFn: func() ([]*domain.CertificateRevocation, error) {
return nil, nil
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCRL(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected status %d, got %d", http.StatusOK, w.Code)
}
var resp map[string]interface{}
json.NewDecoder(w.Body).Decode(&resp)
if resp["total"] != float64(0) {
t.Errorf("expected total 0, got %v", resp["total"])
}
}
func TestGetCRL_ServiceError(t *testing.T) {
mock := &MockCertificateService{
GetRevokedCertificatesFn: func() ([]*domain.CertificateRevocation, error) {
return nil, fmt.Errorf("revocation repository not configured")
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCRL(w, req)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected status %d, got %d", http.StatusInternalServerError, w.Code)
}
}
func TestGetCRL_MethodNotAllowed(t *testing.T) {
mock := &MockCertificateService{}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/crl", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.GetCRL(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("expected status %d, got %d", http.StatusMethodNotAllowed, w.Code)
}
}
// M15b: DER CRL and OCSP Handler Tests
// === CRL and OCSP Handler Tests (RFC 5280 / RFC 6960, served under /.well-known/pki/) ===
//
// M-006 relocated these endpoints from /api/v1/crl* and /api/v1/ocsp/* to the
// RFC-compliant /.well-known/pki/ namespace and deleted the non-standard JSON
// CRL endpoint. The DER-encoded X.509 CRL (application/pkix-crl) and the
// DER-encoded OCSP response (application/ocsp-response) are the only wire
// formats certctl supports for revocation data.
func TestGetDERCRL_Success(t *testing.T) {
derCRLData := []byte{0x30, 0x82, 0x01, 0x00} // Mock DER CRL bytes
mock := &MockCertificateService{
GenerateDERCRLFn: func(issuerID string) ([]byte, error) {
GenerateDERCRLFn: func(_ context.Context, issuerID string) ([]byte, error) {
if issuerID == "iss-local" {
return derCRLData, nil
}
@@ -1092,7 +1038,7 @@ func TestGetDERCRL_Success(t *testing.T) {
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl/iss-local", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/crl/iss-local", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1107,17 +1053,20 @@ func TestGetDERCRL_Success(t *testing.T) {
if len(responseBody) == 0 {
t.Error("expected non-empty response body")
}
if ct := w.Header().Get("Content-Type"); ct != "application/pkix-crl" {
t.Errorf("expected Content-Type application/pkix-crl, got %q", ct)
}
}
func TestGetDERCRL_IssuerNotFound(t *testing.T) {
mock := &MockCertificateService{
GenerateDERCRLFn: func(issuerID string) ([]byte, error) {
GenerateDERCRLFn: func(_ context.Context, issuerID string) ([]byte, error) {
return nil, fmt.Errorf("issuer not found")
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl/nonexistent", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/crl/nonexistent", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1130,13 +1079,13 @@ func TestGetDERCRL_IssuerNotFound(t *testing.T) {
func TestGetDERCRL_NotSupported(t *testing.T) {
mock := &MockCertificateService{
GenerateDERCRLFn: func(issuerID string) ([]byte, error) {
GenerateDERCRLFn: func(_ context.Context, issuerID string) ([]byte, error) {
return nil, fmt.Errorf("issuer does not support CRL generation")
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/crl/iss-acme", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/crl/iss-acme", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1151,7 +1100,7 @@ func TestGetDERCRL_NotSupported(t *testing.T) {
func TestGetDERCRL_MethodNotAllowed(t *testing.T) {
mock := &MockCertificateService{}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/crl/iss-local", nil)
req := httptest.NewRequest(http.MethodPost, "/.well-known/pki/crl/iss-local", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1165,7 +1114,7 @@ func TestGetDERCRL_MethodNotAllowed(t *testing.T) {
func TestHandleOCSP_Success(t *testing.T) {
ocspResponseBytes := []byte{0x30, 0x82, 0x02, 0x00} // Mock OCSP response
mock := &MockCertificateService{
GetOCSPResponseFn: func(issuerID string, serialHex string) ([]byte, error) {
GetOCSPResponseFn: func(_ context.Context, issuerID string, serialHex string) ([]byte, error) {
if issuerID == "iss-local" && serialHex == "12345" {
return ocspResponseBytes, nil
}
@@ -1174,7 +1123,7 @@ func TestHandleOCSP_Success(t *testing.T) {
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/ocsp/iss-local/12345", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/ocsp/iss-local/12345", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1188,12 +1137,15 @@ func TestHandleOCSP_Success(t *testing.T) {
if len(responseBody) == 0 {
t.Error("expected non-empty OCSP response body")
}
if ct := w.Header().Get("Content-Type"); ct != "application/ocsp-response" {
t.Errorf("expected Content-Type application/ocsp-response, got %q", ct)
}
}
func TestHandleOCSP_MissingSerial(t *testing.T) {
mock := &MockCertificateService{}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/ocsp/iss-local/", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/ocsp/iss-local/", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1206,13 +1158,13 @@ func TestHandleOCSP_MissingSerial(t *testing.T) {
func TestHandleOCSP_IssuerNotFound(t *testing.T) {
mock := &MockCertificateService{
GetOCSPResponseFn: func(issuerID string, serialHex string) ([]byte, error) {
GetOCSPResponseFn: func(_ context.Context, issuerID string, serialHex string) ([]byte, error) {
return nil, fmt.Errorf("issuer not found")
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/ocsp/nonexistent/ABC123", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/ocsp/nonexistent/ABC123", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1225,13 +1177,13 @@ func TestHandleOCSP_IssuerNotFound(t *testing.T) {
func TestHandleOCSP_CertNotFound(t *testing.T) {
mock := &MockCertificateService{
GetOCSPResponseFn: func(issuerID string, serialHex string) ([]byte, error) {
GetOCSPResponseFn: func(_ context.Context, issuerID string, serialHex string) ([]byte, error) {
return nil, fmt.Errorf("certificate not found")
},
}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodGet, "/api/v1/ocsp/iss-local/UNKNOWN", nil)
req := httptest.NewRequest(http.MethodGet, "/.well-known/pki/ocsp/iss-local/UNKNOWN", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1245,7 +1197,7 @@ func TestHandleOCSP_CertNotFound(t *testing.T) {
func TestHandleOCSP_MethodNotAllowed(t *testing.T) {
mock := &MockCertificateService{}
handler := NewCertificateHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/ocsp/iss-local/12345", nil)
req := httptest.NewRequest(http.MethodPost, "/.well-known/pki/ocsp/iss-local/12345", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
@@ -1261,7 +1213,7 @@ func TestHandleOCSP_MethodNotAllowed(t *testing.T) {
// TestListCertificates_SortParam tests sort parameter parsing and passing to service.
func TestListCertificates_SortParam(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
// Handler strips the '-' prefix and sets SortDesc = true
if filter.Sort != "notAfter" || !filter.SortDesc {
t.Errorf("expected sort=notAfter desc=true, got sort=%s desc=%v", filter.Sort, filter.SortDesc)
@@ -1284,7 +1236,7 @@ func TestListCertificates_SortParam(t *testing.T) {
// TestListCertificates_SortParam_Ascending tests sort parameter without '-' prefix (ascending).
func TestListCertificates_SortParam_Ascending(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.Sort != "createdAt" || filter.SortDesc {
t.Errorf("expected sort=createdAt desc=false, got sort=%s desc=%v", filter.Sort, filter.SortDesc)
}
@@ -1309,7 +1261,7 @@ func TestListCertificates_TimeRangeFilters(t *testing.T) {
after := time.Now().AddDate(0, 0, -90)
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.ExpiresBefore == nil {
t.Error("expected ExpiresBefore to be set")
}
@@ -1339,7 +1291,7 @@ func TestListCertificates_CreatedAfterFilter(t *testing.T) {
past := time.Now().AddDate(-1, 0, 0)
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.CreatedAfter == nil {
t.Error("expected CreatedAfter to be set")
}
@@ -1369,7 +1321,7 @@ func TestListCertificates_CursorPagination(t *testing.T) {
}
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
return []domain.ManagedCertificate{cert}, 1, nil
},
}
@@ -1409,7 +1361,7 @@ func TestListCertificates_SparseFields(t *testing.T) {
}
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if len(filter.Fields) != 2 {
t.Errorf("expected 2 fields, got %d", len(filter.Fields))
}
@@ -1456,7 +1408,7 @@ func TestListCertificates_SparseFields(t *testing.T) {
// TestListCertificates_ProfileFilter tests profile_id filter.
func TestListCertificates_ProfileFilter(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.ProfileID != "prof-standard" {
t.Errorf("expected ProfileID=prof-standard, got %s", filter.ProfileID)
}
@@ -1479,7 +1431,7 @@ func TestListCertificates_ProfileFilter(t *testing.T) {
// TestListCertificates_AgentIDFilter tests agent_id filter.
func TestListCertificates_AgentIDFilter(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.AgentID != "agent-prod-001" {
t.Errorf("expected AgentID=agent-prod-001, got %s", filter.AgentID)
}
@@ -1502,7 +1454,7 @@ func TestListCertificates_AgentIDFilter(t *testing.T) {
// TestListCertificates_CombinedFilters tests multiple filters together.
func TestListCertificates_CombinedFilters(t *testing.T) {
mock := &MockCertificateService{
ListCertificatesWithFilterFn: func(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
ListCertificatesWithFilterFn: func(_ context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error) {
if filter.Status != "Active" || filter.Environment != "production" || filter.ProfileID != "prof-standard" {
t.Error("expected all filters to be set")
}
@@ -1540,7 +1492,7 @@ func TestGetCertificateDeployments_Success(t *testing.T) {
}
mock := &MockCertificateService{
GetCertificateDeploymentsFn: func(certID string) ([]domain.DeploymentTarget, error) {
GetCertificateDeploymentsFn: func(_ context.Context, certID string) ([]domain.DeploymentTarget, error) {
if certID != "mc-prod-001" {
return nil, ErrMockNotFound
}
@@ -1576,7 +1528,7 @@ func TestGetCertificateDeployments_Success(t *testing.T) {
// TestGetCertificateDeployments_NotFound tests 404 for nonexistent certificate.
func TestGetCertificateDeployments_NotFound(t *testing.T) {
mock := &MockCertificateService{
GetCertificateDeploymentsFn: func(certID string) ([]domain.DeploymentTarget, error) {
GetCertificateDeploymentsFn: func(_ context.Context, certID string) ([]domain.DeploymentTarget, error) {
return nil, fmt.Errorf("certificate not found")
},
}
@@ -1596,7 +1548,7 @@ func TestGetCertificateDeployments_NotFound(t *testing.T) {
// TestGetCertificateDeployments_Empty tests successful response with no deployments.
func TestGetCertificateDeployments_Empty(t *testing.T) {
mock := &MockCertificateService{
GetCertificateDeploymentsFn: func(certID string) ([]domain.DeploymentTarget, error) {
GetCertificateDeploymentsFn: func(_ context.Context, certID string) ([]domain.DeploymentTarget, error) {
if certID == "mc-no-deployments" {
return []domain.DeploymentTarget{}, nil
}
+46 -73
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"log/slog"
"net/http"
@@ -15,20 +16,20 @@ import (
// CertificateService defines the service interface for certificate operations.
type CertificateService interface {
ListCertificates(status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error)
ListCertificatesWithFilter(filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error)
GetCertificate(id string) (*domain.ManagedCertificate, error)
CreateCertificate(cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
UpdateCertificate(id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
ArchiveCertificate(id string) error
GetCertificateVersions(certID string, page, perPage int) ([]domain.CertificateVersion, int64, error)
TriggerRenewal(certID string) error
TriggerDeployment(certID string, targetID string) error
RevokeCertificate(certID string, reason string) error
GetRevokedCertificates() ([]*domain.CertificateRevocation, error)
GenerateDERCRL(issuerID string) ([]byte, error)
GetOCSPResponse(issuerID string, serialHex string) ([]byte, error)
GetCertificateDeployments(certID string) ([]domain.DeploymentTarget, error)
ListCertificates(ctx context.Context, status, environment, ownerID, teamID, issuerID string, page, perPage int) ([]domain.ManagedCertificate, int64, error)
ListCertificatesWithFilter(ctx context.Context, filter *repository.CertificateFilter) ([]domain.ManagedCertificate, int, error)
GetCertificate(ctx context.Context, id string) (*domain.ManagedCertificate, error)
CreateCertificate(ctx context.Context, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
UpdateCertificate(ctx context.Context, id string, cert domain.ManagedCertificate) (*domain.ManagedCertificate, error)
ArchiveCertificate(ctx context.Context, id string) error
GetCertificateVersions(ctx context.Context, certID string, page, perPage int) ([]domain.CertificateVersion, int64, error)
TriggerRenewal(ctx context.Context, certID string, actor string) error
TriggerDeployment(ctx context.Context, certID string, targetID string, actor string) error
RevokeCertificate(ctx context.Context, certID string, reason string, actor string) error
GetRevokedCertificates(ctx context.Context) ([]*domain.CertificateRevocation, error)
GenerateDERCRL(ctx context.Context, issuerID string) ([]byte, error)
GetOCSPResponse(ctx context.Context, issuerID string, serialHex string) ([]byte, error)
GetCertificateDeployments(ctx context.Context, certID string) ([]domain.DeploymentTarget, error)
}
// CertificateHandler handles HTTP requests for certificate operations.
@@ -128,7 +129,7 @@ func (h CertificateHandler) ListCertificates(w http.ResponseWriter, r *http.Requ
filter.Fields = strings.Split(fieldsStr, ",")
}
certs, total, err := h.svc.ListCertificatesWithFilter(filter)
certs, total, err := h.svc.ListCertificatesWithFilter(r.Context(), filter)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list certificates", requestID)
return
@@ -186,7 +187,7 @@ func (h CertificateHandler) GetCertificate(w http.ResponseWriter, r *http.Reques
return
}
cert, err := h.svc.GetCertificate(id)
cert, err := h.svc.GetCertificate(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Certificate not found", requestID)
return
@@ -241,7 +242,7 @@ func (h CertificateHandler) CreateCertificate(w http.ResponseWriter, r *http.Req
return
}
created, err := h.svc.CreateCertificate(cert)
created, err := h.svc.CreateCertificate(r.Context(), cert)
if err != nil {
slog.Error("failed to create certificate", "error", err, "request_id", requestID, "common_name", cert.CommonName, "name", cert.Name)
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create certificate", requestID)
@@ -295,7 +296,7 @@ func (h CertificateHandler) UpdateCertificate(w http.ResponseWriter, r *http.Req
}
}
updated, err := h.svc.UpdateCertificate(id, cert)
updated, err := h.svc.UpdateCertificate(r.Context(), id, cert)
if err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Certificate not found", requestID)
@@ -325,7 +326,7 @@ func (h CertificateHandler) ArchiveCertificate(w http.ResponseWriter, r *http.Re
return
}
if err := h.svc.ArchiveCertificate(id); err != nil {
if err := h.svc.ArchiveCertificate(r.Context(), id); err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Certificate not found", requestID)
return
@@ -370,7 +371,7 @@ func (h CertificateHandler) GetCertificateVersions(w http.ResponseWriter, r *htt
}
}
versions, total, err := h.svc.GetCertificateVersions(certID, page, perPage)
versions, total, err := h.svc.GetCertificateVersions(r.Context(), certID, page, perPage)
if err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Certificate not found", requestID)
@@ -410,7 +411,9 @@ func (h CertificateHandler) TriggerRenewal(w http.ResponseWriter, r *http.Reques
}
certID := parts[0]
if err := h.svc.TriggerRenewal(certID); err != nil {
actor := resolveActor(r.Context())
if err := h.svc.TriggerRenewal(r.Context(), certID, actor); err != nil {
errMsg := err.Error()
if strings.Contains(errMsg, "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Certificate not found", requestID)
@@ -466,7 +469,9 @@ func (h CertificateHandler) TriggerDeployment(w http.ResponseWriter, r *http.Req
}
}
if err := h.svc.TriggerDeployment(certID, req.TargetID); err != nil {
actor := resolveActor(r.Context())
if err := h.svc.TriggerDeployment(r.Context(), certID, req.TargetID, actor); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to trigger deployment", requestID)
return
}
@@ -508,7 +513,9 @@ func (h CertificateHandler) RevokeCertificate(w http.ResponseWriter, r *http.Req
}
}
if err := h.svc.RevokeCertificate(certID, req.Reason); err != nil {
actor := resolveActor(r.Context())
if err := h.svc.RevokeCertificate(r.Context(), certID, req.Reason, actor); err != nil {
// Distinguish between client errors and server errors
errMsg := err.Error()
if strings.Contains(errMsg, "already revoked") ||
@@ -528,49 +535,12 @@ func (h CertificateHandler) RevokeCertificate(w http.ResponseWriter, r *http.Req
JSON(w, http.StatusOK, map[string]string{"status": "revoked"})
}
// GetCRL returns the Certificate Revocation List as structured JSON.
// GET /api/v1/crl
// Note: DER-encoded X.509 CRL generation (requiring CA key access) is planned for M15b
// alongside the embedded OCSP responder. This endpoint provides the same data in JSON format.
func (h CertificateHandler) GetCRL(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
requestID := middleware.GetRequestID(r.Context())
revocations, err := h.svc.GetRevokedCertificates()
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to generate CRL", requestID)
return
}
type CRLEntry struct {
SerialNumber string `json:"serial_number"`
RevocationDate string `json:"revocation_date"`
RevocationReason string `json:"revocation_reason"`
}
entries := make([]CRLEntry, 0, len(revocations))
for _, rev := range revocations {
entries = append(entries, CRLEntry{
SerialNumber: rev.SerialNumber,
RevocationDate: rev.RevokedAt.Format("2006-01-02T15:04:05Z"),
RevocationReason: rev.Reason,
})
}
JSON(w, http.StatusOK, map[string]interface{}{
"version": 1,
"entries": entries,
"total": len(entries),
"generated_at": time.Now().UTC().Format("2006-01-02T15:04:05Z"),
})
}
// GetDERCRL returns a DER-encoded X.509 CRL signed by the specified issuer.
// GET /api/v1/crl/{issuer_id}
// GET /.well-known/pki/crl/{issuer_id}
//
// RFC 5280 § 5. Served unauthenticated under the /.well-known/pki/ namespace so
// relying parties (browsers, OpenSSL, OCSP stapling sidecars) can fetch the CRL
// without presenting certctl API credentials.
func (h CertificateHandler) GetDERCRL(w http.ResponseWriter, r *http.Request) {
requestID, _ := r.Context().Value("request_id").(string)
@@ -579,13 +549,13 @@ func (h CertificateHandler) GetDERCRL(w http.ResponseWriter, r *http.Request) {
return
}
issuerID := strings.TrimPrefix(r.URL.Path, "/api/v1/crl/")
issuerID := strings.TrimPrefix(r.URL.Path, "/.well-known/pki/crl/")
if issuerID == "" {
ErrorWithRequestID(w, http.StatusBadRequest, "Issuer ID is required", requestID)
return
}
derBytes, err := h.svc.GenerateDERCRL(issuerID)
derBytes, err := h.svc.GenerateDERCRL(r.Context(), issuerID)
if err != nil {
errMsg := err.Error()
if strings.Contains(errMsg, "not found") {
@@ -607,8 +577,11 @@ func (h CertificateHandler) GetDERCRL(w http.ResponseWriter, r *http.Request) {
}
// HandleOCSP processes OCSP requests.
// GET /api/v1/ocsp/{issuer_id}/{serial_hex}
// For simplicity, use GET with path params instead of binary POST.
// GET /.well-known/pki/ocsp/{issuer_id}/{serial_hex}
//
// RFC 6960. Served unauthenticated under the /.well-known/pki/ namespace. For
// simplicity we accept GET with path params rather than the binary POST body
// form — the response is a valid DER-encoded OCSP response either way.
func (h CertificateHandler) HandleOCSP(w http.ResponseWriter, r *http.Request) {
requestID, _ := r.Context().Value("request_id").(string)
@@ -617,8 +590,8 @@ func (h CertificateHandler) HandleOCSP(w http.ResponseWriter, r *http.Request) {
return
}
// Extract issuer_id and serial from path: /api/v1/ocsp/{issuer_id}/{serial_hex}
path := strings.TrimPrefix(r.URL.Path, "/api/v1/ocsp/")
// Extract issuer_id and serial from path: /.well-known/pki/ocsp/{issuer_id}/{serial_hex}
path := strings.TrimPrefix(r.URL.Path, "/.well-known/pki/ocsp/")
parts := strings.SplitN(path, "/", 2)
if len(parts) < 2 || parts[0] == "" || parts[1] == "" {
ErrorWithRequestID(w, http.StatusBadRequest, "Issuer ID and serial number are required", requestID)
@@ -627,7 +600,7 @@ func (h CertificateHandler) HandleOCSP(w http.ResponseWriter, r *http.Request) {
issuerID := parts[0]
serialHex := parts[1]
derBytes, err := h.svc.GetOCSPResponse(issuerID, serialHex)
derBytes, err := h.svc.GetOCSPResponse(r.Context(), issuerID, serialHex)
if err != nil {
errMsg := err.Error()
if strings.Contains(errMsg, "not found") {
@@ -667,7 +640,7 @@ func (h CertificateHandler) GetCertificateDeployments(w http.ResponseWriter, r *
}
certID := parts[0]
deployments, err := h.svc.GetCertificateDeployments(certID)
deployments, err := h.svc.GetCertificateDeployments(r.Context(), certID)
if err != nil {
errMsg := err.Error()
if strings.Contains(errMsg, "not found") {
+9 -4
View File
@@ -11,12 +11,17 @@ import (
)
// DiscoveryService defines the interface used by the discovery handler.
// ClaimDiscovered and DismissDiscovered accept an explicit actor parameter so
// the handler can flow the authenticated named-key identity into the audit
// trail (M-005). Services that call these methods from non-request contexts
// pass a descriptive sentinel (e.g., "system") or "" (which falls back to
// "api").
type DiscoveryService interface {
ProcessDiscoveryReport(ctx context.Context, report *domain.DiscoveryReport) (*domain.DiscoveryScan, error)
ListDiscovered(ctx context.Context, agentID, status string, page, perPage int) ([]*domain.DiscoveredCertificate, int, error)
GetDiscovered(ctx context.Context, id string) (*domain.DiscoveredCertificate, error)
ClaimDiscovered(ctx context.Context, id string, managedCertID string) error
DismissDiscovered(ctx context.Context, id string) error
ClaimDiscovered(ctx context.Context, id string, managedCertID string, actor string) error
DismissDiscovered(ctx context.Context, id string, actor string) error
ListScans(ctx context.Context, agentID string, page, perPage int) ([]*domain.DiscoveryScan, int, error)
GetScan(ctx context.Context, id string) (*domain.DiscoveryScan, error)
GetDiscoverySummary(ctx context.Context) (map[string]int, error)
@@ -142,7 +147,7 @@ func (h DiscoveryHandler) ClaimDiscovered(w http.ResponseWriter, r *http.Request
return
}
if err := h.svc.ClaimDiscovered(r.Context(), id, body.ManagedCertificateID); err != nil {
if err := h.svc.ClaimDiscovered(r.Context(), id, body.ManagedCertificateID, resolveActor(r.Context())); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to claim certificate: %v", err))
return
}
@@ -166,7 +171,7 @@ func (h DiscoveryHandler) DismissDiscovered(w http.ResponseWriter, r *http.Reque
return
}
if err := h.svc.DismissDiscovered(r.Context(), id); err != nil {
if err := h.svc.DismissDiscovered(r.Context(), id, resolveActor(r.Context())); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to dismiss certificate: %v", err))
return
}
+10 -10
View File
@@ -19,8 +19,8 @@ type MockDiscoveryService struct {
ProcessDiscoveryReportFn func(ctx context.Context, report *domain.DiscoveryReport) (*domain.DiscoveryScan, error)
ListDiscoveredFn func(ctx context.Context, agentID, status string, page, perPage int) ([]*domain.DiscoveredCertificate, int, error)
GetDiscoveredFn func(ctx context.Context, id string) (*domain.DiscoveredCertificate, error)
ClaimDiscoveredFn func(ctx context.Context, id string, managedCertID string) error
DismissDiscoveredFn func(ctx context.Context, id string) error
ClaimDiscoveredFn func(ctx context.Context, id string, managedCertID string, actor string) error
DismissDiscoveredFn func(ctx context.Context, id string, actor string) error
ListScansFn func(ctx context.Context, agentID string, page, perPage int) ([]*domain.DiscoveryScan, int, error)
GetScanFn func(ctx context.Context, id string) (*domain.DiscoveryScan, error)
GetDiscoverySummaryFn func(ctx context.Context) (map[string]int, error)
@@ -47,16 +47,16 @@ func (m *MockDiscoveryService) GetDiscovered(ctx context.Context, id string) (*d
return nil, nil
}
func (m *MockDiscoveryService) ClaimDiscovered(ctx context.Context, id string, managedCertID string) error {
func (m *MockDiscoveryService) ClaimDiscovered(ctx context.Context, id string, managedCertID string, actor string) error {
if m.ClaimDiscoveredFn != nil {
return m.ClaimDiscoveredFn(ctx, id, managedCertID)
return m.ClaimDiscoveredFn(ctx, id, managedCertID, actor)
}
return nil
}
func (m *MockDiscoveryService) DismissDiscovered(ctx context.Context, id string) error {
func (m *MockDiscoveryService) DismissDiscovered(ctx context.Context, id string, actor string) error {
if m.DismissDiscoveredFn != nil {
return m.DismissDiscoveredFn(ctx, id)
return m.DismissDiscoveredFn(ctx, id, actor)
}
return nil
}
@@ -352,7 +352,7 @@ func TestGetDiscovered_NotFound(t *testing.T) {
// Test ClaimDiscovered - success case
func TestClaimDiscovered_Success(t *testing.T) {
mock := &MockDiscoveryService{
ClaimDiscoveredFn: func(ctx context.Context, id string, managedCertID string) error {
ClaimDiscoveredFn: func(ctx context.Context, id string, managedCertID string, actor string) error {
if id == "dcert-1" && managedCertID == "mc-prod-1" {
return nil
}
@@ -411,7 +411,7 @@ func TestClaimDiscovered_MissingManagedCertID(t *testing.T) {
// Test ClaimDiscovered - discovered cert not found
func TestClaimDiscovered_NotFound(t *testing.T) {
mock := &MockDiscoveryService{
ClaimDiscoveredFn: func(ctx context.Context, id string, managedCertID string) error {
ClaimDiscoveredFn: func(ctx context.Context, id string, managedCertID string, actor string) error {
return fmt.Errorf("discovered certificate not found")
},
}
@@ -438,7 +438,7 @@ func TestClaimDiscovered_NotFound(t *testing.T) {
// Test DismissDiscovered - success case
func TestDismissDiscovered_Success(t *testing.T) {
mock := &MockDiscoveryService{
DismissDiscoveredFn: func(ctx context.Context, id string) error {
DismissDiscoveredFn: func(ctx context.Context, id string, actor string) error {
if id == "dcert-1" {
return nil
}
@@ -614,7 +614,7 @@ func TestGetDiscoverySummary_MethodNotAllowed(t *testing.T) {
// Test DismissDiscovered - service error
func TestDismissDiscovered_ServiceError(t *testing.T) {
mock := &MockDiscoveryService{
DismissDiscoveredFn: func(ctx context.Context, id string) error {
DismissDiscoveredFn: func(ctx context.Context, id string, actor string) error {
return fmt.Errorf("database error")
},
}
+8 -134
View File
@@ -12,6 +12,7 @@ import (
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/pkcs7"
)
// ESTService defines the service interface for EST enrollment operations.
@@ -67,7 +68,7 @@ func (h ESTHandler) CACerts(w http.ResponseWriter, r *http.Request) {
}
// Parse PEM to DER for PKCS#7 encoding
derCerts, err := pemToDERChain(caCertPEM)
derCerts, err := pkcs7.PEMToDERChain(caCertPEM)
if err != nil {
requestID := middleware.GetRequestID(r.Context())
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to encode CA certificates", requestID)
@@ -75,7 +76,7 @@ func (h ESTHandler) CACerts(w http.ResponseWriter, r *http.Request) {
}
// Build a simple PKCS#7 SignedData (certs-only, degenerate) structure
pkcs7Data, err := buildCertsOnlyPKCS7(derCerts)
pkcs7Data, err := pkcs7.BuildCertsOnlyPKCS7(derCerts)
if err != nil {
requestID := middleware.GetRequestID(r.Context())
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to build PKCS#7 response", requestID)
@@ -237,7 +238,7 @@ func (h ESTHandler) writeCertResponse(w http.ResponseWriter, result *domain.ESTE
var derCerts [][]byte
// Add the issued certificate
certDER, err := pemToDERChain(result.CertPEM)
certDER, err := pkcs7.PEMToDERChain(result.CertPEM)
if err != nil || len(certDER) == 0 {
http.Error(w, "Failed to encode certificate", http.StatusInternalServerError)
return
@@ -246,14 +247,14 @@ func (h ESTHandler) writeCertResponse(w http.ResponseWriter, result *domain.ESTE
// Add the CA chain if present
if result.ChainPEM != "" {
chainDER, err := pemToDERChain(result.ChainPEM)
chainDER, err := pkcs7.PEMToDERChain(result.ChainPEM)
if err == nil {
derCerts = append(derCerts, chainDER...)
}
}
// Build PKCS#7 certs-only
pkcs7Data, err := buildCertsOnlyPKCS7(derCerts)
pkcs7Data, err := pkcs7.BuildCertsOnlyPKCS7(derCerts)
if err != nil {
http.Error(w, "Failed to build PKCS#7 response", http.StatusInternalServerError)
return
@@ -273,132 +274,5 @@ func (h ESTHandler) writeCertResponse(w http.ResponseWriter, result *domain.ESTE
}
}
// pemToDERChain converts PEM-encoded certificates to a slice of DER-encoded certificates.
func pemToDERChain(pemData string) ([][]byte, error) {
var derCerts [][]byte
rest := []byte(pemData)
for {
var block *pem.Block
block, rest = pem.Decode(rest)
if block == nil {
break
}
if block.Type == "CERTIFICATE" {
derCerts = append(derCerts, block.Bytes)
}
}
if len(derCerts) == 0 {
return nil, fmt.Errorf("no certificates found in PEM data")
}
return derCerts, nil
}
// buildCertsOnlyPKCS7 creates a degenerate PKCS#7 SignedData structure containing only certificates.
// This is the "certs-only" format specified in RFC 7030 Section 4.1.3 for /cacerts responses
// and enrollment responses.
//
// ASN.1 structure (simplified):
//
// ContentInfo {
// contentType: signedData (1.2.840.113549.1.7.2)
// content: SignedData {
// version: 1
// digestAlgorithms: {} (empty)
// encapContentInfo: { contentType: data (1.2.840.113549.1.7.1) }
// certificates: [cert1, cert2, ...]
// signerInfos: {} (empty)
// }
// }
func buildCertsOnlyPKCS7(derCerts [][]byte) ([]byte, error) {
// We build the ASN.1 manually to avoid pulling in a PKCS#7 library.
// This is a well-defined, static structure — no signing needed.
// OID for signedData: 1.2.840.113549.1.7.2
oidSignedData := []byte{0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x07, 0x02}
// OID for data: 1.2.840.113549.1.7.1
oidData := []byte{0x06, 0x09, 0x2a, 0x86, 0x48, 0x86, 0xf7, 0x0d, 0x01, 0x07, 0x01}
// Build certificates [0] IMPLICIT SET OF Certificate
var certsContent []byte
for _, cert := range derCerts {
certsContent = append(certsContent, cert...)
}
certsField := asn1WrapImplicit(0, certsContent)
// Build encapContentInfo: SEQUENCE { OID data }
encapContentInfo := asn1WrapSequence(oidData)
// Build digestAlgorithms: SET {} (empty)
digestAlgorithms := asn1WrapSet(nil)
// Build signerInfos: SET {} (empty)
signerInfos := asn1WrapSet(nil)
// Version: INTEGER 1
version := []byte{0x02, 0x01, 0x01}
// Build SignedData SEQUENCE
var signedDataContent []byte
signedDataContent = append(signedDataContent, version...)
signedDataContent = append(signedDataContent, digestAlgorithms...)
signedDataContent = append(signedDataContent, encapContentInfo...)
signedDataContent = append(signedDataContent, certsField...)
signedDataContent = append(signedDataContent, signerInfos...)
signedData := asn1WrapSequence(signedDataContent)
// Wrap in [0] EXPLICIT for ContentInfo.content
contentField := asn1WrapExplicit(0, signedData)
// Build ContentInfo SEQUENCE
var contentInfoContent []byte
contentInfoContent = append(contentInfoContent, oidSignedData...)
contentInfoContent = append(contentInfoContent, contentField...)
contentInfo := asn1WrapSequence(contentInfoContent)
return contentInfo, nil
}
// asn1WrapSequence wraps content in an ASN.1 SEQUENCE tag (0x30).
func asn1WrapSequence(content []byte) []byte {
return asn1Wrap(0x30, content)
}
// asn1WrapSet wraps content in an ASN.1 SET tag (0x31).
func asn1WrapSet(content []byte) []byte {
return asn1Wrap(0x31, content)
}
// asn1WrapExplicit wraps content in an ASN.1 context-specific EXPLICIT tag.
func asn1WrapExplicit(tag int, content []byte) []byte {
return asn1Wrap(byte(0xa0|tag), content)
}
// asn1WrapImplicit wraps content in an ASN.1 context-specific IMPLICIT CONSTRUCTED tag.
func asn1WrapImplicit(tag int, content []byte) []byte {
return asn1Wrap(byte(0xa0|tag), content)
}
// asn1Wrap wraps content with an ASN.1 tag and length.
func asn1Wrap(tag byte, content []byte) []byte {
length := len(content)
var result []byte
result = append(result, tag)
result = append(result, asn1EncodeLength(length)...)
result = append(result, content...)
return result
}
// asn1EncodeLength encodes a length in ASN.1 DER format.
func asn1EncodeLength(length int) []byte {
if length < 0x80 {
return []byte{byte(length)}
}
// Long form
var lengthBytes []byte
l := length
for l > 0 {
lengthBytes = append([]byte{byte(l & 0xff)}, lengthBytes...)
l >>= 8
}
return append([]byte{byte(0x80 | len(lengthBytes))}, lengthBytes...)
}
// NOTE: PKCS#7 helpers (BuildCertsOnlyPKCS7, PEMToDERChain, ASN.1 wrappers)
// are in the shared internal/pkcs7 package, used by both EST and SCEP handlers.
+10 -34
View File
@@ -18,6 +18,7 @@ import (
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/pkcs7"
)
// mockESTService implements ESTService for testing.
@@ -338,12 +339,12 @@ func TestESTCSRAttrs_MethodNotAllowed(t *testing.T) {
}
}
func TestBuildCertsOnlyPKCS7(t *testing.T) {
// Test with a dummy DER certificate
func TestBuildCertsOnlyPKCS7_ViaSharedPackage(t *testing.T) {
// Test with a dummy DER certificate via shared pkcs7 package
dummyCert := []byte{0x30, 0x82, 0x01, 0x00} // minimal ASN.1 SEQUENCE
result, err := buildCertsOnlyPKCS7([][]byte{dummyCert})
result, err := pkcs7.BuildCertsOnlyPKCS7([][]byte{dummyCert})
if err != nil {
t.Fatalf("buildCertsOnlyPKCS7 failed: %v", err)
t.Fatalf("BuildCertsOnlyPKCS7 failed: %v", err)
}
if len(result) == 0 {
t.Error("expected non-empty PKCS#7 output")
@@ -354,49 +355,24 @@ func TestBuildCertsOnlyPKCS7(t *testing.T) {
}
}
func TestPemToDERChain(t *testing.T) {
func TestPemToDERChain_ViaSharedPackage(t *testing.T) {
pemData := generateTestCertPEM(t)
certs, err := pemToDERChain(pemData)
certs, err := pkcs7.PEMToDERChain(pemData)
if err != nil {
t.Fatalf("pemToDERChain failed: %v", err)
t.Fatalf("PEMToDERChain failed: %v", err)
}
if len(certs) != 1 {
t.Errorf("expected 1 cert, got %d", len(certs))
}
}
func TestPemToDERChain_NoCerts(t *testing.T) {
_, err := pemToDERChain("not a PEM")
func TestPemToDERChain_NoCerts_ViaSharedPackage(t *testing.T) {
_, err := pkcs7.PEMToDERChain("not a PEM")
if err == nil {
t.Error("expected error for invalid PEM")
}
}
func TestASN1EncodeLength(t *testing.T) {
tests := []struct {
length int
expected []byte
}{
{0, []byte{0x00}},
{1, []byte{0x01}},
{127, []byte{0x7f}},
{128, []byte{0x81, 0x80}},
{256, []byte{0x82, 0x01, 0x00}},
}
for _, tt := range tests {
result := asn1EncodeLength(tt.length)
if len(result) != len(tt.expected) {
t.Errorf("asn1EncodeLength(%d): expected %d bytes, got %d", tt.length, len(tt.expected), len(result))
continue
}
for i := range result {
if result[i] != tt.expected[i] {
t.Errorf("asn1EncodeLength(%d): byte %d: expected 0x%02x, got 0x%02x", tt.length, i, tt.expected[i], result[i])
}
}
}
}
func TestESTCSRAttrs_ServiceError(t *testing.T) {
svc := &mockESTService{
CSRAttrsErr: errors.New("service error"),
+19 -3
View File
@@ -2,6 +2,8 @@ package handler
import (
"net/http"
"github.com/shankar0123/certctl/internal/api/middleware"
)
// HealthHandler handles health and readiness check endpoints.
@@ -55,9 +57,23 @@ func (h HealthHandler) AuthInfo(w http.ResponseWriter, r *http.Request) {
JSON(w, http.StatusOK, response)
}
// AuthCheck returns 200 if the request has valid auth credentials.
// The auth middleware runs before this handler, so reaching here means auth passed.
// AuthCheck returns 200 if the request has valid auth credentials, along with
// the resolved named-key identity and admin flag so the GUI can gate
// admin-only affordances (e.g., the bulk-revoke button).
//
// M-003 (Phase B.4): surface the admin flag so the frontend hides affordances
// that would otherwise 403 at the server. This is a hint for UX only —
// authorization remains enforced at the handler layer (bulk_revocation.go).
//
// The auth middleware runs before this handler, so reaching here means auth
// passed. `user` falls back to an empty string when auth is disabled
// (CERTCTL_AUTH_TYPE=none).
// GET /api/v1/auth/check
func (h HealthHandler) AuthCheck(w http.ResponseWriter, r *http.Request) {
JSON(w, http.StatusOK, map[string]string{"status": "authenticated"})
response := map[string]interface{}{
"status": "authenticated",
"user": middleware.GetUser(r.Context()),
"admin": middleware.IsAdmin(r.Context()),
}
JSON(w, http.StatusOK, response)
}
+308
View File
@@ -0,0 +1,308 @@
package handler
import (
"context"
"encoding/json"
"fmt"
"net/http"
"strconv"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/repository"
)
// HealthCheckServicer defines the interface used by the health check handler.
type HealthCheckServicer interface {
Create(ctx context.Context, check *domain.EndpointHealthCheck) error
Get(ctx context.Context, id string) (*domain.EndpointHealthCheck, error)
Update(ctx context.Context, check *domain.EndpointHealthCheck) error
Delete(ctx context.Context, id string) error
List(ctx context.Context, filter *repository.HealthCheckFilter) ([]*domain.EndpointHealthCheck, int, error)
GetHistory(ctx context.Context, healthCheckID string, limit int) ([]*domain.HealthHistoryEntry, error)
AcknowledgeIncident(ctx context.Context, id string, actor string) error
GetSummary(ctx context.Context) (*domain.HealthCheckSummary, error)
}
// HealthCheckHandler handles HTTP requests for TLS health monitoring.
type HealthCheckHandler struct {
service HealthCheckServicer
}
// NewHealthCheckHandler creates a new health check handler.
func NewHealthCheckHandler(service HealthCheckServicer) *HealthCheckHandler {
return &HealthCheckHandler{service: service}
}
// ListHealthChecks handles GET /api/v1/health-checks
func (h *HealthCheckHandler) ListHealthChecks(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
query := r.URL.Query()
status := query.Get("status")
certificateID := query.Get("certificate_id")
networkScanTargetID := query.Get("network_scan_target_id")
enabledStr := query.Get("enabled")
page := parseIntDefault(query.Get("page"), 1)
perPage := parseIntDefault(query.Get("per_page"), 50)
if perPage > 500 {
perPage = 50
}
// Parse enabled flag if provided
var enabledFilter *bool
if enabledStr != "" {
enabled := enabledStr == "true"
enabledFilter = &enabled
}
filter := &repository.HealthCheckFilter{
Status: status,
CertificateID: certificateID,
NetworkScanTargetID: networkScanTargetID,
Enabled: enabledFilter,
Page: page,
PerPage: perPage,
}
checks, total, err := h.service.List(r.Context(), filter)
if err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to list health checks: %v", err))
return
}
if checks == nil {
checks = make([]*domain.EndpointHealthCheck, 0)
}
JSON(w, http.StatusOK, PagedResponse{
Data: checks,
Total: int64(total),
Page: page,
PerPage: perPage,
})
}
// GetHealthCheck handles GET /api/v1/health-checks/{id}
func (h *HealthCheckHandler) GetHealthCheck(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
id := r.PathValue("id")
if id == "" {
Error(w, http.StatusBadRequest, "health check ID is required")
return
}
check, err := h.service.Get(r.Context(), id)
if err != nil {
Error(w, http.StatusNotFound, fmt.Sprintf("health check not found: %v", err))
return
}
JSON(w, http.StatusOK, check)
}
// CreateHealthCheck handles POST /api/v1/health-checks
func (h *HealthCheckHandler) CreateHealthCheck(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
var check domain.EndpointHealthCheck
if err := json.NewDecoder(r.Body).Decode(&check); err != nil {
Error(w, http.StatusBadRequest, fmt.Sprintf("invalid request body: %v", err))
return
}
if check.Endpoint == "" {
Error(w, http.StatusBadRequest, "endpoint is required")
return
}
// Set defaults
if check.CheckIntervalSecs <= 0 {
check.CheckIntervalSecs = 300
}
if check.DegradedThreshold <= 0 {
check.DegradedThreshold = 2
}
if check.DownThreshold <= 0 {
check.DownThreshold = 5
}
if check.Status == "" {
check.Status = domain.HealthStatusUnknown
}
if err := h.service.Create(r.Context(), &check); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to create health check: %v", err))
return
}
JSON(w, http.StatusCreated, check)
}
// UpdateHealthCheck handles PUT /api/v1/health-checks/{id}
func (h *HealthCheckHandler) UpdateHealthCheck(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPut {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
id := r.PathValue("id")
if id == "" {
Error(w, http.StatusBadRequest, "health check ID is required")
return
}
// Get existing check
existing, err := h.service.Get(r.Context(), id)
if err != nil {
Error(w, http.StatusNotFound, fmt.Sprintf("health check not found: %v", err))
return
}
var updates domain.EndpointHealthCheck
if err := json.NewDecoder(r.Body).Decode(&updates); err != nil {
Error(w, http.StatusBadRequest, fmt.Sprintf("invalid request body: %v", err))
return
}
// Merge updates (only update provided fields)
if updates.Endpoint != "" {
existing.Endpoint = updates.Endpoint
}
if updates.ExpectedFingerprint != "" {
existing.ExpectedFingerprint = updates.ExpectedFingerprint
}
if updates.CheckIntervalSecs > 0 {
existing.CheckIntervalSecs = updates.CheckIntervalSecs
}
if updates.DegradedThreshold > 0 {
existing.DegradedThreshold = updates.DegradedThreshold
}
if updates.DownThreshold > 0 {
existing.DownThreshold = updates.DownThreshold
}
existing.Enabled = updates.Enabled
if err := h.service.Update(r.Context(), existing); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to update health check: %v", err))
return
}
JSON(w, http.StatusOK, existing)
}
// DeleteHealthCheck handles DELETE /api/v1/health-checks/{id}
func (h *HealthCheckHandler) DeleteHealthCheck(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodDelete {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
id := r.PathValue("id")
if id == "" {
Error(w, http.StatusBadRequest, "health check ID is required")
return
}
if err := h.service.Delete(r.Context(), id); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to delete health check: %v", err))
return
}
w.WriteHeader(http.StatusNoContent)
}
// GetHealthCheckHistory handles GET /api/v1/health-checks/{id}/history
func (h *HealthCheckHandler) GetHealthCheckHistory(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
id := r.PathValue("id")
if id == "" {
Error(w, http.StatusBadRequest, "health check ID is required")
return
}
limitStr := r.URL.Query().Get("limit")
limit := 100
if limitStr != "" {
if l, err := strconv.Atoi(limitStr); err == nil && l > 0 {
limit = l
}
}
if limit > 1000 {
limit = 1000
}
history, err := h.service.GetHistory(r.Context(), id, limit)
if err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to get health check history: %v", err))
return
}
if history == nil {
history = make([]*domain.HealthHistoryEntry, 0)
}
JSON(w, http.StatusOK, history)
}
// AcknowledgeHealthCheck handles POST /api/v1/health-checks/{id}/acknowledge
func (h *HealthCheckHandler) AcknowledgeHealthCheck(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
id := r.PathValue("id")
if id == "" {
Error(w, http.StatusBadRequest, "health check ID is required")
return
}
var req struct {
Actor string `json:"actor,omitempty"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
Error(w, http.StatusBadRequest, fmt.Sprintf("invalid request body: %v", err))
return
}
if req.Actor == "" {
req.Actor = "unknown"
}
if err := h.service.AcknowledgeIncident(r.Context(), id, req.Actor); err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to acknowledge health check: %v", err))
return
}
w.WriteHeader(http.StatusNoContent)
}
// GetHealthCheckSummary handles GET /api/v1/health-checks/summary
// This route must be registered BEFORE the /{id} routes
func (h *HealthCheckHandler) GetHealthCheckSummary(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
summary, err := h.service.GetSummary(r.Context())
if err != nil {
Error(w, http.StatusInternalServerError, fmt.Sprintf("failed to get health check summary: %v", err))
return
}
JSON(w, http.StatusOK, summary)
}
@@ -0,0 +1,305 @@
package handler
import (
"bytes"
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"testing"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/repository"
)
// mockHealthCheckSvc implements HealthCheckServicer for testing.
type mockHealthCheckSvc struct {
createErr error
getErr error
updateErr error
deleteErr error
listErr error
getHistoryErr error
acknowledgeErr error
getSummaryErr error
checks map[string]*domain.EndpointHealthCheck
summary *domain.HealthCheckSummary
}
func newMockHealthCheckSvc() *mockHealthCheckSvc {
return &mockHealthCheckSvc{
checks: make(map[string]*domain.EndpointHealthCheck),
summary: &domain.HealthCheckSummary{
Healthy: 1,
Degraded: 0,
Down: 0,
CertMismatch: 0,
Unknown: 0,
},
}
}
func (m *mockHealthCheckSvc) Create(ctx context.Context, check *domain.EndpointHealthCheck) error {
if m.createErr != nil {
return m.createErr
}
check.ID = "hc-created-1"
m.checks[check.ID] = check
return nil
}
func (m *mockHealthCheckSvc) Get(ctx context.Context, id string) (*domain.EndpointHealthCheck, error) {
if m.getErr != nil {
return nil, m.getErr
}
if check, ok := m.checks[id]; ok {
return check, nil
}
return nil, errors.New("not found")
}
func (m *mockHealthCheckSvc) Update(ctx context.Context, check *domain.EndpointHealthCheck) error {
if m.updateErr != nil {
return m.updateErr
}
m.checks[check.ID] = check
return nil
}
func (m *mockHealthCheckSvc) Delete(ctx context.Context, id string) error {
if m.deleteErr != nil {
return m.deleteErr
}
delete(m.checks, id)
return nil
}
func (m *mockHealthCheckSvc) List(ctx context.Context, filter *repository.HealthCheckFilter) ([]*domain.EndpointHealthCheck, int, error) {
if m.listErr != nil {
return nil, 0, m.listErr
}
checks := make([]*domain.EndpointHealthCheck, 0, len(m.checks))
for _, check := range m.checks {
checks = append(checks, check)
}
return checks, len(checks), nil
}
func (m *mockHealthCheckSvc) GetHistory(ctx context.Context, healthCheckID string, limit int) ([]*domain.HealthHistoryEntry, error) {
if m.getHistoryErr != nil {
return nil, m.getHistoryErr
}
return make([]*domain.HealthHistoryEntry, 0), nil
}
func (m *mockHealthCheckSvc) AcknowledgeIncident(ctx context.Context, id string, actor string) error {
if m.acknowledgeErr != nil {
return m.acknowledgeErr
}
if check, ok := m.checks[id]; ok {
check.Acknowledged = true
check.AcknowledgedBy = actor
}
return nil
}
func (m *mockHealthCheckSvc) GetSummary(ctx context.Context) (*domain.HealthCheckSummary, error) {
if m.getSummaryErr != nil {
return nil, m.getSummaryErr
}
return m.summary, nil
}
// Tests
func TestListHealthChecks_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
svc.checks["hc-1"] = &domain.EndpointHealthCheck{
ID: "hc-1",
Endpoint: "api.example.com:443",
Status: domain.HealthStatusHealthy,
}
handler := NewHealthCheckHandler(svc)
req := httptest.NewRequest("GET", "/api/v1/health-checks", nil)
w := httptest.NewRecorder()
handler.ListHealthChecks(w, req)
if w.Code != http.StatusOK {
t.Errorf("Expected status 200, got %d", w.Code)
}
var resp PagedResponse
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("Failed to decode response: %v", err)
}
if resp.Total != 1 {
t.Errorf("Expected 1 health check, got %d", resp.Total)
}
}
func TestListHealthChecks_MethodNotAllowed(t *testing.T) {
handler := NewHealthCheckHandler(newMockHealthCheckSvc())
req := httptest.NewRequest("POST", "/api/v1/health-checks", nil)
w := httptest.NewRecorder()
handler.ListHealthChecks(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("Expected status 405, got %d", w.Code)
}
}
func TestGetHealthCheck_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
check := &domain.EndpointHealthCheck{
ID: "hc-1",
Endpoint: "api.example.com:443",
Status: domain.HealthStatusHealthy,
}
svc.checks["hc-1"] = check
handler := NewHealthCheckHandler(svc)
req := httptest.NewRequest("GET", "/api/v1/health-checks/hc-1", nil)
req.SetPathValue("id", "hc-1")
w := httptest.NewRecorder()
handler.GetHealthCheck(w, req)
if w.Code != http.StatusOK {
t.Errorf("Expected status 200, got %d", w.Code)
}
var resp domain.EndpointHealthCheck
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("Failed to decode response: %v", err)
}
if resp.ID != "hc-1" {
t.Errorf("Expected ID hc-1, got %s", resp.ID)
}
}
func TestGetHealthCheck_NotFound(t *testing.T) {
handler := NewHealthCheckHandler(newMockHealthCheckSvc())
req := httptest.NewRequest("GET", "/api/v1/health-checks/nonexistent", nil)
req.SetPathValue("id", "nonexistent")
w := httptest.NewRecorder()
handler.GetHealthCheck(w, req)
if w.Code != http.StatusNotFound {
t.Errorf("Expected status 404, got %d", w.Code)
}
}
func TestCreateHealthCheck_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
handler := NewHealthCheckHandler(svc)
check := domain.EndpointHealthCheck{
Endpoint: "web.example.com:443",
Enabled: true,
}
body, _ := json.Marshal(check)
req := httptest.NewRequest("POST", "/api/v1/health-checks", bytes.NewReader(body))
w := httptest.NewRecorder()
handler.CreateHealthCheck(w, req)
if w.Code != http.StatusCreated {
t.Errorf("Expected status 201, got %d", w.Code)
}
var resp domain.EndpointHealthCheck
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("Failed to decode response: %v", err)
}
if resp.Endpoint != "web.example.com:443" {
t.Errorf("Expected endpoint web.example.com:443, got %s", resp.Endpoint)
}
}
func TestDeleteHealthCheck_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
svc.checks["hc-1"] = &domain.EndpointHealthCheck{
ID: "hc-1",
Endpoint: "api.example.com:443",
}
handler := NewHealthCheckHandler(svc)
req := httptest.NewRequest("DELETE", "/api/v1/health-checks/hc-1", nil)
req.SetPathValue("id", "hc-1")
w := httptest.NewRecorder()
handler.DeleteHealthCheck(w, req)
if w.Code != http.StatusNoContent {
t.Errorf("Expected status 204, got %d", w.Code)
}
if _, ok := svc.checks["hc-1"]; ok {
t.Fatal("Expected check to be deleted")
}
}
func TestAcknowledgeHealthCheck_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
svc.checks["hc-1"] = &domain.EndpointHealthCheck{
ID: "hc-1",
Endpoint: "api.example.com:443",
Status: domain.HealthStatusDown,
}
handler := NewHealthCheckHandler(svc)
req := httptest.NewRequest("POST", "/api/v1/health-checks/hc-1/acknowledge", bytes.NewReader([]byte(`{"actor":"user@example.com"}`)))
req.SetPathValue("id", "hc-1")
w := httptest.NewRecorder()
handler.AcknowledgeHealthCheck(w, req)
if w.Code != http.StatusNoContent {
t.Errorf("Expected status 204, got %d", w.Code)
}
if !svc.checks["hc-1"].Acknowledged {
t.Fatal("Expected check to be acknowledged")
}
}
func TestGetHealthCheckSummary_Success(t *testing.T) {
svc := newMockHealthCheckSvc()
svc.summary = &domain.HealthCheckSummary{
Healthy: 3,
Degraded: 1,
Down: 0,
CertMismatch: 0,
Unknown: 1,
}
handler := NewHealthCheckHandler(svc)
req := httptest.NewRequest("GET", "/api/v1/health-checks/summary", nil)
w := httptest.NewRecorder()
handler.GetHealthCheckSummary(w, req)
if w.Code != http.StatusOK {
t.Errorf("Expected status 200, got %d", w.Code)
}
var resp domain.HealthCheckSummary
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("Failed to decode response: %v", err)
}
if resp.Healthy != 3 {
t.Errorf("Expected 3 healthy checks, got %d", resp.Healthy)
}
}
+347
View File
@@ -0,0 +1,347 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/shankar0123/certctl/internal/api/middleware"
)
func TestHealth_ReturnsOK(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodGet, "/health", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.Health(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("Health handler returned status %d, want %d", status, http.StatusOK)
}
// Check content type
if ct := w.Header().Get("Content-Type"); ct != "application/json" {
t.Errorf("Content-Type = %q, want application/json", ct)
}
// Check response body
var result map[string]string
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["status"] != "healthy" {
t.Errorf("status = %q, want healthy", result["status"])
}
}
func TestHealth_MethodNotAllowed(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodPost, "/health", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.Health(w, req)
if status := w.Code; status != http.StatusMethodNotAllowed {
t.Errorf("Health handler returned status %d, want %d", status, http.StatusMethodNotAllowed)
}
}
func TestReady_ReturnsOK(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodGet, "/ready", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.Ready(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("Ready handler returned status %d, want %d", status, http.StatusOK)
}
// Check content type
if ct := w.Header().Get("Content-Type"); ct != "application/json" {
t.Errorf("Content-Type = %q, want application/json", ct)
}
// Check response body
var result map[string]string
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["status"] != "ready" {
t.Errorf("status = %q, want ready", result["status"])
}
}
func TestReady_MethodNotAllowed(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodDelete, "/ready", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.Ready(w, req)
if status := w.Code; status != http.StatusMethodNotAllowed {
t.Errorf("Ready handler returned status %d, want %d", status, http.StatusMethodNotAllowed)
}
}
func TestAuthInfo_ReturnsAuthType_APIKey(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodGet, "/api/v1/auth/info", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.AuthInfo(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("AuthInfo handler returned status %d, want %d", status, http.StatusOK)
}
var result map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["auth_type"] != "api-key" {
t.Errorf("auth_type = %q, want api-key", result["auth_type"])
}
if required, ok := result["required"].(bool); !ok || !required {
t.Errorf("required = %v, want true", result["required"])
}
}
func TestAuthInfo_ReturnsAuthType_None(t *testing.T) {
handler := NewHealthHandler("none")
req, err := http.NewRequest(http.MethodGet, "/api/v1/auth/info", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.AuthInfo(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("AuthInfo handler returned status %d, want %d", status, http.StatusOK)
}
var result map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["auth_type"] != "none" {
t.Errorf("auth_type = %q, want none", result["auth_type"])
}
if required, ok := result["required"].(bool); !ok || required {
t.Errorf("required = %v, want false", result["required"])
}
}
func TestAuthInfo_ReturnsAuthType_JWT(t *testing.T) {
handler := NewHealthHandler("jwt")
req, err := http.NewRequest(http.MethodGet, "/api/v1/auth/info", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.AuthInfo(w, req)
var result map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["auth_type"] != "jwt" {
t.Errorf("auth_type = %q, want jwt", result["auth_type"])
}
if required, ok := result["required"].(bool); !ok || !required {
t.Errorf("required = %v, want true", result["required"])
}
}
func TestAuthCheck_ReturnsOK(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodGet, "/api/v1/auth/check", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.AuthCheck(w, req)
if status := w.Code; status != http.StatusOK {
t.Errorf("AuthCheck handler returned status %d, want %d", status, http.StatusOK)
}
// Check content type
if ct := w.Header().Get("Content-Type"); ct != "application/json" {
t.Errorf("Content-Type = %q, want application/json", ct)
}
// Check response body — mixed-value map (string + bool) post-Phase B.4.
var result map[string]any
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["status"] != "authenticated" {
t.Errorf("status = %q, want authenticated", result["status"])
}
}
func TestAuthCheck_MethodNotAllowed(t *testing.T) {
handler := NewHealthHandler("api-key")
req, err := http.NewRequest(http.MethodPost, "/api/v1/auth/check", nil)
if err != nil {
t.Fatalf("NewRequest failed: %v", err)
}
w := httptest.NewRecorder()
handler.AuthCheck(w, req)
// AuthCheck doesn't explicitly check method, so it will return 200
// But let's verify the response is still correct
if status := w.Code; status != http.StatusOK {
t.Logf("AuthCheck returned status %d (note: method not enforced in handler)", status)
}
}
// --- M-003 (Phase B.4): /auth/check surfaces admin flag + user identity ---
// TestAuthCheck_AdminCaller_ReportsAdminTrue confirms that when the auth
// middleware sets AdminKey{}=true (i.e., named key was admin-tagged), the
// /auth/check endpoint reports admin=true so the GUI can show admin-only
// affordances.
func TestAuthCheck_AdminCaller_ReportsAdminTrue(t *testing.T) {
handler := NewHealthHandler("api-key")
req := httptest.NewRequest(http.MethodGet, "/api/v1/auth/check", nil)
ctx := context.WithValue(req.Context(), middleware.AdminKey{}, true)
ctx = context.WithValue(ctx, middleware.UserKey{}, "ops-admin")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.AuthCheck(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200, got %d", w.Code)
}
var result map[string]any
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["status"] != "authenticated" {
t.Errorf("status = %q, want authenticated", result["status"])
}
admin, ok := result["admin"].(bool)
if !ok {
t.Fatalf("admin field missing or wrong type: %T", result["admin"])
}
if !admin {
t.Errorf("admin = false, want true")
}
if result["user"] != "ops-admin" {
t.Errorf("user = %q, want ops-admin", result["user"])
}
}
// TestAuthCheck_NonAdminCaller_ReportsAdminFalse pins the negative case: the
// auth middleware has stored AdminKey{}=false (non-admin named key) — the
// endpoint must report admin=false so the GUI hides admin-only affordances.
func TestAuthCheck_NonAdminCaller_ReportsAdminFalse(t *testing.T) {
handler := NewHealthHandler("api-key")
req := httptest.NewRequest(http.MethodGet, "/api/v1/auth/check", nil)
ctx := context.WithValue(req.Context(), middleware.AdminKey{}, false)
ctx = context.WithValue(ctx, middleware.UserKey{}, "alice")
req = req.WithContext(ctx)
w := httptest.NewRecorder()
handler.AuthCheck(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200, got %d", w.Code)
}
var result map[string]any
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
admin, ok := result["admin"].(bool)
if !ok {
t.Fatalf("admin field missing or wrong type: %T", result["admin"])
}
if admin {
t.Errorf("admin = true, want false")
}
if result["user"] != "alice" {
t.Errorf("user = %q, want alice", result["user"])
}
}
// TestAuthCheck_NoAuthContext_DefaultsToEmptyUserAndFalseAdmin covers the
// CERTCTL_AUTH_TYPE=none deployment, where the auth middleware doesn't set
// any keys. Response must still be well-formed with empty user + admin=false.
func TestAuthCheck_NoAuthContext_DefaultsToEmptyUserAndFalseAdmin(t *testing.T) {
handler := NewHealthHandler("none")
req := httptest.NewRequest(http.MethodGet, "/api/v1/auth/check", nil)
w := httptest.NewRecorder()
handler.AuthCheck(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200, got %d", w.Code)
}
var result map[string]any
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["status"] != "authenticated" {
t.Errorf("status = %q, want authenticated", result["status"])
}
admin, ok := result["admin"].(bool)
if !ok {
t.Fatalf("admin field missing or wrong type: %T", result["admin"])
}
if admin {
t.Errorf("admin = true for no-auth context, want false")
}
if result["user"] != "" {
t.Errorf("user = %q, want empty string", result["user"])
}
}
+147 -28
View File
@@ -2,9 +2,12 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
@@ -13,52 +16,52 @@ import (
// MockIssuerService is a mock implementation of IssuerService interface.
type MockIssuerService struct {
ListIssuersFn func(page, perPage int) ([]domain.Issuer, int64, error)
GetIssuerFn func(id string) (*domain.Issuer, error)
CreateIssuerFn func(issuer domain.Issuer) (*domain.Issuer, error)
UpdateIssuerFn func(id string, issuer domain.Issuer) (*domain.Issuer, error)
DeleteIssuerFn func(id string) error
TestConnectionFn func(id string) error
ListIssuersFn func(ctx context.Context, page, perPage int) ([]domain.Issuer, int64, error)
GetIssuerFn func(ctx context.Context, id string) (*domain.Issuer, error)
CreateIssuerFn func(ctx context.Context, issuer domain.Issuer) (*domain.Issuer, error)
UpdateIssuerFn func(ctx context.Context, id string, issuer domain.Issuer) (*domain.Issuer, error)
DeleteIssuerFn func(ctx context.Context, id string) error
TestConnectionFn func(ctx context.Context, id string) error
}
func (m *MockIssuerService) ListIssuers(page, perPage int) ([]domain.Issuer, int64, error) {
func (m *MockIssuerService) ListIssuers(ctx context.Context, page, perPage int) ([]domain.Issuer, int64, error) {
if m.ListIssuersFn != nil {
return m.ListIssuersFn(page, perPage)
return m.ListIssuersFn(ctx, page, perPage)
}
return nil, 0, nil
}
func (m *MockIssuerService) GetIssuer(id string) (*domain.Issuer, error) {
func (m *MockIssuerService) GetIssuer(ctx context.Context, id string) (*domain.Issuer, error) {
if m.GetIssuerFn != nil {
return m.GetIssuerFn(id)
return m.GetIssuerFn(ctx, id)
}
return nil, nil
}
func (m *MockIssuerService) CreateIssuer(issuer domain.Issuer) (*domain.Issuer, error) {
func (m *MockIssuerService) CreateIssuer(ctx context.Context, issuer domain.Issuer) (*domain.Issuer, error) {
if m.CreateIssuerFn != nil {
return m.CreateIssuerFn(issuer)
return m.CreateIssuerFn(ctx, issuer)
}
return nil, nil
}
func (m *MockIssuerService) UpdateIssuer(id string, issuer domain.Issuer) (*domain.Issuer, error) {
func (m *MockIssuerService) UpdateIssuer(ctx context.Context, id string, issuer domain.Issuer) (*domain.Issuer, error) {
if m.UpdateIssuerFn != nil {
return m.UpdateIssuerFn(id, issuer)
return m.UpdateIssuerFn(ctx, id, issuer)
}
return nil, nil
}
func (m *MockIssuerService) DeleteIssuer(id string) error {
func (m *MockIssuerService) DeleteIssuer(ctx context.Context, id string) error {
if m.DeleteIssuerFn != nil {
return m.DeleteIssuerFn(id)
return m.DeleteIssuerFn(ctx, id)
}
return nil
}
func (m *MockIssuerService) TestConnection(id string) error {
func (m *MockIssuerService) TestConnection(ctx context.Context, id string) error {
if m.TestConnectionFn != nil {
return m.TestConnectionFn(id)
return m.TestConnectionFn(ctx, id)
}
return nil
}
@@ -83,7 +86,7 @@ func TestListIssuers_Success(t *testing.T) {
}
mock := &MockIssuerService{
ListIssuersFn: func(page, perPage int) ([]domain.Issuer, int64, error) {
ListIssuersFn: func(_ context.Context, page, perPage int) ([]domain.Issuer, int64, error) {
return []domain.Issuer{iss1, iss2}, 2, nil
},
}
@@ -111,7 +114,7 @@ func TestListIssuers_Success(t *testing.T) {
func TestListIssuers_Pagination(t *testing.T) {
var capturedPage, capturedPerPage int
mock := &MockIssuerService{
ListIssuersFn: func(page, perPage int) ([]domain.Issuer, int64, error) {
ListIssuersFn: func(_ context.Context, page, perPage int) ([]domain.Issuer, int64, error) {
capturedPage = page
capturedPerPage = perPage
return []domain.Issuer{}, 0, nil
@@ -135,7 +138,7 @@ func TestListIssuers_Pagination(t *testing.T) {
func TestListIssuers_ServiceError(t *testing.T) {
mock := &MockIssuerService{
ListIssuersFn: func(page, perPage int) ([]domain.Issuer, int64, error) {
ListIssuersFn: func(_ context.Context, page, perPage int) ([]domain.Issuer, int64, error) {
return nil, 0, ErrMockServiceFailed
},
}
@@ -167,7 +170,7 @@ func TestListIssuers_MethodNotAllowed(t *testing.T) {
func TestGetIssuer_Success(t *testing.T) {
now := time.Now()
mock := &MockIssuerService{
GetIssuerFn: func(id string) (*domain.Issuer, error) {
GetIssuerFn: func(_ context.Context, id string) (*domain.Issuer, error) {
return &domain.Issuer{
ID: id,
Name: "Local CA",
@@ -193,7 +196,7 @@ func TestGetIssuer_Success(t *testing.T) {
func TestGetIssuer_NotFound(t *testing.T) {
mock := &MockIssuerService{
GetIssuerFn: func(id string) (*domain.Issuer, error) {
GetIssuerFn: func(_ context.Context, id string) (*domain.Issuer, error) {
return nil, ErrMockNotFound
},
}
@@ -226,7 +229,7 @@ func TestGetIssuer_EmptyID(t *testing.T) {
func TestCreateIssuer_Success(t *testing.T) {
now := time.Now()
mock := &MockIssuerService{
CreateIssuerFn: func(issuer domain.Issuer) (*domain.Issuer, error) {
CreateIssuerFn: func(_ context.Context, issuer domain.Issuer) (*domain.Issuer, error) {
issuer.ID = "iss-new"
issuer.CreatedAt = now
issuer.UpdatedAt = now
@@ -324,10 +327,126 @@ func TestCreateIssuer_NameTooLong(t *testing.T) {
}
}
func TestCreateIssuer_DuplicateName(t *testing.T) {
mock := &MockIssuerService{
CreateIssuerFn: func(_ context.Context, issuer domain.Issuer) (*domain.Issuer, error) {
return nil, fmt.Errorf("failed to create issuer: duplicate key value violates unique constraint \"issuers_name_key\"")
},
}
body := map[string]interface{}{
"name": "ACME Issuer",
"type": "ACME",
}
bodyBytes, _ := json.Marshal(body)
handler := NewIssuerHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/issuers", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateIssuer(w, req)
if w.Code != http.StatusConflict {
t.Fatalf("expected status 409, got %d", w.Code)
}
var resp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if !strings.Contains(resp.Message, "already exists") {
t.Errorf("expected message to contain 'already exists', got %q", resp.Message)
}
}
func TestCreateIssuer_UnsupportedType(t *testing.T) {
mock := &MockIssuerService{
CreateIssuerFn: func(_ context.Context, issuer domain.Issuer) (*domain.Issuer, error) {
return nil, fmt.Errorf("unsupported issuer type: FakeCA")
},
}
body := map[string]interface{}{
"name": "Fake Issuer",
"type": "FakeCA",
}
bodyBytes, _ := json.Marshal(body)
handler := NewIssuerHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/issuers", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateIssuer(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("expected status 400, got %d", w.Code)
}
var resp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if !strings.Contains(resp.Message, "unsupported issuer type") {
t.Errorf("expected message to contain 'unsupported issuer type', got %q", resp.Message)
}
}
func TestCreateIssuer_GenericServiceError(t *testing.T) {
mock := &MockIssuerService{
CreateIssuerFn: func(_ context.Context, issuer domain.Issuer) (*domain.Issuer, error) {
return nil, fmt.Errorf("failed to encrypt config: cipher error")
},
}
body := map[string]interface{}{
"name": "Some Issuer",
"type": "ACME",
}
bodyBytes, _ := json.Marshal(body)
handler := NewIssuerHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/issuers", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateIssuer(w, req)
if w.Code != http.StatusInternalServerError {
t.Fatalf("expected status 500, got %d", w.Code)
}
}
func TestUpdateIssuer_DuplicateName(t *testing.T) {
mock := &MockIssuerService{
UpdateIssuerFn: func(_ context.Context, id string, issuer domain.Issuer) (*domain.Issuer, error) {
return nil, fmt.Errorf("failed to update issuer: duplicate key value violates unique constraint")
},
}
body := map[string]interface{}{
"name": "Existing Name",
"type": "ACME",
}
bodyBytes, _ := json.Marshal(body)
handler := NewIssuerHandler(mock)
req := httptest.NewRequest(http.MethodPut, "/api/v1/issuers/iss-test", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.UpdateIssuer(w, req)
if w.Code != http.StatusConflict {
t.Fatalf("expected status 409, got %d", w.Code)
}
}
func TestDeleteIssuer_Success(t *testing.T) {
var deletedID string
mock := &MockIssuerService{
DeleteIssuerFn: func(id string) error {
DeleteIssuerFn: func(_ context.Context, id string) error {
deletedID = id
return nil
},
@@ -350,7 +469,7 @@ func TestDeleteIssuer_Success(t *testing.T) {
func TestDeleteIssuer_ServiceError(t *testing.T) {
mock := &MockIssuerService{
DeleteIssuerFn: func(id string) error {
DeleteIssuerFn: func(_ context.Context, id string) error {
return ErrMockServiceFailed
},
}
@@ -369,7 +488,7 @@ func TestDeleteIssuer_ServiceError(t *testing.T) {
func TestTestConnection_Success(t *testing.T) {
mock := &MockIssuerService{
TestConnectionFn: func(id string) error {
TestConnectionFn: func(_ context.Context, id string) error {
return nil
},
}
@@ -396,7 +515,7 @@ func TestTestConnection_Success(t *testing.T) {
func TestTestConnection_Failure(t *testing.T) {
mock := &MockIssuerService{
TestConnectionFn: func(id string) error {
TestConnectionFn: func(_ context.Context, id string) error {
return ErrMockServiceFailed
},
}
+42 -16
View File
@@ -1,7 +1,9 @@
package handler
import (
"context"
"encoding/json"
"log/slog"
"net/http"
"strconv"
"strings"
@@ -12,22 +14,28 @@ import (
// IssuerService defines the service interface for issuer operations.
type IssuerService interface {
ListIssuers(page, perPage int) ([]domain.Issuer, int64, error)
GetIssuer(id string) (*domain.Issuer, error)
CreateIssuer(issuer domain.Issuer) (*domain.Issuer, error)
UpdateIssuer(id string, issuer domain.Issuer) (*domain.Issuer, error)
DeleteIssuer(id string) error
TestConnection(id string) error
ListIssuers(ctx context.Context, page, perPage int) ([]domain.Issuer, int64, error)
GetIssuer(ctx context.Context, id string) (*domain.Issuer, error)
CreateIssuer(ctx context.Context, issuer domain.Issuer) (*domain.Issuer, error)
UpdateIssuer(ctx context.Context, id string, issuer domain.Issuer) (*domain.Issuer, error)
DeleteIssuer(ctx context.Context, id string) error
TestConnection(ctx context.Context, id string) error
}
// IssuerHandler handles HTTP requests for issuer operations.
type IssuerHandler struct {
svc IssuerService
svc IssuerService
logger *slog.Logger
}
// NewIssuerHandler creates a new IssuerHandler with a service dependency.
func NewIssuerHandler(svc IssuerService) IssuerHandler {
return IssuerHandler{svc: svc}
return IssuerHandler{svc: svc, logger: slog.Default()}
}
// NewIssuerHandlerWithLogger creates a new IssuerHandler with a custom logger.
func NewIssuerHandlerWithLogger(svc IssuerService, logger *slog.Logger) IssuerHandler {
return IssuerHandler{svc: svc, logger: logger}
}
// ListIssuers lists all configured issuers.
@@ -54,7 +62,7 @@ func (h IssuerHandler) ListIssuers(w http.ResponseWriter, r *http.Request) {
}
}
issuers, total, err := h.svc.ListIssuers(page, perPage)
issuers, total, err := h.svc.ListIssuers(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list issuers", requestID)
return
@@ -86,7 +94,7 @@ func (h IssuerHandler) GetIssuer(w http.ResponseWriter, r *http.Request) {
return
}
issuer, err := h.svc.GetIssuer(id)
issuer, err := h.svc.GetIssuer(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Issuer not found", requestID)
return
@@ -125,9 +133,18 @@ func (h IssuerHandler) CreateIssuer(w http.ResponseWriter, r *http.Request) {
return
}
created, err := h.svc.CreateIssuer(issuer)
created, err := h.svc.CreateIssuer(r.Context(), issuer)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create issuer", requestID)
h.logger.Error("failed to create issuer", "error", err, "name", issuer.Name, "type", issuer.Type)
errMsg := err.Error()
switch {
case strings.Contains(errMsg, "unique") || strings.Contains(errMsg, "duplicate"):
ErrorWithRequestID(w, http.StatusConflict, "An issuer with this name already exists", requestID)
case strings.Contains(errMsg, "unsupported issuer type"):
ErrorWithRequestID(w, http.StatusBadRequest, errMsg, requestID)
default:
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create issuer", requestID)
}
return
}
@@ -158,9 +175,18 @@ func (h IssuerHandler) UpdateIssuer(w http.ResponseWriter, r *http.Request) {
return
}
updated, err := h.svc.UpdateIssuer(id, issuer)
updated, err := h.svc.UpdateIssuer(r.Context(), id, issuer)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update issuer", requestID)
h.logger.Error("failed to update issuer", "error", err, "id", id)
errMsg := err.Error()
switch {
case strings.Contains(errMsg, "unique") || strings.Contains(errMsg, "duplicate"):
ErrorWithRequestID(w, http.StatusConflict, "An issuer with this name already exists", requestID)
case strings.Contains(errMsg, "not found"):
ErrorWithRequestID(w, http.StatusNotFound, "Issuer not found", requestID)
default:
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update issuer", requestID)
}
return
}
@@ -183,7 +209,7 @@ func (h IssuerHandler) DeleteIssuer(w http.ResponseWriter, r *http.Request) {
return
}
if err := h.svc.DeleteIssuer(id); err != nil {
if err := h.svc.DeleteIssuer(r.Context(), id); err != nil {
if strings.Contains(err.Error(), "violates foreign key") || strings.Contains(err.Error(), "RESTRICT") {
ErrorWithRequestID(w, http.StatusConflict, "Cannot delete issuer: certificates are still using this issuer", requestID)
} else if strings.Contains(err.Error(), "not found") {
@@ -216,7 +242,7 @@ func (h IssuerHandler) TestConnection(w http.ResponseWriter, r *http.Request) {
}
issuerID := parts[0]
if err := h.svc.TestConnection(issuerID); err != nil {
if err := h.svc.TestConnection(r.Context(), issuerID); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Connection test failed", requestID)
return
}
+65 -15
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"fmt"
"net/http"
@@ -10,48 +11,51 @@ import (
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// MockJobService is a mock implementation of JobService interface.
// Approve/Reject closures now take the actor string so tests can assert
// actor propagation from the auth middleware → handler → service.
type MockJobService struct {
ListJobsFn func(status, jobType string, page, perPage int) ([]domain.Job, int64, error)
GetJobFn func(id string) (*domain.Job, error)
CancelJobFn func(id string) error
ApproveJobFn func(id string) error
RejectJobFn func(id string, reason string) error
ApproveJobFn func(id, actor string) error
RejectJobFn func(id, reason, actor string) error
}
func (m *MockJobService) ListJobs(status, jobType string, page, perPage int) ([]domain.Job, int64, error) {
func (m *MockJobService) ListJobs(_ context.Context, status, jobType string, page, perPage int) ([]domain.Job, int64, error) {
if m.ListJobsFn != nil {
return m.ListJobsFn(status, jobType, page, perPage)
}
return nil, 0, nil
}
func (m *MockJobService) GetJob(id string) (*domain.Job, error) {
func (m *MockJobService) GetJob(_ context.Context, id string) (*domain.Job, error) {
if m.GetJobFn != nil {
return m.GetJobFn(id)
}
return nil, nil
}
func (m *MockJobService) CancelJob(id string) error {
func (m *MockJobService) CancelJob(_ context.Context, id string) error {
if m.CancelJobFn != nil {
return m.CancelJobFn(id)
}
return nil
}
func (m *MockJobService) ApproveJob(id string) error {
func (m *MockJobService) ApproveJob(_ context.Context, id, actor string) error {
if m.ApproveJobFn != nil {
return m.ApproveJobFn(id)
return m.ApproveJobFn(id, actor)
}
return nil
}
func (m *MockJobService) RejectJob(id string, reason string) error {
func (m *MockJobService) RejectJob(_ context.Context, id, reason, actor string) error {
if m.RejectJobFn != nil {
return m.RejectJobFn(id, reason)
return m.RejectJobFn(id, reason, actor)
}
return nil
}
@@ -347,7 +351,7 @@ func TestCancelJob_EmptyID(t *testing.T) {
func TestApproveJob_Success(t *testing.T) {
var approvedID string
mock := &MockJobService{
ApproveJobFn: func(id string) error {
ApproveJobFn: func(id, actor string) error {
approvedID = id
return nil
},
@@ -378,7 +382,7 @@ func TestApproveJob_Success(t *testing.T) {
func TestApproveJob_NotFound(t *testing.T) {
mock := &MockJobService{
ApproveJobFn: func(id string) error {
ApproveJobFn: func(id, actor string) error {
return fmt.Errorf("job not found: no rows")
},
}
@@ -397,7 +401,7 @@ func TestApproveJob_NotFound(t *testing.T) {
func TestApproveJob_BadStatus(t *testing.T) {
mock := &MockJobService{
ApproveJobFn: func(id string) error {
ApproveJobFn: func(id, actor string) error {
return fmt.Errorf("cannot approve job with status Running")
},
}
@@ -426,10 +430,56 @@ func TestApproveJob_MethodNotAllowed(t *testing.T) {
}
}
// TestApproveJob_SelfApproval_Returns403 verifies the M-003 separation-of-duties
// wire: when the service returns ErrSelfApproval the handler must surface HTTP
// 403 Forbidden (NOT 500). The error sentinel crosses the service boundary via
// errors.Is so the handler can pattern-match regardless of any fmt.Errorf
// wrapping that may be added later.
func TestApproveJob_SelfApproval_Returns403(t *testing.T) {
var capturedActor string
mock := &MockJobService{
ApproveJobFn: func(id, actor string) error {
capturedActor = actor
return service.ErrSelfApproval
},
}
h := NewJobHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/jobs/job-self/approve", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
h.ApproveJob(w, req)
if w.Code != http.StatusForbidden {
t.Fatalf("expected status 403, got %d", w.Code)
}
var resp map[string]any
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
// Response body should name the self-approval condition explicitly so
// operators triaging a 403 can distinguish it from other forbid paths.
// The ErrorResponse envelope uses "error" for the status text and
// "message" for the human-readable explanation — we assert on message.
msg, _ := resp["message"].(string)
if !strings.Contains(strings.ToLower(msg), "self-approval") {
t.Errorf("expected message to mention self-approval, got %q", msg)
}
// The handler resolves the actor from the auth context; in this test the
// request has no auth context, so the propagated actor is the anonymous
// fallback ("" or "anonymous" depending on middleware wiring). We only
// assert the closure observed *some* actor string — the detailed actor
// threading is covered by resolveActor unit tests.
_ = capturedActor
}
func TestRejectJob_Success(t *testing.T) {
var rejectedID, capturedReason string
mock := &MockJobService{
RejectJobFn: func(id string, reason string) error {
RejectJobFn: func(id, reason, actor string) error {
rejectedID = id
capturedReason = reason
return nil
@@ -457,7 +507,7 @@ func TestRejectJob_Success(t *testing.T) {
func TestRejectJob_NoReason(t *testing.T) {
mock := &MockJobService{
RejectJobFn: func(id string, reason string) error {
RejectJobFn: func(id, reason, actor string) error {
return nil
},
}
@@ -476,7 +526,7 @@ func TestRejectJob_NoReason(t *testing.T) {
func TestRejectJob_NotFound(t *testing.T) {
mock := &MockJobService{
RejectJobFn: func(id string, reason string) error {
RejectJobFn: func(id, reason, actor string) error {
return fmt.Errorf("job not found: no rows")
},
}
+29 -10
View File
@@ -1,7 +1,9 @@
package handler
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"strconv"
@@ -9,15 +11,21 @@ import (
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// JobService defines the service interface for job operations.
type JobService interface {
ListJobs(status, jobType string, page, perPage int) ([]domain.Job, int64, error)
GetJob(id string) (*domain.Job, error)
CancelJob(id string) error
ApproveJob(id string) error
RejectJob(id string, reason string) error
ListJobs(ctx context.Context, status, jobType string, page, perPage int) ([]domain.Job, int64, error)
GetJob(ctx context.Context, id string) (*domain.Job, error)
CancelJob(ctx context.Context, id string) error
// ApproveJob approves a renewal job. actor is the named-key identity
// resolved from the auth middleware; the service returns ErrSelfApproval
// (mapped to 403) when actor matches the certificate owner.
ApproveJob(ctx context.Context, id, actor string) error
// RejectJob rejects a renewal job. actor is the named-key identity
// recorded for audit attribution; no not-self restriction.
RejectJob(ctx context.Context, id, reason, actor string) error
}
// JobHandler handles HTTP requests for job operations.
@@ -57,7 +65,7 @@ func (h JobHandler) ListJobs(w http.ResponseWriter, r *http.Request) {
}
}
jobs, total, err := h.svc.ListJobs(status, jobType, page, perPage)
jobs, total, err := h.svc.ListJobs(r.Context(), status, jobType, page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list jobs", requestID)
return
@@ -91,7 +99,7 @@ func (h JobHandler) GetJob(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
job, err := h.svc.GetJob(id)
job, err := h.svc.GetJob(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Job not found", requestID)
return
@@ -119,7 +127,7 @@ func (h JobHandler) CancelJob(w http.ResponseWriter, r *http.Request) {
}
jobID := parts[0]
if err := h.svc.CancelJob(jobID); err != nil {
if err := h.svc.CancelJob(r.Context(), jobID); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to cancel job", requestID)
return
}
@@ -149,7 +157,16 @@ func (h JobHandler) ApproveJob(w http.ResponseWriter, r *http.Request) {
}
jobID := parts[0]
if err := h.svc.ApproveJob(jobID); err != nil {
actor := resolveActor(r.Context())
if err := h.svc.ApproveJob(r.Context(), jobID, actor); err != nil {
// M-003: self-approval by the certificate owner is forbidden.
if errors.Is(err, service.ErrSelfApproval) {
ErrorWithRequestID(w, http.StatusForbidden,
"Self-approval is forbidden: the certificate owner cannot approve their own renewal",
requestID)
return
}
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Job not found", requestID)
return
@@ -193,7 +210,9 @@ func (h JobHandler) RejectJob(w http.ResponseWriter, r *http.Request) {
}
}
if err := h.svc.RejectJob(jobID, body.Reason); err != nil {
actor := resolveActor(r.Context())
if err := h.svc.RejectJob(r.Context(), jobID, body.Reason, actor); err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Job not found", requestID)
return
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -17,21 +18,21 @@ type MockNotificationService struct {
MarkAsReadFn func(id string) error
}
func (m *MockNotificationService) ListNotifications(page, perPage int) ([]domain.NotificationEvent, int64, error) {
func (m *MockNotificationService) ListNotifications(_ context.Context, page, perPage int) ([]domain.NotificationEvent, int64, error) {
if m.ListNotificationsFn != nil {
return m.ListNotificationsFn(page, perPage)
}
return nil, 0, nil
}
func (m *MockNotificationService) GetNotification(id string) (*domain.NotificationEvent, error) {
func (m *MockNotificationService) GetNotification(_ context.Context, id string) (*domain.NotificationEvent, error) {
if m.GetNotificationFn != nil {
return m.GetNotificationFn(id)
}
return nil, nil
}
func (m *MockNotificationService) MarkAsRead(id string) error {
func (m *MockNotificationService) MarkAsRead(_ context.Context, id string) error {
if m.MarkAsReadFn != nil {
return m.MarkAsReadFn(id)
}
+7 -6
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"net/http"
"strconv"
"strings"
@@ -11,9 +12,9 @@ import (
// NotificationService defines the service interface for notification operations.
type NotificationService interface {
ListNotifications(page, perPage int) ([]domain.NotificationEvent, int64, error)
GetNotification(id string) (*domain.NotificationEvent, error)
MarkAsRead(id string) error
ListNotifications(ctx context.Context, page, perPage int) ([]domain.NotificationEvent, int64, error)
GetNotification(ctx context.Context, id string) (*domain.NotificationEvent, error)
MarkAsRead(ctx context.Context, id string) error
}
// NotificationHandler handles HTTP requests for notification operations.
@@ -50,7 +51,7 @@ func (h NotificationHandler) ListNotifications(w http.ResponseWriter, r *http.Re
}
}
notifications, total, err := h.svc.ListNotifications(page, perPage)
notifications, total, err := h.svc.ListNotifications(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list notifications", requestID)
return
@@ -84,7 +85,7 @@ func (h NotificationHandler) GetNotification(w http.ResponseWriter, r *http.Requ
}
id = parts[0]
notification, err := h.svc.GetNotification(id)
notification, err := h.svc.GetNotification(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Notification not found", requestID)
return
@@ -112,7 +113,7 @@ func (h NotificationHandler) MarkAsRead(w http.ResponseWriter, r *http.Request)
}
notificationID := parts[0]
if err := h.svc.MarkAsRead(notificationID); err != nil {
if err := h.svc.MarkAsRead(r.Context(), notificationID); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to mark notification as read", requestID)
return
}
+6 -5
View File
@@ -2,6 +2,7 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -20,35 +21,35 @@ type MockOwnerService struct {
DeleteOwnerFn func(id string) error
}
func (m *MockOwnerService) ListOwners(page, perPage int) ([]domain.Owner, int64, error) {
func (m *MockOwnerService) ListOwners(_ context.Context, page, perPage int) ([]domain.Owner, int64, error) {
if m.ListOwnersFn != nil {
return m.ListOwnersFn(page, perPage)
}
return nil, 0, nil
}
func (m *MockOwnerService) GetOwner(id string) (*domain.Owner, error) {
func (m *MockOwnerService) GetOwner(_ context.Context, id string) (*domain.Owner, error) {
if m.GetOwnerFn != nil {
return m.GetOwnerFn(id)
}
return nil, nil
}
func (m *MockOwnerService) CreateOwner(owner domain.Owner) (*domain.Owner, error) {
func (m *MockOwnerService) CreateOwner(_ context.Context, owner domain.Owner) (*domain.Owner, error) {
if m.CreateOwnerFn != nil {
return m.CreateOwnerFn(owner)
}
return nil, nil
}
func (m *MockOwnerService) UpdateOwner(id string, owner domain.Owner) (*domain.Owner, error) {
func (m *MockOwnerService) UpdateOwner(_ context.Context, id string, owner domain.Owner) (*domain.Owner, error) {
if m.UpdateOwnerFn != nil {
return m.UpdateOwnerFn(id, owner)
}
return nil, nil
}
func (m *MockOwnerService) DeleteOwner(id string) error {
func (m *MockOwnerService) DeleteOwner(_ context.Context, id string) error {
if m.DeleteOwnerFn != nil {
return m.DeleteOwnerFn(id)
}
+11 -10
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"strconv"
@@ -12,11 +13,11 @@ import (
// OwnerService defines the service interface for owner operations.
type OwnerService interface {
ListOwners(page, perPage int) ([]domain.Owner, int64, error)
GetOwner(id string) (*domain.Owner, error)
CreateOwner(owner domain.Owner) (*domain.Owner, error)
UpdateOwner(id string, owner domain.Owner) (*domain.Owner, error)
DeleteOwner(id string) error
ListOwners(ctx context.Context, page, perPage int) ([]domain.Owner, int64, error)
GetOwner(ctx context.Context, id string) (*domain.Owner, error)
CreateOwner(ctx context.Context, owner domain.Owner) (*domain.Owner, error)
UpdateOwner(ctx context.Context, id string, owner domain.Owner) (*domain.Owner, error)
DeleteOwner(ctx context.Context, id string) error
}
// OwnerHandler handles HTTP requests for owner operations.
@@ -53,7 +54,7 @@ func (h OwnerHandler) ListOwners(w http.ResponseWriter, r *http.Request) {
}
}
owners, total, err := h.svc.ListOwners(page, perPage)
owners, total, err := h.svc.ListOwners(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list owners", requestID)
return
@@ -87,7 +88,7 @@ func (h OwnerHandler) GetOwner(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
owner, err := h.svc.GetOwner(id)
owner, err := h.svc.GetOwner(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Owner not found", requestID)
return
@@ -122,7 +123,7 @@ func (h OwnerHandler) CreateOwner(w http.ResponseWriter, r *http.Request) {
return
}
created, err := h.svc.CreateOwner(owner)
created, err := h.svc.CreateOwner(r.Context(), owner)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create owner", requestID)
return
@@ -155,7 +156,7 @@ func (h OwnerHandler) UpdateOwner(w http.ResponseWriter, r *http.Request) {
return
}
updated, err := h.svc.UpdateOwner(id, owner)
updated, err := h.svc.UpdateOwner(r.Context(), id, owner)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update owner", requestID)
return
@@ -182,7 +183,7 @@ func (h OwnerHandler) DeleteOwner(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
if err := h.svc.DeleteOwner(id); err != nil {
if err := h.svc.DeleteOwner(r.Context(), id); err != nil {
if strings.Contains(err.Error(), "violates foreign key") || strings.Contains(err.Error(), "RESTRICT") {
ErrorWithRequestID(w, http.StatusConflict, "Cannot delete owner: certificates are still assigned to this owner", requestID)
} else if strings.Contains(err.Error(), "not found") {
+30 -12
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"strconv"
@@ -12,12 +13,12 @@ import (
// PolicyService defines the service interface for policy rule operations.
type PolicyService interface {
ListPolicies(page, perPage int) ([]domain.PolicyRule, int64, error)
GetPolicy(id string) (*domain.PolicyRule, error)
CreatePolicy(policy domain.PolicyRule) (*domain.PolicyRule, error)
UpdatePolicy(id string, policy domain.PolicyRule) (*domain.PolicyRule, error)
DeletePolicy(id string) error
ListViolations(policyID string, page, perPage int) ([]domain.PolicyViolation, int64, error)
ListPolicies(ctx context.Context, page, perPage int) ([]domain.PolicyRule, int64, error)
GetPolicy(ctx context.Context, id string) (*domain.PolicyRule, error)
CreatePolicy(ctx context.Context, policy domain.PolicyRule) (*domain.PolicyRule, error)
UpdatePolicy(ctx context.Context, id string, policy domain.PolicyRule) (*domain.PolicyRule, error)
DeletePolicy(ctx context.Context, id string) error
ListViolations(ctx context.Context, policyID string, page, perPage int) ([]domain.PolicyViolation, int64, error)
}
// PolicyHandler handles HTTP requests for policy rule operations.
@@ -54,7 +55,7 @@ func (h PolicyHandler) ListPolicies(w http.ResponseWriter, r *http.Request) {
}
}
policies, total, err := h.svc.ListPolicies(page, perPage)
policies, total, err := h.svc.ListPolicies(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list policies", requestID)
return
@@ -88,7 +89,7 @@ func (h PolicyHandler) GetPolicy(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
policy, err := h.svc.GetPolicy(id)
policy, err := h.svc.GetPolicy(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Policy not found", requestID)
return
@@ -126,8 +127,19 @@ func (h PolicyHandler) CreatePolicy(w http.ResponseWriter, r *http.Request) {
ErrorWithRequestID(w, http.StatusBadRequest, err.Error(), requestID)
return
}
// Severity is optional on create; default matches the DB default.
// Any explicit value must pass the TitleCase allowlist; the DB CHECK
// constraint enforces the same set, but catching it here gives a 400
// with a clear message instead of a 500 on constraint violation.
if policy.Severity == "" {
policy.Severity = domain.PolicySeverityWarning
}
if err := ValidatePolicySeverity(policy.Severity); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, err.Error(), requestID)
return
}
created, err := h.svc.CreatePolicy(policy)
created, err := h.svc.CreatePolicy(r.Context(), policy)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create policy", requestID)
return
@@ -173,8 +185,14 @@ func (h PolicyHandler) UpdatePolicy(w http.ResponseWriter, r *http.Request) {
return
}
}
if policy.Severity != "" {
if err := ValidatePolicySeverity(policy.Severity); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, err.Error(), requestID)
return
}
}
updated, err := h.svc.UpdatePolicy(id, policy)
updated, err := h.svc.UpdatePolicy(r.Context(), id, policy)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update policy", requestID)
return
@@ -201,7 +219,7 @@ func (h PolicyHandler) DeletePolicy(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
if err := h.svc.DeletePolicy(id); err != nil {
if err := h.svc.DeletePolicy(r.Context(), id); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to delete policy", requestID)
return
}
@@ -242,7 +260,7 @@ func (h PolicyHandler) ListViolations(w http.ResponseWriter, r *http.Request) {
}
}
violations, total, err := h.svc.ListViolations(policyID, page, perPage)
violations, total, err := h.svc.ListViolations(r.Context(), policyID, page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list violations", requestID)
return
+7 -6
View File
@@ -2,6 +2,7 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -21,42 +22,42 @@ type MockPolicyService struct {
ListViolationsFn func(policyID string, page, perPage int) ([]domain.PolicyViolation, int64, error)
}
func (m *MockPolicyService) ListPolicies(page, perPage int) ([]domain.PolicyRule, int64, error) {
func (m *MockPolicyService) ListPolicies(_ context.Context, page, perPage int) ([]domain.PolicyRule, int64, error) {
if m.ListPoliciesFn != nil {
return m.ListPoliciesFn(page, perPage)
}
return nil, 0, nil
}
func (m *MockPolicyService) GetPolicy(id string) (*domain.PolicyRule, error) {
func (m *MockPolicyService) GetPolicy(_ context.Context, id string) (*domain.PolicyRule, error) {
if m.GetPolicyFn != nil {
return m.GetPolicyFn(id)
}
return nil, nil
}
func (m *MockPolicyService) CreatePolicy(policy domain.PolicyRule) (*domain.PolicyRule, error) {
func (m *MockPolicyService) CreatePolicy(_ context.Context, policy domain.PolicyRule) (*domain.PolicyRule, error) {
if m.CreatePolicyFn != nil {
return m.CreatePolicyFn(policy)
}
return nil, nil
}
func (m *MockPolicyService) UpdatePolicy(id string, policy domain.PolicyRule) (*domain.PolicyRule, error) {
func (m *MockPolicyService) UpdatePolicy(_ context.Context, id string, policy domain.PolicyRule) (*domain.PolicyRule, error) {
if m.UpdatePolicyFn != nil {
return m.UpdatePolicyFn(id, policy)
}
return nil, nil
}
func (m *MockPolicyService) DeletePolicy(id string) error {
func (m *MockPolicyService) DeletePolicy(_ context.Context, id string) error {
if m.DeletePolicyFn != nil {
return m.DeletePolicyFn(id)
}
return nil
}
func (m *MockPolicyService) ListViolations(policyID string, page, perPage int) ([]domain.PolicyViolation, int64, error) {
func (m *MockPolicyService) ListViolations(_ context.Context, policyID string, page, perPage int) ([]domain.PolicyViolation, int64, error) {
if m.ListViolationsFn != nil {
return m.ListViolationsFn(policyID, page, perPage)
}
+6 -5
View File
@@ -2,6 +2,7 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -20,35 +21,35 @@ type MockProfileService struct {
DeleteProfileFn func(id string) error
}
func (m *MockProfileService) ListProfiles(page, perPage int) ([]domain.CertificateProfile, int64, error) {
func (m *MockProfileService) ListProfiles(_ context.Context, page, perPage int) ([]domain.CertificateProfile, int64, error) {
if m.ListProfilesFn != nil {
return m.ListProfilesFn(page, perPage)
}
return nil, 0, nil
}
func (m *MockProfileService) GetProfile(id string) (*domain.CertificateProfile, error) {
func (m *MockProfileService) GetProfile(_ context.Context, id string) (*domain.CertificateProfile, error) {
if m.GetProfileFn != nil {
return m.GetProfileFn(id)
}
return nil, nil
}
func (m *MockProfileService) CreateProfile(profile domain.CertificateProfile) (*domain.CertificateProfile, error) {
func (m *MockProfileService) CreateProfile(_ context.Context, profile domain.CertificateProfile) (*domain.CertificateProfile, error) {
if m.CreateProfileFn != nil {
return m.CreateProfileFn(profile)
}
return nil, nil
}
func (m *MockProfileService) UpdateProfile(id string, profile domain.CertificateProfile) (*domain.CertificateProfile, error) {
func (m *MockProfileService) UpdateProfile(_ context.Context, id string, profile domain.CertificateProfile) (*domain.CertificateProfile, error) {
if m.UpdateProfileFn != nil {
return m.UpdateProfileFn(id, profile)
}
return nil, nil
}
func (m *MockProfileService) DeleteProfile(id string) error {
func (m *MockProfileService) DeleteProfile(_ context.Context, id string) error {
if m.DeleteProfileFn != nil {
return m.DeleteProfileFn(id)
}
+11 -10
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"strconv"
@@ -12,11 +13,11 @@ import (
// ProfileService defines the service interface for certificate profile operations.
type ProfileService interface {
ListProfiles(page, perPage int) ([]domain.CertificateProfile, int64, error)
GetProfile(id string) (*domain.CertificateProfile, error)
CreateProfile(profile domain.CertificateProfile) (*domain.CertificateProfile, error)
UpdateProfile(id string, profile domain.CertificateProfile) (*domain.CertificateProfile, error)
DeleteProfile(id string) error
ListProfiles(ctx context.Context, page, perPage int) ([]domain.CertificateProfile, int64, error)
GetProfile(ctx context.Context, id string) (*domain.CertificateProfile, error)
CreateProfile(ctx context.Context, profile domain.CertificateProfile) (*domain.CertificateProfile, error)
UpdateProfile(ctx context.Context, id string, profile domain.CertificateProfile) (*domain.CertificateProfile, error)
DeleteProfile(ctx context.Context, id string) error
}
// ProfileHandler handles HTTP requests for certificate profile operations.
@@ -53,7 +54,7 @@ func (h ProfileHandler) ListProfiles(w http.ResponseWriter, r *http.Request) {
}
}
profiles, total, err := h.svc.ListProfiles(page, perPage)
profiles, total, err := h.svc.ListProfiles(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list profiles", requestID)
return
@@ -85,7 +86,7 @@ func (h ProfileHandler) GetProfile(w http.ResponseWriter, r *http.Request) {
return
}
profile, err := h.svc.GetProfile(id)
profile, err := h.svc.GetProfile(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Profile not found", requestID)
return
@@ -120,7 +121,7 @@ func (h ProfileHandler) CreateProfile(w http.ResponseWriter, r *http.Request) {
return
}
created, err := h.svc.CreateProfile(profile)
created, err := h.svc.CreateProfile(r.Context(), profile)
if err != nil {
// Check if it's a validation error from the service
if strings.Contains(err.Error(), "invalid") || strings.Contains(err.Error(), "required") ||
@@ -159,7 +160,7 @@ func (h ProfileHandler) UpdateProfile(w http.ResponseWriter, r *http.Request) {
return
}
updated, err := h.svc.UpdateProfile(id, profile)
updated, err := h.svc.UpdateProfile(r.Context(), id, profile)
if err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Profile not found", requestID)
@@ -193,7 +194,7 @@ func (h ProfileHandler) DeleteProfile(w http.ResponseWriter, r *http.Request) {
return
}
if err := h.svc.DeleteProfile(id); err != nil {
if err := h.svc.DeleteProfile(r.Context(), id); err != nil {
if strings.Contains(err.Error(), "not found") {
ErrorWithRequestID(w, http.StatusNotFound, "Profile not found", requestID)
return
+20
View File
@@ -1,14 +1,34 @@
package handler
import (
"context"
"encoding/base64"
"encoding/json"
"fmt"
"net/http"
"strings"
"time"
"github.com/shankar0123/certctl/internal/api/middleware"
)
// resolveActor extracts the authenticated named-key identity from the request
// context for audit-trail attribution. Returns the named-key name when set by
// the auth middleware, or "api" as a safe sentinel when the auth middleware
// did not populate the context (e.g., AUTH_TYPE=none, or internal/system calls
// that bypass auth).
//
// Post-M-002: this is the single source of truth for handler-layer actor
// resolution. Handlers must NOT hardcode string literals like "api-key-user"
// or "api" — always go through this helper so the named-key identity flows to
// services and the audit trail.
func resolveActor(ctx context.Context) string {
if user := middleware.GetUser(ctx); user != "" {
return user
}
return "api"
}
// PagedResponse represents a paginated API response.
type PagedResponse struct {
Data interface{} `json:"data"`
+427
View File
@@ -0,0 +1,427 @@
package handler
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestEncodeCursor_ProducesValidBase64(t *testing.T) {
// Test that encodeCursor produces valid base64 with correct format
originalTime := time.Date(2024, 3, 15, 10, 30, 45, 123456789, time.UTC)
originalID := "cert-12345"
// Encode
encoded := encodeCursor(originalTime, originalID)
// Verify it's valid base64
decoded, err := base64.URLEncoding.DecodeString(encoded)
if err != nil {
t.Fatalf("encoded cursor is not valid base64: %v", err)
}
// Verify contains both timestamp and ID
decodedStr := string(decoded)
if !strings.Contains(decodedStr, originalID) {
t.Errorf("decoded cursor doesn't contain ID %q, got %q", originalID, decodedStr)
}
// Verify it's not empty and has expected structure (timestamp:id)
if !strings.Contains(decodedStr, ":") {
t.Errorf("decoded cursor doesn't contain colon separator, got %q", decodedStr)
}
}
func TestEncodeCursor_DifferentTimes(t *testing.T) {
id := "test-id"
time1 := time.Date(2024, 1, 1, 0, 0, 0, 0, time.UTC)
time2 := time.Date(2024, 1, 2, 0, 0, 0, 0, time.UTC)
cursor1 := encodeCursor(time1, id)
cursor2 := encodeCursor(time2, id)
// Different times should produce different cursors
if cursor1 == cursor2 {
t.Error("Different times produced identical cursors")
}
}
func TestEncodeCursor_DifferentIDs(t *testing.T) {
now := time.Now()
id1 := "cert-1"
id2 := "cert-2"
cursor1 := encodeCursor(now, id1)
cursor2 := encodeCursor(now, id2)
// Different IDs should produce different cursors
if cursor1 == cursor2 {
t.Error("Different IDs produced identical cursors")
}
}
func TestDecodeCursor_InvalidBase64(t *testing.T) {
// Create the decodeCursor function from the closure - matching actual behavior
decodeCursor := func(cursor string) (time.Time, string, error) {
raw, err := base64.URLEncoding.DecodeString(cursor)
if err != nil {
return time.Time{}, "", err
}
parts := strings.SplitN(string(raw), ":", 2)
if len(parts) != 2 {
return time.Time{}, "", fmt.Errorf("invalid cursor format")
}
t, err := time.Parse(time.RFC3339Nano, parts[0])
if err != nil {
return time.Time{}, "", err
}
return t, parts[1], nil
}
tests := []struct {
name string
cursor string
expectError bool
}{
{"invalid base64", "!!!invalid!!!", true},
{"empty string", "", true},
{"no colon separator", base64.URLEncoding.EncodeToString([]byte("no-separator-here")), true},
{"invalid timestamp", base64.URLEncoding.EncodeToString([]byte("not-a-timestamp:id-123")), true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
_, _, err := decodeCursor(tt.cursor)
if tt.expectError && err == nil {
t.Error("expected error for invalid cursor, got nil")
}
if !tt.expectError && err != nil {
t.Errorf("expected no error, got %v", err)
}
})
}
}
func TestJSON_SetsContentType(t *testing.T) {
w := httptest.NewRecorder()
data := map[string]string{"key": "value"}
JSON(w, http.StatusOK, data)
contentType := w.Header().Get("Content-Type")
if contentType != "application/json" {
t.Errorf("Content-Type = %q, want application/json", contentType)
}
}
func TestJSON_SetsStatusCode(t *testing.T) {
w := httptest.NewRecorder()
data := map[string]string{"key": "value"}
JSON(w, http.StatusCreated, data)
if w.Code != http.StatusCreated {
t.Errorf("Status code = %d, want %d", w.Code, http.StatusCreated)
}
}
func TestJSON_EncodesData(t *testing.T) {
w := httptest.NewRecorder()
data := map[string]interface{}{
"string": "value",
"number": 42,
"bool": true,
"null": nil,
}
JSON(w, http.StatusOK, data)
var result map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&result); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if result["string"] != "value" {
t.Errorf("string = %v, want value", result["string"])
}
if result["number"] != float64(42) {
t.Errorf("number = %v, want 42", result["number"])
}
if result["bool"] != true {
t.Errorf("bool = %v, want true", result["bool"])
}
if result["null"] != nil {
t.Errorf("null = %v, want nil", result["null"])
}
}
func TestError_SetsStatusCode(t *testing.T) {
w := httptest.NewRecorder()
Error(w, http.StatusBadRequest, "Invalid input")
if w.Code != http.StatusBadRequest {
t.Errorf("Status code = %d, want %d", w.Code, http.StatusBadRequest)
}
}
func TestError_SetsContentType(t *testing.T) {
w := httptest.NewRecorder()
Error(w, http.StatusBadRequest, "Invalid input")
contentType := w.Header().Get("Content-Type")
if contentType != "application/json" {
t.Errorf("Content-Type = %q, want application/json", contentType)
}
}
func TestError_IncludesMessage(t *testing.T) {
w := httptest.NewRecorder()
message := "Something went wrong"
Error(w, http.StatusInternalServerError, message)
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Message != message {
t.Errorf("Message = %q, want %q", errResp.Message, message)
}
}
func TestError_IncludesStatusText(t *testing.T) {
w := httptest.NewRecorder()
Error(w, http.StatusNotFound, "Resource not found")
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Error != http.StatusText(http.StatusNotFound) {
t.Errorf("Error = %q, want %q", errResp.Error, http.StatusText(http.StatusNotFound))
}
}
func TestErrorWithRequestID_SetsStatusCode(t *testing.T) {
w := httptest.NewRecorder()
ErrorWithRequestID(w, http.StatusBadRequest, "Invalid input", "req-123")
if w.Code != http.StatusBadRequest {
t.Errorf("Status code = %d, want %d", w.Code, http.StatusBadRequest)
}
}
func TestErrorWithRequestID_IncludesRequestID(t *testing.T) {
w := httptest.NewRecorder()
requestID := "req-abc-def-ghi"
ErrorWithRequestID(w, http.StatusInternalServerError, "Server error", requestID)
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.RequestID != requestID {
t.Errorf("RequestID = %q, want %q", errResp.RequestID, requestID)
}
}
func TestErrorWithRequestID_IncludesMessage(t *testing.T) {
w := httptest.NewRecorder()
message := "Database connection failed"
ErrorWithRequestID(w, http.StatusServiceUnavailable, message, "req-123")
var errResp ErrorResponse
if err := json.NewDecoder(w.Body).Decode(&errResp); err != nil {
t.Fatalf("failed to decode error response: %v", err)
}
if errResp.Message != message {
t.Errorf("Message = %q, want %q", errResp.Message, message)
}
}
func TestPagedResponse_Structure(t *testing.T) {
response := PagedResponse{
Data: []string{"item1", "item2"},
Total: 100,
Page: 2,
PerPage: 50,
}
data, err := json.Marshal(response)
if err != nil {
t.Fatalf("failed to marshal response: %v", err)
}
var result map[string]interface{}
if err := json.Unmarshal(data, &result); err != nil {
t.Fatalf("failed to unmarshal response: %v", err)
}
if result["total"] != float64(100) {
t.Errorf("total = %v, want 100", result["total"])
}
if result["page"] != float64(2) {
t.Errorf("page = %v, want 2", result["page"])
}
if result["per_page"] != float64(50) {
t.Errorf("per_page = %v, want 50", result["per_page"])
}
if result["data"] == nil {
t.Error("data is nil")
}
}
func TestCursorPagedResponse_Structure(t *testing.T) {
response := CursorPagedResponse{
Data: []string{"item1", "item2"},
Total: 100,
NextCursor: "abc123def456",
PageSize: 50,
}
data, err := json.Marshal(response)
if err != nil {
t.Fatalf("failed to marshal response: %v", err)
}
var result map[string]interface{}
if err := json.Unmarshal(data, &result); err != nil {
t.Fatalf("failed to unmarshal response: %v", err)
}
if result["total"] != float64(100) {
t.Errorf("total = %v, want 100", result["total"])
}
if result["next_cursor"] != "abc123def456" {
t.Errorf("next_cursor = %v, want abc123def456", result["next_cursor"])
}
if result["page_size"] != float64(50) {
t.Errorf("page_size = %v, want 50", result["page_size"])
}
}
func TestCursorPagedResponse_EmptyNextCursor(t *testing.T) {
// When NextCursor is empty, it should be omitted from JSON
response := CursorPagedResponse{
Data: []string{},
Total: 0,
NextCursor: "",
PageSize: 50,
}
data, err := json.Marshal(response)
if err != nil {
t.Fatalf("failed to marshal response: %v", err)
}
// Empty string for next_cursor should be omitted due to omitempty tag
if bytes.Contains(data, []byte("next_cursor")) {
t.Error("empty next_cursor should be omitted from JSON")
}
}
func TestFilterFields_SingleObject(t *testing.T) {
data := map[string]interface{}{
"id": "cert-123",
"name": "My Cert",
"expiry": "2025-01-01",
"status": "active",
}
result := filterFields(data, []string{"id", "name"})
resultMap, ok := result.(map[string]interface{})
if !ok {
t.Fatalf("result is not map[string]interface{}, got %T", result)
}
if resultMap["id"] != "cert-123" {
t.Errorf("id = %v, want cert-123", resultMap["id"])
}
if resultMap["name"] != "My Cert" {
t.Errorf("name = %v, want My Cert", resultMap["name"])
}
if _, hasExpiry := resultMap["expiry"]; hasExpiry {
t.Error("expiry should be filtered out")
}
if _, hasStatus := resultMap["status"]; hasStatus {
t.Error("status should be filtered out")
}
}
func TestFilterFields_EmptyFields(t *testing.T) {
// Empty fields list should return data unchanged
data := map[string]interface{}{
"id": "cert-123",
"name": "My Cert",
}
result := filterFields(data, []string{})
// Should return original data unchanged
resultMap, ok := result.(map[string]interface{})
if !ok {
t.Fatalf("result is not map[string]interface{}, got %T", result)
}
if len(resultMap) != 2 {
t.Errorf("filtered result has %d fields, want 2", len(resultMap))
}
}
func TestFilterFields_NoMatchingFields(t *testing.T) {
data := map[string]interface{}{
"id": "cert-123",
"name": "My Cert",
}
result := filterFields(data, []string{"nonexistent", "also-not-there"})
resultMap, ok := result.(map[string]interface{})
if !ok {
t.Fatalf("result is not map[string]interface{}, got %T", result)
}
if len(resultMap) != 0 {
t.Errorf("filtered result has %d fields, want 0", len(resultMap))
}
}
func TestFilterFields_InvalidJSON(t *testing.T) {
// Non-serializable data should be returned as-is
data := make(chan int) // channels can't be marshaled to JSON
result := filterFields(data, []string{"field"})
// Should return original data unchanged
if result != data {
t.Error("invalid data should be returned unchanged")
}
}
+353
View File
@@ -0,0 +1,353 @@
package handler
import (
"context"
"crypto/x509"
"encoding/asn1"
"encoding/base64"
"encoding/pem"
"fmt"
"io"
"net/http"
"strings"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/pkcs7"
)
// SCEPService defines the service interface for SCEP enrollment operations.
// SCEP (RFC 8894) is a protocol for certificate enrollment used by MDM platforms
// and network devices.
type SCEPService interface {
// GetCACaps returns the SCEP server capabilities as a newline-separated string.
GetCACaps(ctx context.Context) string
// GetCACert returns the PEM-encoded CA certificate chain.
GetCACert(ctx context.Context) (string, error)
// PKCSReq processes a PKCS#10 CSR and returns a signed certificate.
PKCSReq(ctx context.Context, csrPEM string, challengePassword string, transactionID string) (*domain.SCEPEnrollResult, error)
}
// SCEPHandler handles HTTP requests for the SCEP protocol (RFC 8894).
//
// SCEP uses a single endpoint with operation-based dispatch via query parameters.
// All operations use GET or POST to the same path.
//
// Supported operations:
// - GET ?operation=GetCACaps — server capabilities
// - GET ?operation=GetCACert — CA certificate distribution
// - POST ?operation=PKIOperation — certificate enrollment (PKCSReq)
type SCEPHandler struct {
svc SCEPService
}
// NewSCEPHandler creates a new SCEPHandler.
func NewSCEPHandler(svc SCEPService) SCEPHandler {
return SCEPHandler{svc: svc}
}
// HandleSCEP is the single entry point for all SCEP operations.
// It dispatches based on the "operation" query parameter.
func (h SCEPHandler) HandleSCEP(w http.ResponseWriter, r *http.Request) {
operation := r.URL.Query().Get("operation")
switch operation {
case "GetCACaps":
h.getCACaps(w, r)
case "GetCACert":
h.getCACert(w, r)
case "PKIOperation":
h.pkiOperation(w, r)
default:
http.Error(w, fmt.Sprintf("Unknown SCEP operation: %s", operation), http.StatusBadRequest)
}
}
// getCACaps handles GET ?operation=GetCACaps
// Returns the SCEP server capabilities as plaintext, one per line.
func (h SCEPHandler) getCACaps(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
caps := h.svc.GetCACaps(r.Context())
w.Header().Set("Content-Type", "text/plain")
w.WriteHeader(http.StatusOK)
w.Write([]byte(caps))
}
// getCACert handles GET ?operation=GetCACert
// Returns the CA certificate(s). Single cert as DER, chain as PKCS#7.
func (h SCEPHandler) getCACert(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
caCertPEM, err := h.svc.GetCACert(r.Context())
if err != nil {
requestID := middleware.GetRequestID(r.Context())
ErrorWithRequestID(w, http.StatusInternalServerError, fmt.Sprintf("Failed to get CA certificate: %v", err), requestID)
return
}
// Parse PEM to DER chain
derCerts, err := pkcs7.PEMToDERChain(caCertPEM)
if err != nil {
requestID := middleware.GetRequestID(r.Context())
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to parse CA certificates", requestID)
return
}
if len(derCerts) == 1 {
// Single CA cert — return as raw DER
w.Header().Set("Content-Type", "application/x-x509-ca-cert")
w.WriteHeader(http.StatusOK)
w.Write(derCerts[0])
return
}
// Multiple certs (CA + RA or chain) — return as PKCS#7
pkcs7Data, err := pkcs7.BuildCertsOnlyPKCS7(derCerts)
if err != nil {
requestID := middleware.GetRequestID(r.Context())
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to build PKCS#7 response", requestID)
return
}
w.Header().Set("Content-Type", "application/x-x509-ca-ra-cert")
w.WriteHeader(http.StatusOK)
w.Write(pkcs7Data)
}
// pkiOperation handles POST ?operation=PKIOperation
// Processes a SCEP enrollment request containing a PKCS#7-wrapped CSR.
func (h SCEPHandler) pkiOperation(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
requestID := middleware.GetRequestID(r.Context())
body, err := io.ReadAll(io.LimitReader(r.Body, 1<<20)) // 1MB limit
if err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, "Failed to read request body", requestID)
return
}
defer r.Body.Close()
if len(body) == 0 {
ErrorWithRequestID(w, http.StatusBadRequest, "Empty request body", requestID)
return
}
// Extract the PKCS#10 CSR from the PKCS#7 SignedData envelope
csrDER, challengePassword, transactionID, err := extractCSRFromPKCS7(body)
if err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, fmt.Sprintf("Invalid SCEP message: %v", err), requestID)
return
}
// Validate the CSR
csr, err := x509.ParseCertificateRequest(csrDER)
if err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, fmt.Sprintf("Invalid CSR: %v", err), requestID)
return
}
if err := csr.CheckSignature(); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, fmt.Sprintf("CSR signature invalid: %v", err), requestID)
return
}
// Convert DER CSR to PEM for the service layer
csrPEM := string(pem.EncodeToMemory(&pem.Block{
Type: "CERTIFICATE REQUEST",
Bytes: csrDER,
}))
result, err := h.svc.PKCSReq(r.Context(), csrPEM, challengePassword, transactionID)
if err != nil {
if strings.Contains(err.Error(), "challenge password") {
ErrorWithRequestID(w, http.StatusForbidden, "Invalid challenge password", requestID)
return
}
ErrorWithRequestID(w, http.StatusInternalServerError, fmt.Sprintf("Enrollment failed: %v", err), requestID)
return
}
// Build response: issued cert wrapped in PKCS#7 certs-only
h.writeSCEPResponse(w, result)
}
// writeSCEPResponse writes a SCEP enrollment response as PKCS#7 certs-only (DER).
func (h SCEPHandler) writeSCEPResponse(w http.ResponseWriter, result *domain.SCEPEnrollResult) {
var derCerts [][]byte
certDER, err := pkcs7.PEMToDERChain(result.CertPEM)
if err != nil || len(certDER) == 0 {
http.Error(w, "Failed to encode certificate", http.StatusInternalServerError)
return
}
derCerts = append(derCerts, certDER...)
if result.ChainPEM != "" {
chainDER, err := pkcs7.PEMToDERChain(result.ChainPEM)
if err == nil {
derCerts = append(derCerts, chainDER...)
}
}
pkcs7Data, err := pkcs7.BuildCertsOnlyPKCS7(derCerts)
if err != nil {
http.Error(w, "Failed to build PKCS#7 response", http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/x-pki-message")
w.WriteHeader(http.StatusOK)
w.Write(pkcs7Data)
}
// extractCSRFromPKCS7 extracts a PKCS#10 CSR from a SCEP PKCS#7 SignedData envelope.
//
// SCEP clients wrap the CSR in a PKCS#7 SignedData structure. For the MVP, we parse
// the outer ASN.1 structure to find the encapsulated content (the CSR bytes), and
// extract the challenge password from the CSR attributes.
//
// Returns: csrDER, challengePassword, transactionID, error
func extractCSRFromPKCS7(data []byte) ([]byte, string, string, error) {
// Try to decode as PKCS#7 SignedData
csrDER, err := parseSignedDataForCSR(data)
if err != nil {
// Fallback: some clients send the CSR directly (not wrapped in PKCS#7)
// or send base64-encoded data
decoded, decErr := base64.StdEncoding.DecodeString(strings.TrimSpace(string(data)))
if decErr == nil {
// Try the decoded data as PKCS#7
csrDER2, err2 := parseSignedDataForCSR(decoded)
if err2 == nil {
return extractCSRFields(csrDER2)
}
// Maybe the decoded data IS the CSR directly
if _, parseErr := x509.ParseCertificateRequest(decoded); parseErr == nil {
return extractCSRFields(decoded)
}
}
// Maybe the raw data IS the CSR directly (no PKCS#7 wrapping)
if _, parseErr := x509.ParseCertificateRequest(data); parseErr == nil {
return extractCSRFields(data)
}
return nil, "", "", fmt.Errorf("failed to extract CSR from PKCS#7: %w", err)
}
return extractCSRFields(csrDER)
}
// extractCSRFields extracts the challenge password and transaction ID from CSR attributes.
func extractCSRFields(csrDER []byte) ([]byte, string, string, error) {
csr, err := x509.ParseCertificateRequest(csrDER)
if err != nil {
return nil, "", "", fmt.Errorf("invalid CSR: %w", err)
}
challengePassword := ""
transactionID := ""
// OID for challengePassword: 1.2.840.113549.1.9.7
oidChallengePassword := asn1.ObjectIdentifier{1, 2, 840, 113549, 1, 9, 7}
// Extract challenge password from parsed CSR attributes.
// Attributes is []pkix.AttributeTypeAndValueSET where each has Type (OID)
// and Value ([][]pkix.AttributeTypeAndValue). The challenge password value
// is stored as a string in the inner AttributeTypeAndValue.Value field.
for _, attr := range csr.Attributes {
if attr.Type.Equal(oidChallengePassword) {
if len(attr.Value) > 0 && len(attr.Value[0]) > 0 {
if pwd, ok := attr.Value[0][0].Value.(string); ok {
challengePassword = pwd
}
}
}
}
// Use CN as fallback transaction ID if not found in attributes
if transactionID == "" && csr.Subject.CommonName != "" {
transactionID = csr.Subject.CommonName
}
return csrDER, challengePassword, transactionID, nil
}
// pkcs7ContentInfo represents the outer ContentInfo structure.
type pkcs7ContentInfo struct {
ContentType asn1.ObjectIdentifier
Content asn1.RawValue `asn1:"explicit,tag:0"`
}
// pkcs7SignedData represents a simplified SignedData structure for CSR extraction.
type pkcs7SignedData struct {
Version int
DigestAlgorithms asn1.RawValue
EncapContentInfo asn1.RawValue
}
// pkcs7EncapContent represents the EncapsulatedContentInfo.
type pkcs7EncapContent struct {
ContentType asn1.ObjectIdentifier
Content asn1.RawValue `asn1:"explicit,optional,tag:0"`
}
// parseSignedDataForCSR extracts the encapsulated content (CSR) from PKCS#7 SignedData.
func parseSignedDataForCSR(data []byte) ([]byte, error) {
var contentInfo pkcs7ContentInfo
rest, err := asn1.Unmarshal(data, &contentInfo)
if err != nil {
return nil, fmt.Errorf("failed to parse ContentInfo: %w", err)
}
if len(rest) > 0 {
// Trailing data is OK for some implementations
}
// OID for signedData: 1.2.840.113549.1.7.2
oidSignedData := asn1.ObjectIdentifier{1, 2, 840, 113549, 1, 7, 2}
if !contentInfo.ContentType.Equal(oidSignedData) {
return nil, fmt.Errorf("not SignedData: got OID %v", contentInfo.ContentType)
}
// Parse the SignedData
var signedData pkcs7SignedData
_, err = asn1.Unmarshal(contentInfo.Content.Bytes, &signedData)
if err != nil {
return nil, fmt.Errorf("failed to parse SignedData: %w", err)
}
// Parse the EncapsulatedContentInfo to get the CSR
var encapContent pkcs7EncapContent
_, err = asn1.Unmarshal(signedData.EncapContentInfo.FullBytes, &encapContent)
if err != nil {
return nil, fmt.Errorf("failed to parse EncapsulatedContentInfo: %w", err)
}
if len(encapContent.Content.Bytes) == 0 {
return nil, fmt.Errorf("empty encapsulated content")
}
// The content may be wrapped in an OCTET STRING
var csrBytes []byte
var octetString asn1.RawValue
if _, err := asn1.Unmarshal(encapContent.Content.Bytes, &octetString); err == nil && octetString.Tag == asn1.TagOctetString {
csrBytes = octetString.Bytes
} else {
csrBytes = encapContent.Content.Bytes
}
// Validate it's a parseable CSR
if _, err := x509.ParseCertificateRequest(csrBytes); err != nil {
return nil, fmt.Errorf("extracted content is not a valid CSR: %w", err)
}
return csrBytes, nil
}
+262
View File
@@ -0,0 +1,262 @@
package handler
import (
"context"
"encoding/pem"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/shankar0123/certctl/internal/domain"
)
// mockSCEPService implements SCEPService for testing.
type mockSCEPService struct {
CACaps string
CACertPEM string
CACertErr error
EnrollResult *domain.SCEPEnrollResult
EnrollErr error
}
func (m *mockSCEPService) GetCACaps(ctx context.Context) string {
if m.CACaps != "" {
return m.CACaps
}
return "POSTPKIOperation\nSHA-256\nAES\nSCEPStandard\n"
}
func (m *mockSCEPService) GetCACert(ctx context.Context) (string, error) {
return m.CACertPEM, m.CACertErr
}
func (m *mockSCEPService) PKCSReq(ctx context.Context, csrPEM string, challengePassword string, transactionID string) (*domain.SCEPEnrollResult, error) {
return m.EnrollResult, m.EnrollErr
}
func TestSCEP_GetCACaps_Success(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep?operation=GetCACaps", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
ct := w.Header().Get("Content-Type")
if ct != "text/plain" {
t.Errorf("expected text/plain, got %s", ct)
}
body := w.Body.String()
if !strings.Contains(body, "POSTPKIOperation") {
t.Errorf("expected POSTPKIOperation in response, got: %s", body)
}
if !strings.Contains(body, "SHA-256") {
t.Errorf("expected SHA-256 in response, got: %s", body)
}
}
func TestSCEP_GetCACaps_MethodNotAllowed(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodPost, "/scep?operation=GetCACaps", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("expected 405, got %d", w.Code)
}
}
func TestSCEP_GetCACert_Success_SingleCert(t *testing.T) {
certPEM := generateTestCertPEM(t)
svc := &mockSCEPService{
CACertPEM: certPEM,
}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep?operation=GetCACert", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
ct := w.Header().Get("Content-Type")
if ct != "application/x-x509-ca-cert" {
t.Errorf("expected application/x-x509-ca-cert, got %s", ct)
}
if w.Body.Len() == 0 {
t.Error("expected non-empty body")
}
}
func TestSCEP_GetCACert_MethodNotAllowed(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodPost, "/scep?operation=GetCACert", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("expected 405, got %d", w.Code)
}
}
func TestSCEP_GetCACert_ServiceError(t *testing.T) {
svc := &mockSCEPService{
CACertErr: errors.New("CA unavailable"),
}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep?operation=GetCACert", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", w.Code)
}
}
func TestSCEP_PKIOperation_MethodNotAllowed(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep?operation=PKIOperation", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Errorf("expected 405, got %d", w.Code)
}
}
func TestSCEP_PKIOperation_EmptyBody(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodPost, "/scep?operation=PKIOperation", strings.NewReader(""))
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
func TestSCEP_PKIOperation_InvalidBody(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodPost, "/scep?operation=PKIOperation", strings.NewReader("not-valid-asn1-or-csr"))
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestSCEP_PKIOperation_ServiceError(t *testing.T) {
svc := &mockSCEPService{
EnrollErr: errors.New("enrollment failed"),
}
h := NewSCEPHandler(svc)
// Generate a valid raw CSR DER to send as body (fallback path)
csrPEM := generateTestCSRPEM(t)
block, _ := pem.Decode([]byte(csrPEM))
if block == nil {
t.Fatal("failed to decode CSR PEM")
}
req := httptest.NewRequest(http.MethodPost, "/scep?operation=PKIOperation", strings.NewReader(string(block.Bytes)))
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
func TestSCEP_PKIOperation_Success_RawCSR(t *testing.T) {
certPEM := generateTestCertPEM(t)
svc := &mockSCEPService{
EnrollResult: &domain.SCEPEnrollResult{
CertPEM: certPEM,
ChainPEM: "",
},
}
h := NewSCEPHandler(svc)
csrPEM := generateTestCSRPEM(t)
block, _ := pem.Decode([]byte(csrPEM))
if block == nil {
t.Fatal("failed to decode CSR PEM")
}
req := httptest.NewRequest(http.MethodPost, "/scep?operation=PKIOperation", strings.NewReader(string(block.Bytes)))
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
ct := w.Header().Get("Content-Type")
if ct != "application/x-pki-message" {
t.Errorf("expected application/x-pki-message, got %s", ct)
}
}
func TestSCEP_PKIOperation_ChallengePasswordRejected(t *testing.T) {
svc := &mockSCEPService{
EnrollErr: errors.New("invalid challenge password"),
}
h := NewSCEPHandler(svc)
csrPEM := generateTestCSRPEM(t)
block, _ := pem.Decode([]byte(csrPEM))
if block == nil {
t.Fatal("failed to decode CSR PEM")
}
req := httptest.NewRequest(http.MethodPost, "/scep?operation=PKIOperation", strings.NewReader(string(block.Bytes)))
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusForbidden {
t.Errorf("expected 403, got %d: %s", w.Code, w.Body.String())
}
}
func TestSCEP_UnknownOperation(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep?operation=UnknownOp", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
func TestSCEP_MissingOperation(t *testing.T) {
svc := &mockSCEPService{}
h := NewSCEPHandler(svc)
req := httptest.NewRequest(http.MethodGet, "/scep", nil)
w := httptest.NewRecorder()
h.HandleSCEP(w, req)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", w.Code)
}
}
+169 -30
View File
@@ -2,6 +2,7 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -9,48 +10,57 @@ import (
"time"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// MockTargetService is a mock implementation of TargetService interface.
type MockTargetService struct {
ListTargetsFn func(page, perPage int) ([]domain.DeploymentTarget, int64, error)
GetTargetFn func(id string) (*domain.DeploymentTarget, error)
CreateTargetFn func(target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
UpdateTargetFn func(id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
DeleteTargetFn func(id string) error
ListTargetsFn func(ctx context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error)
GetTargetFn func(ctx context.Context, id string) (*domain.DeploymentTarget, error)
CreateTargetFn func(ctx context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
UpdateTargetFn func(ctx context.Context, id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
DeleteTargetFn func(ctx context.Context, id string) error
TestConnectionFn func(ctx context.Context, id string) error
}
func (m *MockTargetService) ListTargets(page, perPage int) ([]domain.DeploymentTarget, int64, error) {
func (m *MockTargetService) ListTargets(ctx context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error) {
if m.ListTargetsFn != nil {
return m.ListTargetsFn(page, perPage)
return m.ListTargetsFn(ctx, page, perPage)
}
return nil, 0, nil
}
func (m *MockTargetService) GetTarget(id string) (*domain.DeploymentTarget, error) {
func (m *MockTargetService) GetTarget(ctx context.Context, id string) (*domain.DeploymentTarget, error) {
if m.GetTargetFn != nil {
return m.GetTargetFn(id)
return m.GetTargetFn(ctx, id)
}
return nil, nil
}
func (m *MockTargetService) CreateTarget(target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
func (m *MockTargetService) CreateTarget(ctx context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
if m.CreateTargetFn != nil {
return m.CreateTargetFn(target)
return m.CreateTargetFn(ctx, target)
}
return nil, nil
}
func (m *MockTargetService) UpdateTarget(id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
func (m *MockTargetService) UpdateTarget(ctx context.Context, id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
if m.UpdateTargetFn != nil {
return m.UpdateTargetFn(id, target)
return m.UpdateTargetFn(ctx, id, target)
}
return nil, nil
}
func (m *MockTargetService) DeleteTarget(id string) error {
func (m *MockTargetService) DeleteTarget(ctx context.Context, id string) error {
if m.DeleteTargetFn != nil {
return m.DeleteTargetFn(id)
return m.DeleteTargetFn(ctx, id)
}
return nil
}
func (m *MockTargetService) TestConnection(ctx context.Context, id string) error {
if m.TestConnectionFn != nil {
return m.TestConnectionFn(ctx, id)
}
return nil
}
@@ -77,7 +87,7 @@ func TestListTargets_Success(t *testing.T) {
}
mock := &MockTargetService{
ListTargetsFn: func(page, perPage int) ([]domain.DeploymentTarget, int64, error) {
ListTargetsFn: func(_ context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error) {
return []domain.DeploymentTarget{t1, t2}, 2, nil
},
}
@@ -105,7 +115,7 @@ func TestListTargets_Success(t *testing.T) {
func TestListTargets_Pagination(t *testing.T) {
var capturedPage, capturedPerPage int
mock := &MockTargetService{
ListTargetsFn: func(page, perPage int) ([]domain.DeploymentTarget, int64, error) {
ListTargetsFn: func(_ context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error) {
capturedPage = page
capturedPerPage = perPage
return []domain.DeploymentTarget{}, 0, nil
@@ -129,7 +139,7 @@ func TestListTargets_Pagination(t *testing.T) {
func TestListTargets_ServiceError(t *testing.T) {
mock := &MockTargetService{
ListTargetsFn: func(page, perPage int) ([]domain.DeploymentTarget, int64, error) {
ListTargetsFn: func(_ context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error) {
return nil, 0, ErrMockServiceFailed
},
}
@@ -161,7 +171,7 @@ func TestListTargets_MethodNotAllowed(t *testing.T) {
func TestGetTarget_Success(t *testing.T) {
now := time.Now()
mock := &MockTargetService{
GetTargetFn: func(id string) (*domain.DeploymentTarget, error) {
GetTargetFn: func(_ context.Context, id string) (*domain.DeploymentTarget, error) {
return &domain.DeploymentTarget{
ID: id,
Name: "NGINX Proxy",
@@ -188,7 +198,7 @@ func TestGetTarget_Success(t *testing.T) {
func TestGetTarget_NotFound(t *testing.T) {
mock := &MockTargetService{
GetTargetFn: func(id string) (*domain.DeploymentTarget, error) {
GetTargetFn: func(_ context.Context, id string) (*domain.DeploymentTarget, error) {
return nil, ErrMockNotFound
},
}
@@ -221,7 +231,7 @@ func TestGetTarget_EmptyID(t *testing.T) {
func TestCreateTarget_Success(t *testing.T) {
now := time.Now()
mock := &MockTargetService{
CreateTargetFn: func(target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
CreateTargetFn: func(_ context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
target.ID = "t-new"
target.CreatedAt = now
target.UpdatedAt = now
@@ -230,8 +240,9 @@ func TestCreateTarget_Success(t *testing.T) {
}
body := map[string]interface{}{
"name": "New Target",
"type": "nginx",
"name": "New Target",
"type": "nginx",
"agent_id": "agent-001",
}
bodyBytes, _ := json.Marshal(body)
@@ -249,7 +260,8 @@ func TestCreateTarget_Success(t *testing.T) {
func TestCreateTarget_MissingName(t *testing.T) {
body := map[string]interface{}{
"type": "nginx",
"type": "nginx",
"agent_id": "agent-001",
}
bodyBytes, _ := json.Marshal(body)
@@ -267,7 +279,8 @@ func TestCreateTarget_MissingName(t *testing.T) {
func TestCreateTarget_MissingType(t *testing.T) {
body := map[string]interface{}{
"name": "New Target",
"name": "New Target",
"agent_id": "agent-001",
}
bodyBytes, _ := json.Marshal(body)
@@ -302,8 +315,9 @@ func TestCreateTarget_NameTooLong(t *testing.T) {
longName += "x"
}
body := map[string]interface{}{
"name": longName,
"type": "nginx",
"name": longName,
"type": "nginx",
"agent_id": "agent-001",
}
bodyBytes, _ := json.Marshal(body)
@@ -331,10 +345,69 @@ func TestCreateTarget_MethodNotAllowed(t *testing.T) {
}
}
// TestCreateTarget_MissingAgentID_Returns400 pins the C-002 handler contract:
// handler MUST reject a create payload that omits agent_id with HTTP 400
// before the service is invoked. Using a mock that would return 201-worthy
// success proves the guard fires.
func TestCreateTarget_MissingAgentID_Returns400(t *testing.T) {
body := map[string]interface{}{
"name": "New Target",
"type": "nginx",
// agent_id intentionally omitted
}
bodyBytes, _ := json.Marshal(body)
mock := &MockTargetService{
CreateTargetFn: func(_ context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
// Would succeed if handler guard did not fire.
target.ID = "t-would-be-created"
return &target, nil
},
}
handler := NewTargetHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/targets", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateTarget(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("expected 400, got %d — body=%s", w.Code, w.Body.String())
}
}
// TestCreateTarget_NonexistentAgent_Returns400 pins the C-002 handler↔service
// translation: when the service returns service.ErrAgentNotFound, the handler
// MUST map it to HTTP 400, not the generic 500 used for other service errors.
func TestCreateTarget_NonexistentAgent_Returns400(t *testing.T) {
mock := &MockTargetService{
CreateTargetFn: func(_ context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
return nil, service.ErrAgentNotFound
},
}
body := map[string]interface{}{
"name": "New Target",
"type": "nginx",
"agent_id": "agent-does-not-exist",
}
bodyBytes, _ := json.Marshal(body)
handler := NewTargetHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/targets", bytes.NewReader(bodyBytes))
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.CreateTarget(w, req)
if w.Code != http.StatusBadRequest {
t.Fatalf("expected 400 for nonexistent agent, got %d — body=%s", w.Code, w.Body.String())
}
}
func TestUpdateTarget_Success(t *testing.T) {
now := time.Now()
mock := &MockTargetService{
UpdateTargetFn: func(id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
UpdateTargetFn: func(_ context.Context, id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error) {
return &domain.DeploymentTarget{
ID: id,
Name: target.Name,
@@ -367,7 +440,7 @@ func TestUpdateTarget_Success(t *testing.T) {
func TestDeleteTarget_Success(t *testing.T) {
var deletedID string
mock := &MockTargetService{
DeleteTargetFn: func(id string) error {
DeleteTargetFn: func(_ context.Context, id string) error {
deletedID = id
return nil
},
@@ -390,7 +463,7 @@ func TestDeleteTarget_Success(t *testing.T) {
func TestDeleteTarget_ServiceError(t *testing.T) {
mock := &MockTargetService{
DeleteTargetFn: func(id string) error {
DeleteTargetFn: func(_ context.Context, id string) error {
return ErrMockServiceFailed
},
}
@@ -419,3 +492,69 @@ func TestDeleteTarget_EmptyID(t *testing.T) {
t.Fatalf("expected status 400, got %d", w.Code)
}
}
func TestTestTargetConnection_Success(t *testing.T) {
mock := &MockTargetService{
TestConnectionFn: func(_ context.Context, id string) error {
return nil
},
}
handler := NewTargetHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/targets/t-nginx-01/test", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.TestTargetConnection(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200, got %d", w.Code)
}
var resp map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if resp["status"] != "success" {
t.Errorf("expected status 'success', got %v", resp["status"])
}
}
func TestTestTargetConnection_Failed(t *testing.T) {
mock := &MockTargetService{
TestConnectionFn: func(_ context.Context, id string) error {
return ErrMockServiceFailed
},
}
handler := NewTargetHandler(mock)
req := httptest.NewRequest(http.MethodPost, "/api/v1/targets/t-nginx-01/test", nil)
req = req.WithContext(contextWithRequestID())
w := httptest.NewRecorder()
handler.TestTargetConnection(w, req)
if w.Code != http.StatusOK {
t.Fatalf("expected status 200, got %d", w.Code)
}
var resp map[string]interface{}
if err := json.NewDecoder(w.Body).Decode(&resp); err != nil {
t.Fatalf("failed to decode response: %v", err)
}
if resp["status"] != "failed" {
t.Errorf("expected status 'failed', got %v", resp["status"])
}
}
func TestTestTargetConnection_MethodNotAllowed(t *testing.T) {
handler := NewTargetHandler(&MockTargetService{})
req := httptest.NewRequest(http.MethodGet, "/api/v1/targets/t-nginx-01/test", nil)
w := httptest.NewRecorder()
handler.TestTargetConnection(w, req)
if w.Code != http.StatusMethodNotAllowed {
t.Fatalf("expected status 405, got %d", w.Code)
}
}
+61 -10
View File
@@ -1,22 +1,26 @@
package handler
import (
"context"
"encoding/json"
"errors"
"net/http"
"strconv"
"strings"
"github.com/shankar0123/certctl/internal/api/middleware"
"github.com/shankar0123/certctl/internal/domain"
"github.com/shankar0123/certctl/internal/service"
)
// TargetService defines the service interface for deployment target operations.
type TargetService interface {
ListTargets(page, perPage int) ([]domain.DeploymentTarget, int64, error)
GetTarget(id string) (*domain.DeploymentTarget, error)
CreateTarget(target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
UpdateTarget(id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
DeleteTarget(id string) error
ListTargets(ctx context.Context, page, perPage int) ([]domain.DeploymentTarget, int64, error)
GetTarget(ctx context.Context, id string) (*domain.DeploymentTarget, error)
CreateTarget(ctx context.Context, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
UpdateTarget(ctx context.Context, id string, target domain.DeploymentTarget) (*domain.DeploymentTarget, error)
DeleteTarget(ctx context.Context, id string) error
TestConnection(ctx context.Context, id string) error
}
// TargetHandler handles HTTP requests for deployment target operations.
@@ -53,7 +57,7 @@ func (h TargetHandler) ListTargets(w http.ResponseWriter, r *http.Request) {
}
}
targets, total, err := h.svc.ListTargets(page, perPage)
targets, total, err := h.svc.ListTargets(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list targets", requestID)
return
@@ -85,7 +89,7 @@ func (h TargetHandler) GetTarget(w http.ResponseWriter, r *http.Request) {
return
}
target, err := h.svc.GetTarget(id)
target, err := h.svc.GetTarget(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Target not found", requestID)
return
@@ -123,9 +127,23 @@ func (h TargetHandler) CreateTarget(w http.ResponseWriter, r *http.Request) {
ErrorWithRequestID(w, http.StatusBadRequest, "type is required", requestID)
return
}
// C-002: agent_id is a NOT NULL FK in deployment_targets (migration 000001
// line 104). Reject empty values at the boundary so callers get a clean 400
// with the field name rather than a generic "Failed to create target" 500.
if err := ValidateRequired("agent_id", target.AgentID); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, err.Error(), requestID)
return
}
created, err := h.svc.CreateTarget(target)
created, err := h.svc.CreateTarget(r.Context(), target)
if err != nil {
// C-002: a nonexistent agent_id is a client error, not a server error.
// The service returns ErrAgentNotFound (wrapped via fmt.Errorf %w) when
// agentRepo.Get fails; we translate that to 400 via errors.Is.
if errors.Is(err, service.ErrAgentNotFound) {
ErrorWithRequestID(w, http.StatusBadRequest, err.Error(), requestID)
return
}
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create target", requestID)
return
}
@@ -157,7 +175,7 @@ func (h TargetHandler) UpdateTarget(w http.ResponseWriter, r *http.Request) {
return
}
updated, err := h.svc.UpdateTarget(id, target)
updated, err := h.svc.UpdateTarget(r.Context(), id, target)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update target", requestID)
return
@@ -182,10 +200,43 @@ func (h TargetHandler) DeleteTarget(w http.ResponseWriter, r *http.Request) {
return
}
if err := h.svc.DeleteTarget(id); err != nil {
if err := h.svc.DeleteTarget(r.Context(), id); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to delete target", requestID)
return
}
w.WriteHeader(http.StatusNoContent)
}
// TestTargetConnection tests target connectivity by checking the assigned agent's heartbeat.
// POST /api/v1/targets/{id}/test
func (h TargetHandler) TestTargetConnection(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
return
}
requestID := middleware.GetRequestID(r.Context())
// Extract target ID from path: /api/v1/targets/{id}/test
path := strings.TrimPrefix(r.URL.Path, "/api/v1/targets/")
parts := strings.Split(path, "/")
if len(parts) < 2 || parts[0] == "" {
ErrorWithRequestID(w, http.StatusBadRequest, "Target ID is required", requestID)
return
}
id := parts[0]
if err := h.svc.TestConnection(r.Context(), id); err != nil {
JSON(w, http.StatusOK, map[string]interface{}{
"status": "failed",
"message": err.Error(),
})
return
}
JSON(w, http.StatusOK, map[string]interface{}{
"status": "success",
"message": "Agent is online and reachable",
})
}
+6 -5
View File
@@ -2,6 +2,7 @@ package handler
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
@@ -20,35 +21,35 @@ type MockTeamService struct {
DeleteTeamFn func(id string) error
}
func (m *MockTeamService) ListTeams(page, perPage int) ([]domain.Team, int64, error) {
func (m *MockTeamService) ListTeams(_ context.Context, page, perPage int) ([]domain.Team, int64, error) {
if m.ListTeamsFn != nil {
return m.ListTeamsFn(page, perPage)
}
return nil, 0, nil
}
func (m *MockTeamService) GetTeam(id string) (*domain.Team, error) {
func (m *MockTeamService) GetTeam(_ context.Context, id string) (*domain.Team, error) {
if m.GetTeamFn != nil {
return m.GetTeamFn(id)
}
return nil, nil
}
func (m *MockTeamService) CreateTeam(team domain.Team) (*domain.Team, error) {
func (m *MockTeamService) CreateTeam(_ context.Context, team domain.Team) (*domain.Team, error) {
if m.CreateTeamFn != nil {
return m.CreateTeamFn(team)
}
return nil, nil
}
func (m *MockTeamService) UpdateTeam(id string, team domain.Team) (*domain.Team, error) {
func (m *MockTeamService) UpdateTeam(_ context.Context, id string, team domain.Team) (*domain.Team, error) {
if m.UpdateTeamFn != nil {
return m.UpdateTeamFn(id, team)
}
return nil, nil
}
func (m *MockTeamService) DeleteTeam(id string) error {
func (m *MockTeamService) DeleteTeam(_ context.Context, id string) error {
if m.DeleteTeamFn != nil {
return m.DeleteTeamFn(id)
}
+11 -10
View File
@@ -1,6 +1,7 @@
package handler
import (
"context"
"encoding/json"
"net/http"
"strconv"
@@ -12,11 +13,11 @@ import (
// TeamService defines the service interface for team operations.
type TeamService interface {
ListTeams(page, perPage int) ([]domain.Team, int64, error)
GetTeam(id string) (*domain.Team, error)
CreateTeam(team domain.Team) (*domain.Team, error)
UpdateTeam(id string, team domain.Team) (*domain.Team, error)
DeleteTeam(id string) error
ListTeams(ctx context.Context, page, perPage int) ([]domain.Team, int64, error)
GetTeam(ctx context.Context, id string) (*domain.Team, error)
CreateTeam(ctx context.Context, team domain.Team) (*domain.Team, error)
UpdateTeam(ctx context.Context, id string, team domain.Team) (*domain.Team, error)
DeleteTeam(ctx context.Context, id string) error
}
// TeamHandler handles HTTP requests for team operations.
@@ -53,7 +54,7 @@ func (h TeamHandler) ListTeams(w http.ResponseWriter, r *http.Request) {
}
}
teams, total, err := h.svc.ListTeams(page, perPage)
teams, total, err := h.svc.ListTeams(r.Context(), page, perPage)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to list teams", requestID)
return
@@ -87,7 +88,7 @@ func (h TeamHandler) GetTeam(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
team, err := h.svc.GetTeam(id)
team, err := h.svc.GetTeam(r.Context(), id)
if err != nil {
ErrorWithRequestID(w, http.StatusNotFound, "Team not found", requestID)
return
@@ -122,7 +123,7 @@ func (h TeamHandler) CreateTeam(w http.ResponseWriter, r *http.Request) {
return
}
created, err := h.svc.CreateTeam(team)
created, err := h.svc.CreateTeam(r.Context(), team)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to create team", requestID)
return
@@ -155,7 +156,7 @@ func (h TeamHandler) UpdateTeam(w http.ResponseWriter, r *http.Request) {
return
}
updated, err := h.svc.UpdateTeam(id, team)
updated, err := h.svc.UpdateTeam(r.Context(), id, team)
if err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to update team", requestID)
return
@@ -182,7 +183,7 @@ func (h TeamHandler) DeleteTeam(w http.ResponseWriter, r *http.Request) {
}
id = parts[0]
if err := h.svc.DeleteTeam(id); err != nil {
if err := h.svc.DeleteTeam(r.Context(), id); err != nil {
ErrorWithRequestID(w, http.StatusInternalServerError, "Failed to delete team", requestID)
return
}
+2 -1
View File
@@ -71,10 +71,11 @@ func ValidatePolicyType(policyType interface{}) error {
"RequiredMetadata": true,
"AllowedEnvironments": true,
"RenewalLeadTime": true,
"CertificateLifetime": true,
}
typeStr := fmt.Sprintf("%v", policyType)
if !validTypes[typeStr] {
return ValidationError{Field: "type", Message: "type must be one of: AllowedIssuers, AllowedDomains, RequiredMetadata, AllowedEnvironments, RenewalLeadTime"}
return ValidationError{Field: "type", Message: "type must be one of: AllowedIssuers, AllowedDomains, RequiredMetadata, AllowedEnvironments, RenewalLeadTime, CertificateLifetime"}
}
return nil
}
+562
View File
@@ -0,0 +1,562 @@
package handler
import (
"crypto/ecdsa"
"crypto/elliptic"
"crypto/rand"
"crypto/x509"
"crypto/x509/pkix"
"encoding/pem"
"strings"
"testing"
)
// TestValidateCommonName_ValidInputs tests common names that should pass validation.
func TestValidateCommonName_ValidInputs(t *testing.T) {
tests := []struct {
name string
cn string
}{
{
name: "simple hostname",
cn: "example.com",
},
{
name: "wildcard domain",
cn: "*.example.com",
},
{
name: "subdomain",
cn: "sub.deep.example.com",
},
{
name: "IPv4 address",
cn: "192.168.1.1",
},
{
name: "IPv6 address",
cn: "2001:db8::1",
},
{
name: "email address (S/MIME)",
cn: "user@example.com",
},
{
name: "hostname with hyphen",
cn: "my-host",
},
{
name: "single character hostname",
cn: "a",
},
{
name: "hostname with underscore",
cn: "my_host",
},
{
name: "complex subdomain",
cn: "api.v1.internal.example.com",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateCommonName(tt.cn)
if err != nil {
t.Errorf("ValidateCommonName(%q) = %v, want nil", tt.cn, err)
}
})
}
}
// TestValidateCommonName_InvalidInputs tests common names that should fail validation.
func TestValidateCommonName_InvalidInputs(t *testing.T) {
tests := []struct {
name string
cn string
wantErr bool
}{
{
name: "empty string",
cn: "",
wantErr: true,
},
{
name: "whitespace only",
cn: " ",
wantErr: true,
},
{
name: "string exceeds 253 characters",
cn: strings.Repeat("a", 254),
wantErr: true,
},
{
name: "path traversal attempt",
cn: "../etc/passwd",
wantErr: true,
},
{
name: "label starts with hyphen",
cn: "-example.com",
wantErr: true,
},
{
name: "label ends with hyphen",
cn: "example-.com",
wantErr: true,
},
{
name: "empty label",
cn: "example..com",
wantErr: true,
},
{
name: "invalid character space",
cn: "my host.com",
wantErr: true,
},
{
name: "invalid character slash",
cn: "my/host.com",
wantErr: true,
},
{
name: "malformed email",
cn: "notanemail@",
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateCommonName(tt.cn)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateCommonName(%q) error = %v, wantErr %v", tt.cn, err, tt.wantErr)
}
})
}
}
// TestValidateRequired_EmptyAndWhitespace tests required field validation.
func TestValidateRequired_EmptyAndWhitespace(t *testing.T) {
tests := []struct {
name string
field string
value string
wantErr bool
}{
{
name: "empty value",
field: "test_field",
value: "",
wantErr: true,
},
{
name: "valid value",
field: "test_field",
value: "value",
wantErr: false,
},
{
name: "whitespace only value",
field: "another_field",
value: " ",
wantErr: false, // Whitespace is considered a value (not empty string)
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateRequired(tt.field, tt.value)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateRequired(%q, %q) error = %v, wantErr %v", tt.field, tt.value, err, tt.wantErr)
}
if err != nil {
ve, ok := err.(ValidationError)
if !ok {
t.Errorf("Expected ValidationError, got %T", err)
}
if ve.Field != tt.field {
t.Errorf("Expected field %q, got %q", tt.field, ve.Field)
}
}
})
}
}
// TestValidateStringLength_Boundary tests string length validation at boundaries.
func TestValidateStringLength_Boundary(t *testing.T) {
tests := []struct {
name string
field string
value string
maxLen int
wantErr bool
}{
{
name: "at max length",
field: "test",
value: "0123456789",
maxLen: 10,
wantErr: false,
},
{
name: "under max length",
field: "test",
value: "012345678",
maxLen: 10,
wantErr: false,
},
{
name: "exceeds max length",
field: "test",
value: "01234567890",
maxLen: 10,
wantErr: true,
},
{
name: "empty string",
field: "test",
value: "",
maxLen: 10,
wantErr: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateStringLength(tt.field, tt.value, tt.maxLen)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateStringLength(%q, %q, %d) error = %v, wantErr %v",
tt.field, tt.value, tt.maxLen, err, tt.wantErr)
}
if err != nil {
ve, ok := err.(ValidationError)
if !ok {
t.Errorf("Expected ValidationError, got %T", err)
}
if ve.Field != tt.field {
t.Errorf("Expected field %q, got %q", tt.field, ve.Field)
}
}
})
}
}
// TestValidateCSRPEM_Valid tests validation of a real CSR PEM.
func TestValidateCSRPEM_Valid(t *testing.T) {
// Generate a real CSR using crypto/x509
privateKey, err := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
if err != nil {
t.Fatalf("Failed to generate private key: %v", err)
}
csrTemplate := &x509.CertificateRequest{
Subject: pkixName("example.com"),
}
csrDER, err := x509.CreateCertificateRequest(rand.Reader, csrTemplate, privateKey)
if err != nil {
t.Fatalf("Failed to create CSR: %v", err)
}
csrPEM := pem.EncodeToMemory(&pem.Block{
Type: "CERTIFICATE REQUEST",
Bytes: csrDER,
})
err = ValidateCSRPEM(string(csrPEM))
if err != nil {
t.Errorf("ValidateCSRPEM() on valid CSR returned error: %v", err)
}
}
// TestValidateCSRPEM_InvalidInputs tests CSR validation with invalid inputs.
func TestValidateCSRPEM_InvalidInputs(t *testing.T) {
tests := []struct {
name string
csrPEM string
wantErr bool
}{
{
name: "empty string",
csrPEM: "",
wantErr: true,
},
{
name: "not PEM format",
csrPEM: "not-a-pem-block",
wantErr: true,
},
{
name: "garbage data",
csrPEM: "asdfjkl;asdfjkl;",
wantErr: true,
},
{
name: "certificate PEM (not CSR)",
csrPEM: "-----BEGIN CERTIFICATE-----\nMIIC",
wantErr: true,
},
{
name: "PEM with wrong type",
csrPEM: "-----BEGIN PRIVATE KEY-----\ndata",
wantErr: true,
},
{
name: "whitespace only",
csrPEM: " \n ",
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateCSRPEM(tt.csrPEM)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateCSRPEM(%q) error = %v, wantErr %v", tt.csrPEM, err, tt.wantErr)
}
if err != nil {
ve, ok := err.(ValidationError)
if !ok {
t.Errorf("Expected ValidationError, got %T", err)
}
if ve.Field != "csr_pem" {
t.Errorf("Expected field 'csr_pem', got %q", ve.Field)
}
}
})
}
}
// TestValidatePolicyType_ValidTypes tests valid policy types.
func TestValidatePolicyType_ValidTypes(t *testing.T) {
validTypes := []struct {
name string
ptype interface{}
}{
{
name: "AllowedIssuers",
ptype: "AllowedIssuers",
},
{
name: "AllowedDomains",
ptype: "AllowedDomains",
},
{
name: "RequiredMetadata",
ptype: "RequiredMetadata",
},
{
name: "AllowedEnvironments",
ptype: "AllowedEnvironments",
},
{
name: "RenewalLeadTime",
ptype: "RenewalLeadTime",
},
}
for _, tt := range validTypes {
t.Run(tt.name, func(t *testing.T) {
err := ValidatePolicyType(tt.ptype)
if err != nil {
t.Errorf("ValidatePolicyType(%v) = %v, want nil", tt.ptype, err)
}
})
}
}
// TestValidatePolicyType_InvalidType tests invalid policy types.
func TestValidatePolicyType_InvalidType(t *testing.T) {
tests := []struct {
name string
ptype interface{}
wantErr bool
}{
{
name: "nonexistent type",
ptype: "NonexistentType",
wantErr: true,
},
{
name: "empty string",
ptype: "",
wantErr: true,
},
{
name: "lowercase type",
ptype: "allowedissuers",
wantErr: true,
},
{
name: "integer type",
ptype: 123,
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidatePolicyType(tt.ptype)
if (err != nil) != tt.wantErr {
t.Errorf("ValidatePolicyType(%v) error = %v, wantErr %v", tt.ptype, err, tt.wantErr)
}
if err != nil {
ve, ok := err.(ValidationError)
if !ok {
t.Errorf("Expected ValidationError, got %T", err)
}
if ve.Field != "type" {
t.Errorf("Expected field 'type', got %q", ve.Field)
}
}
})
}
}
// TestValidatePolicySeverity_ValidSeverities tests valid severity levels.
func TestValidatePolicySeverity_ValidSeverities(t *testing.T) {
validSeverities := []struct {
name string
sev interface{}
}{
{
name: "Warning",
sev: "Warning",
},
{
name: "Error",
sev: "Error",
},
{
name: "Critical",
sev: "Critical",
},
}
for _, tt := range validSeverities {
t.Run(tt.name, func(t *testing.T) {
err := ValidatePolicySeverity(tt.sev)
if err != nil {
t.Errorf("ValidatePolicySeverity(%v) = %v, want nil", tt.sev, err)
}
})
}
}
// TestValidatePolicySeverity_InvalidSeverity tests invalid severity levels.
func TestValidatePolicySeverity_InvalidSeverity(t *testing.T) {
tests := []struct {
name string
sev interface{}
wantErr bool
}{
{
name: "lowercase warning",
sev: "warning",
wantErr: true,
},
{
name: "nonexistent severity",
sev: "Severe",
wantErr: true,
},
{
name: "empty string",
sev: "",
wantErr: true,
},
{
name: "integer",
sev: 1,
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidatePolicySeverity(tt.sev)
if (err != nil) != tt.wantErr {
t.Errorf("ValidatePolicySeverity(%v) error = %v, wantErr %v", tt.sev, err, tt.wantErr)
}
if err != nil {
ve, ok := err.(ValidationError)
if !ok {
t.Errorf("Expected ValidationError, got %T", err)
}
if ve.Field != "severity" {
t.Errorf("Expected field 'severity', got %q", ve.Field)
}
}
})
}
}
// TestValidationError_ErrorMessage tests ValidationError.Error() method.
func TestValidationError_ErrorMessage(t *testing.T) {
tests := []struct {
name string
err ValidationError
wantMsg string
}{
{
name: "simple message",
err: ValidationError{
Field: "common_name",
Message: "common_name is required",
},
wantMsg: "common_name is required",
},
{
name: "detailed message",
err: ValidationError{
Field: "csr_pem",
Message: "csr_pem must be a valid PEM-encoded certificate request",
},
wantMsg: "csr_pem must be a valid PEM-encoded certificate request",
},
{
name: "error with field info",
err: ValidationError{
Field: "test_field",
Message: "test_field must be 10 characters or fewer",
},
wantMsg: "test_field must be 10 characters or fewer",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
errMsg := tt.err.Error()
if errMsg != tt.wantMsg {
t.Errorf("ValidationError.Error() = %q, want %q", errMsg, tt.wantMsg)
}
})
}
}
// TestValidationError_IsError tests that ValidationError satisfies error interface.
func TestValidationError_IsError(t *testing.T) {
ve := ValidationError{
Field: "test",
Message: "test error",
}
// Assign to interface variable to verify it satisfies error
var err error = ve
_ = err
msg := ve.Error()
if msg != "test error" {
t.Errorf("Expected error message 'test error', got %q", msg)
}
}
// pkixName is a helper function to create PKIX name (used in CSR generation).
func pkixName(cn string) pkix.Name {
return pkix.Name{
CommonName: cn,
}
}
+159 -58
View File
@@ -4,16 +4,22 @@ import (
"context"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"io"
"log/slog"
"net/http"
"strings"
"sync"
"time"
)
// AuditRecorder is the interface that the audit middleware uses to record API calls.
// This avoids importing the service package directly, maintaining dependency inversion.
//
// Implementations may perform I/O (e.g., database writes). The middleware invokes
// RecordAPICall from a tracked goroutine so that callers can drain in-flight
// recordings during graceful shutdown via AuditMiddleware.Flush.
type AuditRecorder interface {
RecordAPICall(ctx context.Context, method, path, actor string, bodyHash string, status int, latencyMs int64) error
}
@@ -26,10 +32,42 @@ type AuditConfig struct {
Logger *slog.Logger
}
// NewAuditLog creates a middleware that records every API call to the audit trail.
// It captures method, path, authenticated actor, request body hash, response status, and latency.
// Audit recording is best-effort — failures are logged but don't affect the HTTP response.
func NewAuditLog(recorder AuditRecorder, cfg AuditConfig) func(http.Handler) http.Handler {
// ErrAuditFlushTimeout is returned by AuditMiddleware.Flush when in-flight audit
// recordings do not complete before the provided context is cancelled or its
// deadline elapses. It mirrors scheduler.ErrSchedulerShutdownTimeout so callers
// can branch on graceful-shutdown timeouts consistently across subsystems.
var ErrAuditFlushTimeout = errors.New("audit middleware flush timeout")
// AuditMiddleware is the handle returned by NewAuditLog. It wraps the audit
// logging HTTP middleware and tracks the goroutines spawned to record each API
// call, so that callers can drain them during graceful shutdown (M-1, CWE-662
// / CWE-400). The goroutines themselves still run detached from the request
// context — the shutdown-drain signal flows through this struct's WaitGroup
// instead of the per-request context.
type AuditMiddleware struct {
recorder AuditRecorder
logger *slog.Logger
excludeSet map[string]bool
// wg tracks every audit-recording goroutine spawned by Middleware so Flush
// can block until they complete before the DB pool is torn down.
wg sync.WaitGroup
}
// NewAuditLog constructs the API audit logging middleware. The returned
// *AuditMiddleware exposes the HTTP middleware via the Middleware method value
// (same func(http.Handler) http.Handler shape) and a Flush method that the
// process shutdown path must call after the HTTP server has stopped accepting
// new requests but before the audit recorder's backing store (e.g., the
// database connection pool) is closed.
//
// The middleware records method, path, authenticated actor, request body hash,
// response status, and latency. Recording is best-effort — individual failures
// are logged and do not affect the HTTP response. Shutdown is NOT best-effort:
// Flush must succeed (or time out, returning ErrAuditFlushTimeout) so that
// in-flight events are not lost when the audit recorder's connection pool is
// closed out from under the goroutines.
func NewAuditLog(recorder AuditRecorder, cfg AuditConfig) *AuditMiddleware {
excludeSet := make(map[string]bool, len(cfg.ExcludePaths))
for _, p := range cfg.ExcludePaths {
excludeSet[p] = true
@@ -40,68 +78,131 @@ func NewAuditLog(recorder AuditRecorder, cfg AuditConfig) func(http.Handler) htt
logger = slog.Default()
}
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Skip excluded paths (health, readiness probes)
for prefix := range excludeSet {
if strings.HasPrefix(r.URL.Path, prefix) {
next.ServeHTTP(w, r)
return
}
return &AuditMiddleware{
recorder: recorder,
logger: logger,
excludeSet: excludeSet,
}
}
// Middleware is the http.Handler wrapper. It has the standard
// func(http.Handler) http.Handler middleware signature so it can be composed
// into an existing middleware chain via a method value (auditMiddleware.Middleware).
func (a *AuditMiddleware) Middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Skip excluded paths (health, readiness probes)
for prefix := range a.excludeSet {
if strings.HasPrefix(r.URL.Path, prefix) {
next.ServeHTTP(w, r)
return
}
}
start := time.Now()
start := time.Now()
// Hash request body for audit (don't store raw bodies — security + size concerns)
bodyHash := ""
if r.Body != nil && r.Body != http.NoBody {
hasher := sha256.New()
body, err := io.ReadAll(r.Body)
if err == nil && len(body) > 0 {
hasher.Write(body)
bodyHash = hex.EncodeToString(hasher.Sum(nil))[:16] // truncated hash
// Restore the body for downstream handlers
r.Body = io.NopCloser(strings.NewReader(string(body)))
}
// Hash request body for audit (don't store raw bodies — security + size concerns)
bodyHash := ""
if r.Body != nil && r.Body != http.NoBody {
hasher := sha256.New()
body, err := io.ReadAll(r.Body)
if err == nil && len(body) > 0 {
hasher.Write(body)
bodyHash = hex.EncodeToString(hasher.Sum(nil))[:16] // truncated hash
// Restore the body for downstream handlers
r.Body = io.NopCloser(strings.NewReader(string(body)))
}
}
// Extract actor from auth context
actor := "anonymous"
if user, ok := GetUser(r.Context()); ok && user != "" {
actor = user
// Extract actor from auth context
actor := "anonymous"
if user := GetUser(r.Context()); user != "" {
actor = user
}
// Wrap response writer to capture status code
wrapped := &responseWriter{ResponseWriter: w, statusCode: http.StatusOK}
next.ServeHTTP(wrapped, r)
latency := time.Since(start).Milliseconds()
// Snapshot request-derived inputs so the goroutine does not race with
// the http.Server reusing r after this handler returns.
method := r.Method
path := r.URL.Path
status := wrapped.statusCode
// Derive a detached context that preserves request-scoped values
// (trace IDs, auth info carried via context keys) but is not cancelled
// when the HTTP server finalizes the request. Using r.Context()
// directly would cause the async audit write to observe ctx.Done()
// as soon as the response completes; using context.Background() would
// discard useful observability metadata. WithoutCancel gives us both
// (M-2 / D-3).
auditCtx := context.WithoutCancel(r.Context())
// Record audit event asynchronously (best-effort, don't block response).
// SECURITY: We intentionally use r.URL.Path (not r.URL.String() or r.RequestURI)
// to prevent query parameters from being recorded in the immutable audit trail.
// Query strings may contain cursor tokens, API keys passed as params, or other
// sensitive filter values. Since the audit trail is append-only with no deletion
// capability, any sensitive data recorded would persist permanently.
//
// The goroutine is tracked in a.wg so AuditMiddleware.Flush can drain
// in-flight recordings during graceful shutdown. Without this (M-1,
// CWE-662 / CWE-400), SIGTERM would close the DB pool while recordings
// were still mid-flight, silently dropping audit events.
a.wg.Add(1)
go func() {
defer a.wg.Done()
if err := a.recorder.RecordAPICall(
auditCtx,
method,
path,
actor,
bodyHash,
status,
latency,
); err != nil {
a.logger.Error("failed to record API audit event",
"error", err,
"method", method,
"path", path,
)
}
}()
})
}
// Wrap response writer to capture status code
wrapped := &responseWriter{ResponseWriter: w, statusCode: http.StatusOK}
// Flush blocks until every audit-recording goroutine spawned by Middleware has
// completed, or until ctx is cancelled / its deadline elapses. It must be
// called from the process shutdown path after http.Server.Shutdown has
// returned (so no new requests are being accepted) but before the backing
// audit recorder's resources (DB pool, etc.) are torn down.
//
// On timeout or cancellation Flush returns ErrAuditFlushTimeout wrapped with
// any context error; in-flight goroutines continue to run and may still write
// to the recorder once they unblock — the caller is responsible for deciding
// whether to proceed with teardown anyway or surface the error.
//
// Flush mirrors the idiom used by scheduler.Scheduler.WaitForCompletion so
// that the two subsystems drain identically at shutdown.
func (a *AuditMiddleware) Flush(ctx context.Context) error {
done := make(chan struct{})
go func() {
a.wg.Wait()
close(done)
}()
next.ServeHTTP(wrapped, r)
latency := time.Since(start).Milliseconds()
// Record audit event asynchronously (best-effort, don't block response).
// SECURITY: We intentionally use r.URL.Path (not r.URL.String() or r.RequestURI)
// to prevent query parameters from being recorded in the immutable audit trail.
// Query strings may contain cursor tokens, API keys passed as params, or other
// sensitive filter values. Since the audit trail is append-only with no deletion
// capability, any sensitive data recorded would persist permanently.
go func() {
if err := recorder.RecordAPICall(
context.Background(),
r.Method,
r.URL.Path,
actor,
bodyHash,
wrapped.statusCode,
latency,
); err != nil {
logger.Error("failed to record API audit event",
"error", err,
"method", r.Method,
"path", r.URL.Path,
)
}
}()
})
select {
case <-done:
a.logger.Info("audit middleware flush complete")
return nil
case <-ctx.Done():
a.logger.Warn("audit middleware flush did not complete before context cancellation",
"error", ctx.Err(),
)
return fmt.Errorf("%w: %w", ErrAuditFlushTimeout, ctx.Err())
}
}
+133 -14
View File
@@ -2,6 +2,7 @@ package middleware
import (
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -16,7 +17,8 @@ import (
type mockAuditRecorder struct {
mu sync.Mutex
calls []auditCall
err error // if non-nil, RecordAPICall returns this
err error // if non-nil, RecordAPICall returns this
block chan struct{} // if non-nil, RecordAPICall blocks on receive before returning
}
type auditCall struct {
@@ -29,6 +31,13 @@ type auditCall struct {
}
func (m *mockAuditRecorder) RecordAPICall(ctx context.Context, method, path, actor, bodyHash string, status int, latencyMs int64) error {
// Optional: block the recorder until a signal is received so tests can
// exercise the shutdown-drain path deterministically. The block happens
// before any state mutation so Flush-timeout tests see the call
// "in-flight" (wg counter > 0) with no recorded entries yet.
if m.block != nil {
<-m.block
}
m.mu.Lock()
defer m.mu.Unlock()
m.calls = append(m.calls, auditCall{
@@ -90,7 +99,7 @@ func (w *waitableAuditRecorder) Wait(timeout time.Duration) bool {
func TestAuditLog_RecordsAPICall(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
@@ -130,7 +139,7 @@ func TestAuditLog_RecordsAPICall(t *testing.T) {
func TestAuditLog_CapturesStatusCode(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
@@ -157,7 +166,7 @@ func TestAuditLog_ExcludesHealth(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{
ExcludePaths: []string{"/health", "/ready"},
})
}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
@@ -193,7 +202,7 @@ func TestAuditLog_ExcludesHealth(t *testing.T) {
func TestAuditLog_HashesRequestBody(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
// Handler verifies body was restored
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
@@ -228,7 +237,7 @@ func TestAuditLog_HashesRequestBody(t *testing.T) {
func TestAuditLog_EmptyBodyNoHash(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
@@ -253,15 +262,16 @@ func TestAuditLog_EmptyBodyNoHash(t *testing.T) {
func TestAuditLog_ExtractsAuthenticatedActor(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodDelete, "/api/v1/certificates/mc-1", nil)
// Simulate auth middleware having set the user in context
ctx := context.WithValue(req.Context(), UserKey{}, "api-key-user")
// Simulate auth middleware having set the named-key identity in context
// (post-M-002: actor is the named-key name, not the old "api-key-user").
ctx := context.WithValue(req.Context(), UserKey{}, "ops-admin")
req = req.WithContext(ctx)
rr := httptest.NewRecorder()
@@ -275,8 +285,8 @@ func TestAuditLog_ExtractsAuthenticatedActor(t *testing.T) {
if len(calls) != 1 {
t.Fatalf("expected 1 audit call, got %d", len(calls))
}
if calls[0].Actor != "api-key-user" {
t.Errorf("expected actor api-key-user, got %s", calls[0].Actor)
if calls[0].Actor != "ops-admin" {
t.Errorf("expected actor ops-admin, got %s", calls[0].Actor)
}
if calls[0].Method != "DELETE" {
t.Errorf("expected method DELETE, got %s", calls[0].Method)
@@ -285,7 +295,7 @@ func TestAuditLog_ExtractsAuthenticatedActor(t *testing.T) {
func TestAuditLog_RecorderErrorDoesNotBreakResponse(t *testing.T) {
recorder := &mockAuditRecorder{err: fmt.Errorf("db connection lost")}
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
@@ -304,7 +314,7 @@ func TestAuditLog_RecorderErrorDoesNotBreakResponse(t *testing.T) {
func TestAuditLog_CapturesLatency(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
time.Sleep(10 * time.Millisecond)
@@ -330,7 +340,7 @@ func TestAuditLog_CapturesLatency(t *testing.T) {
func TestAuditLog_ExcludesQueryParamsFromPath(t *testing.T) {
recorder := newWaitableAuditRecorder()
mw := NewAuditLog(recorder, AuditConfig{})
mw := NewAuditLog(recorder, AuditConfig{}).Middleware
handler := mw(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
@@ -429,3 +439,112 @@ func TestAuditServiceAdapter_PropagatesError(t *testing.T) {
t.Errorf("expected database error, got %v", err)
}
}
// TestAuditLog_FlushDrainsInFlightGoroutines verifies the M-1 shutdown-drain
// contract: Flush blocks until every audit-recording goroutine spawned by the
// middleware completes, then returns nil. Without the drain (pre-M-1 code),
// the DB pool would be closed while in-flight goroutines were still calling
// RecordAPICall, silently dropping audit events (CWE-662 / CWE-400).
func TestAuditLog_FlushDrainsInFlightGoroutines(t *testing.T) {
// Recorder blocks on `unblock` until the test releases it. This simulates
// a slow DB write still in flight when shutdown begins.
unblock := make(chan struct{})
recorder := &mockAuditRecorder{block: unblock}
auditMW := NewAuditLog(recorder, AuditConfig{})
handler := auditMW.Middleware(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
// Fire a request. Handler returns immediately; recorder goroutine is
// parked on the `unblock` channel inside RecordAPICall.
req := httptest.NewRequest(http.MethodGet, "/api/v1/certificates", nil)
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
// Start Flush in a goroutine — it must block on the WaitGroup until we
// release the recorder.
flushDone := make(chan error, 1)
go func() {
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
flushDone <- auditMW.Flush(ctx)
}()
// Confirm Flush is actually blocked (not returning immediately).
select {
case err := <-flushDone:
t.Fatalf("Flush returned before recorder unblocked: err=%v", err)
case <-time.After(50 * time.Millisecond):
// expected: Flush is blocked on wg.Wait
}
// Release the recorder. Flush should now observe wg counter drop to 0
// and return nil.
close(unblock)
select {
case err := <-flushDone:
if err != nil {
t.Fatalf("expected nil from Flush after drain, got %v", err)
}
case <-time.After(2 * time.Second):
t.Fatal("Flush did not return after recorder unblocked")
}
// Verify the audit event was actually recorded (i.e., the goroutine
// completed its write — not just that Flush unblocked).
calls := recorder.getCalls()
if len(calls) != 1 {
t.Fatalf("expected 1 recorded audit call, got %d", len(calls))
}
if calls[0].Path != "/api/v1/certificates" {
t.Errorf("expected path /api/v1/certificates, got %s", calls[0].Path)
}
}
// TestAuditLog_FlushTimeoutReturnsErrAuditFlushTimeout verifies that Flush
// respects its context: when in-flight goroutines exceed the shutdown budget,
// Flush returns an error wrapping ErrAuditFlushTimeout plus ctx.Err(). The
// caller can then decide whether to proceed with teardown anyway.
func TestAuditLog_FlushTimeoutReturnsErrAuditFlushTimeout(t *testing.T) {
// Recorder will never unblock on its own — we unblock at end of test for
// a clean race-safe teardown.
unblock := make(chan struct{})
recorder := &mockAuditRecorder{block: unblock}
auditMW := NewAuditLog(recorder, AuditConfig{})
handler := auditMW.Middleware(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
req := httptest.NewRequest(http.MethodPost, "/api/v1/certificates", nil)
rr := httptest.NewRecorder()
handler.ServeHTTP(rr, req)
// Flush with a tiny deadline — must time out.
ctx, cancel := context.WithTimeout(context.Background(), 20*time.Millisecond)
defer cancel()
err := auditMW.Flush(ctx)
if err == nil {
// Release the blocked goroutine before failing so the race detector
// doesn't trip on teardown.
close(unblock)
t.Fatal("expected Flush to return an error on timeout, got nil")
}
if !errors.Is(err, ErrAuditFlushTimeout) {
close(unblock)
t.Fatalf("expected error to wrap ErrAuditFlushTimeout, got %v", err)
}
if !errors.Is(err, context.DeadlineExceeded) {
close(unblock)
t.Fatalf("expected error to wrap context.DeadlineExceeded, got %v", err)
}
// Race-safe teardown: unblock the recorder goroutine so it exits cleanly
// before the test returns. The goroutine itself is still detached and
// will record to the mock even after Flush timed out — that's the
// documented behavior (Flush surfaces the timeout; caller decides).
close(unblock)
}
+98 -31
View File
@@ -5,6 +5,7 @@ import (
"crypto/sha256"
"crypto/subtle"
"encoding/hex"
"fmt"
"log"
"log/slog"
"net/http"
@@ -21,6 +22,16 @@ type RequestIDKey struct{}
// UserKey is the context key for storing authenticated user information.
type UserKey struct{}
// AdminKey is the context key for storing admin flag information.
type AdminKey struct{}
// NamedAPIKey represents a named API key with optional admin flag.
type NamedAPIKey struct {
Name string
Key string
Admin bool
}
// RequestID middleware generates a unique request ID and adds it to the request context and response headers.
func RequestID(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
@@ -78,10 +89,17 @@ func NewLogging(logger *slog.Logger) func(http.Handler) http.Handler {
// Recovery middleware recovers from panics and returns a 500 error.
func Recovery(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
defer func() {
if err := recover(); err != nil {
requestID := getRequestID(r.Context())
log.Printf("[%s] PANIC: %v", requestID, err)
requestID := getRequestID(ctx)
// Use slog.ErrorContext so the panic log carries the same
// request-scoped trace/auth metadata as normal request logs
// (M-2 / D-3 — preserve ctx propagation on the panic path).
slog.ErrorContext(ctx, "panic recovered in HTTP handler",
"request_id", requestID,
"panic", fmt.Sprintf("%v", err),
)
http.Error(w, `{"error":"Internal Server Error"}`, http.StatusInternalServerError)
}
}()
@@ -104,35 +122,40 @@ type AuthConfig struct {
Secret string // The raw API key or comma-separated list of valid API keys
}
// NewAuth creates an authentication middleware based on config.
// When Type is "none", all requests pass through (demo/development mode).
// When Type is "api-key", requests must include a valid Bearer token.
// The Secret field supports a comma-separated list of valid API keys for
// zero-downtime key rotation. Rotation workflow:
// 1. Add new key to comma-separated list, restart server
// 2. Update all agents/clients to use new key
// 3. Remove old key from list, restart server
func NewAuth(cfg AuthConfig) func(http.Handler) http.Handler {
if cfg.Type == "none" {
// NewAuthWithNamedKeys creates an authentication middleware that validates
// Bearer tokens against a set of named API keys. Each key carries a name
// (propagated as the actor via context) and an admin flag (consulted by
// authorization gates such as bulk revocation).
//
// When namedKeys is empty the returned middleware is a no-op pass-through,
// which is used in demo/development mode (CERTCTL_AUTH_TYPE=none). When one
// or more keys are provided, requests must include a matching Bearer token
// or they are rejected with 401.
func NewAuthWithNamedKeys(namedKeys []NamedAPIKey) func(http.Handler) http.Handler {
if len(namedKeys) == 0 {
return func(next http.Handler) http.Handler {
return next
}
}
// Pre-compute hashes of all valid keys for constant-time comparison.
// Supports comma-separated list for zero-downtime key rotation.
keys := strings.Split(cfg.Secret, ",")
var expectedHashes []string
for _, k := range keys {
k = strings.TrimSpace(k)
if k != "" {
expectedHashes = append(expectedHashes, HashAPIKey(k))
}
type keyEntry struct {
hash string
name string
admin bool
}
var entries []keyEntry
for _, nk := range namedKeys {
entries = append(entries, keyEntry{
hash: HashAPIKey(nk.Key),
name: nk.Name,
admin: nk.Admin,
})
}
// Warn if only one key is configured in production mode
if len(expectedHashes) == 1 {
slog.Warn("only one API key configured — consider adding a rotation key via comma-separated CERTCTL_AUTH_SECRET for zero-downtime rotation")
if len(entries) == 1 {
slog.Warn("only one API key configured — consider adding a rotation key for zero-downtime rotation")
}
return func(next http.Handler) http.Handler {
@@ -156,27 +179,60 @@ func NewAuth(cfg AuthConfig) func(http.Handler) http.Handler {
tokenHash := HashAPIKey(token)
// Check against all valid keys using constant-time comparison
authorized := false
for _, expectedHash := range expectedHashes {
if subtle.ConstantTimeCompare([]byte(tokenHash), []byte(expectedHash)) == 1 {
authorized = true
var matched *keyEntry
for i := range entries {
if subtle.ConstantTimeCompare([]byte(tokenHash), []byte(entries[i].hash)) == 1 {
matched = &entries[i]
break
}
}
if !authorized {
if matched == nil {
w.Header().Set("Content-Type", "application/json; charset=utf-8")
http.Error(w, `{"error":"Invalid API key"}`, http.StatusUnauthorized)
return
}
// Store the authenticated identity in context
ctx := context.WithValue(r.Context(), UserKey{}, "api-key-user")
// Store the authenticated identity and admin flag in context
ctx := context.WithValue(r.Context(), UserKey{}, matched.name)
ctx = context.WithValue(ctx, AdminKey{}, matched.admin)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
}
// NewAuth is a legacy shim that converts a comma-separated Secret list into
// synthesized legacy-key-N named entries and delegates to NewAuthWithNamedKeys.
// It preserves the pre-M-002 behavior for callers that still pass raw AuthConfig
// (primarily cmd/server/main_test.go). The synthesized actor is "legacy-key-N"
// rather than the old hardcoded "api-key-user" so audit events carry
// meaningful identity even on the legacy path.
//
// Deprecated: Use NewAuthWithNamedKeys with explicit NamedAPIKey entries.
func NewAuth(cfg AuthConfig) func(http.Handler) http.Handler {
if cfg.Type == "none" {
return func(next http.Handler) http.Handler {
return next
}
}
var namedKeys []NamedAPIKey
idx := 0
for _, k := range strings.Split(cfg.Secret, ",") {
k = strings.TrimSpace(k)
if k == "" {
continue
}
namedKeys = append(namedKeys, NamedAPIKey{
Name: fmt.Sprintf("legacy-key-%d", idx),
Key: k,
Admin: false,
})
idx++
}
return NewAuthWithNamedKeys(namedKeys)
}
// RateLimitConfig holds configuration for the rate limiter.
type RateLimitConfig struct {
RPS float64 // Requests per second
@@ -336,9 +392,20 @@ func getRequestID(ctx context.Context) string {
}
// GetUser extracts the authenticated user from context.
func GetUser(ctx context.Context) (string, bool) {
// Returns the name of the matched API key and whether it was found.
func GetUser(ctx context.Context) string {
user, ok := ctx.Value(UserKey{}).(string)
return user, ok
if !ok {
return ""
}
return user
}
// IsAdmin extracts the admin flag from context.
// Returns true if the authenticated user has admin privileges.
func IsAdmin(ctx context.Context) bool {
admin, ok := ctx.Value(AdminKey{}).(bool)
return ok && admin
}
// responseWriter wraps http.ResponseWriter to capture the status code.

Some files were not shown because too many files have changed in this diff Show More