17 Commits

Author SHA1 Message Date
shankar0123 21aeed4f4e legal: addlicense headers + normalize legacy variants (Phase 0 RED-4)
Phase 0 closure (Path B2, post-rewrite):

addlicense sweep — adds the canonical certctl LLC copyright + BUSL-1.1
SPDX header to every production Go file. Template:

  // Copyright 2026 certctl LLC. All rights reserved.
  // SPDX-License-Identifier: BUSL-1.1

Coverage: 338 / 338 production Go files (cmd/ + internal/, excluding
*_test.go and **/testdata/**). Pre-sweep coverage was 22 / 338 (6.5%);
post-sweep is 338 / 338 (100%).

Normalized 22 pre-existing legacy headers (`// Copyright (c) certctl`
+ `// SPDX-License-Identifier: BSL-1.1`) and 1 file using a
`Certctl Contributors` attribution. The legacy SPDX ID `BSL-1.1`
is non-standard; the official SPDX identifier for Business Source
License 1.1 is `BUSL-1.1` (capital U). All 338 files now share the
canonical form.

Generated via:
  addlicense -c "certctl LLC" -y 2026 \
    -f cowork/legal/copyright-header.tpl \
    -ignore '**/testdata/**' -ignore '**/*_test.go' \
    cmd/ internal/

Verification:
  find cmd internal -name '*.go' -not -name '*_test.go' \
    -not -path '*/testdata/*' \
    -exec grep -L '^// Copyright 2026 certctl LLC' {} \; | wc -l

  Returns: 0

gofmt clean. Header additions are comments only, no compile impact.

Closes: cowork/certctl-architecture-diligence-audit.html#fix-RED-4
2026-05-13 21:23:35 +00:00
shankar0123 8b75e0311b chore: rename Go module path to github.com/certctl-io/certctl
Mechanical sed across the main go.mod's module declaration, the f5-mock-icontrol
sub-module's go.mod, every Go file's import path (361 files), and a rebuild of
the checked-in f5-mock-icontrol binary so its embedded build-info reflects the
new module path. No behavior change.

Choice B from cowork/transfer-certctl-to-org.md, executed 2026-05-04. Choice A
(keep module path declared as github.com/shankar0123/certctl regardless of
repo URL) shipped on the day of the org transfer (2026-05-03) since we had no
external Go consumers; this commit closes that deferral.

Backward-compat: GitHub HTTP redirects continue to forward
github.com/shankar0123/certctl → github.com/certctl-io/certctl at the URL
level, but Go's module proxy uses the path declared in go.mod as the
canonical name. Pre-fix, anyone trying `go get github.com/certctl-io/certctl/...`
hit a "module path mismatch" error because go.mod said
github.com/shankar0123/certctl and the URL they fetched it from said
certctl-io/certctl. Post-fix, the canonical name and the URL agree, so
go get / go install / external Go consumers / Go-tooling integrations
work cleanly via either the new path (preferred) or the old path (which
redirects and Go follows the redirect for source fetch).

Anyone still importing the old path inside their own code keeps working
provided they update their go.mod's `require` line to match — the module
path declared in their consumer's go.sum / go.mod is the authoritative
import name, so a mass sed across their import statements is the migration
on the consumer side. No external consumers exist today.

Diff shape:
  361 *.go files  — import path replacement only
    2 go.mod     — module declaration replacement only
    1 binary     — deploy/test/f5-mock-icontrol/f5-mock-icontrol rebuilt
                   so embedded build-info reflects the new path (8618965 vs
                   8618933 bytes; 32-byte diff is the build-info change)

  Total: 364 files, 730 insertions / 730 deletions, net-zero size, pure
  mechanical substitution.

Verification:
  gofmt: 17 files needed re-alignment after sed (the new path is one char
    shorter than the old, so column-aligned import groups drifted). Applied
    `gofmt -w` to fix.
  go mod tidy: clean exit on both modules.
  go vet ./...: clean exit.
  go build ./...: clean exit.
  go test -short -count=1 on representative packages: all green
    (internal/domain, internal/validation, internal/crypto, internal/crypto/signer,
    cmd/agent). Test output now reads `ok github.com/certctl-io/certctl/...`
    confirming the module path resolves correctly.
  binary: f5-mock-icontrol rebuilt; `strings | grep shankar0123` returns
    nothing; `strings | grep certctl-io/certctl` shows the new module path
    embedded in build-info.

Files intentionally NOT touched in this commit:
  README.md / CHANGELOG.md / docs/ / etc. — already swept to certctl-io
    URLs in commit 0729ee4 (the post-transfer URL refresh). This commit is
    purely the Go-tooling layer.
  Scarf pixels (`shankar0123.docker.scarf.sh/...`) — Scarf-account
    namespace, not a Go import or GitHub repo URL. Stays.

This is a non-blocking, non-customer-impacting change. Operators pulling
container images, running `make verify`, hitting the API, or installing the
agent see no functional difference. Only Go-tooling consumers (none today)
are affected, and they're enabled — not broken — by this commit.
2026-05-04 00:30:29 +00:00
shankar0123 fefa5a5fd7 acme: support serial-only revocation via local cert-version lookup
Closes the #7 acquisition-readiness blocker from the 2026-05-01 issuer
coverage audit. Pre-fix, ACME RevokeCertificate at acme.go:L519-L529
returned the literal error "ACME revocation by serial not supported in
V1; provide certificate DER". RFC 8555 §7.6 genuinely requires the
cert DER bytes (not just the serial), but a CLM platform's job is to
abstract over that limitation. Operators routinely have only the
serial in hand: lost PEM, rotated key, GUI revoke action driven by a
row in the certs list.

This commit:

- Adds CertificateLookupRepo interface at the ACME connector boundary
  (connector boundary, NOT a service/repository import — the connector
  accepts whatever satisfies the shape). Production wiring in
  cmd/server/main.go injects the postgres CertificateRepository; tests
  inject a fake.

- Adds CertificateRepository.GetVersionBySerial(ctx, issuerID, serial)
  + interface declaration in repository/interfaces.go, returning the
  certificate_versions row whose SerialNumber matches, scoped to the
  issuer via JOIN on managed_certificates. Mirrors the existing
  GetByIssuerAndSerial shape but returns the version (where PEMChain
  lives). Per RFC 5280 §5.2.3 the issuer scope is required for
  determinism.

- Adds SetCertificateLookup + SetIssuerID setters on *acme.Connector.
  Mirror the pattern local.Connector already uses for OCSP responder
  wiring. Both must be wired before serial-only revoke works;
  unwired state falls back to a more actionable error pointing at the
  wiring requirement (the historical "not supported" wording is
  retired).

- Rewrites RevokeCertificate end-to-end: lookup → empty-PEM check →
  pem.Decode → block.Type == "CERTIFICATE" check → ensureClient →
  golang.org/x/crypto/acme.Client.RevokeCert(ctx, accountKey, der,
  reasonCode). RFC 8555 §7.6 case 1 (revocation request signed with
  account key) — the same account key issued the cert, so authority
  is intrinsic. The not-found path returns an actionable operator-
  facing error pointing at the local-store requirement.

- Adds mapRevocationReason translating RFC 5280 §5.3.1 reason strings
  (unspecified, keyCompromise, cACompromise, affiliationChanged,
  superseded, cessationOfOperation, certificateHold, removeFromCRL,
  privilegeWithdrawn, aACompromise) into golang.org/x/crypto/acme.
  CRLReasonCode. Accepts canonical camelCase + underscore_lower +
  ALL_CAPS_UNDERSCORE. Nil reason → 0 (unspecified). Unknown reason
  errors rather than silently demoting (operators rely on the reason
  for compliance reporting).

- Wiring update in service/issuer_registry.go: SetACMECertLookup
  setter on the registry; Rebuild type-asserts *acme.Connector and
  calls SetCertificateLookup + SetIssuerID, mirroring the existing
  *local.Connector branch. cmd/server/main.go calls
  issuerRegistry.SetACMECertLookup(certificateRepo) immediately after
  SetIssuanceMetrics — the postgres repo satisfies the interface via
  GetVersionBySerial.

- Tests:
  * acme_revoke_test.go (new): TestRevokeCertificate_NoCertLookupWired,
    TestRevokeCertificate_NoIssuerIDWired,
    TestRevokeCertificate_LookupReturnsNotFound (operator-facing
    "may not have been issued through certctl" hint pinned),
    TestRevokeCertificate_LookupArbitraryError,
    TestRevokeCertificate_VersionPEMEmpty (corrupt-row guard),
    TestRevokeCertificate_PEMMalformed_NoBlock,
    TestRevokeCertificate_PEMMalformed_WrongType (PRIVATE KEY block
    rejected as not a CERTIFICATE).
  * TestMapRevocationReason_TableDriven: full RFC 5280 reason set
    plus camelCase / underscore / ALL-CAPS variants plus
    nil-reason and unknown-reason cases.
  * acme_failure_test.go: renamed TestRevokeCertificate_AlwaysError
    → TestRevokeCertificate_UnwiredCertLookupFallback; the test
    still exercises the same backward-compat branch but now
    asserts the new "CertificateLookup wiring" error wording.

- Mock-repo updates (3 sites): mockCertificateRepository in
  internal/integration/lifecycle_test.go, mockCertRepo in
  internal/service/testutil_test.go, mockCertRepoWithGetError in
  internal/service/shortlived_test.go each gain a GetVersionBySerial
  implementation that mirrors the GetByIssuerAndSerial logic but
  returns the version row.

- docs/connectors.md ACME section: new "Revocation by serial number"
  subsection covering the workflow, the local-store requirement
  (cert was issued through certctl, not imported), the reason-code
  mapping with the three accepted spelling variants, and a pointer
  to the audit reference.

Out of scope (intentional, per spec):

- Recovering the DER from outside the local cert store (CT logs,
  CSR + signature reconstruction). If the cert wasn't issued through
  certctl, revoke-by-serial via certctl isn't possible.
- Revocation via the cert's private key (RFC 8555 §7.6 case 2). The
  account-key path covers all certctl-issued certs because the same
  account key issued them.
- Pebble-backed integration test for the happy path. Pebble integration
  is the right home for that — the unit tests in this commit pin all
  failure-mode branches before the network call, and the wiring
  branch in Rebuild is exercised by the existing
  TestIssuerRegistryRebuild paths.

Verified locally:
- gofmt -l . clean
- go vet ./... clean
- staticcheck ./... clean
- go test -short -count=1 across connector, service, repository,
  integration, api/middleware, api/handler: green

Audit reference: cowork/issuer-coverage-audit-2026-05-01/RESULTS.md
Top-10 fix #7.
2026-05-02 13:09:30 +00:00
shankar0123 b0efdbe2f8 repo,service: introduce WithinTx and atomic audit rows for issue/renew/revoke
Closes the #3 acquisition-readiness blocker from the 2026-05-01 issuer
coverage audit (Part 1.5 finding #1: audit row not transactional with
issuance). AuditRepository.Create previously ran on the package-level
*sql.DB while the certificate insert / version insert / revocation
insert ran on independent connections — a failed audit INSERT after
a successful operation INSERT was silently lost. SOX §404 over IT
general controls, PCI-DSS §10 audit logging, HIPAA §164.312(b) audit
controls, and CA/B Forum Baseline Requirements §5.4.1 audit log
records all presume audit-with-operation atomicity.

Design — Option A (Querier abstraction). The chosen pattern: a shared
repository.Querier interface (subset of *sql.DB and *sql.Tx) plus a
postgres.WithinTx helper that begins a tx, runs fn, commits on nil
error, rolls back on error or panic, and returns the wrapped result.
Repository methods that participate in a service-layer transaction
expose a *WithTx variant taking repository.Querier; the bare methods
remain for stand-alone use. A repository.Transactor abstracts the
"begin tx, run fn, commit/rollback" lifecycle so service-layer code
runs multi-write operations atomically without holding *sql.DB
directly. Option B (UnitOfWork) was considered but adds boilerplate
without behavioral benefit for the current scope. Option C
(context-carried tx) was explicitly rejected — it hides the
transactional boundary from the type system, reproducing the class
of bug we're fixing.

This commit:
- Adds internal/repository/querier.go with the Querier interface
  (compile-time guards that *sql.DB and *sql.Tx satisfy it) and the
  Transactor interface for service-layer use.
- Adds internal/repository/postgres/tx.go with the WithinTx helper
  (begin/fn/commit/rollback with panic recovery) and a transactor
  type that satisfies repository.Transactor.
- Adds CreateWithTx variants on AuditRepository, CertificateRepository
  (Create + Update + CreateVersion), and RevocationRepository.
  Existing bare methods now delegate to the *WithTx variant using
  the package-level *sql.DB so existing call sites are
  behavior-preserving.
- Updates repository/interfaces.go: AuditRepository, CertificateRepository,
  and RevocationRepository declare the new *WithTx methods. Adds an
  atomicity contract doc-comment on AuditRepository pointing at
  WithinTx + the audit blocker.
- Adds AuditService.RecordEventWithTx, mirroring RecordEvent but
  routing through CreateWithTx so the audit row is part of the
  caller's transaction. Same redaction + marshalling contract.
- Refactors three audit-emitting service paths to use Transactor.WithinTx
  when SetTransactor was wired, with a legacy fallback for backward
  compat:
    * CertificateService.Create — cert insert + audit row in one tx.
    * RevocationSvc.RevokeCertificateWithActor — cert status update +
      revocation row + audit row in one tx. The OCSP cache invalidate
      remains best-effort (out of scope per the prompt).
    * RenewalService CompleteServerRenewal — cert version insert +
      cert update + audit row in one tx. Job status update stays
      outside the audit-atomicity scope (job state lives outside
      the operator-facing audit trail).
- Adds SetTransactor on CertificateService, RevocationSvc, and
  RenewalService. cmd/server/main.go wires a single Transactor
  instance shared across all three so all audit-emitting paths run
  their writes in transactions backed by the same *sql.DB handle.
- Updates 5 mock implementations to satisfy the new interface methods:
  mockCertRepo (testutil_test.go), mockCertRepoWithGetError
  (shortlived_test.go), fakeRevocationRepo (crl_cache_test.go),
  intuneE2EAuditRepo (scep_intune_e2e_test.go), and the integration-
  test mocks (lifecycle_test.go: mockCertificateRepository,
  mockAuditRepository, mockRevocationRepository). All *WithTx mocks
  ignore the Querier and delegate to the bare method (mocks have no
  DB; in-memory state is shared regardless of "tx").
- Adds a service-layer test mockTransactor with BeginTxErr and
  CommitErr knobs so the atomic-audit tests can assert error
  propagation through the transactional boundary.
- Adds internal/repository/postgres/tx_test.go: unit-level test that
  WithinTx surfaces "begin tx" wrap when BeginTx fails, and that
  Transactor.WithinTx delegates correctly. Real-Postgres rollback
  semantics are covered by the testcontainers tests in the postgres
  package — sandbox disk pressure prevented adding a sqlmock dep
  for the in-fn / commit-failure unit test, so those scenarios are
  exercised through atomic_audit_test.go using the mockTransactor's
  CommitErr / BeginTxErr fields.
- Adds internal/service/atomic_audit_test.go:
    * TestCertificateService_Create_AtomicWithTx — asserts audit
      insert failure inside the tx surfaces as the operation's error
      (closes the blocker contract).
    * TestCertificateService_Create_LegacyPathLogs — pins the
      backward-compat behavior when SetTransactor isn't wired:
      audit failure is logged-not-failed, matching pre-fix.
    * TestCertificateService_Create_TransactorBeginFailure — BeginTx
      error path: operation fails, no cert insert, no audit insert.
    * TestCertificateService_Create_TransactorCommitFailure —
      Commit error after successful in-fn writes surfaces as the
      operation's error. Real Postgres can fail Commit on
      serialization conflicts; the service must report this.

Out of scope (separate follow-up commits, same shape):
- Issuer CRUD audit atomicity.
- Target CRUD audit atomicity.
- Agent retire (already transactional via RetireAgentWithCascade;
  verified, not changed).
- Renewal-policy CRUD audit atomicity.
- Owner/team/agent-group CRUD audit atomicity.
- Discovery / health-check audit atomicity.

Verified locally:
- gofmt -l . clean
- go vet ./... clean
- staticcheck ./... clean
- golangci-lint run --timeout 5m ./... → 0 issues
- go test -short -count=1 ./internal/service/ green
- go test -short -count=1 ./internal/api/handler/ green
- go test -short -count=1 ./internal/integration/ green
- go test -short -count=1 ./internal/repository/postgres/ green
- go build ./... success

Audit reference: cowork/issuer-coverage-audit-2026-05-01/RESULTS.md
Top-10 fix #3 (Part 3, narrative section).
2026-05-02 00:29:09 +00:00
shankar0123 5a682db8e2 EST RFC 7030 hardening master bundle Phases 10-11: libest sidecar e2e
+ Cisco IOS quirk fixtures + ManagedCertificate.Source provenance +
EST bulk-revoke endpoint + 13 typed audit action codes.

Phase 10.1 — libest reference-client sidecar:
- deploy/test/libest/Dockerfile: multi-stage Debian-bookworm-slim
  build of Cisco's libest v3.2.0-2 from source (autoconf/automake/
  libtool + libcurl4-openssl-dev + libssl-dev). Runtime stage
  carries only estclient + bash + openssl + ca-certificates so the
  exec surface stays small + predictable.
- docker-compose.test.yml libest-client entry (profiles: [est-e2e])
  with bind mounts for /config/est (test workspace) + /config/certs
  (certctl CA bundle for TLS pinning); IP 10.30.50.9 (10.30.50.8
  was already taken by certctl-agent).
- deploy/test/est/.gitkeep keeps the bind-mount target tracked.

Phase 10.2 — 5 integration tests (//go:build integration) in
deploy/test/est_e2e_test.go:
- TestEST_LibESTClient_Enrollment_Integration (cacerts → simpleenroll
  → cert-shape assertion)
- TestEST_LibESTClient_MTLSEnrollment_Integration (mTLS sibling-route
  cert auth; skip when bootstrap cert absent)
- TestEST_LibESTClient_ServerKeygen_Integration (RFC 7030 §4.4
  multipart; skip when profile gate disabled)
- TestEST_LibESTClient_RateLimited_Integration (4th enroll trips
  per-principal cap, asserts 429-shaped error)
- TestEST_LibESTClient_ChannelBinding_Integration (libest
  --tls-exporter; skip when libest build lacks the flag).
- requireESTSidecar guard skips the suite when the operator forgot
  --profile est-e2e; helpful error message includes the exact
  command to bring the sidecar up.

Phase 10.3 — Cisco IOS quirk fixtures + 3 unit tests in
internal/api/handler/cisco_ios_quirks_test.go:
- testdata/cisco_ios_15x_pem_csr.txt: PEM body sent with
  Content-Type application/x-pem-file. Handler dispatches on
  body-prefix not Content-Type — accepts cleanly.
- testdata/cisco_ios_16x_trailing_newline_csr.txt: extra trailing
  newlines after base64 body. strings.TrimSpace tolerates.
- testdata/cisco_ios_crlf_b64_csr.txt: CRLF-wrapped base64.
  base64.StdEncoding handles CRLF + LF identically.

Phase 11.1 — ManagedCertificate.Source provenance:
- New domain.CertificateSource enum (Unspecified/EST/SCEP/API/Agent).
- Migration 000023_managed_certificates_source.up.sql adds source
  TEXT NOT NULL DEFAULT '' so existing rows scan as
  CertificateSourceUnspecified — back-compat: bulk-revoke filter
  treats empty as "any source".
- Postgres repo Insert/Update/scan paths all wire the new column.

Phase 11.2 — EST bulk-revoke endpoint:
- BulkRevocationCriteria.Source field (Source-only requests rejected
  as too broad — must accompany at least one narrower criterion).
- service.bulk_revocation.resolveCertificates post-filter by Source
  (empty=any, no SQL change so existing CertificateFilter callers
  unaffected).
- New BulkRevocationHandler.BulkRevokeEST method pins Source=EST +
  dispatches; new route POST /api/v1/est/certificates/bulk-revoke
  (M-008 admin-gated). openapi.yaml documented + parity-guard green.

Phase 11.3 — 13 typed audit action codes in
internal/service/est_audit_actions.go:
- est_simple_enroll_success / _failed
- est_simple_reenroll_success / _failed
- est_server_keygen_success / _failed
- est_auth_failed_basic / _mtls / _channel_binding
- est_rate_limited
- est_csr_policy_violation
- est_bulk_revoke
- est_trust_anchor_reloaded
- ESTService.processEnrollment + SimpleServerKeygen + ReloadTrust
  split-emit BOTH the legacy bare action codes (back-compat for the
  GUI activity-tab chip filters that match by exact string +
  existing audit-log analysers) AND the new typed _success / _failed
  variants (operator grep target + per-failure-mode counter).

Tests:
- internal/api/handler/bulk_revocation_est_test.go — 5 cases
  (admin-true happy path pins Source=EST + non-admin 403 +
  empty-criteria 400 + invalid-reason 400 + method-not-allowed).
- internal/service/est_audit_actions_test.go — 5 cases (SimpleEnroll
  legacy+typed emission / SimpleReEnroll typed / IssuerError
  typed-failed / PolicyViolation triple-emit /
  unique-string invariant).

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres testcontainers limit), staticcheck
clean across api/handler/api/router/domain/service/deploy/test,
go test -short -count=1 green for every non-postgres Go package +
integration build (`go build -tags integration ./deploy/test/...`)
clean. G-3 docs-drift guard reproduced locally clean (Phases 10-11
added zero new env vars).

Spec preserved at cowork/est-rfc7030-hardening-prompt.md. Phases
12-13 (docs/est.md + WiFi/802.1X / IoT bootstrap / FreeRADIUS
recipes; release prep + tag) remain — post-2.1.0 work.
2026-04-30 00:52:43 +00:00
shankar0123 1b4de3fb2d Bundle E: Mechanical sweeps & defensive polish — 6 findings closed; L-004 deferred
Closes L-009 + L-010 + L-011 + L-013 + L-020 + L-021 from
comprehensive-audit-2026-04-25. L-004 deferred — recon found NO
rotation infrastructure exists at all; building it from scratch is
a feature project, not a Bundle-E mechanical sweep.

L-009 — ZeroSSL EAB URL configurable
  Audit's 'no timeout' claim was wrong: ari.go:329 has 15s timeout.
  internal/connector/issuer/acme/acme.go: zeroSSLEABEndpoint now
  lazily reads CERTCTL_ZEROSSL_EAB_URL from env at package init;
  defaults to ZeroSSL public endpoint. Pre-existing test override
  path preserved.

L-010 — Verified-already-clean
  grep -rn 'mock\.Anything' --include='*_test.go' . returned 0.
  certctl uses hand-rolled struct mocks (mockJobRepo, mockAuditRepo,
  etc.) with explicit method bodies; no testify-style mocks anywhere.

L-011 — IPv6 bracket-aware dialing pinned
  Every production net.Dial / DialTimeout site audited:
    cmd/agent/main.go:293 — intentional IPv4 literal '8.8.8.8:80'
    verify.go / tlsprobe / network_scan — net.Dialer (no string addr)
    email.go — net.JoinHostPort (bracket-aware)
    ssh.go — addr derives from JoinHostPort upstream
    ssrf.go — net.Dialer
  internal/connector/notifier/email/email_ipv6_test.go (NEW):
    TestJoinHostPort_IPv6BracketsRoundTrip pins IPv4/IPv6/zone variants;
    TestSMTPDialerUsesJoinHostPort source-greps email.go and fails CI
    if a future refactor swaps in 'host:port' concatenation.

L-013 — Verified-already-clean (monotonic-safe)
  Only one site uses now.Sub: middleware.go:393 in tokenBucket.allow().
  Both 'now' and tb.lastRefill come from time.Now() which carries
  monotonic-clock readings per Go's time package contract;
  intra-process now.Sub is monotonic-safe by construction. Doc
  comment block added above the call to make the invariant explicit.

L-020 (CWE-563) — ineffassign sweep, 8 unique sites
  certificate.go:135 — sortDir initial value dropped (set
    unconditionally below by SortDesc branch).
  certificate.go:169,175 — argCount post-increments dropped (var
    not read past the LIMIT/OFFSET formatting).
  agent_group.go, profile.go — page/perPage truly vestigial,
    replaced with _ = page; _ = perPage.
  issuer.go:633, owner.go:131, target.go:267, team.go:131 — same
    treatment for the audit-flagged second-function ListXxx clamps.
  First-function List() in issuer/owner/target/team KEEPS its
    clamp because page/perPage is used for in-memory slice
    pagination — ineffassign correctly didn't flag those.
  Build + tests green post-sweep.

L-021 — Transitive CVE bump
  go get golang.org/x/crypto@v0.45.0 golang.org/x/net@v0.47.0
    (crypto required net@0.47.0). go-text@v0.31.0 transitively
    bumped.
  Per tool-output govulncheck-verbose: x/net@v0.45.0 fixes
    GO-2026-4441 + GO-2026-4440; x/crypto@v0.45.0 fixes
    GO-2025-4134 + GO-2025-4135 + GO-2025-4116 — all 5 advisories
    cleared. Bundle B's ISV grep guard + Bundle D's release-time
    govulncheck step are the going-forward monitor + bump pass.

L-004 — Deferred to dedicated bundle
  Recon: zero hits for RotateAPIKey / rotated_at / key_status
    anywhere in source. API keys configured via
    CERTCTL_API_KEYS_NAMED env var; rotation is operator-managed
    (edit env + restart). Building rotation infrastructure from
    scratch is a feature project, not a mechanical sweep.
  Documented in audit-report.md with scope-pivot note.

Audit deliverables:
  audit-report.md: score 46/55 -> 52/55 closed
    (Low 14/19 -> 19/19 — 100% Low closed except L-004 deferred)
  findings.yaml: 6 status flips
  certctl/CHANGELOG.md: Bundle E section

Verification:
  go test -count=1 -short ./internal/service ./internal/connector/issuer/acme
    ./internal/connector/notifier/email                      green
  go vet on changed packages                                  clean
2026-04-27 01:17:15 +00:00
shankar0123 0e29c416b1 refactor(handler,repo): replace strings.Contains error dispatch with typed sentinels (S-2)
Closes one 2026-04-24 audit finding (P2):

  - cat-s6-efc7f6f6bd50: 30 strings.Contains(err.Error(), ...) sites
    in internal/api/handler/ — brittle to repository-layer message
    changes, untyped against the actual failure mode.

Approach (Option B from prompt design notes):
  - New typed sentinels in internal/repository/errors.go:
      ErrNotFound, ErrForeignKeyConstraint
      IsForeignKeyError(err) helper (the only place substring
      matching at the lib/pq boundary is allowed; isolates the
      DB-driver string knowledge to one function).
  - New typed sentinel in internal/domain/errors.go:
      ErrValidation (reserved for future per-entity validation
      wrappers; not yet used by all handlers).
  - 49 sites in internal/repository/postgres/*.go updated to wrap
    sql.ErrNoRows-derived errors via fmt.Errorf("...: %w",
    repository.ErrNotFound).
  - 18 not-found handler sites + 2 FK-constraint handler sites
    refactored to errors.Is(err, repository.ErrNotFound) /
    repository.IsForeignKeyError(err).
  - 23 inline `fmt.Errorf("X not found")` test fixtures across
    handler tests rewrapped to wrap repository.ErrNotFound.
  - test_utils.go::ErrMockNotFound rewrapped to wrap
    repository.ErrNotFound; renewal_policy.go closure docblock
    updated to reflect the new convention.
  - integration test mockJobRepository.Get wraps repository.ErrNotFound.

CI regression guardrail:
- .github/workflows/ci.yml::"Forbidden strings.Contains(err.Error())
  regression guard (S-2)" greps for the three patterns ("not found",
  "violates foreign key", "RESTRICT") under internal/api/handler/
  and fails the build on regression.

Verification:
- go build ./... — clean
- go vet ./... — clean
- go test ./... -short -count=1 — all packages pass (handler +
  repository + service + integration)
- golangci-lint v2.11.4 run ./... — 0 issues
- S-2 guardrail dry-run on post-fix tree → empty (good)
- All sibling guardrails (S-1, G-3, D-1+D-2, B-1, L-1, H-1, C-1, F-1, P-1) pass

Audit findings closed:
- cat-s6-efc7f6f6bd50 (P2)

Deferred follow-ups:
- 6 domain-specific substring patterns still inline in handlers
  ("cannot approve", "cannot reject", "cannot be parsed",
  "no certificates found", "challenge password", "invalid"/
  "required" validation chains in profiles + agent_groups). Each
  needs its own typed sentinel, scoped per service. Documented
  by the S-2 CI guardrail's allowlist for closure-comments only.
- Per-entity not-found sentinels (Option A — ErrCertificateNotFound,
  ErrAgentNotFound, etc.) deferred. Generic ErrNotFound covers the
  current dispatch needs; per-entity precision would let handlers
  return entity-aware error bodies without a domain.Type field,
  but not blocking.
2026-04-25 17:54:14 +00:00
shankar0123 fe7e766510 Close M-004 (OCSP issuer binding) and M-005 (discovery actor propagation) coverage-gap findings
M-004 — OCSP issuer binding (composite key):
  The OCSP lookup path now binds (issuer_id, serial) as a composite key
  rather than resolving by serial alone. CertificateRepository and
  RevocationRepository gain GetByIssuerAndSerial methods; ca_operations.go
  scopes both lookups by the issuer_id path param. When no managed cert
  binds to that (issuer, serial) tuple, GetOCSPResponse constructs an
  RFC 6960 §2.2 'unknown' response (CertStatus=2) instead of the prior
  default 'good'. Short-lived cert exemption (profile TTL < 1h) is
  preserved. Real repo errors (non-sql.ErrNoRows) fail closed with a log.

  Regression coverage: internal/service/ca_operations_test.go
    - TestCAOperationsSvc_GetOCSPResponse_Unknown_CrossIssuer
    - TestCAOperationsSvc_GetOCSPResponse_Unknown_UnknownSerial

M-005 — Discovery Claim/Dismiss actor propagation:
  DiscoveryService.ClaimDiscovered and DismissDiscovered now accept an
  explicit 'actor string' parameter (propagation pattern mirrors
  bulk_revocation.go / revocation_svc.go). The handler layer passes
  resolveActor(r.Context()) — the named-key identity established by the
  M-002 auth unification — and the service falls back to 'api' (the same
  safe sentinel resolveActor uses when no auth context is present) only
  when the caller passes an empty string. Never falls back to 'operator'.

  Regression coverage: internal/service/discovery_test.go
    - TestDiscoveryService_ClaimDiscovered_AuditActor
    - TestDiscoveryService_DismissDiscovered_AuditActor
    - TestDiscoveryService_ClaimDiscovered_EmptyActorFallsBackToAPI
    - TestDiscoveryService_DismissDiscovered_EmptyActorFallsBackToAPI

Each new test asserts event.Actor matches the caller-supplied string (or
'api' on empty input) and explicitly asserts event.Actor != 'operator'
to lock in the historical fix intent.

Files:
  internal/api/handler/discovery.go          — pass resolveActor(ctx)
  internal/api/handler/discovery_handler_test.go — updated call sites
  internal/integration/lifecycle_test.go     — updated mock wiring
  internal/repository/interfaces.go          — GetByIssuerAndSerial on
                                               CertificateRepository +
                                               RevocationRepository
  internal/repository/postgres/certificate.go — composite key lookup
  internal/service/ca_operations.go          — (issuer_id, serial) scoping
  internal/service/ca_operations_test.go     — 2 new M-004 tests
  internal/service/discovery.go              — actor parameter + 'api' fallback
  internal/service/discovery_test.go         — 4 new M-005 tests
  internal/service/shortlived_test.go        — mock signature update
  internal/service/testutil_test.go          — mock GetByIssuerAndSerial
2026-04-18 22:20:25 +00:00
shankar0123 80450c7180 fix(repository): populate TargetIDs in certificate scan helper (M-7)
scanCertificate never queried the certificate_target_mappings junction
table, so Certificate.TargetIDs was always nil on reads. This silently
broke deployment lookups, bulk revocation filters, cert detail pages,
and any code path that iterated TargetIDs to dispatch target work.

Fix:
- Convert scanCertificate to a receiver method (r *CertificateRepository)
  so it has access to the DB for the secondary junction query.
- Get(): scan the row, then call r.getTargetIDs(ctx, certID) to populate
  TargetIDs with a single targeted query.
- List() and GetExpiringCertificates(): inline the scan loop so we can
  collect all certIDs first, then call getTargetIDsForCertificates once
  with pq.Array(certIDs) to avoid N+1 round-trips. Build a map and
  attach TargetIDs to each certificate in the result set.
- Default TargetIDs to []string{} (not nil) when a cert has no mappings
  so JSON marshals as [] rather than null.

Tests:
- New integration test file certificate_targetids_test.go with 5
  subtests exercising Get / List / GetExpiringCertificates single
  and multi-target cases plus the empty-slice vs nil contract.
- Uses the shared testcontainers-go setupTestDB infrastructure and
  skips under 'go test -short' so CI (which excludes ./internal/repository/...
  from coverage paths anyway) stays green.

Addresses M-7 from certctl-audit-report.md.
2026-04-17 15:41:08 +00:00
shankar0123 f2e60b93a3 feat(M11c): crypto policy enforcement — CSR validation, MaxTTL caps, key metadata
Enforce certificate profile crypto constraints across all 5 issuance paths
(renewal, agent CSR, EST, SCEP). ValidateCSRAgainstProfile() rejects CSRs
with key algorithm/size that don't match profile rules. MaxTTL enforcement
caps certificate validity per issuer connector (Local CA, Vault, step-ca
enforce directly; ACME/DigiCert/Sectigo pass through). Key algorithm and
size are now persisted in certificate_versions for audit compliance.

16 new tests (12 service-layer + 4 Local CA connector). Removes hardcoded
version number from GUI sidebar. Documentation updated across architecture,
features, connectors, and README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:05:14 -04:00
shankar0123 a8fc177118 fix: resolve NULL csr_pem scan errors and QA smoke test failures
Root cause: certificate_versions.csr_pem is nullable in the schema but
Go code scanned it into a plain string. Used sql.NullString in
ListVersions and GetLatestVersion to handle NULL values correctly.

Also includes: partial update fetch-merge-update pattern to prevent FK
violations, nil directory guard in discovery service, diagnostic slog
logging in handlers, export handler 422 for unparseable PEM, OpenAPI
spec corrections, MCP tool description improvements, and test fixes.

Rewrites the Release Sign-Off section in testing-guide.md to individual
test-level granularity (320 rows) with smoke test results audited and
checked off (121 pass, 5 skip, 194 manual remaining).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 00:51:18 -04:00
shankar0123 09ff51c5ae fix(ci): resolve 185 golangci-lint v2 issues — fix unused, tune config
Fix 6 unused function/variable errors (var _ assignment pattern, remove
IIS PowerShell stub). Reduce enabled linter set to govet + staticcheck +
unused with targeted staticcheck check exclusions for pre-existing style
issues (ST1005, QF1001, S1009, etc.). Noisy linters (errcheck, gocritic,
gosec, ineffassign, noctx, bodyclose) temporarily disabled — will be
re-enabled incrementally as pre-existing issues are fixed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-27 23:18:04 -04:00
shankar0123 e078a686bf feat: M20 Enhanced Query API — sort, time-range filters, cursor pagination, sparse fields, deployments endpoint
V2 (free) query enhancements for certificates:
- `sort` param with direction (`?sort=-notAfter` for descending)
- Time-range filters: `expires_before`, `expires_after`, `created_after`, `updated_after`
- Cursor-based pagination (`?cursor=token&page_size=100`) alongside page-based
- Sparse field selection (`?fields=id,commonName,status`)
- Additional filters: `agent_id`, `profile_id`
- New endpoint: `GET /api/v1/certificates/{id}/deployments`

25 new tests (12 handler + 13 e2e) covering all M20 features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 18:56:02 -04:00
shankar0123 5d98e373e3 feat: M15a — certificate revocation API, CRL endpoint, and revocation notifications
Implements core revocation infrastructure: POST /api/v1/certificates/{id}/revoke
with all 8 RFC 5280 reason codes, JSON-formatted CRL at GET /api/v1/crl, webhook
and email revocation notifications, best-effort issuer notification, and immutable
revocation audit trail. Includes 48 new tests across service, handler, integration,
and domain layers (600+ total). Fixes 3 pre-existing test bugs (team_test error
matching, agent_group delete status code, team handler per_page validation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 10:59:18 -04:00
shankar0123 a579a84c7f feat: M11a — certificate profiles, crypto policy enforcement, short-lived cert expiry
Add certificate profiles as named enrollment templates that control allowed
key algorithms, max TTL, permitted EKUs, required SAN patterns, and optional
SPIFFE URI SANs. CSR submissions are validated against profile rules at
signing time (key type + minimum size). Short-lived certs (TTL < 1 hour)
auto-expire via a new scheduler loop — expiry acts as revocation, no
CRL/OCSP needed.

New files:
- Migration 000003: certificate_profiles table, FK columns on
  managed_certificates/renewal_policies, key metadata on certificate_versions
- domain/profile.go: CertificateProfile + KeyAlgorithmRule structs
- repository/postgres/profile.go: full CRUD with JSONB marshaling
- service/profile.go: ProfileService with validation + audit logging
- service/crypto_validation.go: CSR-against-profile validation (RSA/ECDSA/Ed25519)
- handler/profiles.go: 5 HTTP endpoints under /api/v1/profiles
- web/src/pages/ProfilesPage.tsx: profiles management page

Modified:
- renewal.go: CSR validation in CompleteAgentCSRRenewal, ExpireShortLivedCertificates
- scheduler.go: 30s short-lived expiry check loop
- certificate.go (repo): nullable profile FK, key metadata on versions
- main.go: profile repo/service/handler wiring, 8-param NewRenewalService
- router.go: 12-param RegisterHandlers with profile routes
- seed_demo.sql: 4 demo profiles (standard, mtls, short-lived, high-security)
- Frontend: types, API client, routing, sidebar nav

Tests: 40 new tests across handler (15), service (13), crypto validation (12)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 20:39:49 -04:00
shankar0123 9b4122b159 Fix runtime bugs, implement service layer, and overhaul documentation
Runtime fixes:
- Fix env var mismatch (CERTCTL_DB_URL → CERTCTL_DATABASE_URL)
- Fix table name mismatches (certificates → managed_certificates, notifications → notification_events)
- Add renewal_policy_id to certificate queries
- Remove non-existent created_at from notification queries
- Add env var fallback for agent CLI flags
- Graceful degradation for missing notifiers/issuers in demo mode
- Copy web/ directory in Dockerfile for dashboard serving

Service layer:
- Implement handler-service interface pattern across all services
- Wire up certificate, agent, job, policy, team, owner, audit, notification services

Documentation:
- Add concepts.md: beginner-friendly guide to TLS, CAs, private keys
- Rewrite quickstart.md with accurate API examples matching actual handlers
- Add demo-advanced.md: interactive demo with cert issuance and automated script
- Update architecture.md with correct table names and connector interfaces
- Update connectors.md to match actual Go interface signatures
- Update demo-guide.md with cross-references to new docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 21:38:11 -04:00
shankar0123 3a9fe8ba37 Complete V1 scaffold 2026-03-14 20:01:53 -04:00