# Changelog All notable changes to certctl are documented in this file. Dates use ISO 8601. Versions follow [Semantic Versioning](https://semver.org/). ## [unreleased] — 2026-04-26 ### Bundle 9 (Local-Issuer Hardening): 5 audit findings closed + 1 partial > Closes the audit's local-CA + agent-keystore findings end-to-end: `H-010` (local-issuer coverage 68.3% → 86.7%, CI gate flipped 60% → 85% hard), `L-002` (private-key zeroization helper + agent + local wiring), `L-003` (0700 key-dir hardening), `L-012` (Unicode safety in CN/SAN — IDN homograph + RTL + zero-width + control chars), `L-014` (CA-key-in-process threat-model documentation), and partially closes `M-028` — the `internal/connector/issuer/local/local.go:682` `elliptic.Marshal` → `crypto/ecdh.PublicKey.Bytes()` site only (5 of 6 SA1019 sites remain). Round-trip pin in `TestHashPublicKey_ECDSA_RoundTripPin` proves byte-identical SubjectKeyId output across P-256/P-384/P-521 so the migration cannot silently change the SKI of every previously-issued cert. #### Added - **`internal/validation/unicode.go::ValidateUnicodeSafe` (NEW, Audit L-012 / CWE-1007 + CWE-176)** — single chokepoint that rejects RTL/LTR override chars (`U+202A..U+202E`, `U+2066..U+2069`), zero-width chars (`U+200B..U+200D`, `U+2060`, `U+FEFF`), control chars (`<0x20`, `0x7F..0x9F`), and per-DNS-label Latin+non-Latin-letter mixes (the classic Cyrillic-а-in-apple homograph). Pure-IDN labels are allowed. Errors cite the rune codepoint + byte offset so operators can locate the violation in their CSR. - **`internal/connector/issuer/local/keymem.go::marshalPrivateKeyAndZeroize` (NEW, Audit L-002 / CWE-226)** — wraps `x509.MarshalECPrivateKey` with `defer clear(der)`; bounds the heap-resident private-scalar exposure window to the duration of the caller-supplied `onDER` callback. Used by both the local-CA path and (mirrored as `marshalAgentKeyAndZeroize` in `cmd/agent/keymem.go`) the agent's per-cert key-write site. - **`internal/connector/issuer/local/keystore.go::ensureKeyDirSecure` (NEW, Audit L-003 / CWE-732)** — creates the key directory at mode `0700` if absent, accepts existing owner-only modes, chmod-tightens any 077-permissive leaf with re-stat verification, and fail-loud-refuses empty/root/dot paths. Mirrored as `ensureAgentKeyDirSecure` in `cmd/agent/keymem.go` and wired ahead of every `os.WriteFile(keyPath, ..., 0600)` site in the agent. - **`internal/connector/issuer/local/local.go::ecdsaToECDH` (NEW, Audit M-028 / CWE-477 partial)** — replaces the deprecated `elliptic.Marshal(k.Curve, k.X, k.Y)` call inside `hashPublicKey` with `crypto/ecdh.PublicKey.Bytes()`. Dispatches on `Curve.Params().Name` to avoid importing `crypto/elliptic` for sentinel comparisons. Supports P-256/P-384/P-521; P-224 returns an unsupported-curve error and the caller falls back to a stable X+Y `big.Int.Bytes()` hash so SKI generation never panics. - **L-014 file-header doc comment in `internal/connector/issuer/local/local.go`** — explicit threat-model carve-out documenting what the bundled defense-in-depth measures (disk-at-rest 0600, key-dir 0700, key-bytes-zeroed-after-marshal, M-028 round-trip pin) DO and DO NOT protect against. Operators with stricter requirements (debugger/core-dump/CAP_SYS_PTRACE attacker; unencrypted swap; cold-boot RAM) are directed to the V3 Pro KMS-backed-issuance roadmap entry — heap hygiene is defense-in-depth, not the source of truth. - **CI hard gate on local-issuer coverage at 85% (`.github/workflows/ci.yml`)** — flipped the Bundle-7 transitional `LOCAL_ISSUER_COV < 60` floor to `< 85` with explicit "add tests, do not lower the gate" comment. The Bundle-9 closure invariant is that every percentage point under 85 is a regression, not a calibration drift. #### Tests - **`internal/connector/issuer/local/bundle9_coverage_test.go` (NEW, ~30 subtests)** — lifts `internal/connector/issuer/local/` coverage from 68.3% (pre-bundle baseline) to 86.7% (package-scoped `go test -cover`). Targets every previously-uncovered hotspot. **`TestHashPublicKey_ECDSA_RoundTripPin` is the regression oracle** that pins the new `crypto/ecdh.PublicKey.Bytes()` output to the legacy `elliptic.Marshal` output across P-256/P-384/P-521 (with explicit `//nolint:staticcheck` on the SA1019 reference) — guarantees the M-028 migration cannot silently change the SubjectKeyId of every previously-issued cert. - **`internal/validation/unicode_test.go` (NEW, 8 test functions)** — exercises every rejection arm of `ValidateUnicodeSafe`. U+FEFF (BOM) uses the `` escape sequence in source because Go's parser rejects literal BOM bytes inside string literals; all other invisible chars are written as literals (the file-header doc comment notes this). #### Wired - **`cmd/agent/main.go`** — agent's per-cert key-write path now calls `ensureAgentKeyDirSecure(filepath.Dir(keyPath))` before writing, marshals via `marshalAgentKeyAndZeroize` (which `defer clear(der)` immediately), and `defer clear(privKeyPEM)` on the encoded buffer for symmetry. - **`internal/connector/issuer/local/local.go`** — both `IssueCertificate` and `RenewCertificate` CSR-acceptance paths invoke `validateCSRUnicode(csr, request.SANs)` after `csr.CheckSignature()` and before `c.generateCertificate()`. The validator covers CSR Subject CommonName + DNSNames + EmailAddresses + request-side additional SANs. #### Audit Deliverables Updated - `cowork/comprehensive-audit-2026-04-25/audit-report.md` — score 20/55 → 25/55 closed (Critical 0/0, High 6/9 → 7/9, Medium 7/27 unchanged, Low 4/19 → 8/19); H-010 + L-002 + L-003 + L-012 + L-014 boxes flipped `[x]` with closure notes; M-028 annotated as partial-closed (1 of 6 sites migrated). - `cowork/comprehensive-audit-2026-04-25/findings.yaml` — corresponding status flips with closure notes citing the Bundle-9 mechanism. ### Bundle 8 (Frontend Hardening): 2 audit findings closed + 3 partial + 1 new ID opened > Closes the audit's remaining frontend findings — `L-015` (target="_blank" rel-noopener) and `L-019` (dangerouslySetInnerHTML) verified-already-clean at HEAD with new chokepoints + CI grep guards preventing regression. Partial closures for `M-009` (mutation invalidation), `M-010` (filter/sort/pagination consistency), `M-026` (XSS deep-dive on 14 untested pages) — Bundle 8 ships the helpers + contract tests + soft CI budget guard; per-page migrations of the existing 56 useMutation sites + ~14 list pages + 14 T-1-deferred pages tracked as new finding `M-029`. #### Added - **`web/src/components/ExternalLink.tsx` (NEW, Audit L-015 / CWE-1022)** — single chokepoint anchor that hardcodes `target="_blank"` + `rel="noopener noreferrer"`. Future external-link additions should use this component; the CI grep guard fails the build if any new bare `target="_blank"` lands without the rel pair outside this file. - **`web/src/utils/safeHtml.ts::sanitizeHtml` (NEW, Audit L-019 / CWE-79)** — placeholder chokepoint for any future code that needs `dangerouslySetInnerHTML`. Throws by default with a clear "add dompurify" activation-procedure message; the CI grep guard fails the build if any new `dangerouslySetInnerHTML` lands outside this file. At Bundle-8 time the codebase has 0 sites — the placeholder is preventive. - **`web/src/hooks/useListParams.ts` (NEW, Audit M-010)** — URL-state hook for filter / sort / pagination on list pages. Canonicalises the existing `DashboardPage` `useSearchParams` pattern with the contract `?page=2&page_size=25&sort=-created_at&filter[status]=active`. 7-test Vitest suite covers default omission, garbage-value rejection, filter-resets-page invariant, resetParams. - **`web/src/hooks/useTrackedMutation.ts` (NEW, Audit M-009)** — `useMutation` wrapper whose discriminated-union type REQUIRES the caller to declare `invalidates: QueryKey[]` OR `invalidates: 'noop'` + `noopReason: string`. Migrating the 56 existing useMutation sites to the wrapper tracked as `M-029`. - **CI regression guards (`.github/workflows/ci.yml`)** — three new steps: "Bundle-8 / L-015 target=_blank rel=noopener" (greps web/src for any bare target=_blank); "Bundle-8 / L-019 dangerouslySetInnerHTML" (greps web/src outside safeHtml.ts); "Bundle-8 / M-009 mutation invalidation contract" (soft budget guard: useMutation sites must not exceed invalidation sites + 5). #### Tests - 4 new Vitest test files / 15 tests passing: `ExternalLink.test.tsx` (target/rel preservation), `safeHtml.test.ts` (placeholder throws + activation-hint message), `useListParams.test.tsx` (URL contract), `useTrackedMutation.test.tsx` (invalidate-then-onSuccess + noop variant). #### Verified at HEAD (no code change required) - **L-015** — all 3 `target="_blank"` sites in `web/src/pages/OnboardingWizard.tsx` already carry `rel="noopener noreferrer"`. CI guard now prevents regression. - **L-019** — 0 `dangerouslySetInnerHTML` sites anywhere in `web/src/`. CI guard now prevents regression. #### Partially addressed (helpers shipped, per-page migrations tracked as M-029) - **M-009** — 56 useMutation sites across `web/src/`; soft CI budget guard at HEAD (61 mutations / 87 budget). Per-site migration to `useTrackedMutation` is incremental. - **M-010** — `CertificatesPage.tsx` and other list pages still use local `useState` for pagination. Per-page migration to `useListParams` is incremental. - **M-026** — 14 T-1-deferred pages still don't have explicit XSS-hardening test blocks. Adding them is incremental. #### Why this matters Pre-Bundle-8, the audit-report flagged 5 frontend findings — 2 of them (`L-015`, `L-019`) turned out to already be clean at HEAD but had no enforcement, so a careless future commit could regress. Bundle 8 verifies the clean state, ships the chokepoint helpers, and adds CI guards that fail on regression. The 3 partial findings (`M-009`, `M-010`, `M-026`) require touching every list page + every mutation site — a single PR scope of 5-7 days of mechanical migration work that's better done incrementally per page than as one large bundle. The new finding `M-029` tracks that backlog explicitly so future PRs can chip away at it without reopening this audit. ### Bundle 7 (Verification & Tool Suite Execution): wires mandatory scans + first-run evidence > Closes the audit's biggest scope gap from `cowork/comprehensive-audit-2026-04-25/tool-output/_SCOPE.txt`: the §12 mandatory tool runs that were deferred in the original audit session due to disk pressure. **Closures:** `D-002` clean; `D-001`, `D-006`, `H-005` partial; `D-003..D-005`, `D-007` wired CI-only. **New tracker IDs opened:** `H-010` (local-issuer coverage gap), `M-028` (6 deprecated-API sites), `L-020` (ineffassign cleanup sweep), `L-021` (5 transitive Go-module CVEs). #### Added - **`scripts/install-security-tools.sh` (NEW)** — idempotent installer for the Go-based subset of the §12 tool suite: govulncheck, staticcheck, errcheck, ineffassign, gosec, osv-scanner. Used locally for a Bundle-7-style run and by both CI workflows. - **`.github/workflows/security-deep-scan.yml` (NEW)** — daily + `workflow_dispatch` heavyweight scans for the container/network-bound subset. Steps: `gosec`, `osv-scanner`, `go test -race -count=10` against the full suite, `go test -cover` on the crypto cluster, `docker build` + `trivy image`, `syft` SBOM, ZAP baseline DAST, `schemathesis` OpenAPI fuzz, `nuclei` template scan, `testssl.sh` TLS audit. Every step `continue-on-error: true`; artefacts uploaded for triage. - **`staticcheck` CI gate (Audit D-001)** — added to `.github/workflows/ci.yml` alongside the existing govulncheck step. SOFT gate (`continue-on-error: true`) until `M-028` closes the 6 remaining SA1019 deprecated-API call sites; flip to fail-on-non-zero then. - **Per-package coverage gates for the crypto cluster (Audit H-005)** — `.github/workflows/ci.yml` extended: pkcs7 hard ≥85% (currently 100%), local-issuer soft ≥65% transitional floor (H-010 lifts to ≥85% once the missing CSR-validation + CA-cert-loading + key-rotation tests land). - **`.govulnignore` (NEW)** — empty placeholder with the suppression contract documented (one OSV ID + justification + review-by date per line). At Bundle-7 time the 5 deferred-call advisories don't need entries because govulncheck's default exit code already passes — the file is ready when an advisory becomes call-affected. - **`staticcheck.conf` (NEW)** — TOML config explicitly enumerating which checks are enabled. Suppresses 6 style-only rules (ST1005 capitalization, ST1000 package comments, ST1003 naming, S1009 redundant nil check, S1011 append-spread, SA9003 empty branches) with documented per-rule justifications. SA1019 (deprecated API) NOT suppressed. #### Tool-run evidence Local first-run receipts at `cowork/comprehensive-audit-2026-04-25/tool-output/2026-04-26/`: | Tool | Result | Receipt | |---|---|---| | govulncheck | clean — 0 affected; 5 deferred-call advisories → L-021 | `govulncheck.txt`, `govulncheck-verbose.txt` | | staticcheck | 6 SA1019 → M-028; 109 style suppressed via config | `staticcheck.txt`, `staticcheck-after-suppressions.txt` | | errcheck | 1294 sites — all defer-Close / response-write convention | `errcheck.txt` | | ineffassign | 15 unique sites — mechanical re-assignment patterns → L-020 | `ineffassign.txt` | | helm lint | clean (1 INFO-level icon recommendation) | `helm-lint.txt` | | `go test -race -count=3` | clean across scheduler / middleware / mcp | `go-test-race.txt` | | `go test -cover` (crypto cluster) | crypto 86.7% ✓ / pkcs7 100% ✓ / local-issuer 68.3% ✗ → H-010 | `go-test-cover.txt` | Container/network-bound tools (gosec, osv-scanner, semgrep, hadolint, trivy, syft, schemathesis, ZAP, nuclei, testssl.sh, kube-score, checkov) wired in the new deep-scan workflow but not run locally — sandbox lacks docker. Catalog of dispositions in `_BUNDLE-7-CLOSURE.md`. #### NOT addressed in this bundle (deferred to a Bundle-7-bis) - `M-007` bulk-operation partial-failure tests - `M-008` admin-gated role-gate tests - `L-010` `mock.Anything` overuse audit - `L-018` defect age analysis on remaining High findings #### Why this matters Pre-Bundle-7, the audit-report's "no Critical findings" claim was a manual-review attestation backed by `_SCOPE.txt` warning that "the static-analysis findings in lens-6.* files were derived from manual code review + grep, not automated SAST output." Bundle 7 inverts that: the §12 tool suite is now wired into CI as either a hard or soft gate, with first-run evidence preserved, and every surfaced finding triaged into either a documented suppression OR a new tracker ID. The audit's largest scope gap is now a recurring CI workflow rather than a deferred backlog item. ### Bundle 6 (Audit Integrity + Privacy): 3 audit findings closed > Closure bundle from the 2026-04-25 comprehensive audit > (`cowork/comprehensive-audit-2026-04-25/`). Hardens the audit trail > against tampering and minimizes PII exposure in one cohesive change — > closes HIPAA §164.312(b), GDPR Art. 32, and the audit-leak finding > H-008 with two complementary controls that apply automatically. > Closes H-008 + M-017 + M-022. #### Added - **`migrations/000018_audit_events_worm.up.sql` (NEW, Audit M-017 / HIPAA §164.312(b))** — DB-level append-only enforcement on `audit_events`. Two layers: (1) `audit_events_block_modification()` PL/pgSQL function fired by a `BEFORE UPDATE OR DELETE` trigger raises `check_violation` with a diagnostic citing the rationale + a HINT pointing at the compliance-superuser pattern; (2) `REVOKE UPDATE, DELETE ON audit_events FROM certctl` for defence-in-depth, wrapped in a `pg_roles` existence check so test fixtures and single-superuser setups stay idempotent. Pre-Bundle-6 enforcement was app-layer only — a buggy migration script, a manual `psql` session, or an attacker with the app role's DB credentials could rewrite history. Compliance superusers (legal hold, GDPR right-to-be-forgotten, statutory purges) use a separate role provisioned out-of-band — pattern documented in `docs/compliance.md` (NOT auto-created; operators provision per their compliance policy). - **`internal/service/audit_redact.go::RedactDetailsForAudit` (NEW, Audit H-008 + M-022 / CWE-532 / GDPR Art. 32)** — service-layer redactor chokepoint. Walks every `details` map BEFORE marshaling to JSONB. Two case-insensitive deny-lists: `credentialKeys` (~30 entries — `api_key`, `password`, `token`, `*_pem`, `eab_secret`, `acme_account_key`, `signature`, `bootstrap_token`, ...) replaced with `"[REDACTED:CREDENTIAL]"`; `piiKeys` (~20 entries — `email`, `phone`, `ssn`, `dob`, `name`, `address`, `postal_code`, `ip_address`, ...) replaced with `"[REDACTED:PII]"`. Recurses into nested maps + arrays; mutation-free (caller's map unchanged); surfaces a `redacted_keys` array listing scrubbed dotted-paths so operators can audit the redactor itself during a compliance review without exposing values (satisfies GDPR Art. 30 records-of-processing transparency). - **`migrations/000018_audit_events_worm.down.sql` (NEW)** — clean teardown for dev resets; not for production use. #### Changed - **`internal/service/audit.go::RecordEvent`** — now routes every `details` map through `RedactDetailsForAudit` before marshaling. No call-site changes required at any of the ~25 existing `RecordEvent` invocations across the service layer. #### Tests - `internal/service/audit_redact_test.go` (NEW, ~250 LOC) — every credential key, every PII key, nested maps, nested arrays, case-insensitivity, mutation-free invariant, JSON round-trip safety, no-redaction path (clean output for the common case), scalar pass-through (no panic on int/bool/nil). - `internal/repository/postgres/audit_worm_test.go` (NEW, testcontainers, gated by `testing.Short()`) — pins WORM contract: INSERT succeeds, UPDATE fails with `check_violation`, DELETE fails with `check_violation`, second INSERT after blocked modification still succeeds (no trigger-state corruption). #### Documentation - `docs/compliance.md` — new section "Audit-Trail Integrity & Privacy (Bundle 6)" with the two-layer enforcement table, verification `psql` snippet, compliance-superuser SQL pattern, redactor before/after JSON example, and a maintenance note for adding new credential-bearing fields. #### Why this matters Pre-Bundle-6, three compliance gaps and one direct security finding sat unfixed: (1) any host with the app role's DB credentials could rewrite the audit table — there was no DB-level append-only enforcement, only app-layer convention; (2) future service-layer call sites that accidentally passed a credential field in `RecordEvent` details would persist plaintext to the append-only audit table; (3) routine routes captured PII (email, phone, etc.) far beyond the GDPR Art. 32 minimization threshold via similar paths. Bundle 6 closes all three at once because they share the same code path (audit middleware + audit_events table) and the same fix shape (deny-list redaction + DB constraint). #### Backwards compatibility Trigger applies forward only — existing rows unchanged. `nil`/empty `details` from `RecordEvent` callers → `nil` out (preserves prior behaviour for the many existing call sites that pass nil). Compliance superusers (provisioned out-of-band) bypass the trigger by design. ### Bundle 5 (Operational Liveness + Bootstrap): 4 audit findings closed > Closure bundle from the 2026-04-25 comprehensive audit > (`cowork/comprehensive-audit-2026-04-25/`). Hardens the orchestrator- > facing surface — Kubernetes probes, agent enrollment, shutdown audit > drain — and confirms the L-006 short-lived-expiry plumbing already > shipped in v2.0.54 via the C-1 master closure. Closes > H-006 + H-007 + M-011 + L-006. #### Added - **`/ready` deep DB probe (Audit H-006 / CWE-754)** — `internal/api/handler/health.go::HealthHandler.Ready` now accepts a `*sql.DB` and runs `db.PingContext` with a 2-second ceiling; returns 503 + `{"status":"db_unavailable","error":""}` when the DB is unreachable. Pre-Bundle-5 `/ready` returned 200 unconditionally — k8s readinessProbe pointed at `/ready` would succeed even when the control plane was disconnected from Postgres, masking outages and routing user traffic to a broken instance. Post-Bundle-5: `/health` stays shallow (k8s liveness signal — process alive, never restart for DB hiccups); `/ready` is the new readiness signal. Nil DB pool degrades gracefully to 200 + `db=not_configured` for test fixtures and no-DB deploys. Helm chart already routed readinessProbe to `/ready` so no chart change required — the upgrade is purely behavioural. - **Agent bootstrap token (Audit H-007 / CWE-306 + CWE-288)** — new env var `CERTCTL_AGENT_BOOTSTRAP_TOKEN` and `internal/api/handler/agent_bootstrap.go::verifyBootstrapToken` helper. When set, `RegisterAgent` requires `Authorization: Bearer ` (constant-time compare via `crypto/subtle.ConstantTimeCompare`) BEFORE body parse — defeats both timing oracles and unauth payload allocation. Length-mismatch path runs a dummy compare so timing is uniform regardless of failure mode. 401 returns a fixed string `invalid_or_missing_bootstrap_token` (no echo of presented credential — defence against shape leakage to a token spray probe). Backwards-compat: empty token (the v2.0.x default) = warn-mode pass-through with one-shot startup deprecation WARN announcing v2.2.0 deny-default. Generation guidance: `openssl rand -hex 32` for 256-bit entropy. - **`CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS` env var (Audit M-011)** — `Server.AuditFlushTimeoutSeconds` field; `cmd/server/main.go` shutdown path uses `time.Duration(cfg.Server.AuditFlushTimeoutSeconds) * time.Second` with default 30s preserving prior behaviour. Server logs `graceful shutdown budget` at startup. High-volume operators can extend the window without forking the binary; existing WARN on deadline-exceeded retained. #### Tests - `internal/api/handler/agent_bootstrap_test.go` (NEW) — full coverage: missing header, wrong scheme, empty bearer, wrong token, length mismatch, matching bearer, warn-mode pass-through, RegisterAgent E2E gate (401 BEFORE service call). - `internal/api/handler/health_test.go` (extended) — `/ready` DB-ping failure (503 + db_unavailable), nil-DB pass-through (200 + db=not_configured), `/health` shallow with nil DB. #### Verified (no code change required) - **`L-006` Short-lived expiry interval plumb** — re-verified at HEAD: `cmd/server/main.go:557` already calls `sched.SetShortLivedExpiryCheckInterval(cfg.Scheduler.ShortLivedExpiryCheckInterval)` per the C-1 master closure in v2.0.54. Bundle 5 confirms; tracker box flipped, no code change required. #### Why this matters Pre-Bundle-5, three operational footguns sat unfixed: (1) k8s readinessProbe couldn't distinguish "process alive" from "DB reachable", so an outage looked healthy until users complained; (2) any host with network reach to the agent registration endpoint could enroll an agent and start polling for work — no shared secret required; (3) the shutdown audit drain was hard-coded 30s, which was too short for high-volume environments and dropped events silently. Bundle 5 closes all three plus verifies a fourth (L-006) that was already silently fixed by C-1. ### Bundle 3 (MCP Trust-Boundary Fencing): 5 audit findings closed > Second closure bundle from the 2026-04-25 comprehensive audit > (`cowork/comprehensive-audit-2026-04-25/`). Hardens the MCP↔LLM-consumer > trust boundary (TB-7) against CWE-1039 LLM Prompt Injection. Closes > H-002 + H-003 + M-003 + M-004 + M-005. #### Added - **MCP wrapper-layer fencing (`internal/mcp/fence.go`, new)** — `FenceUntrusted(label, content)` wraps content in `--- UNTRUSTED