Second CI run surfaced 8 real failures across 7 detail/list pages and 1
mock-shape error. Root causes:
1. Multi-match disambiguation. screen.getByText(...) matched both the
PageHeader <h2> AND duplicated text in InfoRow / detail-row spans
within the same page (e.g., issuer name appears as page title AND
in the Issuer Details panel; cert.common_name appears as page title
AND in the Common Name InfoRow). The regex variants (getByText(/X/i))
were even worse — matched any element containing the substring.
2. NetworkScanPage mock-shape. xssScanTarget.ports was '443,8443'
(string), but NetworkScanPage.tsx:180 calls t.ports?.join() which
requires a number[] per src/api/types.ts:506. Page errored before
rendering the DataTable, so the XSS test's body.textContent
assertion saw an empty string.
Fixes:
- Every page-title assertion in the 14 Pass 3 test files now uses
screen.getByRole('heading', { level: 2, name: ... }), which matches
ONLY the PageHeader <h2> (PageHeader.tsx:11 renders an actual <h2>).
Detail-row spans / InfoRow text / column-header text in lower-level
headings (h3) is excluded by the level filter.
- NetworkScanPage xssScanTarget.ports changed from '443,8443' (string)
to [443, 8443] (number[]) per the NetworkScanTarget TS type.
Pages with assertion fixes (8 tests across 7 files):
- AgentFleetPage /Agent/i -> 'Agent Fleet Overview' (h2)
- AuditPage /Audit/ -> 'Audit Trail' (h2)
- CertificateDetailPage 'plain.example.com' (text) -> heading h2
- HealthMonitorPage /Health/i -> 'Health Monitor' (h2)
- IssuerDetailPage 'Plain Name' (text) -> heading h2
- JobDetailPage /j-xss-001/ (text) -> heading h2
- JobsPage /Jobs/i -> 'Jobs' (h2)
- ProfilesPage /Profile/i -> 'Certificate Profiles' (h2)
- TargetDetailPage 'Plain Name' (text) -> heading h2
Plus 4 already-correct pages updated for consistency:
- DigestPage text 'Certificate Digest' -> heading h2
- ObservabilityPage text 'Observability' -> heading h2
- NetworkScanPage /Network/i -> 'Network Scanning' (h2)
- ShortLivedPage text 'Short-Lived...' -> heading h2
Mock-shape fix:
- NetworkScanPage.test.tsx ports: '443,8443' -> [443, 8443]
End-to-end audit:
Every Pass 3 test now anchors on the unambiguous PageHeader <h2>;
no remaining getByText() with regex or substring that could spuriously
multi-match. Mock data shapes verified against src/api/types.ts
interfaces (NetworkScanTarget, MetricsResponse, ManagedCertificate).
CI surfaced two real failures in the Pass 3 tests:
1. ObservabilityPage.test.tsx — tests 2 + 3 mocked getMetrics with only
the uptime field, but ObservabilityPage.tsx:85 reads metrics.gauge
.certificate_total. Test 2 silently 'passed' because the page error
bailed out before any rendering took place — its assertions (no live
<script>, __xss_pwned__ undefined) became vacuous; test 3 surfaced
the actual TypeError. Fix: every getMetrics mock now returns the full
MetricsResponse shape (gauge / counter / uptime) per src/api/types.ts
:517 — sanity-checked against the actual TS interface.
2. CertificateDetailPage.test.tsx — the xssCert mock was missing
updated_at, which CertificateDetailPage.tsx:605 reads through
formatDateTime. formatDateTime tolerates undefined per utils.ts:6,
so the page didn't throw, but the cert mock should mirror the real
ManagedCertificate shape — added updated_at.
Both fixes are mock-shape corrections; no production code changes.
Closes M-029 Pass 3 fully. Every src/pages/*.tsx now has a *.test.tsx peer.
Audit recon: 'comm -23 <pages> <test-peers>' returns zero (all 14 T-1-deferred
pages now covered).
Test files added (each ships render-coverage + an XSS-hardening contract):
- HealthMonitorPage.test.tsx endpoint URL + last_error payloads
- JobsPage.test.tsx type / certificate_id / agent_id /
error_message payloads
- NetworkScanPage.test.tsx network_range / agent_id / last_scan_message
payloads
- ProfilesPage.test.tsx profile name / description / EKUs payloads
- AgentFleetPage.test.tsx agent name / hostname / OS / arch / IP
payloads (mirrors the M-003 MCP fence shape)
Pass 3 totals across batches A + B + C: 14 new test files, 14/14 T-1-deferred
pages closed. Each test pins three invariants:
1. The page renders against mock data without crashing.
2. No live <script data-xss='...'> attaches to the DOM.
3. The literal payload appears as escaped text content (proving the page
surfaces the data without rendering it as HTML).
M-029 status after Pass 3:
Pass 1 — useMutation -> useTrackedMutation COMPLETE (6 batches, 56 -> 0)
Pass 2 — useState pagination -> useListParams COMPLETE (CertificatesPage)
Pass 3 — XSS-hardening test suites COMPLETE (14/14 pages)
M-029 IS NOW READY TO CLOSE.
Continues Pass 3. Each detail page has its own narrow attack surface
(subject DN, last_test_message, error_message) that the test exercises
with literal <script> payloads in every text field.
Test files added:
- CertificateDetailPage.test.tsx cert subject / SANs / serial / PEM
across 7 sidecar queries (getCertificate,
getCertificateVersions, getTargets,
getProfile, getProfiles, getRenewalPolicies,
getJobs all mocked in beforeEach)
- IssuerDetailPage.test.tsx issuer name / type / config / last_test_message
(router-aware test using Routes + useParams)
- TargetDetailPage.test.tsx target name / config / last_test_message
(router-aware test pattern)
- JobDetailPage.test.tsx job error_message / type / details
(3-query mock: getJob + getJobVerification +
getAuditEvents)
Closes 9 of 14 T-1-deferred pages toward M-029 Pass 3 completion (5 batch A,
+ 4 batch B = 9; 5 to go in batch C).
Pass 3 of M-029 ships per-page render + XSS-hardening test suites for the
14 T-1-deferred pages. Each test:
- Renders the page with mock data containing <script> payloads in every
text-rendering field.
- Asserts no live <script data-xss='...'> element attached to the DOM.
- Asserts no global side-effect from the script body executed (window
__xss_pwned__ stays undefined).
- Asserts the literal payload text appears as escaped content (proving
the page surfaces the data without rendering it as HTML).
Batch A: 5 simpler pages (display-only / single-mutation / login).
Test files added:
- DigestPage.test.tsx preview HTML payload + render coverage
- LoginPage.test.tsx useAuth.error payload + form invariants
(mocked AuthProvider via Layout.test pattern)
- ShortLivedPage.test.tsx cert subject DN / SAN / id / environment
payloads through the DataTable rendering
- AuditPage.test.tsx audit-event action / actor / resource_*
payloads through the DataTable rendering
- ObservabilityPage.test.tsx health.status + Prometheus text payloads
through the <pre> rendering surface
Closes 5 of 14 T-1-deferred pages toward M-029 Pass 3 completion.
M-029 Pass 2 surface turned out to be much smaller than the audit estimated:
the only page with real UI-driven pagination + filter state stored in
useState was CertificatesPage. Most other pages either fetch filter-dropdown
data with hardcoded per_page (sidecars, not pagination) or use
useSearchParams directly already. So Pass 2 is a single-page migration.
What changed:
- 9 useState hooks (statusFilter, envFilter, issuerFilter, ownerFilter,
profileFilter, teamFilter, expiresBefore, sortBy, page, perPage) collapse
into a single useListParams({ pageSize: 50 }) call.
- All filter onChange handlers now call setFilter('<key>', value).
- setFilter automatically resets page to 1 on every filter / sort change,
so the manual setPage(1) calls at three sites (team / expires_before /
sort) are no longer needed — the F-1 contract is now enforced by the
hook, not by hand-rolled setPage calls scattered through onChange.
- Pagination handler simplified: onPerPageChange: setPageSize (the hook
drops the page param from the URL when pageSize changes).
Behavior preserved:
- The 8 filter keys (status / environment / issuer_id / owner_id /
profile_id / team_id / expires_before / sort) still flow through
getCertificates with the same param names — pinned by the existing
CertificatesPage.test.tsx F-1 contract tests.
- Default pageSize stays at 50 (matches the F-1 baseline; the hook's
global default is 25 but the per-page override takes precedence).
- Page reset on filter / per_page change preserved (now hook-enforced).
Side benefit: filter / sort / pagination state is now URL-resident (browser
deep-link + back-button correct). Sharing a filtered list view is now a
URL copy, not a 'recreate this filter combo by hand' message.
Verification:
legacy useMutation count still 0 (Pass 1 invariant intact)
CertificatesPage useListParams 0 -> 1 site
CertificatesPage local pagination removed
Pass 1 finished — every src/ useMutation now goes through useTrackedMutation.
Promote the M-009 guard to a hard-zero invariant: any bare useMutation() call
outside web/src/hooks/useTrackedMutation.ts fails CI immediately.
Pre-Bundle-8 the codebase had 56 bare useMutation sites. Bundle 8 shipped the
wrapper. M-029 Pass 1 migrated all 56 sites to the wrapper across 6 batches
(commits 2057e76 / e0a3d50 / ee25f00 / ec3772d / 190a27e / 213b464). With the
soft-budget gate now obsolete, the hard-zero gate prevents drift back into
the discretionary-invalidation pattern that motivated M-009 in the first place.
Rationale: per-site enforcement (the wrapper's discriminated-union invalidates
contract) is strictly stronger than the +5 budget guard. The guard's failure
mode also improves: instead of a count delta the operator has to interpret,
they get the exact file:line(s) of the offending bare useMutation call.
Verification:
python3 yaml.safe_load YAML OK
manual guard simulation PASS: bare useMutation = 0 outside wrapper
Drains the last 10 useMutation sites (10 -> 0). Pass 1 is now COMPLETE:
every legacy useMutation site in src/pages and src/components has been
migrated to useTrackedMutation with explicit invalidates contract. The only
remaining useMutation reference in the codebase is inside useTrackedMutation.ts
itself (the wrapper).
Pages migrated:
- CertificateDetailPage.tsx 5 mutations across 2 components:
InlinePolicyEditor.saveMutation invalidates
[['certificate', certId]];
main page renew/deploy/archive/revoke invalidate
various combinations of [['certificate', id]]
and [['certificates']].
(queryClient + useQueryClient dropped from both)
- OnboardingWizard.tsx 5 mutations across 4 components:
Issuer step create/test invalidates [['issuers']]
(test refreshes last_tested_at server-side);
CreateTeamModalInline.create invalidates [['teams']];
CreateOwnerModalInline.create invalidates [['owners']];
CertificateStep.create invalidates
[['certificates'], ['dashboard-summary']].
(queryClient + useQueryClient dropped from all 4)
Verification:
legacy useMutation calls 10 -> 0 (-10) — Pass 1 COMPLETE
useTrackedMutation count 46 -> 61 (+15; some 5-mutation pages collapse
two invalidate-pairs into one array literal,
hence net is greater than the +10 removal)
Pass 1 totals: 56 useMutation sites -> 0; 0 useTrackedMutation -> 61.
Total work in Pass 1: 6 batches across 21 page files merged --no-ff to master.
Closes the 2026-04-25 audit's final-closure cluster. Score 51/55 -> 54/55
(98% closed); deferred 4/7 -> 7/7 (100%). All severity-graded findings now
closed except M-029 (frontend per-PR migration backlog, by design incremental).
L-004 (CWE-924) — dual-key API rotation overlap window:
internal/config/config.go::ParseNamedAPIKeys rewritten to allow same-name
duplicate entries iff admin flag matches. Mismatched-admin entries rejected
at startup (privilege escalation guard); exact (name,key) duplicates rejected
(typo guard — rotation requires DIFFERENT keys under the same name). Startup
INFO log per name with multiple entries surfaces the active rotation window.
NewAuthWithNamedKeys was already shaped correctly (constant-time hash compare
across all entries, same UserKey + AdminKey for either bearer); Bundle B's
M-025 per-user rate-limit bucket and audit-trail actor inherit consistency
across the rollover automatically. 8 new tests pin the contract end-to-end.
docs/security.md::API key rotation walks the 6-step zero-downtime rollover.
D-003 — Mutation testing wired:
security-deep-scan.yml gets a go-mutesting step covering ./internal/crypto/...,
./internal/pkcs7/..., ./internal/connector/issuer/local/... with per-package
summary lines extracted into go-mutesting.txt artefact.
D-007 — Frontend semgrep wired (recon found Bundle 7's wiring claim was false):
security-deep-scan.yml gets a 'semgrep p/react-security' step running
returntocorp/semgrep:latest --config=p/react-security against /src/web/src;
results uploaded as semgrep-react.json.
D-004 + D-005 — Operator runbook published:
docs/testing-strategy.md (NEW) consolidates per-tool local-run procedures,
acceptance thresholds, and triage paths for go-mutesting, ZAP baseline DAST,
testssl.sh, and semgrep p/react-security. Closes the 'wired CI-only, no
local-run validation' framing for D-004/D-005 by giving operators the same
commands the CI workflow runs.
Verification:
gofmt -l no diff
go vet ./internal/config/... ./internal/api/middleware/... clean
go test -short -count=1 ./internal/config/... ./internal/api/middleware/... PASS
python3 -c 'yaml.safe_load(...)' YAML OK
G-3 env-var docs guard no phantom env-vars
Audit deliverables:
audit-report.md: L-004 + D-003/4/5/7 boxes flipped [x]; score 51/55 -> 54/55
findings.yaml: 5 status flips; new bundle-G-final-closure closure_log entry
CHANGELOG.md: Bundle G entry under [unreleased]; supersedes Bundle E + F
L-004-deferred framing
CI on the bundle-F merge (run #24972730564) failed the G-3 env-var
docs guardrail because docs/legacy-est-scep.md mentioned
CERTCTL_EST_PROXY_TRUSTED_SOURCES
CERTCTL_EST_TRUST_PROXY_CLIENT_CERT_HEADER
which are documented as future-feature env vars but don't exist in
config.go. The G-3 guard treats any env-var name in docs that's not
either defined in source OR on the documented integration-surface
allowlist as drift.
The runbook's 'certctl-side configuration' section was over-promising
features that haven't shipped yet. Rewritten to be honest:
- Current implementation is header-agnostic (X-SSL-Client-Cert is
ignored). EST/SCEP authentication still works correctly because
both protocols carry their own auth (CSR signature for EST,
challengePassword for SCEP) inside the request body.
- The reverse proxy is purely a TLS-version bridge.
- Future-feature description retained in prose form (without
literal env-var names) so an operator who needs proxy-supplied
client identity knows to open an issue.
The nginx config block's comment was also rewritten to reflect the
header-agnostic default. The proxy still SETS the headers (cheap,
no-op when ignored); a future commit can flip certctl to read them
behind a fail-closed CIDR allowlist + opt-in toggle.
Verification:
grep -rnE 'CERTCTL_EST_PROXY|CERTCTL_EST_TRUST' README.md docs/ deploy/helm/
— empty (G-3 guard now passes for these names)
Closes H-001 + M-012 + M-014 from comprehensive-audit-2026-04-25.
H-001 (CWE-829) — Container base images SHA-pinned
Pre-bundle: 5 FROM lines pulled by tag only — registry-side tag
swap could silently change the build.
Post-bundle: every FROM pinned to immutable digest fetched live
from Docker Hub at audit time:
node:20-alpine@sha256:fb4cd12c85ee03686f6af5362a0b0d56d50c58a04632e6c0fb8363f609372293
golang:1.25-alpine@sha256:5caaf1cca9dc351e13deafbc3879fd4754801acba8653fa9540cea125d01a71f (x2)
alpine:3.19@sha256:6baf43584bcb78f2e5847d1de515f23499913ac9f12bdf834811a3145eb11ca1 (x2)
Dockerfile header comment documents the operator bump procedure
(quarterly cadence; docker manifest inspect or Hub Registry API).
CI step Forbidden bare FROM regression guard (H-001) fails build
if any new FROM lacks @sha256.
M-012 (CWE-250) — Verified-already-clean + USER guard
Recon found both Dockerfile:75 and Dockerfile.agent:59 already
carry USER certctl directives; pre-USER RUN calls are build-setup
steps that legitimately need root, each happening before the
USER drop.
CI step Forbidden missing USER regression guard (M-012) greps
every Dockerfile* for the LAST USER directive; fails build if
missing OR equals root/0. Future Dockerfile additions must
preserve the privilege drop.
M-014 — npm ci explicit retry helper
Pre-bundle Dockerfile:25:
RUN npm ci --include=dev || npm ci --include=dev && \
tsc --version && npm run build
Broken bash precedence: A || (B && C && D) means tsc+build only
ran on success path of the second npm ci. A transient registry
blip silently skipped the production step — build would succeed
with no node_modules + no tsc verification.
Post-bundle: deterministic 3-attempt retry loop with 5s backoff
plus explicit [ -d node_modules ] post-check that fails loudly
if directory wasn't created. Silent failure is now impossible.
Audit deliverables:
audit-report.md: H-001/M-012/M-014 flipped [x] with closure
notes; score 49/55 closed (High 9/9 = 100%; Medium 24/27;
Low 19/19 with L-004 deferred). All High audit findings now
closed for the first time.
findings.yaml: 3 status flips
CHANGELOG.md: Bundle A section
Verification:
Self-test of both new CI guards locally — PASS for current state
(every FROM has @sha256; every Dockerfile drops to non-root).
Closes L-009 + L-010 + L-011 + L-013 + L-020 + L-021 from
comprehensive-audit-2026-04-25. L-004 deferred — recon found NO
rotation infrastructure exists at all; building it from scratch is
a feature project, not a Bundle-E mechanical sweep.
L-009 — ZeroSSL EAB URL configurable
Audit's 'no timeout' claim was wrong: ari.go:329 has 15s timeout.
internal/connector/issuer/acme/acme.go: zeroSSLEABEndpoint now
lazily reads CERTCTL_ZEROSSL_EAB_URL from env at package init;
defaults to ZeroSSL public endpoint. Pre-existing test override
path preserved.
L-010 — Verified-already-clean
grep -rn 'mock\.Anything' --include='*_test.go' . returned 0.
certctl uses hand-rolled struct mocks (mockJobRepo, mockAuditRepo,
etc.) with explicit method bodies; no testify-style mocks anywhere.
L-011 — IPv6 bracket-aware dialing pinned
Every production net.Dial / DialTimeout site audited:
cmd/agent/main.go:293 — intentional IPv4 literal '8.8.8.8:80'
verify.go / tlsprobe / network_scan — net.Dialer (no string addr)
email.go — net.JoinHostPort (bracket-aware)
ssh.go — addr derives from JoinHostPort upstream
ssrf.go — net.Dialer
internal/connector/notifier/email/email_ipv6_test.go (NEW):
TestJoinHostPort_IPv6BracketsRoundTrip pins IPv4/IPv6/zone variants;
TestSMTPDialerUsesJoinHostPort source-greps email.go and fails CI
if a future refactor swaps in 'host:port' concatenation.
L-013 — Verified-already-clean (monotonic-safe)
Only one site uses now.Sub: middleware.go:393 in tokenBucket.allow().
Both 'now' and tb.lastRefill come from time.Now() which carries
monotonic-clock readings per Go's time package contract;
intra-process now.Sub is monotonic-safe by construction. Doc
comment block added above the call to make the invariant explicit.
L-020 (CWE-563) — ineffassign sweep, 8 unique sites
certificate.go:135 — sortDir initial value dropped (set
unconditionally below by SortDesc branch).
certificate.go:169,175 — argCount post-increments dropped (var
not read past the LIMIT/OFFSET formatting).
agent_group.go, profile.go — page/perPage truly vestigial,
replaced with _ = page; _ = perPage.
issuer.go:633, owner.go:131, target.go:267, team.go:131 — same
treatment for the audit-flagged second-function ListXxx clamps.
First-function List() in issuer/owner/target/team KEEPS its
clamp because page/perPage is used for in-memory slice
pagination — ineffassign correctly didn't flag those.
Build + tests green post-sweep.
L-021 — Transitive CVE bump
go get golang.org/x/crypto@v0.45.0 golang.org/x/net@v0.47.0
(crypto required net@0.47.0). go-text@v0.31.0 transitively
bumped.
Per tool-output govulncheck-verbose: x/net@v0.45.0 fixes
GO-2026-4441 + GO-2026-4440; x/crypto@v0.45.0 fixes
GO-2025-4134 + GO-2025-4135 + GO-2025-4116 — all 5 advisories
cleared. Bundle B's ISV grep guard + Bundle D's release-time
govulncheck step are the going-forward monitor + bump pass.
L-004 — Deferred to dedicated bundle
Recon: zero hits for RotateAPIKey / rotated_at / key_status
anywhere in source. API keys configured via
CERTCTL_API_KEYS_NAMED env var; rotation is operator-managed
(edit env + restart). Building rotation infrastructure from
scratch is a feature project, not a mechanical sweep.
Documented in audit-report.md with scope-pivot note.
Audit deliverables:
audit-report.md: score 46/55 -> 52/55 closed
(Low 14/19 -> 19/19 — 100% Low closed except L-004 deferred)
findings.yaml: 6 status flips
certctl/CHANGELOG.md: Bundle E section
Verification:
go test -count=1 -short ./internal/service ./internal/connector/issuer/acme
./internal/connector/notifier/email green
go vet on changed packages clean
Closes H-009 + L-001 + L-007 + L-008 + L-016 + L-017 + L-018 + M-027
from comprehensive-audit-2026-04-25.
H-009 — README JWT verified-already-clean
README has zero JWT mentions at audit time. docs/architecture.md
correctly documents JWT/OIDC integration via authenticating-gateway
pattern (line 905-912).
.github/workflows/ci.yml: new step
'Forbidden README JWT advertising regression guard (H-009)'
greps README for JWT-as-supported phrasing; passes verbatim
(gateway / pre-G-1) but fails build on net-new advertising.
L-001 (CWE-295) — InsecureSkipVerify per-site justification
Audit count was 8; recon found 13 production sites.
docs/tls.md: new 'InsecureSkipVerify justifications' table
enumerates each site by file:line with per-site rationale.
cmd/agent/verify.go:78, internal/tlsprobe/probe.go:54,
internal/service/network_scan.go:460: each previously-bare
InsecureSkipVerify: true now carries //nolint:gosec.
.github/workflows/ci.yml: new step
'Forbidden bare InsecureSkipVerify regression guard (L-001)'
fails build if any net-new ISV lands in non-test .go without
nolint:gosec on the same or preceding line.
L-007 — README dependency-audit commands
README.md: new Dependencies section with go list -m all | wc -l,
go mod why, govulncheck ./.... Honors operating-rules invariant.
L-008 — Release-time govulncheck gate
.github/workflows/release.yml: new 'Install govulncheck' +
'Run govulncheck (release gate)' steps in the matrix job.
Pinned to same install path as ci.yml. Default exit code
semantics (fail on called-vuln only, deferred-call advisories
tracked on master via L-021) keeps the gate appropriate.
L-016 — architecture.md drift fixes
docs/architecture.md: system-components diagram's '21 tables'
annotation removed (current 23; replaced with TEXT-keys
descriptor); connector-architecture '9 connectors' prose
replaced with grep ref + current 12-issuer list (added
Entrust/GlobalSign/EJBCA which were missing); API-design
'97 operations / 107 total' replaced with grep commands.
Connector subgraphs verified-current at 12/13/6.
L-017 — workspace CLAUDE.md verified-already-clean
Bundle B's pre-commit-gate refactor already converted current-
state numeric claims to grep commands. Phase 0 recon confirmed
zero remaining hardcoded counts.
L-018 — Defect age table
cowork/comprehensive-audit-2026-04-25/defect-age.md (NEW):
Tabulates all 9 High findings with first-mentioned commit,
closing bundle, days-open. Methodology snippet for re-running.
Key finding: 8 of 9 closed within 24h of audit publication.
M-027 — OpenAPI parity verified-already-clean
Audit's 'router 121 vs OpenAPI 125 — 4-op gap' was wrong
methodology. The 4-op 'gap' was exactly the 4 routes registered
via r.mux.Handle (auth-exempt allowlist) instead of r.Register.
When you count both dispatch shapes the totals match exactly.
internal/api/router/openapi_parity_test.go (NEW):
TestRouter_OpenAPIParity AST-walks router.go for both
Register and mux.Handle calls + walks api/openapi.yaml's
path/method nesting + asserts the sets match. Adding a route
without updating the spec fails CI permanently.
Audit deliverables:
audit-report.md: score 38/55 -> 46/55 closed
(High 7/9 -> 8/9; Medium 20/27 -> 21/27; Low 8/19 -> 14/19)
findings.yaml: 8 status flips open -> closed
defect-age.md: new file
certctl/CHANGELOG.md: Bundle D section
Verification:
TestRouter_OpenAPIParity PASS
L-001 grep guard self-test (after //nolint:gosec adds) PASS
H-009 grep guard self-test PASS
go test -count=1 -short on changed packages green
CI on the bundle-C merge (run #24970879984) failed go vet because
internal/integration/lifecycle_test.go::mockJobRepository didn't
implement the new JobRepository.ListJobsWithOfflineAgents method
that Bundle C added.
The lifecycle integration test does not exercise the offline-agent
reaper path (the unit-level test in internal/service covers that),
so the integration-mock stub is a no-op returning (nil, nil) — same
shape as the existing M-7 / I-003 stubs in this file.
Verification:
go vet ./internal/integration clean
go test -count=1 -short ./internal/integration green
Closes M-006 + M-007 + M-008 + M-015 + M-016 + M-019 + M-020 from
comprehensive-audit-2026-04-25. M-028 was already closed by the
Bundle B CI follow-up.
M-006 (CWE-913) — Idempotent migration 000014
migrations/000014_policy_violation_severity_check.up.sql:
Prepended ALTER TABLE ... DROP CONSTRAINT IF EXISTS before the
ADD. Mirrors the down migration's existing IF EXISTS shape and
the M-7 idempotent-index idiom. Re-runs against partially-applied
DBs now succeed.
M-007 — Bulk-op partial-failure tests (3 new)
internal/api/handler/bulk_partial_failure_test.go:
TestBulkRevoke_PartialFailure_ReportsBoth
TestBulkRenew_PartialFailure_ReportsBoth
TestBulkReassign_PartialFailure_ReportsBoth
Each asserts HTTP 200 + both success/failure counters round-trip
+ per-cert errors[] preserved with non-empty messages so operators
can correlate each failure to its certificate ID.
M-008 — Admin-gated handler enumeration pin (verified-already-clean)
Recon: only one admin-gated handler — bulk_revocation.go — with
full 3-branch test triplet already in place. health.go calls
IsAdmin informationally to surface the flag to the GUI without
gating.
internal/api/handler/m008_admin_gate_test.go:
Walks every handler .go file, asserts every middleware.IsAdmin
call site is in AdminGatedHandlers (with required test triplet)
or InformationalIsAdminCallers (justified). Adding a new admin
gate without updating both the constant AND adding the test
triplet fails CI.
M-015 — Single-profile cardinality pin (verified-already-clean)
Audit claim 'no cardinality validation' was wrong — enforced at
struct level. domain.ManagedCertificate.{CertificateProfileID,
RenewalPolicyID,IssuerID,OwnerID} and RenewalPolicy.
CertificateProfileID are bare strings, not slices.
internal/domain/m015_cardinality_test.go:
reflect-based pin on kind=String. Schema change to N:N would
have to update renewal.go's lookup loop in the same commit.
M-016 (CWE-754) — Reap stale-agent jobs
internal/repository/postgres/job.go::ListJobsWithOfflineAgents:
JOIN jobs to agents on agent_id, filter (status=Running AND
a.last_heartbeat_at < cutoff), exclude server-keygen jobs.
internal/service/job.go::ReapJobsWithOfflineAgents:
Flips matched jobs to Failed reason agent_offline so I-001
retry loop re-queues them on a healthy agent. Records audit
event per reap.
internal/scheduler/scheduler.go:
Scheduler.runJobTimeout cycle now calls both reaper arms.
agentOfflineJobTTL default 5min (5x agent-health-check default);
SetAgentOfflineJobTTL knob for operator override.
internal/service/job_offline_agent_reaper_test.go: 6 unit tests
cover happy path, server-keygen-skip, non-Running-skip, non-
positive-TTL fail-loud, repo-error propagation, audit-event
recording.
M-019 — Configurable ARI HTTP timeout
Audit claim 'no fallback timeout' was wrong — ari.go:52 already
had a 15s timeout. Bundle C makes it configurable.
internal/connector/issuer/acme/acme.go:
Config.ARIHTTPTimeoutSeconds field with env path
CERTCTL_ACME_ARI_HTTP_TIMEOUT_SECONDS.
internal/connector/issuer/acme/ari.go:
Both HTTP clients (GetRenewalInfo + getARIEndpoint) now use the
new ariHTTPTimeout() helper. Zero / negative / nil-config all
fall back to the historic 15s default.
ari_timeout_test.go: 4 dispatch arm tests.
M-020 (CWE-770) — OCSP DoS hardening
Pre-bundle the noAuthHandler chain had no rate limit. An attacker
could DoS the OCSP responder, which for fail-open relying parties
is a revocation bypass.
cmd/server/main.go:
noAuthHandler refactored from fixed middleware.Chain(...) to a
conditional slice that appends middleware.NewRateLimiter when
cfg.RateLimit.Enabled. Per-IP keying applies; OCSP/CRL/EST/SCEP
are unauth.
docs/security.md (NEW):
Operator runbook documenting Must-Staple TLS Feature extension
RFC 7633 as the architectural fix for fail-open relying parties.
Profile-flip guidance + nginx/Apache/HAProxy/Envoy stapling
snippets + explicit scope statement on what the rate limiter
alone does NOT solve.
Audit deliverables:
cowork/comprehensive-audit-2026-04-25/audit-report.md: score
31/55 -> 38/55 closed (Medium 13/27 -> 20/27).
cowork/comprehensive-audit-2026-04-25/findings.yaml: 7 status
flips open -> closed with closure notes citing the Bundle C
mechanism.
certctl/CHANGELOG.md: Bundle C section under [unreleased].
Verification:
go vet ./internal/service ./internal/scheduler ./internal/connector/issuer/acme
./internal/api/handler ./internal/domain ./cmd/server clean
go test -count=1 -short on the same packages all green
helm template + helm lint clean
internal/repository/postgres setup-fail sandbox disk
pressure (same on master HEAD before this branch)
Two CI failures on master after Bundle B merge:
1. Frontend Build / G-3 env-var docs guardrail
Bundle B introduced CERTCTL_RATE_LIMIT_PER_USER_RPS and
CERTCTL_RATE_LIMIT_PER_USER_BURST without adding them to
docs/features.md. The guardrail step that scans Go source for
getEnv* calls and asserts each appears in a doc page failed.
Fix: docs/features.md rate-limit section extended with both new
env vars + a paragraph explaining the per-key keying contract
from M-025.
2. Go Build & Test / staticcheck SA1019 hits (6 errors)
The CI workflow runs staticcheck without continue-on-error. Bundle
7 opened M-028 to track 6 deprecated-API sites; Bundle 9 closed 1
of them (the elliptic.Marshal in local.go) but kept a deliberate
regression-oracle reference in bundle9_coverage_test.go protected
only by golangci-lint's //nolint comment — staticcheck-as-CLI does
not honor that, only its native //lint:ignore directive.
Closure of remaining 5 sites:
cmd/server/main_test.go:47, 163, 192, 465 — 4 × middleware.NewAuth
migrated to middleware.NewAuthWithNamedKeys with explicit
NamedAPIKey entries. The auth=none case at line 465 maps to a
nil NamedAPIKey slice (no-op pass-through, matches the
NewAuthWithNamedKeys contract for empty input). Audit count was
3; recon found a 4th at line 465 that was missed.
internal/api/handler/scep.go:266 — csr.Attributes is a real RFC
2985 §5.4.1 challengePassword carve-out. Go's stdlib deprecation
note explicitly applies only to OID 1.2.840.113549.1.9.14
(requestedExtensions), NOT to OID 1.2.840.113549.1.9.7
(challengePassword), for which there is no non-deprecated
stdlib API. Suppressed with native //lint:ignore SA1019 +
comment block citing the RFC.
internal/connector/issuer/local/bundle9_coverage_test.go:342 —
deliberate regression-oracle that calls elliptic.Marshal to
prove the new crypto/ecdh path is byte-identical. Comment
converted from //nolint:staticcheck to native //lint:ignore
SA1019 so staticcheck-as-CLI honors the suppression.
Audit deliverables:
cowork/comprehensive-audit-2026-04-25/audit-report.md: M-028 box
flipped [x]; score 30/55 -> 31/55 (Medium 12/27 -> 13/27).
cowork/comprehensive-audit-2026-04-25/findings.yaml: M-028 status
partial_closed -> closed with closure note.
Verification:
go test -count=1 -short ./cmd/server ./internal/api/handler
./internal/connector/issuer/local ./internal/api/middleware
./internal/config — all green.
staticcheck on each changed package — 0 SA1019 hits.
Bundle C had M-028 in scope; this CI-fix lift moves it forward so
master CI goes green immediately. Bundle C scope adjusts to remove
M-028 and focuses on M-006 / M-015 / M-016 / M-019 / M-020 plus the
M-007 / M-008 coverage gaps.
Closes M-001 + M-002 + M-013 + M-018 + M-025 from
comprehensive-audit-2026-04-25.
M-001 (CWE-916) — PBKDF2 100k -> 600k via v3 blob format
internal/crypto/encryption.go:
- New v3Magic (0x03), pbkdf2IterationsV3 (600,000 — OWASP 2024
Password Storage Cheat Sheet floor), v3SaltSize (16 bytes),
deriveKeyWithSaltV3 helper.
- EncryptIfKeySet now unconditionally writes v3:
magic(0x03) || salt(16) || nonce(12) || ciphertext+tag
- DecryptIfKeySet falls through v3 -> v2 -> v1 with AEAD verification
at each step. Wrong-passphrase v3 reads cannot be silently
misattributed to v2/v1.
- IsLegacyFormat updated to recognize 0x03 as non-legacy.
internal/crypto/encryption_v3_test.go (NEW, 7 tests):
V3 round-trip / V2 read-fallback against deterministic v2 fixture /
V3 wrong-passphrase fails / V3-vs-V2 dispatch order / V2 vs V3 keys
differ for same (passphrase, salt) / iteration-count pin at OWASP
2024 floor / IsLegacyFormat-recognises-V3.
Coverage internal/crypto: 86.7% -> 88.2%.
M-002 (CWE-862) — Auth-exempt allowlist constants + AST regression test
Recon found auth-exempt surface spans TWO layers (audit's claim was
incomplete):
Layer 1 (router.go direct r.mux.Handle):
GET /health, GET /ready, GET /api/v1/auth/info, GET /api/v1/version
Layer 2 (cmd/server/main.go::buildFinalHandler URL-prefix dispatch):
/.well-known/pki/*, /.well-known/est/*, /scep[/...]*
internal/api/router/router.go:
- New AuthExemptRouterRoutes constant with per-entry justifications.
- New AuthExemptDispatchPrefixes constant.
internal/api/router/auth_exempt_test.go (NEW, 2 tests):
AST-walks router.go for every direct mux.Handle call and asserts
set equals AuthExemptRouterRoutes; reads source bytes of Register /
RegisterFunc and asserts they still wrap with middleware.Chain.
cmd/server/auth_exempt_test.go (NEW, 2 tests):
14-case table test on buildFinalHandler asserting documented
prefixes route to noAuthHandler and authenticated routes route to
apiHandler; inverse-overlap pin proves no documented bypass shadows
an authenticated prefix.
M-013 (CWE-942) — CORS deny-by-default verified-already-clean + pin
Audit claim 'default allows all origins if env-var unset' was WRONG.
internal/api/middleware/middleware.go::NewCORS already denies cross-
origin requests when len(cfg.AllowedOrigins) == 0 (no
Access-Control-Allow-Origin header is emitted, same-origin policy
applies).
internal/api/middleware/cors_test.go: +TestNewCORS_NilOriginsDeniesAll
+ TestNewCORS_M013_ContractDocumentedInOrder (5-case table test
pinning the 3-arm dispatch contract).
M-018 (CWE-319 / PCI-DSS Req 4) — Postgres TLS opt-in toggle
deploy/helm/certctl/values.yaml: new postgresql.tls.{mode,caSecretRef}
operator-facing knobs. Default 'disable' preserves in-cluster pod-
network behavior; PCI-scoped operators set verify-full.
deploy/helm/certctl/templates/_helpers.tpl: certctl.databaseURL helper
pipes postgresql.tls.mode into ?sslmode=.
deploy/helm/certctl/templates/server-secret.yaml: uses the helper
instead of hardcoded sslmode=disable.
deploy/docker-compose.yml: CERTCTL_DATABASE_URL is now
${CERTCTL_DATABASE_URL:-...} so operators override without editing.
docs/database-tls.md (NEW): operator runbook covering 4 deployment
shapes, RDS verify-full example with PGSSLROOTCERT mount, and
pg_stat_ssl verification query.
helm template + helm lint clean.
M-025 (OWASP ASVS L2 §11.2.1) — Per-key rate limiting
internal/api/middleware/middleware.go::NewRateLimiter rewritten from
a single global tokenBucket to a keyedRateLimiter map keyed on
'user:'+GetUser(ctx) for authenticated callers
'ip:'+RemoteAddr-host for unauthenticated
- Empty UserKey strings treated as unauthenticated.
- X-Forwarded-For intentionally NOT consulted (header-spoofing risk).
- Create-on-demand bucket allocation under sync.RWMutex with double-
check pattern.
RateLimitConfig.PerUserRPS / PerUserBurstSize fields with env vars
CERTCTL_RATE_LIMIT_PER_USER_RPS / CERTCTL_RATE_LIMIT_PER_USER_BURST
allow per-user budgets distinct from per-IP.
internal/api/middleware/ratelimit_keyed_test.go (NEW, 5 tests):
TwoIPsHaveIndependentBuckets / SameUserDifferentIPsShareBucket /
TwoUsersHaveIndependentBuckets / PerUserBudgetOverride /
EmptyUserKeyTreatedAsAnonymous.
Coverage internal/api/middleware: 82.1% -> 83.7%.
Audit deliverables:
cowork/comprehensive-audit-2026-04-25/audit-report.md: score
25/55 -> 30/55 closed (High 7/9, Medium 7/27 -> 12/27, Low 8/19).
cowork/comprehensive-audit-2026-04-25/findings.yaml: 5 status flips
open -> closed with closure notes citing the Bundle B mechanism.
certctl/CHANGELOG.md: Bundle B section under [unreleased].
Verification:
go test -count=1 -short ./... all green
staticcheck on changed packages no new SA*/ST* hits
(the 4 pre-existing SA1019 sites in cmd/server/main_test.go are
Bundle 9 / M-028 partial closure leftovers tracked in Bundle C)
helm template + helm lint clean
internal/repository/postgres setup-fail sandbox disk pressure,
same on master HEAD before this branch — environmental, not Bundle B