mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 12:41:30 +00:00
master
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
02438ad9e1 |
ci: floor raise + doc drift (Phase 3 closure — TEST-H1/H2/M1/M2/M3/M4/L1, ARCH-H3/L1/L2/L3/L4)
Twelve findings from the architecture diligence audit's Phase 3 bundle
closed in one PR. All touch the CI workflows + small doc-drift fixes
across the production Go tree + migration headers.
CI workflow changes
====================
TEST-H1 — Race detection on ./... -short
.github/workflows/ci.yml:106 was a 9-package explicit list. Audit
finding TEST-H1 flagged that 25+ packages (internal/auth/*,
internal/repository/*, internal/mcp, internal/scep, internal/pkcs7,
internal/api/router, internal/api/acme, internal/cli, internal/cms,
internal/config, internal/deploy, internal/integration,
internal/ratelimit, internal/secret, internal/trustanchor, all of
cmd/) silently dropped off race coverage.
Post-fix: 'go test -race -short ./... -count=1 -timeout 600s'.
76 testing.Short() guards already cover testcontainers + live-DB
integration suites, so -short keeps the long-running tests out.
TEST-H2 — Cross-platform build matrix
New 'cross-platform-build' job in ci.yml. Matrix:
ubuntu-latest + windows-latest + macos-latest, fail-fast: false.
Builds cmd/server + cmd/agent + cmd/cli + cmd/mcp-server on each.
Catches Windows-specific regressions (path separators, file
permissions, exec.Command semantics) the pre-Phase-3 Ubuntu-only
CI missed.
TEST-L1 — actions/setup-go cache: true (explicit)
setup-go v5 defaults cache: true; making it explicit so a future
setup-go upgrade can't silently flip it. Re-runs hit the Go module
+ build cache instead of recompiling cold.
TEST-M1 — Mutation-testing floor at 55%
security-deep-scan.yml::go-mutesting step rewritten. Removed
continue-on-error + per-package '|| true'. New post-loop check
extracts every 'The mutation score is X.YZ' line and fails the
step if any package drops below 0.55. Floor rationale: starter
ratio catches major regressions without rejecting the audit's
'this is OK' steady state; raise quarterly.
TEST-M2 — 3 advisory deep-scan gates promoted to blocking
Removed continue-on-error: true from:
- gosec (filtered to G201/G202/G304/G108 high-signal rules:
SQL-injection + path-traversal + pprof-exposed)
- osv-scanner (multi-ecosystem CVE; complements govulncheck
which is already blocking in ci.yml)
- trivy image scan (--severity HIGH,CRITICAL --exit-code 1)
continue-on-error count: 15 → 11.
ZAP / schemathesis / nuclei / testssl stay advisory because their
false-positive rates on https://localhost:8443-targeted DAST runs
are high.
TEST-M3 — Playwright harness stub
web/package.json adds '@playwright/test' devDep + 'e2e' / 'e2e:install'
npm scripts. web/playwright.config.ts ships single chromium project
with webServer block pointing at 'npm run dev'. web/src/__tests__/
e2e/smoke.spec.ts proves the harness wires through. The full 15-flow
suite ships in frontend-design-audit Phase 8 (TEST-H1 in THAT audit);
this is the wiring + a single smoke test as the regression floor.
New Makefile target: 'make e2e-test'.
Doc/code drift fixes
====================
TEST-M4 + ARCH-L2 — Skip inventory artifact + CI guard
scripts/skip-inventory.sh walks every t.Skip site under cmd/ +
internal/ + deploy/test/ and emits docs/testing/skip-inventory.md
grouped by package with file:line:expression triples. Current
inventory: 142 t.Skip sites, 76 testing.Short() guards.
scripts/ci-guards/skip-inventory-drift.sh regenerates and fails on
diff (excluding the 'Last reviewed' timestamp line which drifts
daily). The Markdown is the canonical acquisition-diligence artifact
for 'what tests are being skipped and why.'
ARCH-H3 — MCP catalogue floor reconciliation
Audit framing was '121 vs floor 150 — doc/code drift.' Live count
via the test's actual regex over all 5 tool files (tools.go +
tools_audit_fix.go + tools_auth.go + tools_auth_bundle2.go +
tools_est.go): 155 unique 'Name: "certctl_*"' declarations.
Pre-Phase-3 audit measured tools.go in isolation (121) and missed
the other 4 files (+34 unique names). The test at
internal/ciparity/surface_parity_test.go::TestSurfaceParity_MCP
passes today (155 ≥ 150). Added a clarifying comment near
mcpBaselineFloor explaining the measurement scope so future
reviewers don't repeat the audit's framing error.
STATUS: stale — no code drift, just a measurement scoping error in
the audit.
ARCH-L1 — panic() rationale comments
5 panic sites in production Go (excluding _test.go):
- internal/repository/postgres/tx.go:84
- internal/service/issuer.go:861 (mustJSON)
- internal/service/est.go:728 (mustParseTime)
- internal/service/acme.go:1288 (rand source failure — already documented)
- internal/pkcs7/certrep.go:270 (OID marshal — already documented)
Added ARCH-L1 rationale comments to the 3 sites that didn't have
them. All 5 are defensible impossible-path / rethrow / hardcoded-
constant guards.
ARCH-L3 — Migration IF-NOT-EXISTS carve-outs
4 migrations skip the literal 'IF NOT EXISTS' token but ARE
idempotent via different Postgres patterns:
- 000014_policy_violation_severity_check.up.sql: ALTER TABLE
ADD CONSTRAINT CHECK doesn't accept IF NOT EXISTS; idempotency
via DROP CONSTRAINT IF EXISTS preamble.
- 000018_audit_events_worm.up.sql: CREATE OR REPLACE FUNCTION
+ DROP TRIGGER IF EXISTS + CREATE TRIGGER + DO $$ pg_roles
existence check. CREATE TRIGGER doesn't take IF NOT EXISTS.
- 000030_rbac_admin_perms.up.sql: INSERT ... ON CONFLICT DO NOTHING.
- 000039_audit_crit1_perms.up.sql: same INSERT + ON CONFLICT pattern.
Added ARCH-L3 header comments to each explaining the carve-out so
reviewers don't flag the missing literal token.
STATUS: largely stale — migrations are already idempotent.
ARCH-L4 — TODO/FIXME → see #<descriptor>
5 TODOs rewritten to the allowed 'see #<descriptor>' pattern:
- internal/repository/postgres/auth.go:220 → see #bundle-2-scope-fk
- internal/connector/discovery/gcpsm/gcpsm.go:547 → see #gcpsm-pagination
- internal/service/audit.go:244 → see #audit-pagination-count
- internal/service/job.go:295, 299 → see #validation-job-impl
New CI guard scripts/ci-guards/no-todo-in-prod.sh grep-fails any
new TODO/FIXME in cmd/ + internal/ (excluding _test.go); allows
'see #N' / 'see #<descriptor>' patterns.
Sandbox limitation
==================
The 6.1 GB certctl working tree fills the sandbox volume; go1.25.10
toolchain download fails with 'no space left on device' (sandbox has
1.25.9; go.mod requires 1.25.10). Local 'go test' / 'go build' NOT
run in this commit. Operator must run 'make verify' on their
workstation before push per CLAUDE.md operating rules.
The smoke.spec.ts NOT executed in the sandbox (no chromium installed).
Operator runs 'cd web && npm install && npx playwright install
--with-deps chromium && npm run e2e' on first wire-up.
All CI guards (no-todo-in-prod, skip-inventory-drift, G-3
env-docs-drift, doc-rot-detector, and every existing guard) verified
clean by running each individually.
Closes: cowork/certctl-architecture-diligence-audit.html#fix-TEST-H1,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-H2,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-M1,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-M2,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-M3,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-M4,
cowork/certctl-architecture-diligence-audit.html#fix-TEST-L1,
cowork/certctl-architecture-diligence-audit.html#fix-ARCH-H3,
cowork/certctl-architecture-diligence-audit.html#fix-ARCH-L1,
cowork/certctl-architecture-diligence-audit.html#fix-ARCH-L2,
cowork/certctl-architecture-diligence-audit.html#fix-ARCH-L3,
cowork/certctl-architecture-diligence-audit.html#fix-ARCH-L4
|
||
|
|
1d6c7a0552 |
fix(bundle-6): Audit Integrity + Privacy — 3 audit findings closed
Closes Audit-2026-04-25 H-008 (High), M-017 (Medium), M-022 (Medium).
Hardens audit-trail tamper-resistance + minimizes PII leakage in one
cohesive change, with both controls applying automatically and no
operator action required at install time.
What changed
- internal/service/audit_redact.go (NEW) — RedactDetailsForAudit:
* credentialKeys deny-list (api_key, password, *_pem, eab_secret, ...)
* piiKeys deny-list (email, phone, ssn, name, address, ip_address, ...)
* case-insensitive key match; recurses into nested maps + arrays
* mutation-free; surfaces redacted_keys array for operator visibility
* nil/empty input → nil out (preserves pre-Bundle-6 behaviour)
- internal/service/audit.go — RecordEvent now routes details through
RedactDetailsForAudit BEFORE marshaling. No call-site changes required.
- internal/service/audit_redact_test.go (NEW) — full coverage:
* credential keys (~30 entries)
* PII keys (~20 entries)
* nested maps + arrays
* case-insensitivity
* mutation-free invariant
* JSON round-trip (catches type-assertion regressions)
* scalar pass-through (no panic on int/bool/nil)
- migrations/000018_audit_events_worm.up.sql (NEW) — DB-level WORM:
* BEFORE UPDATE OR DELETE trigger raises check_violation with
diagnostic citing the rationale + compliance-superuser hint
* REVOKE UPDATE,DELETE ON audit_events FROM certctl (defence-in-depth)
* REVOKE wrapped in pg_roles existence check so test fixtures
without the certctl role stay idempotent
- migrations/000018_audit_events_worm.down.sql (NEW) — clean teardown
for dev resets; not for production use.
- internal/repository/postgres/audit_worm_test.go (NEW, testcontainers,
-short gated) — INSERT succeeds; UPDATE + DELETE fail with
check_violation; second INSERT after blocked modification still
succeeds (no trigger-state corruption).
- docs/compliance.md — new section "Audit-Trail Integrity & Privacy
(Bundle 6)" with verification psql snippet, compliance-superuser
pattern (NOT auto-created), redactor before/after example, and a
maintenance note for adding new credential keys.
Compliance mapping
- H-008 (CWE-532 Insertion of Sensitive Information into Log File)
- M-017 (HIPAA Technical Safeguards §164.312(b) — audit controls)
- M-022 (GDPR Art. 32 — data minimization)
Threat model: TB-3 (audit log tampering), TB-1 (operator/orchestrator).
Verification
- go vet ./... → clean
- go build ./... → clean
- go test -short -count=1 ./... → all packages pass
- go test -count=1 -run TestRedactDetailsForAudit ./internal/service/...
→ all pass
- (testcontainers, gated by -short) audit_worm_test.go pins WORM contract
- npx tsc --noEmit (web) → clean (no frontend changes)
- python3 yaml.safe_load(api/openapi.yaml) → 89 paths
Backward compatibility
- Trigger applies forward only — existing rows unchanged.
- nil/empty details from RecordEvent callers → nil out (preserves prior
behaviour for the many existing call sites that pass nil).
- Compliance superusers (provisioned out-of-band) bypass the trigger.
Bundle 6 of the 2026-04-25 comprehensive audit.
|