docs(contributor): document the Auditable Codebase Bundle guards

Three doc changes for the bundle's discoverability: 1. New docs/contributor/ci-guards.md (185 lines) Entry-point doc for new contributors. Explains the four categories of guards (code-shape, contract-parity, build/dep, operational), the discipline that keeps them honest (allowlist + expiration), and how to add a new one. Cross-references scripts/ci-guards/README.md for the exhaustive list. 2. scripts/ci-guards/README.md — added a 'Forward-looking guards' subsection naming complete-path-config-coverage, doc-rot-detector, and cold-db-compose-smoke with their item references + a one-sentence description of what each catches. Replaced the stale '22 guards' header with 'Count: re-derive via ls' per the no-version-stamped-numbers convention from CLAUDE.md. 3. docs/README.md — wired ci-guards.md into the Contributor section navigation table. Bumped 'Last reviewed:' to 2026-05-12 on the two docs touched (docs/README.md, docs/contributor/ci-pipeline.md). Verified: doc-rot-detector.sh green at 91 docs scanned, 89 dated, 0 warns, 0 fails. Audit-Closes: post-v2.1.0-anti-rot/item-1 Audit-Closes: post-v2.1.0-anti-rot/item-2 Audit-Closes: post-v2.1.0-anti-rot/item-5 Audit-Closes: post-v2.1.0-anti-rot/item-6
2026-07-28 12:18:59 +00:00 · 2026-05-12 14:15:13 +00:00
parent ae9e55e860
commit 1ec920e75e
4 changed files with 103 additions and 3 deletions
@@ -1,6 +1,6 @@
 # certctl Documentation

-> Last reviewed: 2026-05-05
+> Last reviewed: 2026-05-12

 The full docs index, organized by audience. Pick the section that matches what you need to do; each link below opens a focused doc rather than a wall of text.

@@ -112,6 +112,7 @@ You're contributing to certctl, running tests locally, or trying to understand t
 | [GUI QA checklist](contributor/gui-qa-checklist.md) | Manual GUI verification pass for release |
 | [Release sign-off](contributor/release-sign-off.md) | Release-day checklist — code state, automated gates, manual QA, artefact verification |
 | [CI pipeline](contributor/ci-pipeline.md) | CI shape, regression guards, adding new checks |
+| [CI guards](contributor/ci-guards.md) | Per-class CI guards (code-shape, contract-parity, build/dep, operational); how to add one |

 ## Archive

@@ -0,0 +1,83 @@
+# CI guards
+
+> Last reviewed: 2026-05-12
+
+CI guards are small scripts (shell + Python) and Go tests that pin invariants the v2 audit history showed are easy to lose. Each one runs on every push, fails the build on regression with a useful error message, and produces no output on the happy path. The canonical source is `scripts/ci-guards/` for shell guards and `internal/ciparity/` for Go-based parity tests.
+
+This page lives at `docs/contributor/ci-guards.md` and is the entry point for contributors who want to understand why a CI step is red, how to add a new guard, or where the allowlist for a given guard lives. The exhaustive list of shell guards is at `scripts/ci-guards/README.md`; this doc explains the categories + the discipline.
+
+## Why guards exist
+
+Two failure modes the v2 audit cycle surfaced repeatedly:
+
+The codebase grew faster than the docs and config could keep up. Env vars got added without consumers; OpenAPI ops were registered without router routes; docs went stale; a migration broke on cold-DB without any test catching it. Each one of those classes has a one-time-fix _per-instance_ pattern (re-read the doc, wire the env var) and a structural _per-class_ pattern (write a guard that fails the next time it happens). CI guards are the second.
+
+The team grew. Reviewers had to remember what each commit author had forgotten. CI guards externalize the institutional knowledge into checks — the build refuses to ship the lying field, the stale doc, the broken migration. New contributors don't need to know the audit history.
+
+## Categories
+
+The guards fall into four buckets, organized by what they pin:
+
+### Code-shape guards
+
+Catch defects in source files BEFORE they ship. Examples: `G-3-env-docs-drift.sh` (no env var defined-but-undocumented or documented-but-undefined), `complete-path-config-coverage.sh` (every env var has a non-config consumer), `T-1-frontend-page-coverage.sh` (every new GUI page has a sibling test file).
+
+### Contract-parity guards
+
+Catch drift across the four product surfaces — OpenAPI spec, HTTP router, MCP tool catalogue, CLI verb dispatcher. The router ↔ OpenAPI pin lives at `internal/api/router/openapi_parity_test.go::TestRouter_OpenAPIParity`. The MCP + CLI sweep lives at `internal/ciparity/surface_parity_test.go` (post-v2.1.0 anti-rot item 2). One hard gate: the MCP tool count cannot regress below `mcpBaselineFloor`. The CLI parity sweep is informational until the CLI surface stabilizes.
+
+### Build / dependency guards
+
+`H-001-bare-from.sh` (Dockerfile pin to `@sha256:`), `digest-validity.sh` (every digest actually resolves on the registry), `M-012-no-root-user.sh` (no Dockerfile ends as root), `bundle-8-*.sh` (frontend XSS / reverse-tabnabbing surface). These come out of specific audits and pin the closure.
+
+### Operational guards
+
+`cold-db-compose-smoke.sh` (wipe postgres volume, bring stack up cold, issue/renew/revoke, audit-row check). `doc-rot-detector.sh` (every doc reviewed within 120 days). These pin the operational reality, not the source shape.
+
+## When the build is red
+
+Find the failing step in the GitHub Actions UI. Every guard's output starts with the guard's own identifier and ends with one of:
+
+`::error::<one-line description of the regression>` followed by 2-4 remediation paths. The fastest path: read the remediation list, pick the option that fits, fix.
+
+`exit 1` without an `::error::` annotation — likely an `set -e` trap on an internal command. Re-run with `bash -x scripts/ci-guards/<id>.sh` locally to see where it died.
+
+If a guard is fundamentally wrong (e.g., refactor moved the code it scans), update the guard in the same PR that triggered the failure. Don't add a one-off allowlist to silence a real bug.
+
+## Adding a new guard
+
+The discipline in five steps. The first three are non-negotiable; the last two are courtesy.
+
+Drop a new `<id>.sh` in `scripts/ci-guards/` with a head-comment block that names the bug class, lists the audit finding (if any) it closes, and explains the failure mode. Mirror the shape of an existing guard — `G-3-env-docs-drift.sh` and `digest-validity.sh` are the canonical bash+Python and pure-bash examples.
+
+Use `set -e` early; use `::error::` annotations on regression; exit 0 with one happy-path confirmation line. Take no arguments, require no env vars. The CI loop iterates every `*.sh` without args.
+
+Write the allowlist file alongside (`<id>-exceptions.yaml`) with the shape `- path: ... / - name: ... + justification + expires`. Make `expires` a required field — every exception has a hard expiration date, typically 90 days out.
+
+Verify on a deliberately broken state: introduce the regression, confirm the guard fires with a useful message, revert, confirm green. Capture the negative-test output in your PR description.
+
+Add a row to `scripts/ci-guards/README.md`. The CI loop auto-picks up the new file — no `ci.yml` edit required, unless the guard needs Docker (in which case it gets its own dedicated job; see `cold-db-compose-smoke` for the pattern).
+
+## Discipline: the allowlist trap
+
+Allowlists are dangerous. They start as a small concession ("this one env var is documented for an external script, not consumed by Go code") and become a junk drawer of unverified exemptions that mask real defects. The discipline that keeps that from happening:
+
+Every entry MUST carry a `justification:` field with a one-line reason. "Tech debt" is not a reason; "documented contract surface consumed by the ACME DNS-01 helper script — see `deploy/test/acme/dns01-export.sh`" is.
+
+Every entry MUST carry an `expires:` field with a hard date, typically 90 days out. The guards reject entries past their expiration. When an entry expires, the only paths forward are (a) close the underlying gap so the entry is no longer needed, (b) re-justify with a fresh expiration. Both force a real review.
+
+If you're adding more than one entry to an allowlist in a single PR, that's a smell — usually the underlying class needs a small refactor, not three allowlist rows.
+
+## Where the bundles live
+
+The `Audit-Closes:` commit trailer convention (post-v2.1.0 anti-rot item 4) is the cross-reference between audit findings and the commits that closed them. Re-derive the closure history of any audit with:
+
+    git log --grep='Audit-Closes: <audit-id>'
+
+The audit folder structure under `cowork/` (workspace-local; not in this repo) carries the per-audit RESULTS.md + findings.yaml. CLAUDE.md's "Audit closures" subsection is the current-state index of which audits are open vs closed.
+
+## Related
+
+The exhaustive guard list — `scripts/ci-guards/README.md`.
+The CI pipeline architecture — `docs/contributor/ci-pipeline.md`.
+The QA test suite — `docs/contributor/qa-test-suite.md`.
@@ -1,6 +1,6 @@
 # CI Pipeline — Operator Guide

-> Last reviewed: 2026-05-05
+> Last reviewed: 2026-05-12

 > Authoritative guide to certctl's CI pipeline shape.
 > Per the ci-pipeline-cleanup spec, Phase 12.
@@ -53,7 +53,11 @@ Current helpers:
 4. CI auto-picks up new scripts via the `for g in scripts/ci-guards/*.sh`
   loop in the `Regression guards` step — no ci.yml change required.

-## The 22 guards in this directory
+## Guards in this directory
+
+Count: re-derive on demand via `ls scripts/ci-guards/*.sh | wc -l`. The table below names each one — keep it in sync as guards are added.
+
+### Per-finding regression guards

 | ID | Finding | Catches |
 |---|---|---|
@@ -80,6 +84,18 @@ Current helpers:
 | `H-1-encryption-key-min-length` | H-1 closure follow-up (post-Phase-5 surfacing) | `CERTCTL_CONFIG_ENCRYPTION_KEY` literal in any `deploy/docker-compose*.yml` shorter than the 32-byte floor enforced by `internal/config/config.go::Validate()` |
 | `test-compose-scep-coherence` | post-Phase-5 surfacing of dead SCEP test config | `CERTCTL_SCEP_ENABLED=true` in test compose without (a) a CI job that runs the SCEP integration test, (b) the `ra.crt` + `ra.key` + `intune_trust_anchor.pem` fixtures committed to `deploy/test/fixtures/`, AND (c) the matching volume mount |

+### Forward-looking guards (Auditable Codebase Bundle, post-v2.1.0 anti-rot)
+
+These guards catch defect classes BEFORE they get audit findings — they pin invariants on the codebase that the v2.0 audit history showed are easy to lose.
+
+| ID | Item | Catches |
+|---|---|---|
+| `complete-path-config-coverage` | post-v2.1.0 / item-1 | "Lying field" — `CERTCTL_*` env var defined in `internal/config/config.go` that no consumer outside `internal/config/` actually reads. Operator-facing config that the docs claim works but the code never honors. Companion Go test at `internal/config/coverage_test.go`. |
+| `doc-rot-detector` | post-v2.1.0 / item-5 | Docs older than 90 days warn (yellow), older than 120 days fail (red). Uses HEAD commit timestamp for reproducibility. `docs/archive/` allowlisted in bulk. |
+| `cold-db-compose-smoke` | post-v2.1.0 / item-6 | Migration-on-cold-DB regression (canonical: 2026-05-09 migration 000045 broken INSERT, commit `6444e13`). Wipes the postgres volume, brings the stack up cold, issue/renew/revoke + 3 audit rows. **Runs in its own GitHub Actions job** (`cold-db-compose-smoke`), NOT the generic regression-guards loop — needs Docker. |
+
+The fourth Bundle artifact (`internal/ciparity/`) is Go tests, not shell guards — runs under the standard Go test step. Pins the MCP tool catalogue floor + naming convention; reports CLI/MCP/OpenAPI surface counts as a trend metric.
+
 ## Guards explicitly NOT here

 - **`QA-doc Part-count drift`** + **`QA-doc seed-count drift`** — these