mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 22:51:30 +00:00
a1c7741e1b
The deploy-vendor-e2e job has been failing with the certctl-test-server container restarting endlessly. Diagnostic dump (added in3b96b35) finally surfaced the actual cause: Failed to load configuration: SCEP profile 0 (PathID="e2eintune") has empty CHALLENGE_PASSWORD — refuse to start (CWE-306: per-profile shared secret is the sole application-layer auth boundary; an empty password would allow any client reaching /scep/e2eintune to enroll a CSR against issuer "iss-local") Same shape as the encryption-key fix that landed inc4157fd: a config validation gate added in code that the test compose never got updated to satisfy, hidden pre-Phase-5 because the matrix-collapse hadn't yet forced the certctl-server to actually boot in CI. Root cause is more interesting than just "missing env var." The 2026-04-29 SCEP RFC 8894 + Intune master bundle Phase I added an `e2eintune` SCEP profile to docker-compose.test.yml expecting deploy/test/scep_intune_e2e_test.go to exercise it. That integration test does exist (//go:build integration) but **NO CI job ever selects it** — ci.yml's deploy-vendor-e2e job runs only `-run 'VendorEdge_'` (line 379), and no other job invokes `go test -tags integration` with a SCEP selector. Confirmed via `grep -rnE "scep_intune|SCEPIntune" .github/workflows/` returning empty. Worse: the supporting fixtures (ra.crt + ra.key + intune_trust_anchor.pem) were documented in deploy/test/fixtures/README.md with the regeneration recipe but never actually committed. Pre-Phase-5 the test stack didn't fully boot the server in CI, so the entire stack of debt — dead config + missing fixtures + no consumer test — sat silent until the matrix collapse forced the boot path. Fixing this with a fake CHALLENGE_PASSWORD value would silence the immediate validator but leave the real problem in place: maintenance cost on test config that no test exercises. Same critique applies to "let me commit fake fixtures" — the fixtures alone don't add test coverage when no CI job runs the SCEP test. The complete-path fix is to make the test compose match what CI actually exercises: - deploy/docker-compose.test.yml: drop CERTCTL_SCEP_ENABLED + the full e2eintune profile env var family (10 lines) + the ./test/fixtures volume mount (1 line). Replace with an in-line comment explaining why SCEP is intentionally disabled and what needs to come back together when SCEP is added to CI for real. - scripts/ci-guards/test-compose-scep-coherence.sh (new, 22nd guard): refuses any future state where CERTCTL_SCEP_ENABLED=true in test compose without ALL of: 1. A CI job that runs the SCEP integration test (matched by scep_intune | SCEPIntune | -run [Ss]cep in ci.yml) 2. The fixture files actually committed (ra.crt, ra.key, intune_trust_anchor.pem) 3. The ./test/fixtures:/etc/certctl/scep:ro volume mount Verified manually with the same pattern as the H-1 guard: clean tree → exit 0; deliberate SCEP_ENABLED=true regression → exit 1 with 5 ::error:: annotations covering each gap; restore → exit 0 again. - scripts/ci-guards/README.md: 21 → 22 guards, new row. The fixtures README at deploy/test/fixtures/README.md keeps the regeneration recipe so the eventual SCEP CI job lands cleanly: the operator who adds the SCEP job restores the env vars, regenerates + commits the fixtures, and the guard auto-passes. Pattern (now firm across this CI-stabilization sequence): - Pre-existing latent bug - Old CI structurally hid it (per-vendor matrix, missing boot path) - Phase-5 matrix collapse + new diagnostic infra exposed it - Direct fix unblocks today - Regression guard prevents the same shape of drift forever Encryption-key (c4157fd) was the same shape; this is its sibling.
98 lines
5.7 KiB
Markdown
98 lines
5.7 KiB
Markdown
# `scripts/ci-guards/` — Regression-guard scripts
|
|
|
|
Each `<id>.sh` script in this directory pins one closed audit finding from
|
|
regressing. CI runs the full set on every push via the
|
|
`Regression guards` step in `.github/workflows/ci.yml`. Operators can
|
|
run any script locally:
|
|
|
|
```bash
|
|
bash scripts/ci-guards/G-3-env-docs-drift.sh
|
|
```
|
|
|
|
## Contract
|
|
|
|
Every script in this directory MUST:
|
|
|
|
1. Be exit-code 0 on a clean repo (no regression present).
|
|
2. Be exit-code non-zero on regression, with a `::error::` annotation
|
|
prefix so PR reviewers see the failing line in the GitHub Actions UI.
|
|
3. **Be runnable from repo root via `bash scripts/ci-guards/<id>.sh`
|
|
with NO arguments and NO env-var requirements.** The CI loop step
|
|
(`for g in scripts/ci-guards/*.sh; do bash "$g"; done`) iterates
|
|
every `.sh` here without args; any script that requires an arg or
|
|
env var WILL fail in that loop.
|
|
4. Carry a head-comment block matching the in-source justification
|
|
from the original ci.yml entry: the audit-finding reference, the
|
|
closure rationale, the exempt-surface list (if any).
|
|
5. Use `set -e` early to fail-fast on internal command errors.
|
|
6. Produce no output on the happy path beyond a final
|
|
`echo "<id>: clean."` confirmation line.
|
|
|
|
### Helpers vs guards
|
|
|
|
Scripts that consume input artifacts (a test-output log, a
|
|
`coverage.out` file) or env vars (`PR_NUMBER`, `GH_TOKEN`) are
|
|
HELPERS, not guards. They live in `scripts/`, NOT `scripts/ci-guards/`.
|
|
|
|
Current helpers:
|
|
- `scripts/vendor-e2e-skip-check.sh` — consumes `test-output.log`
|
|
arg from the deploy-vendor-e2e job
|
|
- `scripts/coverage-pr-comment.sh` — consumes `coverage.out` +
|
|
`PR_NUMBER` + `GH_TOKEN` env from the go-build-and-test job
|
|
- `scripts/check-coverage-thresholds.sh` — consumes `coverage.out`
|
|
+ `.github/coverage-thresholds.yml`
|
|
- `scripts/qa-doc-part-count.sh` + `scripts/qa-doc-seed-count.sh` —
|
|
invoked via `make verify-docs` pre-tag, not in CI
|
|
|
|
## Adding a new guard
|
|
|
|
1. Drop a new `<id>.sh` in this directory with the head-comment block
|
|
describing the audit finding it closes.
|
|
2. Make it executable: `chmod +x scripts/ci-guards/<id>.sh`.
|
|
3. Verify it fails on a deliberate regression and passes on clean repo.
|
|
4. CI auto-picks up new scripts via the `for g in scripts/ci-guards/*.sh`
|
|
loop in the `Regression guards` step — no ci.yml change required.
|
|
|
|
## The 22 guards in this directory
|
|
|
|
| ID | Finding | Catches |
|
|
|---|---|---|
|
|
| `G-1-jwt-auth-literal` | G-1 JWT silent auth downgrade | `"jwt"` literal in additive auth-type surfaces |
|
|
| `L-001-insecure-skip-verify` | L-001 unjustified InsecureSkipVerify | `InsecureSkipVerify: true` without `//nolint:gosec` |
|
|
| `H-001-bare-from` | H-001 (CWE-829) tag-swap attack | Bare `FROM` line without `@sha256` digest pin |
|
|
| `M-012-no-root-user` | M-012 (CWE-250) container-as-root | Dockerfile missing terminal `USER <non-root>` |
|
|
| `H-009-readme-jwt` | H-009 README JWT advertising | README.md re-introducing JWT-as-supported claim |
|
|
| `G-2-api-key-hash-json` | G-2 cat-s5-apikey_leak | `api_key_hash` in JSON-emitting surface |
|
|
| `U-2-plaintext-healthcheck` | U-2 healthcheck protocol mismatch | Plaintext `http://` in HEALTHCHECK directive |
|
|
| `U-3-migration-mount` | U-3 seed initdb schema drift | Migration file mounted into postgres initdb |
|
|
| `D-1-D-2-statusbadge-phantom` | D-1 + D-2 dead keys + TS phantoms | StatusBadge dead keys + 5 Certificate / 5 Agent / 1 Issuer / 1 Notification phantom fields |
|
|
| `L-1-bulk-action-loop` | L-1 client-side bulk loops | `for ... await triggerRenewal/updateCertificate` in CertificatesPage |
|
|
| `B-1-orphan-crud` | B-1 orphan-CRUD client fns | 8 update/create/delete fns lose their page consumer |
|
|
| `S-2-strings-contains-err` | S-2 brittle error-dispatch | `strings.Contains(err.Error(), "not found"\|"violates foreign key")` in handlers |
|
|
| `G-3-env-docs-drift` | G-3 env-var docs drift | `CERTCTL_*` env var defined OR documented but not both |
|
|
| `test-naming-convention` | I-001-extended | `func TestXxx` (lowercase first letter) — Go silently skips |
|
|
| `S-1-hardcoded-source-counts` | S-1 stale numeric prose | Hardcoded "N issuer connectors" / "N MCP tools" in README + docs |
|
|
| `P-1-documented-orphan-fns` | P-1 documented orphans | 16 read-fn names removed from client.ts exports |
|
|
| `T-1-frontend-page-coverage` | T-1 untested frontend pages | New page in `web/src/pages/` without sibling `.test.tsx` and not on the deferred allowlist |
|
|
| `bundle-8-L-015-target-blank-rel-noopener` | L-015 (CWE-1022) reverse-tabnabbing | `target="_blank"` without `rel="noopener noreferrer"` |
|
|
| `bundle-8-L-019-dangerously-set-inner-html` | L-019 (CWE-79) XSS | `dangerouslySetInnerHTML` outside `safeHtml.ts` |
|
|
| `bundle-8-M-009-bare-usemutation` | M-009 + M-029 mutation contract | Bare `useMutation()` outside `useTrackedMutation` wrapper |
|
|
| `H-1-encryption-key-min-length` | H-1 closure follow-up (post-Phase-5 surfacing) | `CERTCTL_CONFIG_ENCRYPTION_KEY` literal in any `deploy/docker-compose*.yml` shorter than the 32-byte floor enforced by `internal/config/config.go::Validate()` |
|
|
| `test-compose-scep-coherence` | post-Phase-5 surfacing of dead SCEP test config | `CERTCTL_SCEP_ENABLED=true` in test compose without (a) a CI job that runs the SCEP integration test, (b) the `ra.crt` + `ra.key` + `intune_trust_anchor.pem` fixtures committed to `deploy/test/fixtures/`, AND (c) the matching volume mount |
|
|
|
|
## Guards explicitly NOT here
|
|
|
|
- **`QA-doc Part-count drift`** + **`QA-doc seed-count drift`** — these
|
|
protect docs-the-operator-reads, not anything the product depends on.
|
|
Moved to `make verify-docs` (operator runs pre-tag, not on every push).
|
|
See `cowork/ci-pipeline-cleanup-prompt.md` Phase 11.
|
|
|
|
## Running the full set locally
|
|
|
|
```bash
|
|
for g in scripts/ci-guards/*.sh; do
|
|
echo "=== $(basename "$g") ==="
|
|
bash "$g" || echo " FAILED"
|
|
done
|
|
```
|