ci(arch-h1): Phase 13 Sprint 13.7 — tighten rest-deferred floor from monotonic-decrease to hard zero-exact pin; close ARCH-H1 + ARCH-M1

Closure commit for Phase 13 (ARCH-H1 OpenAPI ↔ handler gap + ARCH-M1
per-process rate-limit ceiling). Tightens the parity-script CI guard
to a HARD zero-exact pin on the rest-deferred bucket: any future PR
adding a new REST route MUST author its OpenAPI op or fail CI.
The `category: rest-deferred` escape hatch is now closed for good.

The sibling monotonic-decrease guard (openapi-rest-deferred-
monotonic.sh) stays in tree as belt-and-suspenders — both must hold.
The monotonic guard catches baseline-drift accidents (operator edits
the baseline up without surfacing rationale); this guard catches the
underlying rest-deferred bucket re-growing at all.

Phase 13 commit chain (six prior commits, ordered):

  67f346cd  Sprint 13.1  — two-bucket exception categorization +
                          monotonic guard (rest-deferred=28 baseline,
                          wire-protocol=36, fail-on-drift)
  c8347d74  Sprint 13.2  — ARCH-M1 Postgres sliding-window limiter
                          (SELECT FOR UPDATE arbitration) + migration
                          000046 rate_limit_buckets + falsifiable
                          multi-replica integration test
                          (TestRateLimit_PostgresBackend_CapEnforced
                          AcrossReplicas: 100 concurrent allows across
                          3 limiters cap=10 → exactly 10 succeed /
                          90 ErrRateLimited)
  a41fc2d7  Sprint 13.3  — backend selector
                          (CERTCTL_RATE_LIMIT_BACKEND={memory|postgres})
                          + scheduler janitor sweeping
                          updated_at<NOW()-maxWindow + helm chart wiring
                          + docs/operator/observability.md operator
                          decision tree
  952682eb  Sprint 13.4  — OpenAPI authoring batch 1 (13 ops + 8
                          schemas: sessions cluster + OIDC CRUD + JWKS
                          + test + refresh + group-mappings).
                          rest-deferred 28 → 15.
  9135c449  Sprint 13.5  — OpenAPI authoring batch 2 (8 ops + 5
                          schemas: breakglass admin + users + runtime
                          -config). rest-deferred 15 → 7.
  29cb13e7  Sprint 13.6  — OpenAPI authoring batch 3 final 7 ops +
                          2 schemas (audit/export + demo-residual +
                          auth/logout + breakglass/login + 3 OIDC
                          browser flows modeled as 302+Location).
                          rest-deferred 7 → 0. ARCH-H1 substantive
                          close.

Sprint 13.7 deliverables (this commit):

  • scripts/ci-guards/openapi-handler-parity.sh: append inline
    hard zero-exact check after the bucket-counts report. Fails CI
    immediately on any rest-deferred entry, enumerating offenders
    with the suggested-fix narrative.
  • Header docstring updated to reflect post-Sprint-13.7 state:
        220 router routes
        186 OpenAPI operations
         36 documented exceptions (36 wire-protocol + 0 rest-deferred)
          0 unaccounted router routes

Falsifiable closure proofs (re-run in CI on every PR):

  $ bash scripts/ci-guards/openapi-handler-parity.sh
    Router routes:                  220
    OpenAPI operations:             186
    Documented exceptions:          36
      wire-protocol:                36
      rest-deferred:                0
    openapi-handler-parity: clean.

  $ bash scripts/ci-guards/openapi-rest-deferred-monotonic.sh
    openapi-rest-deferred-monotonic: clean — rest-deferred = 0,
    baseline = 0.

  $ cat api/openapi-handler-exceptions-baseline.txt
    0

Negative test (synthetic rest-deferred entry, restored after):

  $ # append GET /scep with category: rest-deferred …
  $ bash scripts/ci-guards/openapi-handler-parity.sh
    ::error::rest-deferred bucket is non-empty (1 entries) —
    Phase 13 Sprint 13.7 closure pins this at zero.
    Offending entries: GET /scep
    exit 1   ← guard fails correctly

  $ gofmt -l .
    (no output — clean)

Findings flipped to ✓ Shipped in
cowork/certctl-architecture-diligence-audit.html:

  • ARCH-H1 — OpenAPI surface diverges from REST handlers
    (commit chain 67f346cd + 952682eb + 9135c449 + 29cb13e7)
  • ARCH-M1 — Per-process rate limiter caps single instance only
    (commit chain c8347d74 + a41fc2d7)

Progress widget: 46 / 56 findings shipped (82%) + 2 scaffolded.
The remaining 8 open findings are v3-scope strategic items
(multi-tenancy, EAB/External Account Binding, cluster coordination
primitives) — explicitly out of v2.2 scope per audit triage.

OPERATOR ACTION REQUIRED (one toggle, no code change):

  Promote TestRateLimit_PostgresBackend_CapEnforcedAcrossReplicas
  in deploy/test/integration_test.go to a required status check in
  GitHub branch-protection settings for master. Code-side wiring
  (.github/workflows/ci.yml) is done; the missing piece is the
  GitHub Settings → Branches → Branch protection rules toggle.
  Without that toggle, the test runs on every PR but isn't gating.

  After flipping the toggle, ARCH-M1 closure is fully load-bearing
  at the CI gate — a regression in the Postgres sliding-window
  backend (e.g. a future refactor that breaks SELECT FOR UPDATE
  arbitration) cannot reach master.
This commit is contained in:
shankar0123
2026-05-14 13:06:57 +00:00
parent 29cb13e7a2
commit 155f1fec98
+43 -10
View File
@@ -14,20 +14,31 @@
# (openapi-rest-deferred-monotonic.sh) against a checked-in baseline
# at api/openapi-handler-exceptions-baseline.txt.
#
# Current state (2026-05-14):
# Current state (post-Sprint-13.7 / 2026-05-14):
# 220 r.Register / r.mux.Handle call sites in internal/api/router/router.go
# 158 operationIds in api/openapi.yaml
# 64 documented exceptions (36 wire-protocol + 28 rest-deferred)
# 186 operationIds in api/openapi.yaml
# 36 documented exceptions (36 wire-protocol + 0 rest-deferred)
# 0 unaccounted router routes — guard passes clean today.
#
# Sprints 13.4-13.6 drive rest-deferred to zero by authoring OpenAPI ops
# for the 28 REST-shaped routes; each batch deletes the corresponding
# exception entries + bumps the baseline file downward. Sprint 13.7
# tightens this guard's rest-deferred floor from "monotonic-decrease"
# (sibling guard) to a hard zero-exact pin (this guard).
# Sprints 13.4-13.6 drove rest-deferred to zero by authoring 28 OpenAPI
# ops + deleting the corresponding exception entries. Sprint 13.7
# (this comment-block update + the inline fail-on-rest-deferred check
# at the bottom of the python block) tightens this guard's
# rest-deferred floor from "monotonic-decrease vs baseline" (the
# sibling guard openapi-rest-deferred-monotonic.sh) to a HARD
# zero-exact pin. The `category: rest-deferred` escape hatch is now
# closed for good: any future PR adding a new REST route MUST author
# its OpenAPI op or fail CI.
#
# The sibling monotonic-decrease guard stays in tree as belt-and-
# suspenders — both must hold. The monotonic guard catches baseline-
# drift accidents (e.g. an operator manually edits the baseline up
# without surfacing the rationale); this guard catches the underlying
# rest-deferred bucket re-growing at all.
#
# Going forward: any new gap (in either direction) fails the build
# unless documented in the exceptions YAML with a category.
# unless documented in the exceptions YAML with category=wire-protocol
# (carry an RFC anchor in `why:` for review-time scrutiny).
#
# Subcommand:
# bash scripts/ci-guards/openapi-handler-parity.sh
@@ -122,7 +133,8 @@ if missing_category:
print(f" {r}")
print()
print("Add `category: wire-protocol` (with an RFC anchor in `why:`) or")
print("`category: rest-deferred` (OpenAPI op deferred) to each entry.")
print("author the route's OpenAPI op (the rest-deferred bucket is now")
print("pinned at zero — see Phase 13 Sprint 13.7 closure).")
fail = True
if unknown_category:
@@ -131,6 +143,27 @@ if unknown_category:
print(f" {r} → category: {c}")
fail = True
# Phase 13 Sprint 13.7 — hard zero-exact pin on the rest-deferred
# bucket. ARCH-H1's substantive close requires that the bucket stay
# empty in perpetuity: any new REST route MUST land with an
# OpenAPI op. Categorizing a new exception as `category: rest-deferred`
# is no longer an escape hatch — it fails CI immediately, surfacing
# the route + suggesting the fix.
if bucket_counts['rest-deferred'] > 0:
print(f"::error::rest-deferred bucket is non-empty ({bucket_counts['rest-deferred']} entries) — Phase 13 Sprint 13.7 closure pins this at zero.")
print()
print("Every entry in api/openapi-handler-exceptions.yaml with")
print("`category: rest-deferred` represents a REST-shaped route whose")
print("OpenAPI op was deferred. Author the OpenAPI op in api/openapi.yaml")
print("with a request/response schema mirroring the Go handler's")
print("projection types, then delete the exception entry.")
print()
print("Offending entries:")
for entry in (exc_doc.get('documented_exceptions') or []):
if entry.get('category') == 'rest-deferred':
print(f" {entry['route']}")
fail = True
# Routes in router but NOT in openapi AND NOT in exceptions = drift
router_only_undocumented = router_set - oapi_set - exception_set
if router_only_undocumented: