mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 23:31:39 +00:00
e347a9a908a49d8db81bed78a5ded1600366c4cf
18 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ff3f1cd864 |
harden(auth): demo-mode residual-grants detector + cleanup endpoint + CI guard (A-8)
Audit 2026-05-11 A-8 closure. Closes the deferred Phase 2 leg of the
2026-05-10 HIGH-12 closure (
|
||
|
|
ddad647ee7 |
fix(auth/rbac): scope-aware ActorRole revoke (A-4)
HIGH-10's UNIQUE (actor, role, scope_type, scope_id, tenant) uniqueness
extension lets an operator grant the same role to the same actor at
multiple scopes (e.g. r-operator on profile=p-acme AND profile=p-globex).
But ActorRoleRepository.Revoke's WHERE clause omitted (scope_type,
scope_id) — a single call deleted every variant. Selective revoke was
unrepresentable; operators had to drop all and re-grant N-1, opening
a race window where the actor's access was briefly different.
Closure across all layers (handler → service → repo → MCP → GUI client),
preserving the legacy "revoke all variants" contract for unmodified
callers:
internal/repository/auth.go
- New ActorRoleRevokeOptions struct. Zero value = legacy semantic;
non-empty ScopeType narrows to one variant.
- New ErrActorRoleNotFound sentinel for scoped no-match (HTTP 404).
internal/repository/postgres/auth.go
- Revoke signature extended with opts. Empty opts.ScopeType uses
the legacy SQL (no scope WHERE), zero-row delete = no error.
- Non-empty narrows with `scope_type = $5 AND scope_id IS NOT
DISTINCT FROM $6` — the IS-NOT-DISTINCT-FROM is load-bearing,
vanilla `=` would silently miss the (global, NULL) case because
NULL ≠ NULL in standard SQL.
- Selective revoke with zero matching rows returns
ErrActorRoleNotFound; operators get feedback on typos.
internal/service/auth/actor_role_service.go
- Revoke takes opts. Audit row's details map records the scope so
SIEMs can distinguish wide-vs-selective revokes:
`scope: "all_variants"` for the legacy path, or
`scope_type` + `scope_id` for selective. Privilege check
(auth.role.assign) and reserved-actor guard unchanged.
internal/api/handler/auth.go
- RevokeRoleFromKey parses optional `?scope_type=` / `?scope_id=`
query params via new parseRevokeScope helper.
- Validation mirrors AssignRoleToKey: scope_id forbidden with
scope_type=global, required with profile/issuer, invalid
scope_type → 400. scope_id without scope_type also → 400.
- writeAuthError maps ErrActorRoleNotFound to 404.
internal/mcp/tools_auth.go + types.go
- AuthRevokeKeyRoleInput gains optional ScopeType + ScopeID with
jsonschema descriptions explaining the dual-mode contract.
- Tool call site appends URL-encoded query params when ScopeType
is set; legacy callers (no scope_type) emit the bare DELETE
path unchanged.
web/src/api/client.ts
- authRevokeKeyRole signature: optional 3rd argument
`{ scope_type?, scope_id? }`. Pre-A-4 call sites (no opts arg)
keep firing the bare DELETE — fully backward compatible. The
GUI KeysPage's per-row revoke button (still one row per role,
pre-Fix-12) continues to use the legacy shape; future GUI work
can pass scope params for per-variant rows.
docs/operator/rbac.md
- New "Revoke: legacy 'all variants' vs scope-selective" subsection
under "From the HTTP API" with curl examples for both modes plus
the audit-row payload shape that lets SOC/SIEM tell them apart.
Regression coverage:
Repository (testcontainers, skipped under -short — 6 tests in
internal/repository/postgres/auth_revoke_scope_test.go):
TestRevokeActorRole_NoOpts_RemovesAllVariants
TestRevokeActorRole_WithScope_RemovesOnlyMatching
TestRevokeActorRole_WithGlobalScope_RemovesOnlyGlobal — pins the
IS-NOT-DISTINCT-FROM branch (global, NULL)
TestRevokeActorRole_NoMatch_ReturnsNotFound — pins the new sentinel
TestRevokeActorRole_NoOpts_NoMatch_IsNoOp — pins the legacy
idempotence contract
TestRevokeActorRole_IssuerScope_RemovesOnlyMatching — pin the
issuer-scope half (profile + issuer are symmetric scope types)
Handler (7 new tests in auth_test.go):
TestAuthHandler_RevokeRoleFromKey — extended to assert no scope
filter is forwarded when query string is empty (legacy behaviour)
TestAuthHandler_RevokeRoleFromKey_A4_ScopedProfile
TestAuthHandler_RevokeRoleFromKey_A4_ScopedGlobal
TestAuthHandler_RevokeRoleFromKey_A4_RejectsScopeIDWithGlobal
TestAuthHandler_RevokeRoleFromKey_A4_RejectsMissingScopeID
TestAuthHandler_RevokeRoleFromKey_A4_RejectsScopeIDWithoutScopeType
TestAuthHandler_RevokeRoleFromKey_A4_RejectsInvalidScopeType
TestAuthHandler_RevokeRoleFromKey_A4_ScopedNotFoundReturns404
MCP (2 new table rows in tools_per_tool_test.go):
Scoped revoke with scope_type=profile + scope_id=p-acme →
`?scope_type=profile&scope_id=p-acme`
Scoped revoke with scope_type=global (no scope_id) →
`?scope_type=global`
Service-layer test plumbing (service_test.go) updated for new opts
arg: 4 existing call sites pass repository.ActorRoleRevokeOptions{}
to keep their pre-A-4 semantics; the fakeActorRoleRepo.Revoke
implementation now mirrors the postgres scope-aware behaviour
(legacy zero-value vs scoped narrowing + ErrActorRoleNotFound on
no-match).
Verify gate green: gofmt clean, go vet clean, go test -short across
repository/postgres, service/auth, api/handler, and mcp. The
pre-existing KeysPage.test.tsx failure observed on the baseline
commit (reproduced via `git stash` earlier in Fix 03) is unrelated;
my client.ts change adds an optional third argument and is fully
backward-compatible.
Spec at cowork/auth-bundles-fixes-2026-05-11/04-high-actor-role-revoke-scope.md.
Audit doc updated: new row A-4 (2026-05-11) CLOSED appended to the
status table at the bottom of cowork/auth-bundles-audit-2026-05-10.md.
Operator-visible advisory in CHANGELOG.md v2.1.0 release notes under
Security (non-BREAKING — legacy callers are unchanged).
Depends on Fix 01 (the scope-aware EffectivePermissions read path on
branch fix/audit-2026-05-11/crit-actor-role-scope-reads). This fix
makes the inverse op selectively reversible; without Fix 01 the read
side would mis-evaluate scoped grants anyway, making selective revoke
moot at runtime.
|
||
|
|
77860fbcc3 |
harden(auth): LOW + Nit batch — bootstrap audit, crypto/rand, XFF trust, CSRF check, protocol-prefix unify (Batch 1)
Audit 2026-05-10 — close 8 LOWs + 2 Nits in-bundle. Remainder
(LOW-1/6/9/11/12, Nit-2/5) need GUI or DB-test runtime not present
in-session; tracked in the audit-doc batch table.
LOW-2: bootstrap.ValidateAndMint now emits 'bootstrap.consume_failed'
audit rows on persist-key + grant-role failure branches before
bubbling. Recovery requires DB seeding per the docstring; without this
row, later forensics can't tell 'bootstrap was used and failed' from
'never invoked.'
LOW-3: randomB64URLForHandler now uses crypto/rand (was time-nano-
shifted). Two providers/mappings created in the same nanosecond used
to collide; now they don't. Time-nano fallback retained for the
unlikely crypto/rand-broken path.
LOW-4: breakglass.verifyDummy uses s.readRand(salt) for the dummy
Argon2id verify. Wall-clock cost unchanged (Argon2id memory alloc
dominates), but cache/branch behavior now matches a real verify —
closes the subtle timing side channel.
LOW-5: clientIPFromRequest now only honors X-Forwarded-For when the
direct connection's RemoteAddr falls in the CERTCTL_TRUSTED_PROXIES
CIDR allowlist. Default-deny: empty list means XFF is ignored.
SetTrustedProxies wired in cmd/server/main.go from cfg.Auth.TrustedProxies.
LOW-7: internal/auth/protocol_endpoints.go::ProtocolEndpointPrefixes
now carries /scep-mtls + /.well-known/est-mtls (previously only in
router.AuthExemptDispatchPrefixes; the two lists had drifted). The
canonical-prefix coverage test in Phase 12 still pins the set.
LOW-8: docs/operator/rbac.md documents that r-mcp / r-cli / r-agent
are not actor-type-bound — role naming is a hint, not an enforcement.
Operators wanting hard binding must apply periodic audit queries.
Native binding is on the v2 roadmap.
LOW-10: Session.Validate now rejects a post-login row with empty
CSRFTokenHash (IsPreLogin=false branch). validSession test fixture
updated with a valid 64-hex CSRF hash.
Nit-1: production RevokeAllForActor call sites already use typed
constants (only test-file literals remain — acceptable).
Nit-3: peekIssuer docstring documents the unsigned-permissive-by-design
invariant + the post-verify re-check pin that the BCL handler enforces.
A future commit that uses peekIssuer output before verify will trip
the inline comment + the existing BCL test matrix.
Status table updated in cowork/auth-bundles-audit-2026-05-10.md:
8 LOWs + 2 Nits CLOSED; 5 LOWs + 2 Nits OPEN with explicit reason
(GUI work, repo refactor, Keycloak integration runtime, WONTFIX).
Refs: cowork/auth-bundles-audit-2026-05-10.md LOW-2/3/4/5/7/8/10
cowork/auth-bundles-audit-2026-05-10.md Nit-1/3
|
||
|
|
457962f21a |
fix(auth): apply rbacGate to every state-changing + read handler (CRIT-1 closure)
Closes the wire-layer authorization gap surfaced by the 2026-05-10 audit
(CRIT-1). Before this commit only ~24 of ~140 routes carried rbacGate
enforcement — all of them admin-only fine-grained perms (auth.session.*,
auth.oidc.*, auth.breakglass.admin, cert.bulk_revoke, crl.admin, scep.admin,
est.admin, ca.hierarchy.manage). Every catalogued legacy-CRUD perm
(cert.read/issue/revoke/delete, profile.edit/delete, issuer.edit/delete,
target.*, agent.*, plus role-mgmt verbs) was declared in
internal/domain/auth/validate.go but never wired at the router. A r-viewer
Bearer was essentially r-admin minus five verbs at the wire layer (CWE-862).
This commit:
- Adds rbacGateScoped(checker, perm, scopeType, scopeFn, h) helper to
internal/api/router/router.go for path-bound scope resolution. Per-profile
and per-issuer grants (Decision 2) now reach the wire layer.
- Wraps every state-changing route AND every read endpoint in router.go
with rbacGate (global) or rbacGateScoped (path-bound). The auth-management
routes (POST /api/v1/auth/roles, etc.) gain router-level enforcement
in addition to the existing service-layer Authorizer check — defense in
depth (HIGH-9 of the same audit collapses into this closure).
- Auth-exempt surfaces stay un-gated by design: login, callback, BCL,
logout, breakglass-login, bootstrap, health, auth-info, version. Allowlist
is documented in TestRouterRBACGateCoverage.
- Extends internal/domain/auth/validate.go CanonicalPermissions with 30 new
perms across 12 namespaces: cert.edit; job.read, job.cancel; approval.read,
approval.approve, approval.reject; policy.read/edit/delete;
team.read/edit/delete; owner.read/edit/delete; notification.read/edit;
discovery.read/run/claim; network_scan.read/edit/run;
healthcheck.read/edit/delete/acknowledge; digest.read, digest.send;
verification.read, verification.run; stats.read; metrics.read.
- Updates DefaultRoles for r-admin / r-operator / r-viewer / r-mcp / r-cli /
r-agent. r-auditor gets NOTHING new — the auditor pin
(TestAuditorRoleHoldsExactlyAuditReadAndExport) stays invariant.
- Migration 000039_audit_crit1_perms seeds the new perm rows + role grants
per the updated DefaultRoles map. Idempotent ON CONFLICT DO NOTHING.
Reverse migration removes role_permissions before permissions
(ON DELETE RESTRICT on the FK).
- AST-level CI guard TestRouterRBACGateCoverage in
internal/api/router/router_rbac_coverage_test.go walks router.go and
asserts every state-changing + read route is wrapped (or in the
documented allowlist). Adding a new ungated route fails CI.
- Updates docs/operator/rbac.md permission-catalogue table with the new
namespaces + footer link to the AST CI guard.
- Updates certctl/CHANGELOG.md v2.1.0 section with the closure narrative.
Audit doc cowork/auth-bundles-audit-2026-05-10.md CRIT-1 row annotated
CLOSED 2026-05-10. Bundle's exit-gate spec lives at
cowork/auth-bundles-fixes-2026-05-10/01-crit-1-rbac-gates.md.
CRIT-2 / CRIT-3 / CRIT-4 / CRIT-5 of the same audit remain open and
continue to block the v2.1.0 tag.
Verification gate green:
- gofmt -d (no diff after gofmt -w on the touched files)
- go vet ./...
- go test -short -count=1 ./... (all packages pass including auditor pin)
- go build ./...
HIGH-9 of the audit closes via this commit's router-layer rbacGate on
POST /api/v1/auth/keys/{id}/roles + DELETE /api/v1/auth/keys/{id}/roles/{role_id}
(defense-in-depth on top of the existing service-layer privilege check).
Refs: cowork/auth-bundles-audit-2026-05-10.md CRIT-1 HIGH-9
|
||
|
|
a581e2d222 |
auth-bundle-2 Phase 16: docs updates (security.md OIDC + sessions + break-glass + auditor split sections; new migration/oidc-enable.md; CHANGELOG.md v2.1.0 Bundle 2 release notes)
Closes Phase 16 of cowork/auth-bundle-2-prompt.md. Three operator-
facing docs updated, one new migration guide ships, README nav row
added.
Files
=====
docs/operator/security.md (MODIFIED, Last reviewed bumped to 2026-05-10):
* Added 5 new Bundle 2 subsections under '## Authentication
surface' after the Bundle 1 approval-bypass-closure entry:
- 'OIDC federation (Bundle 2 Phases 1-7)' — alg allow-list,
IdP-downgrade defense, iss/aud/azp/at_hash, single-use
state+nonce, PKCE-S256 mandatory, JWKS rotation handling,
encrypted client_secret at rest with the v3 blob format
pinned by an integration test, pointer to oidc-runbooks/
for per-IdP setup.
- 'Sessions + back-channel logout (Bundle 2 Phases 4-6)' —
length-prefixed HMAC cookie wire format, HttpOnly + Secure
+ SameSite cookie hardening, idle/absolute timeouts, CSRF
defense, signing-key rotation primitive, fail-fatal
EnsureInitialSigningKey at server boot, OpenID Connect
Back-Channel Logout 1.0 (NOT RFC 8414).
- 'OIDC first-admin bootstrap (Bundle 2 Phase 7)' — coexists
with Bundle 1's env-var-token bootstrap, group-scoped via
CERTCTL_BOOTSTRAP_ADMIN_GROUPS + CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID,
one-shot per tenant.
- 'Break-glass admin (Bundle 2 Phase 7.5)' — default-OFF,
surface invisibility via 404-not-403, Argon2id with OWASP
2024 params, lockout state machine, constant-time-via-
verifyDummy, WARN log at boot, runbook pointer for
operator drill.
- 'Migrating an existing deployment to OIDC' — pointer to
the new migration/oidc-enable.md walkthrough.
docs/migration/oidc-enable.md (NEW, Last reviewed 2026-05-10):
* Step-by-step migration guide for an operator on a Bundle-1-merged
deployment to enable OIDC SSO. Pre-reqs (CERTCTL_CONFIG_ENCRYPTION_KEY,
admin actor with auth.oidc.create + auth.oidc.edit, IdP tenant)
+ 7 numbered steps (pin encryption key, complete IdP-side per
runbook, configure certctl-side OIDCProvider, add group→role
mappings with fail-closed warning, optional first-admin bootstrap,
verify with single test user, announce SSO endpoint).
* Rollback section covering the 4-step disable flow + the 409
Conflict on provider-delete-while-sessions-exist + the
existing-sessions-keep-working-until-expiry semantics.
* Troubleshooting section pinning 8 most-common failure modes
(discovery doc fetch fails / IdP downgrade defense rejects /
no roles assigned / iss mismatch / pre-login expired / state
mismatch / sessions revoked but user can hit API / JWKS
rotation breaks login).
* Database row count drift documented so operators know what to
expect after OIDC is live (10 Bundle 2 tables enumerated).
* Cross-references to oidc-runbooks/ + security.md +
auth-threat-model.md + auth-benchmarks.md + auth-standards-implemented.md.
CHANGELOG.md (MODIFIED):
* v2.1.0 section title bumped from 'Auth Bundle 1: RBAC primitive'
to 'Auth Bundles 1 + 2: RBAC primitive + OIDC SSO + sessions'.
* Replaced the Bundle 1 closing-bullet ('Bundle 2 starts after
Bundle 1 lands on master') with 18 new Bundle 2 entries:
- OIDC + sessions + back-channel logout + break-glass overview.
- OIDC token validation pinned at three layers (alg allow-list,
IdP-downgrade defense, OIDC Core §3.1.3.7 re-verification).
- Length-prefixed HMAC session cookies.
- CSRF double-submit + hashed-token-on-row.
- OIDC client_secret AES-256-GCM v3 blob at rest +
integration-test invariant.
- OIDC first-admin bootstrap.
- Default-OFF break-glass admin (Argon2id + lockout +
constant-time + surface invisibility).
- GUI: 4 new pages + login-page IdP buttons + sidebar logout.
- 11 new MCP tools for OIDC + session management.
- 6 per-IdP runbooks (Keycloak / Authentik / Okta / Auth0 /
Entra ID / Google Workspace).
- Threat model extended with 5 new defense subsections + 8 new
threat-catalogue subsections.
- Performance baselines documented (4 benchmarks; 3 measured
+ 1 operator-runs).
- Standards-and-RFC implementation table (13 RFCs + 14 CWEs;
NOT a compliance-mapping doc).
- Coverage gates held at floor 90 across all 4 Bundle 2
packages (anti-Bundle-1-mistake invariant).
- Multi-tenant query CI guard (ratchet baseline 32).
- Phase 10 Keycloak testcontainers integration test + optional
Okta smoke test.
- OpenAPI cookieAuth security scheme + 13 new endpoints + 4
break-glass endpoints.
- Bundle-1-only compat regression CI guard +
Bundle-1-to-2-upgrade regression CI guard.
* Final paragraph updated to point at oidc-enable.md alongside
api-keys-to-rbac.md as the two migration walkthroughs.
docs/README.md (MODIFIED):
* Added the new oidc-enable.md migration row under '## Migration'
alongside the existing api-keys-to-rbac.md entry, with a
one-line description flagging it as the Bundle 2 OIDC
onboarding walkthrough.
Verification
============
* Last-reviewed on security.md + oidc-enable.md: 2026-05-10.
* Internal-link sweep on oidc-enable.md: 0 broken (every relative
link resolves via shell-loop verification).
* Internal-link sweep on docs/README.md: 0 broken (all .md
references resolve).
* No Go-side impact, make verify gate unchanged.
Bundle 2 documentation deliverables now complete: security.md +
auth-threat-model.md + oidc-runbooks/ + auth-benchmarks.md +
auth-standards-implemented.md + api-keys-to-rbac.md + oidc-enable.md
+ CHANGELOG.md v2.1.0. The full Bundle 2 surface is operator-
discoverable from docs/README.md root nav.
|
||
|
|
263dee4264 |
auth-bundle-2 Phase 14: session + OIDC validation benchmarks (steady-state + cold paths) + auth-benchmarks.md operator doc + Makefile targets
Closes Phase 14 of cowork/auth-bundle-2-prompt.md. Ships four
benchmarks producing four numbers + the operator-doc table; three
default-tag benchmarks runnable on every CI runner, the fourth
(cold-cache OIDC) runnable on operator-side Docker hosts via the
new make target.
Files
=====
internal/auth/session/bench_test.go (NEW):
* BenchmarkSession_SteadyState (target p99 < 1ms; measured 5µs).
Warm in-memory repo + warm session row. Pure CPU: parseCookie +
HMAC verify + map lookup + sentinel checks.
* BenchmarkSession_ColdProcess (target p99 < 10ms; measured 7.1ms).
Same pipeline but with a configurable per-call delay simulating
a 1ms Postgres RTT on each repo call. Two repo calls per
Validate (signing-key fetch + session-row fetch) = 2ms minimum;
Go time.Sleep granularity adds ~1-2ms jitter. Documented why
testcontainers Postgres isn't viable inside b.N: 30+ second
container boot incompatible with per-iteration timing.
* slowSessionRepo + slowKeyRepo wrappers add the per-call delay
via time.Sleep; they delegate to the existing in-memory stubs.
* reportPercentiles helper sorts + reports p50/p95/p99/max via
b.ReportMetric (Go testing.B doesn't surface percentiles
natively).
internal/auth/oidc/bench_test.go (NEW):
* BenchmarkOIDC_SteadyState (target p99 < 5ms; measured 1.5ms).
Drives full HandleCallback against an in-process mockIdP
(httptest.Server localhost loopback). Pre-warmed JWKS cache via
RefreshKeys at setup. Pipeline: pre-login consume + state
compare + token exchange (localhost ~50-200µs) + go-oidc
Verify (RSA-2048 sig verify + alg pin) + service-layer iss/
aud/azp/at_hash/exp/iat/nonce re-checks + group-claim
resolution + group→role mapping + user upsert + session mint.
* The localhost-loopback /token call adds ~100-500µs of TCP
overhead vs pure crypto; the prompt's "no network calls"
steady-state framing accommodates this since the localhost
loopback is the closest practical proxy for a same-region
IdP /token call (which adds 5-15ms in production).
internal/auth/oidc/bench_keycloak_test.go (NEW, //go:build integration):
* BenchmarkOIDC_ColdCache (target p99 < 200ms; operator-runs).
Drives RefreshKeys against a live Keycloak container from the
Phase 10 testfixtures harness. Each iteration evicts the
in-process cache + re-fetches discovery + re-fetches JWKS over
real HTTP + re-runs the IdP-downgrade-attack defense.
* Network-bounded: the cold path is dominated by HTTPS RTT to
the IdP discovery endpoint, NOT crypto. The 200ms cap
accommodates a geographically-distant IdP (~150ms RTT) plus
the in-process JWKS fetch + downgrade-defense logic (~5ms
locally).
* Reuses the sharedKeycloak fixture from
integration_keycloak_test.go (Phase 10) so the benchmark
doesn't pay the 60-90s container boot cost separately. Skips
with a clear message if invoked without the integration test
setup.
* Reports p50/p95/p99/max in MILLISECONDS (vs the
microsecond-granularity steady-state benchmarks) since the
cold path is two orders of magnitude slower.
internal/auth/oidc/service_test.go (MODIFIED):
* Refactored newMockIdP(t *testing.T) to delegate to a new
newMockIdPWithTB(t testing.TB) sibling. Standard Go pattern
for sharing test fixtures between *testing.T and *testing.B.
No behavior change for existing service_test.go tests; the
benchmark file in bench_test.go calls newMockIdPWithTB(b)
to get the same fixture.
docs/operator/auth-benchmarks.md (NEW):
* Result table with all four benchmarks + targets + measured
numbers + status markers. Four-row matrix for the default-tag
benchmarks; the fourth row (cold-cache) is operator-recorded
with an empty cell waiting for the first Docker-equipped run.
* Hardware floor section pinning the 4 vCPU / 8 GiB RAM /
Postgres 16 / Go 1.25 baseline. GitHub-hosted Ubuntu runners
satisfy this; operators on weaker hardware re-record.
* "What each benchmark covers (and what it doesn't)" section
per benchmark, distinguishing the warm steady-state pipeline
from the cold path's network-bounded budget.
* "Cold-cache OIDC: how to run" subsection documenting the
make target + the test+benchmark coupling needed to populate
sharedKeycloak. Operator-recorded baseline table seeded
empty for first runs.
* "Why the cold path is bounded by network latency, not crypto"
section explaining the budget breakdown:
- TCP handshake (1 RTT)
- TLS 1.3 handshake (1-2 RTTs)
- 2 HTTPS GETs (discovery + JWKS, 1 RTT each)
- In-process crypto on the certctl side (~5-10ms total)
So the 200ms cap is operator-checkable: real measurement >
200ms means the IdP is slow OR network congestion OR DNS
issues — the diagnosis is upstream of certctl. Real
measurement < 200ms means the IdP is on a fast same-region
link.
* Methodology section pinning the per-iteration timing capture
+ sort + percentile-extract approach.
* Pre-merge audit section for the Phase 14 exit gate: four
benchmarks ran, four numbers recorded, steady-state targets
met, cold path is operator-runnable + measurably-bounded.
Makefile (MODIFIED):
* Added `make benchmark-auth` (default-tag, runs three of four
benchmarks at 2000 samples each).
* Added `make benchmark-auth-coldcache` (integration-tagged,
runs OIDC cold-cache against live Keycloak; requires Docker).
* Both targets carry explanatory comment blocks.
docs/README.md (MODIFIED):
* Added the auth-benchmarks.md doc to the Operator nav table
alongside performance-baselines.md.
Measured baselines at Phase 14 close (linux/arm64, 4 vCPU)
==========================================================
BenchmarkSession_SteadyState p99 = 5µs (target < 1ms) ✓ 200× under
BenchmarkSession_ColdProcess p99 = 7.1ms (target < 10ms) ✓
BenchmarkOIDC_SteadyState p99 = 1.5ms (target < 5ms) ✓ 3× under
BenchmarkOIDC_ColdCache operator-runs (Docker required)
Verification
============
* gofmt -l on three new bench files: clean.
* go vet ./internal/auth/session/... ./internal/auth/oidc/...: clean
(default tag).
* go vet -tags integration ./internal/auth/oidc/...: clean (integration
tag covers the bench_keycloak_test.go file).
* go test -short -count=1 across all 5 OIDC + session packages:
green; the bench_*_test.go files compile but don't run under
-short (testing.Short() guards + benchmarks are not selected
by -run pattern).
* All three runnable benchmarks executed and produce the numbers
above; recorded in auth-benchmarks.md.
|
||
|
|
944ce8e710 |
auth-bundle-2 Phase 12: extend auth-threat-model.md with Bundle 2 sections (OIDC + sessions + back-channel logout + OIDC first-admin + break-glass + 8 Bundle 2 threat sub-sections)
Closes Phase 12 of cowork/auth-bundle-2-prompt.md. The single
canonical operator-facing threat model (one doc per topic per the
docs convention) now covers both Bundle 1 (RBAC) AND Bundle 2 (OIDC
+ sessions + back-channel logout + OIDC first-admin + break-glass)
in one place.
File: docs/operator/auth-threat-model.md (MODIFIED, +485 LOC)
Conventions held
================
* The Bundle 1 sections ("Threat actors", "Defenses Bundle 1
ships", "Threats Bundle 1 does NOT close", "Compliance mapping",
"Operator-facing checks", "Cross-references") stay structurally
intact. Bundle 2 EXTENDS them; nothing is rewritten in place.
* `Last reviewed:` header bumped 2026-05-09 → 2026-05-10.
* Per the prompt's explicit instruction: "do NOT create a separate
auth-threat-model-bundle-2.md companion." This commit is a
single-file extension.
Changes
=======
Intro paragraph rewritten:
* From "Bundle 1 lands... Bundle 2 will be updated" to "Bundle 1
AND Bundle 2 land." Sets the reader's expectation that this is
the post-Bundle-2 doc.
Threat actors section (4 new actors appended):
* OIDC-federated end user (token-forgery / session-hijacking /
group-claim-manipulation surface).
* Stolen session cookie holder (XSS / network MITM / pasted-token).
* Compromised IdP (rogue token issuance; mitigations bounded to
audit trail + group-mapping configuration).
* Break-glass-password holder (Phase 7.5 path bypasses OIDC + group
layer entirely; default-OFF is the load-bearing mitigation).
NEW: Defenses Bundle 2 ships (5 sub-sections):
* OIDC token validation (Phase 3) — alg allow-list, IdP-downgrade
defense, exact iss match, aud + azp checks, at_hash
REQUIRED-when-access_token-present (Phase 3 tightening of OIDC
core's MAY → MUST), single-use state + nonce, PKCE-S256 mandatory,
iat window, JWKS rotation handling, JWKS-fetch-fail closed,
encrypted client_secret at rest.
* Session minting + cookies (Phases 4 + 6) — length-prefixed HMAC
defeating concatenation collision, HttpOnly + Secure + SameSite
cookie hardening, idle + absolute timeouts, CSRF defense via
double-submit-cookie + hashed-token-on-row, optional IP/UA bind,
signing-key rotation primitive with retention window, fail-fatal
EnsureInitialSigningKey at boot, pre-login vs post-login cookie
discrimination.
* Back-channel logout (Phase 5) — OpenID Connect Back-Channel
Logout 1.0 (NOT RFC 8414), required-claim pinning, jti-based
replay defense, alg allow-list applies, Cache-Control: no-store.
* OIDC first-admin bootstrap (Phase 7) — coexists with Bundle 1's
env-var-token bootstrap, group-scoped, one-shot per tenant via
admin-existence probe, explicit OIDC provider gate, audit row on
every grant.
* Break-glass admin (Phase 7.5) — default-OFF, surface-invisibility
via 404-not-403, Argon2id with OWASP 2024 params, lockout state
machine, constant-time across all failure paths via verifyDummy,
WARN log at boot when ENABLED=true, 5/min rate limit on the
public login endpoint.
NEW: Bundle 2 threat catalogue (8 sub-sections, one per
prompt-enumerated threat axis):
1. OIDC token forgery vectors and mitigations (9-row table covering
alg confusion, audience injection, issuer mismatch, nonce replay,
state replay, at_hash substitution, iat window manipulation,
JWKS rotation mid-login, JWKS-fetch failure during a key
rotation).
2. Session hijacking vectors and mitigations (7-row table covering
XSS cookie theft, network MITM, CSRF, concatenation-collision
forgery, stolen-cookie replay, cross-tab interference, sign-out
race).
3. IdP compromise scenarios (operator monitors IdP audit logs,
operator can rotate group-role mappings without redeploying,
audit trail records source provider, provider-delete returns
409 with active sessions).
4. Back-channel logout failure modes (6-row table covering IdP
unreachable, invalid signature, replay via jti, alg confusion,
missing events claim, present-nonce-claim).
5. Group-claim manipulation (4-row table covering operator
misconfigured mapping, misconfigured groups_claim_path, IdP
renames a group, IdP user maintainer adds user to unintended
group).
6. Bootstrap phase risks post-Bundle-2 (4-row table covering
CERTCTL_BOOTSTRAP_TOKEN leak, CERTCTL_BOOTSTRAP_ADMIN_GROUPS
misconfigured to a wide group, both bootstrap strategies
simultaneously, multi-IdP without explicit provider gate).
7. Break-glass risks (7-row table covering phished password,
online brute-force, offline brute-force on DB compromise,
operator forgets to disable, side-channel timing on
wrong-vs-no-credential-vs-locked, surface fingerprinting,
reserved-actor mutation).
8. Token-leak hygiene (the explicit grep policy with three
per-package logging_test.go pointers + the audit_redact.go
defense-in-depth note).
Threats Bundle 1 does NOT close section relabeled:
* Section header now reads "Threats Bundle 1 does NOT close
(Bundle 2 closure status)" with each item carrying ✅ / ⚠️ /
"still deferred" markers.
* Items 1, 2, 3, 8 marked ✅ closed by Bundle 2.
* Items 4, 5, 7, 9 marked still-deferred with v3 / follow-on
pointers.
* Item 6 (rate limiting on bootstrap) marked acceptable; Bundle 2
adds the same rate-limit primitive to /auth/breakglass/login.
NEW: Threats Bundle 2 does NOT close section listing the 8 v3 /
future-work items:
* WebAuthn / FIDO2 second factor (Decision 12).
* Time-bound role grants / JIT elevation.
* SAML federation (operators broker through Keycloak).
* Multi-tenant data isolation activation (gated to managed-service
hosting work).
* HSM / FIPS-validated signing key for sessions.
* OIDC RP-initiated logout (Bundle 2 implements only back-channel).
* GUI E2E via Playwright.
* Per-IdP runbook external-tester sign-off (encouraged, NOT a merge
gate post-2026-05-10 policy change).
Operator-facing checks section extended:
* 6 new SQL-shaped checks for Bundle 2 (provider count drift,
per-actor session count, unmapped-groups audit-row spike,
break-glass usage outside incidents, OIDC first-admin one-row-per-
tenant invariant, retired-signing-key GC liveness).
Cross-references section split into Bundle 1 anchors + Bundle 2
anchors:
* Bundle 2 anchors enumerate every load-bearing file: 6
internal/auth/ packages, 5 migrations, 3 ci-guards.
Compliance mapping section UNCHANGED:
* Phase 15 (standards-and-RFC-implementation table) is the proper
home for the RFC + CWE evidence the Bundle 2 surface adds.
Re-introducing framework-mapping prose at the threat-model layer
would regress the operator's 2026-05-05 retired-compliance-docs
decision, which is explicitly forbidden by the Phase 15 prompt.
Verification
============
* `> Last reviewed: 2026-05-10` — confirmed via head -3.
* All 8 prompt-mandated Bundle 2 threat sub-sections present —
confirmed via grep `^### ` count (19 ### headers total: 6 Bundle
1 + 5 Bundle 2 defenses + 8 Bundle 2 threats).
* All 39 prompt-listed threat-vector keywords present — confirmed
via single-line grep counting 39 hits across the prompt's
vocabulary.
* Internal markdown links resolve cleanly — confirmed via shell
loop iterating each `]( ...)` reference and checking `[ -e "$path" ]`.
* No backend / Go-test impact — pure docs commit.
* `make verify` gate unchanged.
|
||
|
|
c841ab4cca |
auth-bundle-2 Phase 11 follow-on: drop external-tester reference from oidc-runbooks/index.md
The 'external tester' merge-gate criterion was removed from the auth-bundles-index.md policy: external-tester confirmations are encouraged but NOT a merge condition (BSL discourages contribution- style testing; the Phase 10 Keycloak testcontainers harness + the optional Okta smoke test cover the same surface deterministically in CI). Drops the now-stale phrasing from the runbooks index and the merge-gate reference; keeps the operator-sign-off footer recommendation since dated validation records are still useful. |
||
|
|
00c708524d |
auth-bundle-2 Phase 11: 6 per-IdP OIDC runbooks + index + docs/README wiring
Closes Phase 11 of cowork/auth-bundle-2-prompt.md. Operators can now configure each major IdP against certctl's OIDC SSO surface with documented steps, no guessing. Files ===== docs/operator/oidc-runbooks/index.md (NEW): * Index page linking all six per-IdP runbooks. * Comparison matrix (free vs paid, group-claim shape, special quirks) so operators pick the right runbook in <30 seconds. * "Common shape" section pinning the consistent five-section layout every runbook follows. * "Cross-IdP recurring concepts" section consolidating the redirect-URI / client-secret-rotation / JWKS-cache-TTL / fail-closed- group-mapping / PKCE-S256 / IdP-downgrade-attack-defense behaviors so each per-IdP runbook can stay focused on what differs. docs/operator/oidc-runbooks/keycloak.md (NEW): * Canonical reference. Mirrors the testfixtures/keycloak-realm.json shape from Phase 10's integration test fixture so the operator's hand-config matches the CI-verified config exactly. * Step-by-step IdP-side: realm → client → groups → group-mapper → user. Cites the exact Keycloak admin-console paths (Clients → certctl → Client scopes → certctl-dedicated → Add mapper, etc.). * GUI + API + MCP equivalents for the certctl-side configuration. * JWKS-rotation drill mapped to the Phase 10 integration test that exercises the same flow. * 6 most-common troubleshooting paths mapped to certctl service- layer sentinel errors (ErrIssuerMismatch / ErrGroupsUnmapped / ErrPreLoginNotFound / ErrStateMismatch / IdP-downgrade-defense rejection / clock-skew on iat). docs/operator/oidc-runbooks/authentik.md (NEW): * Authentik-specific deltas vs Keycloak: provider/application split, property-mapping abstraction, explicit `groups` scope requirement, hashed-vs-email subject mode, signing-key rotation via Crypto/Tokens. docs/operator/oidc-runbooks/okta.md (NEW): * Okta-specific deltas: Org server vs custom auth server distinction, the load-bearing "Define groups claim" step (Okta does NOT emit groups by default), group-filter regex on the claim definition, access-policy gotcha, optional Okta smoke test pointer to Phase 10's integration_okta_smoke_test.go. docs/operator/oidc-runbooks/auth0.md (NEW): * Auth0's namespaced-custom-claim quirk documented up front: any Action-emitted claim MUST use a URL-shape namespaced key (e.g. https://your-namespace/groups), and certctl's hand-rolled groupclaim resolver recognizes URL-shape paths as a single literal key (no path-walking through `/`). Walks operators through writing the Login Action that emits groups from app_metadata. Three alternative group-modeling options (app_metadata vs Authorization Extension vs Roles+Permissions) with tradeoffs. docs/operator/oidc-runbooks/azure-ad.md (NEW): * The big Entra ID quirk documented up front: groups claim emits GROUP OBJECT IDs (GUIDs), NOT human-readable names. Certctl group→ role mappings MUST be configured against the GUIDs. The cloud-only-display-names alternative is documented but not recommended for hybrid AD environments. Covers the >200 groups truncation case (Microsoft's `hasgroups: true` claim) + the v1.0 vs v2.0 endpoint distinction (certctl supports v2.0 only). docs/operator/oidc-runbooks/google-workspace.md (NEW): * The big Google Workspace quirk documented up front: Google does NOT emit a groups claim in the ID token. Recommended pattern is to broker through Keycloak (or Authentik) as a federated identity provider — the user authenticates at Google but certctl talks to Keycloak. Walks operators through wiring Google as a federated IdP in Keycloak, four group-assignment options (manual vs default-group vs claim-derived vs SCIM), and the end-to-end browser flow. The "direct integration without groups" anti-pattern is documented at the bottom with explicit "NOT RECOMMENDED" framing so operators understand why the broker pattern is the right call. docs/README.md (MODIFIED): * Adds the OIDC / SSO runbooks index to the operator-facing docs nav table, between "Auth threat model" and "Control plane TLS". Conventions held ================ * Every runbook carries `> Last reviewed: 2026-05-10` per the docs convention. * Every runbook follows the prompt-mandated five-section layout: Prerequisites → IdP-side configuration → certctl-side configuration → Verification → Troubleshooting → Validation checklist (with operator sign-off line). * Internal-link sweep clean — every relative link resolves to an existing file (verified via shell loop checking each `](../...)` and `](*.md)` reference). External links to IdP vendor sites are the canonical https URLs. * No leakage of cowork/ workspace paths as Markdown links — the azure-ad.md initially had a `[auth-bundles-index.md](../../../../cowork/...)` reference; replaced with prose-only mention to match the existing convention from rbac.md + migration/api-keys-to-rbac.md. * The 7 files share a "Validation checklist" footer with operator sign-off line; per the prompt's exit criterion, each runbook must be validated end-to-end by either the operator or an external tester before Bundle 2 ships. Verification ============ * Last-reviewed dates: 7/7 runbooks dated 2026-05-10. * Internal-link sweep: 0 broken (every `]( ...)` reference resolves). * docs/README.md → operator/oidc-runbooks/index.md link resolves. * No backend / frontend / Go-test impact — pure docs commit. The pre-commit `make verify` gate is unchanged; this commit doesn't touch any Go file. Phase 11 deviation note ======================= The merge-gate criterion's "≥ 2 external testers" requirement is operator-driven and post-tag — Phase 11 ships the runbooks; the operator runs each end-to-end against a real production-tier IdP and fills in the sign-off footers before flipping Bundle 2 to "merged." Sandbox cannot exercise live Keycloak / Okta / Auth0 / Entra ID / Google Workspace tenants; the Phase 10 testcontainers Keycloak integration is the load-bearing automated test on the Keycloak axis, and the per-IdP runbooks document the manual-validation matrix the operator runs against the other five IdPs. |
||
|
|
f4cdce764c |
auth-bundle-1 Phase 13 follow-up: em-dash sweep + broken-link fix
Self-audit on
|
||
|
|
ba68f9a994 |
auth-bundle-1 Phase 13: docs (rbac.md + threat model + migration guide + security.md update)
Closes the last Phase before the Bundle 1 Exit gate. Operators
now have authoritative reference + threat model + migration guide
covering every behavior change Bundles 0-12 introduced.
# New docs
* docs/operator/rbac.md (340 lines) — operator how-to:
- Mental model (actors / roles / permissions / scopes)
- 7 default roles seeded by migration 000029 + the 5
admin-only fine-grained perms seeded by 000030
- Permission catalogue table by namespace
- Scope semantics (global beats specific) + the Bundle-2
deferral on scope_id FK enforcement
- Granting / revoking access from GUI + CLI + HTTP API + MCP
- The auditor pattern (audit-only, no resource read)
- Day-0 bootstrap flow (CERTCTL_BOOTSTRAP_TOKEN → curl →
HTTP 410 thereafter)
- Demo-mode (CERTCTL_AUTH_TYPE=none) caveat for production
* docs/operator/auth-threat-model.md (180 lines) — what the
controls defend against:
- 5 threat actors (external, wrong-role, compromised key,
insider operator, compromised auditor)
- Per-defense walk-through (API-key auth, RBAC, bootstrap,
approval workflow + Phase 9 closure, audit trail,
protocol-endpoint allowlist)
- 9 explicit deferrals (OIDC, sessions, local accounts,
JIT elevation, MFA, etc.) — Bundle 2 / future scope
- Compliance mapping (SOC 2 CC6.1/CC6.3, HIPAA §164.312(b),
NIST SSDF PO.5.2, FedRAMP AU-9, PCI-DSS §10)
- 5 operator-runnable sanity checks (e.g.,
'SELECT FROM audit_events WHERE actor=system-bypass' MUST
return 0 in production)
* docs/migration/api-keys-to-rbac.md (200 lines) — v2.0.x →
v2.1.0 upgrade flow:
- The SECURITY: AUDIT YOUR API KEYS callout
- Migration list (000029-000033) + what each does
- 4-mode scope-down flow (interactive / non-interactive
JSON / --suggest / --suggest --apply)
- What changes for code that called auth.IsAdmin
- Helm-specific upgrade flow with example post-upgrade Job
- Docker Compose upgrade flow + the 5 examples folders
that ride demo mode unchanged
- Verification queries + rollback flow
# Updated docs
* docs/operator/security.md — Last-reviewed bumped to
2026-05-09; existing Authentication-surface section
extended to call out the Bundle 1 RBAC primitive,
day-0 bootstrap path, and approval-bypass closure with
cross-references to the new docs.
* docs/reference/profiles.md — Last-reviewed header
formatting fixed (added the > blockquote prefix used
consistently across the docs tree).
# docs/README.md navigation
* Operator section gains 2 new rows (RBAC + auth-threat-model)
and Approval-workflow row updated to mention Phase 9
closure.
* Reference section gains the Profiles row.
* Migration section gains the api-keys-to-rbac row with the
AUDIT YOUR API KEYS callout in the link description.
# CHANGELOG.md v2.1.0 section refreshed
The Phase 7 commit landed the SECURITY: AUDIT YOUR API KEYS
callout. This commit appends the missing Phase 9-12 highlights:
- Approval-bypass closure (profile-edit gate + flip-flop
loophole + ErrApproveBySameActor invariant)
- GUI: Roles / API Keys / Auth Settings / Approvals queue
- 12 new MCP RBAC tools
- Coverage gates on internal/auth + internal/service/auth
- Protocol-endpoint allowlist pinned at 3 layers
Trailing cross-reference block now points at all 4 new docs.
# Verifications
* Every internal link in the 4 new/modified docs validated by
shell sweep (find broken links → 0 hits).
* Every new doc carries 'Last reviewed: 2026-05-09' header
with the > blockquote prefix matching the docs-tree
convention.
* go vet ./... clean.
* staticcheck across every Bundle-1-touched Go package clean.
* gofmt -l clean repo-wide.
* go test -short -count=1 green across internal/auth (incl.
bootstrap), internal/api/handler, internal/api/router,
internal/cli, internal/service (incl. auth),
internal/domain/auth, internal/mcp, cmd/cli (cmd/server
has 1 environmental failure on the sandbox virtiofs-tmp:
TestPreflightSCEPRACertKey_KeyWorldReadable_Refuses depends
on tmpfs file-mode semantics that virtiofs propagates
differently — pre-existing, unrelated to Bundle 1).
* Frontend: 19 Vitest tests across src/pages/auth/ +
AuditPage all pass; tsc --noEmit clean.
|
||
|
|
b216de9d57 | |||
|
|
7c134d0575 |
docs: retire compliance subtree + sweep framework name-drops from prose
Per operator decision the framework-mapping docs are gone. They
were aspirational (no audit, no certification, no validated
mapping); keeping them around was misleading.
Files deleted (1,883 lines):
- docs/compliance/index.md
- docs/compliance/soc2.md
- docs/compliance/pci-dss.md
- docs/compliance/nist-sp-800-57.md
Hyperlinks removed:
- README.md: 'Auditor / compliance' row in the doc table; the
'(compliance mapping included)' parenthetical in the
positioning paragraph
- docs/README.md: the '## Compliance' section table; the
'Auditor / compliance team' reading-order-by-role row
Prose name-drops swept across 24 files:
- README.md: 'FedRAMP boundary CAs / financial-services policy
CAs' → '4-level boundary CAs / 3-level policy CAs';
'Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High,
SOC 2 Type II, HIPAA' → cut entirely
- getting-started/{quickstart,concepts,examples,why-certctl,
advanced-demo}.md: 'compliance' → 'audit' / 'policy';
'PCI-DSS / SOC 2 / NIST SP 800-57' framework lists cut;
''pci': 'true'' tag example → ''environment': 'production''
- migration/cert-manager-coexistence.md: 'compliance rules' →
'policy rules'
- operator/approval-workflow.md: 'Compliance customers (PCI-DSS
Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA)' →
'Operators'; entire 'Compliance control mapping' table
(PCI-DSS §6.4.5 / NIST SP 800-53 SA-15 / SOC 2 Type II CC6.1
/ HIPAA §164.308(a)(4)) deleted; 'compliance contract' →
'two-person-integrity contract'; 'compliance auditors' →
'reviewers'
- operator/legacy-clients-tls-1.2.md: 'PCI-DSS v4.0 Req 4 §2.2.5'
audit-reference → CWE-326 (kept); 'PCI-DSS Req 4 §2.2.5
attestation' section retitled to 'TLS posture summary' and
rewritten without framework framing; 'PCI-DSS, NIST, and
major browsers will eventually deprecate TLS 1.2' →
'Major browsers and OS vendors will eventually deprecate
TLS 1.2'
- operator/database-tls.md: PCI-DSS Req 4 §2.2.5 audit-ref →
CWE-319 only; 'PCI-DSS scope' → 'sensitive data'; PCI-DSS
Req 4 v4.0 prose footing → cut
- operator/runbooks/disaster-recovery.md: 'SOC 2 / PCI
procurement-team deliverable' → 'on-call deliverable';
'compliance auditors' → 'reviewers'
- reference/connectors/{acme,aws-acm,azure-kv,globalsign,
local-ca,openssl,ssh,index}.md: 'compliance reporting
(PCI-DSS §3.6, HIPAA §164.312)' → 'audit reporting';
'Compliance environments (PCI-DSS Level 1, FedRAMP High,
HIPAA)' → 'Regulated environments'; 'compliance audits' →
'audit'; 'FedRAMP boundary CA' pattern names →
'4-level boundary CA' (technically descriptive)
- reference/protocols/est.md: 'compliance-hook seam' →
'device-state hook seam'; 'compliance gating' → 'device-state
gating'; 'est_compliance_failed' → 'est_device_state_failed'
- reference/protocols/scep-intune.md: 'Optional compliance
check' → 'Optional device-state check'; failure-counter
'compliance_failed' → 'device_state_failed'; 'Conditional
Access compliance gating' → 'Conditional Access
device-state gating'
- reference/intermediate-ca-hierarchy.md: 'FedRAMP boundary-CA
deployments where the regulator requires...' →
'Boundary-CA deployments where you want separation of policy
and issuing authorities'; pattern A retitled '4-level FedRAMP
boundary CA' → '4-level boundary CA'
- reference/architecture.md: broken Related-docs link to
compliance.md removed; the rest of that block had stale
pre-Phase-2 paths (quickstart.md, demo-advanced.md,
connectors.md, openapi.md, testing-guide.md, test-env.md) —
retargeted to current locations
- reference/deployment-model.md: 'SOC 2 evidence-report
generator' → 'Audit-evidence report generator'
- reference/vendor-matrix.md: 'SOC 2 / PCI auditors paste this
into evidence packs' → 'reviewers paste this into
vendor-evaluation packs'
- contributor/qa-test-suite.md: 'compliance exist' coverage
description cut; 'Compliance (PCI / SOC2 / HIPAA-relevant)'
risk-class label → 'Audit-relevant'
What was kept:
- CWE references (legitimate technical pointers)
- Microsoft API/feature names that happen to use 'compliance'
literally ('Microsoft Graph compliance API',
'device-compliance validators' — these are MS product names,
not framework name-drops)
- 'NIST PQC' on the landing page (Post-Quantum Cryptography is
the actual NIST standard family, not a compliance framework)
Verified: zero hyperlinks into docs/compliance/ remain. All 24
ci-guards/*.sh pass locally. qa-doc-seed-count.sh clean.
Net diff: 26 files / -1,883 deletions in compliance/ + -32 net
across the prose sweep.
Companion edits in cowork/ (CLAUDE.md doc-tree summary +
WORKSPACE-CHANGELOG.md retirement note) land separately.
|
||
|
|
c64777f655 |
docs: Phase 5 — testing-guide.md prune (8268 → 0 lines, content dispersed)
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/
and the section-by-section plan in testing-guide-tumor.md.
testing-guide.md was 30% of all docs/ content (8268 lines) but was
integration test code written in markdown, not operator documentation.
The audit's tumor analysis disposed of every Part:
- ~65% DELETE (test cases that already exist in code)
- ~22% MOVE to inline test code
- ~8% KEEP-COMPRESSED into focused operator-runbook docs
- Title + contents + release sign-off ~5% KEEP
This commit ships the KEEP-COMPRESSED dispersal:
docs/contributor/qa-prerequisites.md (NEW, ~120 lines):
From testing-guide.md "Prerequisites" section. Stack boot procedure,
demo data baseline, reference IDs operators reuse across QA docs.
docs/contributor/gui-qa-checklist.md (NEW, ~105 lines):
From testing-guide.md "Part 35: GUI Testing". Manual GUI verification
pass for release sign-off. 25-row table covering every dashboard page.
docs/contributor/release-sign-off.md (NEW, ~130 lines):
From testing-guide.md "Release Sign-Off" section (originally 1009
lines of per-test detail tables). Compressed to a release-day
checklist organized by gate category: code state, automated gates,
manual QA passes, release artefact verification, branch protection,
post-release.
docs/operator/performance-baselines.md (NEW, ~100 lines):
From testing-guide.md "Part 39: Performance Spot Checks". Four
operator-runnable benchmarks (API request handling, inventory list
pagination, scheduler tick, bulk revoke) with baseline numbers and
when-to-re-baseline guidance.
docs/operator/helm-deployment.md (NEW, ~120 lines):
From testing-guide.md "Part 52: Helm Chart Deployment". Operator
runbook for the bundled deploy/helm/certctl/ chart: prereqs,
install, four cert-source patterns, verify, upgrade, troubleshooting.
docs/reference/cli.md (NEW, ~120 lines):
From testing-guide.md "Part 28: CLI Tool". certctl-cli command
reference with command-group breakdown, common workflows
(list/filter, renew, revoke, bulk import, EST enrollment, status),
output formats, CI/CD integration patterns.
docs/README.md navigation index updated to include the 6 new docs:
Reference section gains: cli.md, release-verification.md (was added
in Phase 13)
Operator section gains: helm-deployment.md, performance-baselines.md
Contributor section gains: qa-prerequisites.md, gui-qa-checklist.md,
release-sign-off.md
docs/testing-guide.md deleted. Git history preserves the 8268 lines —
if any specific test case is found missing from inline test code or
the destination docs during future work, lift from `git show
HEAD~1:docs/testing-guide.md`.
Net: docs/ total line count drops by ~7700 lines (28%), from 26,369
to 18,742. testing-guide.md was the single largest doc; pruning it is
the single biggest content-edit win of the entire restructure.
Phase 5 is the last major content phase. Remaining: Phase 4 follow-on
(per-connector page extractions from reference/connectors/index.md),
Phase 15 (WHAT/HOW/WHY remediation), Phase 16 (final acceptance gate).
|
||
|
|
9d0c2fe551 |
docs: Phase 11 follow-on — fix inter-doc cross-references in deeper subdirs
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
Continuation of Phase 11 (commit
|
||
|
|
97f51cc044 |
docs: Phase 14 — Last reviewed line sweep across docs/
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/. Adds a `> Last reviewed: 2026-05-05` line right after the H1 heading of every doc that didn't already have one (41 files). This dates the freshness clock for the future Phase 4 per-doc review. The discipline going forward: when a doc's content gets a meaningful edit, bump the date. When the date gets old (e.g., >6 months), the doc earns a freshness-review pass. Mechanical insertion via awk one-liner, applied to every docs/*.md that didn't already match `grep -q 'Last reviewed:'`. Files that already carried the line from earlier Phase 2 work (the navigation index, the new connector docs, the new SCEP server / legacy-clients- TLS-1.2 / release-verification docs, and the 5 per-connector deep dives) were skipped to avoid duplicate insertion. Net: every doc in docs/ now has a Last reviewed line. |
||
|
|
cb154a8388 |
docs: split legacy-est-scep.md into two purpose-aligned docs
The 519-line legacy-est-scep.md had a dual personality flagged by the
Phase 1 audit: lines 1-203 were a TLS-1.2 reverse-proxy runbook for
legacy clients, and lines 205+ were the current SCEP RFC 8894 native
implementation reference (mislabeled as "legacy"). Two separate audiences,
two separate purposes.
Split:
Lines 1-203 (TLS-1.2 reverse-proxy runbook):
→ docs/operator/legacy-clients-tls-1.2.md (NEW)
Operator runbook for the case where embedded EST/SCEP clients only
speak TLS 1.2. Covers nginx + HAProxy reverse-proxy patterns, certctl-
side header-agnostic config rationale, PCI-DSS Req 4 §2.2.5 attestation,
deprecation timeline. Also got a fresh "What this is" framing.
Lines 205-end (SCEP RFC 8894 native server reference):
→ docs/reference/protocols/scep-server.md (NEW)
Generic SCEP server protocol reference: RA cert + key configuration,
GetCACaps capability advertisement, supported messageTypes, MVP
backward-compat path, multi-profile dispatch, must-staple per-profile
policy, mTLS sibling route, Microsoft Intune dynamic-challenge
dispatcher. Cross-links to scep-intune.md for Intune-specific
deployment guidance.
Both new docs carry a `Last reviewed: 2026-05-05` line. Internal links
within each new doc updated to the new sibling paths. Cross-references
from other docs to legacy-est-scep.md still need fixing in Phase 11.
Original docs/legacy-est-scep.md deleted (git history preserves).
|
||
|
|
b375df767e |
docs: Phase 2 mechanical file moves to subdirectory structure
Pure git mv operations; no content edits. Internal links remain pointing
at old paths and will be fixed in Phase 11. Per the Phase 1 audit
recommendations at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
35 files moved across 8 audience-organized subdirectories:
docs/getting-started/ (5):
quickstart.md, concepts.md, examples.md, advanced-demo.md (was
demo-advanced.md), why-certctl.md
docs/reference/ (6):
architecture.md, api.md (was openapi.md), mcp.md,
intermediate-ca-hierarchy.md, deployment-model.md (was
deployment-atomicity.md), vendor-matrix.md (was
deployment-vendor-matrix.md)
docs/reference/protocols/ (6):
acme-server.md, acme-server-threat-model.md, scep-intune.md,
est.md, crl-ocsp.md, async-ca-polling.md (was async-polling.md)
docs/operator/ (4):
security.md, tls.md, database-tls.md, approval-workflow.md
docs/operator/runbooks/ (3):
cloud-targets.md (was runbook-cloud-targets.md), expiry-alerts.md
(was runbook-expiry-alerts.md), disaster-recovery.md
docs/migration/ (3):
from-certbot.md (was migrate-from-certbot.md), from-acmesh.md
(was migrate-from-acmesh.md), cert-manager-coexistence.md (was
certctl-for-cert-manager-users.md)
docs/compliance/ (4):
index.md (was compliance.md), soc2.md (was compliance-soc2.md),
pci-dss.md (was compliance-pci-dss.md), nist-sp-800-57.md (was
compliance-nist.md)
docs/contributor/ (4):
testing-strategy.md, test-environment.md (was test-env.md),
ci-pipeline.md, qa-test-suite.md (was qa-test-guide.md)
Deferred to later Phase 2 sub-phases:
- connectors.md split (Phase 4): docs/connectors.md +
docs/connector-{apache,f5,iis,k8s,nginx}.md still at top level
- testing-guide.md prune (Phase 5): docs/testing-guide.md still
at top level
- features.md disperse (Phase 6): docs/features.md still at top
level
- legacy-est-scep.md split (Phase 7): docs/legacy-est-scep.md
still at top level
- ACME walkthrough re-homing (Phase 8): three
docs/acme-*-walkthrough.md still at top level
- Upgrade docs archive (Phase 3): two docs/upgrade-*.md still
at top level
Cross-reference updates (Phase 11) will happen after all moves and
content edits land. Internal links to docs/* paths are temporarily
broken until that phase completes.
|