certctl

mirror of https://github.com/shankar0123/certctl.git synced 2026-06-07 16:11:29 +00:00

Author	SHA1	Message	Date
shankar0123	c029875196	docs(readme): Status block rewrite — design-partner CTA, paragraph cadence Earlier versions were either link-soup or so tight they read as boilerplate. This pass aims for CMO-grade copy: - Paragraph 1: lede that combines the early-access label with the design-partner ask — sets the tone in one line. - Paragraph 2: what's production-quality today, with the RBAC + OIDC doc links inline (no bold, no link-soup). Names the v2.1.0 layer on top. - Paragraph 3: the ask — production deployments wanted, framed explicitly as 'we can't manufacture this exposure in CI'. Honest about the federated-identity surface being where the new exposure lives. Mutual-value framing. - Paragraph 4: the actionable bit — file issues liberally, with the why ('how the platform earns the right to drop early-access'). Three inline doc links (RBAC, OIDC runbook index, file-issues). Same factual content, warmer voice, paragraph cadence with breathing room between.	2026-05-11 22:16:32 +00:00
shankar0123	ed833e80f6	docs(readme): space out the Status block — three separate blockquotes	2026-05-11 22:14:50 +00:00
shankar0123	0eb3d0310c	docs(readme): tighten Status block; add RBAC + OIDC runbook links Quieter version of the Status block — single blockquote, three short sentences, three inline links (RBAC, OIDC, file-issues). Drops: - The Local-CA / ACME / agent-deployment / CRUD / audit feature pile (those live in the doc table immediately below) - The 6-IdP enumeration (Keycloak / Authentik / Okta / Auth0 / Entra ID / Google Workspace) — operators find that in the OIDC runbook index, now linked inline - The double 'in early-access' phrasing - 'HMAC-signed server-side sessions with __Host- cookies and CSRF rotation; OIDC Back-Channel Logout; Argon2id break-glass admin' — the spec details belong in the auth-threat-model + security docs, not the front-page status Same early-access framing, same issue-link CTA, far more readable.	2026-05-11 22:13:34 +00:00
shankar0123	46769fc7fa	docs(readme): audit pass — fix 7 stale/inaccurate claims Each claim ground-truthed against the live repo, not memory. Numeric drift (claims rotted since they were written): - Screenshot caption 'Catalog with 10 CA types' → 12 (matches internal/connector/issuerfactory/factory.go enumeration). - '33-permission canonical catalogue' → dropped the number. 33 was the base in migration 000029; across all 45 migrations 82 unique perms are seeded (+5 admin / +7 OIDC / +2 break-glass / +33 audit-CRIT-1 / +2 user). 'Fine-grained permission catalogue' is monotonic prose. - 'PostgreSQL 16 backend (35+ tables, idempotent migrations)' → '…backend with idempotent migrations'. Actual table count is 49 across 45 migrations; bare 'idempotent migrations' is drift-proof. - Demo overlay seeds '32 certificates across 10 issuers, 8 agents, 180 days' → '180 days of realistic history across 13 issuers, 8 agents, managed + discovered certs, jobs, deploys, audit, and notification events'. seed_demo.sql actually seeds 14 managed certs + 16 cert versions + 12 discovered, 13 issuers (not 10), 8 agents ✓, 23 INTERVAL '180 days' refs ✓. - 'golangci-lint (11 linters)' → '(govet + staticcheck + contextcheck + unused)'. .golangci.yml lists exactly 4 active linters; 6 others are commented-out 'temporarily disabled' so neither 4 nor 10 explains 11. Broken Helm one-liner (silently no-ops because --set against a nonexistent path doesn't error): - '--set server.apiKey=…' → 'server.auth.apiKey' (deploy/helm/certctl/values.yaml:147 + templates/server- secret.yaml:16). - '--set postgres.password=…' → 'postgresql.password' (top-level key is 'postgresql', not 'postgres'; password sits at postgresql.password per values.yaml:315). Verified accurate (no change): - 12 issuers / 15 targets / 6 notifiers (factory + dir listings). - 7 default roles seeded in migration 000029. - Coverage thresholds (service 70 / handler 75 / crypto 88 / auth packages 85-95) against .github/coverage-thresholds.yml. - All 6 OIDC runbooks present (auth0 / authentik / azure-ad / google-workspace / keycloak / okta). - 4 referenced screenshots all exist on disk. - 8 agents in demo seed, 180 days of history. - RFC 9700 §4.7.1 / 9207 / 8555 / 9773 / 8894 / 9266 / 5280 / 6960 citations match source. - ChromeOS in SCEP description matches source. - install-agent.sh uses uname for OS / arch detection + systemd (Linux) / launchd (macOS). v2.1.0	2026-05-11 17:29:18 +00:00
shankar0123	12705efe36	docs(readme): split Status block into two blockquotes for breathing room	2026-05-11 17:09:20 +00:00
shankar0123	de53847f51	docs(readme): quiet the Status block The previous version crammed 5 bold-emphasized inline links plus inline code into a single paragraph — visually loud and hard to scan. Rewrite as two short paragraphs: - First paragraph: what's production-quality + what's still maturing. No links, em-dash cadence for breathing room. - Second paragraph: v2.1.0 OIDC + sessions + break-glass slice with a single issue-link tail. Drops the bold-link sandwich in favor of plain prose; the doc-nav table directly below handles per-doc routing. Same content, same early-access framing, far less visual noise.	2026-05-11 17:08:21 +00:00
shankar0123	56e2ea1ad7	docs: v2.1.0 release polish — strip internal bundle/phase tags, update status for OIDC ship README: - Rewrite Status block: drop the stale 'federated identity not yet shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout + break-glass as early-access; encourage GitHub issues for IdP rough edges. (A1 framing — keep early-access umbrella, no SAML/WebAuthn/JIT roadmap teaser.) - Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks, group-claim → role mapping, AES-256-GCM client_secret encryption, JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding, RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute expiry, BCL, break-glass admin. - Update Security paragraph: three auth paths (API keys / OIDC / break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL. - Correct CI coverage thresholds against .github/coverage-thresholds.yml (service 70%, handler 75%, crypto 88%, auth packages 85-95%); 'static analysis' replaces the inflated '11 linters' claim (actual count is 4 active). Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags: - docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2 sections (API-key + RBAC defenses / OIDC + sessions + break-glass defenses / OIDC + sessions threat catalogue / Closed federated- identity threats / Future-work threats); clean ~12 H3/prose hits. - docs/operator/rbac.md — strip Bundle 1 framing from intro, scope_id deferral note, MCP tools section, day-0 bootstrap, and 'Where to look next'. - docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from title intro, hardware floor caption, result table caption, methodology, and pre-merge audit section. - docs/operator/security.md — already cleaned earlier this session (RBAC / day-0 / approval-bypass / OIDC federation / sessions / OIDC first-admin / break-glass H3s). - docs/operator/oidc-runbooks/{index,keycloak,authentik,okta, azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4 references; replace with feature-name prose. - docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023 audit-reference framing; keep CWE-326. - docs/operator/database-tls.md — drop Bundle B / M-018 framing from intro + Helm section. - docs/operator/runbooks/disaster-recovery.md — drop 'Production hardening II Phase 10' status callout. - docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO'; strip Bundle 1/2 framing from prereqs, troubleshooting, related docs; update __Host- cookie callout from 'audit MED-14' to v2.1.0-BREAKING. - docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from intro, migration table, IsAdmin section, and cross-references. - docs/migration/acme-from-cert-manager.md — strip residual 'Phase 5' tags from cert-manager integration test references. - docs/reference/configuration.md — retitle Auth section. - docs/reference/profiles.md — strip Bundle 1 Phase 9 framing from RequiresApproval section + Related list. - docs/reference/auth-standards-implemented.md — rewrite intro (API-key + RBAC + OIDC + sessions + back-channel logout + break-glass); rename 'Bundle 1 (RBAC) standards covered separately' H2; clean per-row Phase references. - docs/README.md — rewrite nav-table entries to drop Bundle 1/2 parentheticals; retitle 'Enable OIDC SSO' migration entry. No code or test changes; pure operator-facing prose polish for the v2.1.0 tag.	2026-05-11 16:54:07 +00:00
shankar0123	1b03d0c594	fix(repo/job): split UNION ALL + FOR UPDATE into two queries (Postgres-correctness) Phase-9 docker compose smoke surfaced a latent production-breaking bug introduced by commit `89b910a` (H-6 atomic pending-job claim). The ClaimPendingByAgentID query in internal/repository/postgres/job.go combined UNION ALL with FOR UPDATE SKIP LOCKED in a single statement. Postgres rejects this with: ERROR: FOR UPDATE is not allowed with UNION/INTERSECT/EXCEPT Every agent work-poll returns HTTP 500 in any real deployment where an agent is actually polling. From the compose log: request_id=6da47015-... GET /api/v1/agents/agent-demo-1/work status=500 duration_ms=2 The schema-per-test unit harness in internal/repository/postgres/ *_test.go never inserted jobs and polled, so the SQL execution path was never exercised. The bug has been latent in master since `89b910a` landed. Fix: split the UNION ALL into two separate FOR UPDATE SKIP LOCKED queries within the existing transaction. The H-6 atomicity invariant (concurrent pollers never see the same Pending row) is preserved because: 1. The two queries run inside the same transaction (tx). 2. Each query independently locks its result rows with FOR UPDATE SKIP LOCKED. 3. The subsequent UPDATE that flips Pending -> Running runs in the same transaction, so the rows stay invisible to concurrent callers from initial SELECT through final COMMIT. 4. The transaction is the unit of consistency, not the single SQL statement. Two queries: - Branch 1 (direct): jobs.agent_id = + status='Pending' + type='Deployment'. ORDER BY created_at ASC, FOR UPDATE SKIP LOCKED. - Branch 2 (fallback): jobs.agent_id IS NULL + INNER JOIN deployment_targets dt ON jobs.target_id = dt.id WHERE dt.agent_id = . ORDER BY j.created_at ASC, FOR UPDATE OF j SKIP LOCKED (FOR UPDATE OF needed because the join brings in dt). Branch 3 (AwaitingCSR) is unchanged — already a single SELECT, not affected by the UNION restriction. Inline comment explains the fix's load-bearing-ness so a future refactor doesn't merge them back into one UNION query. Verify (sandbox): go vet clean; go test -short -count=1 PASS on internal/repository/postgres/. Workstation re-runs 'docker compose up' to confirm the agent's GET /work returns 200 with the next pending-deployment claim. Note: this is NOT a regression introduced by Auth Bundle 2 or the 2026-05-11 audit fixes; it's a pre-existing latent defect from H-6. Including in v2.1.0 because shipping with a broken agent work-poll would block the demo path on day one of release.	2026-05-11 16:11:33 +00:00
shankar0123	def4be9b38	fix(migrations): two cold-DB regressions surfaced by Phase-9 docker compose smoke The v2.1.0 release-gate Phase-9 docker compose smoke run against a fresh Postgres surfaced two real defects in the migration files that testcontainers schema-per-test never exercised. Both reproduce by running 'docker compose down -v && docker compose up --build' against the current master tree. Bug A — migration 000045_users_deactivated_at.up.sql is malformed. The 000029 schema defines: permissions (id TEXT PRIMARY KEY, name TEXT NOT NULL UNIQUE, namespace TEXT NOT NULL) role_permissions (..., permission_id TEXT NOT NULL REFERENCES ..., ...) But 000045 was written as: INSERT INTO permissions (name) VALUES ... -- missing id + namespace INSERT INTO role_permissions (role_id, permission, ...) VALUES ... ^^ wrong column name On a cold-DB run this fails immediately with: pq: null value in column "id" of relation "permissions" violates not-null constraint Fix: provide id + namespace columns, use permission_id (the actual column name), ON CONFLICT (id) DO NOTHING. The new permission ids follow the existing 'p-auth-' prefix convention (p-auth-user-read + p-auth-user-deactivate) used by 000029. Bug B — migration 000029_rbac.up.sql is not idempotent post-000043. 000029 originally created actor_roles with: UNIQUE (actor_id, actor_type, role_id, tenant_id) Audit 2026-05-10 HIGH-10 closure / migration 000043 drops that constraint and re-creates it WITH scope columns: UNIQUE (actor_id, actor_type, role_id, scope_type, scope_id, tenant_id) The migration runner (internal/repository/postgres/db.go::RunMigrations) is naive — no tracker table — and re-runs every .up.sql file on every server boot. On the second-and-later boots, 000029's seed INSERT for actor-demo-anon-admin still references the pre-000043 constraint name in its ON CONFLICT clause: ON CONFLICT (actor_id, actor_type, role_id, tenant_id) DO NOTHING Postgres errors out with: pq: there is no unique or exclusion constraint matching the ON CONFLICT specification Fix: pin the conflict target to the row's primary key 'id' column (always present, never altered). The seed row's deterministic id 'ar-demo-anon-admin' makes ON CONFLICT (id) work under both pre- and post-000043 schemas. Why testcontainers schema-per-test missed these: Each test in internal/repository/postgres/*_test.go spins up a fresh schema and applies every .up.sql in order ONCE. The full '000029 -> 000043 -> retry 000029' cascade never happens because migrations don't re-run within a test. Phase-9 docker compose smoke is the only test path that exercises the server-restart- on-error retry, which is exactly the missing coverage. Verify (sandbox): go test ./internal/repository/postgres/ PASS. Workstation re-runs 'docker compose down -v && docker compose up' to confirm both bugs are closed.	2026-05-11 16:06:20 +00:00
shankar0123	aa1efd0676	fix(oidc/testfixtures): set legacy KEYCLOAK_ADMIN* env vars for start-dev master-admin bootstrap Phase-10 live-IdP smoke (post-iss-param fix landing in `360e744`) advanced 4 of 6 integration tests to green. The remaining 2 — the realm-key rotation tests — failed with: admin-cli token: HTTP 401 at the master-realm token endpoint. Root cause: Keycloak 26.x has TWO admin-bootstrap env-var pairs and the right pair depends on the launch command: - 'start' (production): KC_BOOTSTRAP_ADMIN_USERNAME + KC_BOOTSTRAP_ADMIN_PASSWORD - 'start-dev': KEYCLOAK_ADMIN + KEYCLOAK_ADMIN_PASSWORD The fixture sets KC_BOOTSTRAP_ADMIN_USERNAME + KC_BOOTSTRAP_ADMIN_PASSWORD but runs 'start-dev'. The bootstrap pair is silently ignored in dev-mode, leaving the master realm with no admin user → admin-cli token endpoint returns 401 → RotateRealmKeys can't authenticate to the Admin API. The 4 auth-code flow tests passed because they authenticate the engineer / viewer test users INSIDE the certctl realm (created by the realm import), which doesn't need a master admin. Fix: set BOTH pairs as belt-and-braces. The legacy KEYCLOAK_ADMIN pair covers start-dev today; the KC_BOOTSTRAP_ADMIN_* pair keeps a future flip to 'start' working. Inline comment in the fixture explains the why so a future reader doesn't drop one back. Verify (sandbox): go vet -tags=integration clean; gofmt clean. Workstation re-runs 'make keycloak-integration-test' to confirm the 2 rotation tests now reach + execute the Admin API successfully.	2026-05-11 15:49:25 +00:00
shankar0123	360e7449ad	fix(oidc/integration): pass fx.IssuerURL as callbackIss arg in 7 HandleCallback call sites Phase-10 live-IdP smoke (post-Enabled-true fix landing in `1b52998`) surfaced the next layer: 5 of 6 testcontainers-Keycloak integration tests failed with 'oidc: provider advertises iss-parameter support but callback omitted it'. Root cause: Keycloak's discovery doc advertises authorization_response_iss_parameter_supported=true. The Audit 2026-05-10 MED-17 closure (RFC 9207) gates the callback path: when the IdP advertises iss-param support, HandleCallback requires a non-empty callbackIss arg that matches the provider's IssuerURL, else ErrIssParamMissing. The 7 HandleCallback call sites in the integration tests were passing '' for the callbackIss arg — the synthetic test code never simulated the real browser's '?iss=<issuer>' query param. Fix: replace '' with fx.IssuerURL at all 7 sites: - integration_keycloak_test.go: 5 sites (TestKeycloakIntegration_AuthCodeFlow_HappyPath, TestKeycloakIntegration_LogoutRevokesSession, TestKeycloakIntegration_JWKSRotation_RefreshKeysPicksUpNewKey pre+post HandleCallback, TestKeycloakIntegration_UnmappedGroupsFailsClosed) - integration_keycloak_rotate_test.go: 2 sites (TestKeycloakIntegration_MED6_AutoRefreshOnKidMiss pre+post) Inline note on the first site explains the rationale so future test-writers don't drop back to ''. Verify (sandbox): go vet -tags=integration ./internal/auth/oidc/... clean; gofmt clean; grep for remaining empty-iss callsites returns 0 matches. Workstation re-runs 'make keycloak-integration-test' to confirm the 5 affected tests advance past the iss-param check against a real Keycloak 26.x.	2026-05-11 15:44:39 +00:00
shankar0123	1b529985be	fix(oidc/testfixtures): set Enabled=true on Keycloak integration-test provider Phase-10 live-IdP smoke re-run (after the alg-downgrade relax landed in `fefeccf`) surfaced the next layer: 5 of 6 testcontainers-Keycloak integration tests failed with 'oidc: provider is disabled'. Root cause: the OIDCProvider struct literal in internal/auth/oidc/testfixtures/keycloak.go omits the Enabled field. Enabled was added by Audit 2026-05-11 MED-9 (Bundle 2 Fix 13 Phase B); pre-fix the field didn't exist and HandleAuthRequest always proceeded. Post-fix the default zero-value false gates every integration test behind ErrProviderDisabled at service.go L478. Fix: add Enabled: true to the struct literal + inline comment explaining why the field is required for integration tests. The check is the right behavior for production (operator-driven disable kill-switch); just needed to be reflected in the testfixture. Verify (sandbox): go vet -tags=integration ./internal/auth/oidc/... clean. Workstation re-runs 'make keycloak-integration-test' to confirm the 5 affected tests now pass against a real Keycloak 26.x.	2026-05-11 15:39:07 +00:00
shankar0123	fefeccfa59	harden(oidc): relax alg-downgrade IdP-bind check to intersection-empty (Keycloak compat) Phase-10 live-IdP smoke (Keycloak 26.x via testcontainers-go) revealed the IdP-bind alg-downgrade check was too strict for real-world IdPs. 6 of the integration tests in internal/auth/oidc/integration_keycloak_test.go were failing with: oidc: IdP advertises weak signing algorithms (HS/none); refusing to use as defense against downgrade attacks: HS256 Keycloak 26.x (and several other real-world IdPs — Auth0 when HS-mode is enabled, some Authentik configs) advertise EVERY alg they're capable of in the discovery doc's id_token_signing_alg_values_supported field, even when the realm only signs with RS256 in practice. Pre-fix the IdP-bind check refused on ANY HS* or 'none' advertisement → no real Keycloak deploy could ever bind a provider row, hence the integration-test failures. The strict-deny check was defense-in-depth on top of the load-bearing per-token alg-pin at sig-verify time (isDisallowedAlg, service.go L1177): that check rejects every ID token whose JWS header carries an alg outside DefaultAllowedAlgs, regardless of what the discovery doc advertises. A forged HS256 token signed with the IdP's RS256 pubkey as HMAC secret is rejected at sig-verify time → the actual algorithm-confusion attack is closed by the per-token pin, NOT by the discovery-doc check. Fix: relax the IdP-bind check to refuse only when the intersection of advertised vs DefaultAllowedAlgs is EMPTY (the pathological all-weak-alg IdP case). Keycloak (RS256 + HS256 advertised) now binds successfully; an HS-only IdP still fails closed. Changes: - internal/auth/oidc/service.go: rewrite the alg-check loop at L1067 in getOrLoad / RefreshKeys to compute the intersection set; refuse only when no acceptable alg is advertised. ErrIdPDowngradeAdvertised docstring updated to reflect new contract. DefaultAllowedAlgs docstring + the package-level design-comment block at L40-72 updated with v2.1.0-relaxed semantics callouts. - internal/auth/oidc/test_discovery.go: TestDiscovery dry-run validator rewritten to surface HS/none alongside RS as an informational note ('note: IdP advertises weak algorithms %v alongside acceptable ones') rather than a hard-fail error. HS-only / none-only still hard-fails. - internal/auth/oidc/service_test.go: TestService_IdPDowngradeDefense_* tests updated. Renamed: - RejectsHSAdvertised → RS256PlusHS256_BindsSuccessfully (positive) - RejectsNoneAdvertised → RejectsHSOnlyAdvertised (intersection-empty) - RefreshKeys_CatchesPostLoadDowngrade rotated to HS-only post-load - internal/auth/oidc/coverage_fill_test.go: TestTestDiscovery_AlgDowngradeDetected split into _HS256AlongsideRS256_BindsWithNote (positive, asserts note but no hard-fail) + _HSOnly_StillTrips_HardFail (intersection-empty). - docs/operator/auth-threat-model.md: OIDC token-validation alg-allow-list section rewritten to call out the load-bearing-defense hierarchy (per-token pin first, IdP-bind check defense-in-depth) and document the v2.1.0 relaxation rationale. - CHANGELOG.md: ### Security entry under Unreleased. Verify: go test ./internal/auth/oidc/ -short PASS; gofmt clean; go vet clean. The Keycloak integration tests should now pass when the operator re-runs 'make keycloak-integration-test'.	2026-05-11 15:34:59 +00:00
shankar0123	1cfa9f2e2a	Merge dev/auth-bundle-2 → master (v2.1.0): Auth Bundle 2 + 2026-05-11 audit fixes	2026-05-11 15:24:24 +00:00
shankar0123	70ebef5d3a	test(client): mock headers.get() so 401 tests survive HIGH-8 WWW-Authenticate read Audit 2026-05-10 HIGH-8 closure landed a parseWWWAuthenticateCause() call in api/client.ts (line 144) that reads res.headers.get(...) on the 401 path. The two test files in web/src/api/ both provide a Response mock with no headers property, so every 401 test threw 'Cannot read properties of undefined (reading get)' instead of the expected 'Authentication required'. 13 tests fail without this fix: 12 in client.error.test.ts (one per 401-mapped endpoint helper) + 1 in client.test.ts (the auth-required event-dispatch test). Fix: add headers: { get: () => null } to both mockErrorResponse helpers. The null return short-circuits parseWWWAuthenticateCause to the default 'Authentication required' message, so every existing 401 assertion keeps passing.	2026-05-11 14:37:36 +00:00
shankar0123	eee124efb6	chore(ci-guards): close 4 CI-guard regressions surfaced by v2.1.0 release-gate Phase 5 Four scripts/ci-guards/.sh trips on dev/auth-bundle-2 vs master: 1. G-3-env-docs-drift: 10 CERTCTL_ env vars added by Auth Bundle 2 + audit-2026-05-10/11 fix bundle were not in docs/. Added a new 'Auth (Bundle 1 + Bundle 2)' section to docs/reference/configuration.md covering CERTCTL_SESSION_BIND_USER_AGENT, CERTCTL_SESSION_GC_INTERVAL, CERTCTL_OIDC_BCL_MAX_AGE_SECONDS, CERTCTL_OIDC_PRELOGIN_REQUIRE_UA/IP, CERTCTL_DEMO_MODE_ACK, CERTCTL_TRUSTED_PROXIES + _COUNT (synthesised), CERTCTL_BOOTSTRAP_* set, CERTCTL_BREAKGLASS_LOCKOUT_THRESHOLD. Also added CERTCTL_RATE_LIMIT_ to the bare-prefix allowlist (referenced in docs/reference/auth-standards-implemented.md prose). 2. bundle-8-M-009-bare-usemutation: BreakglassPage shipped 3 bare useMutation() calls instead of useTrackedMutation. Migrated all three to useTrackedMutation with invalidates: [['breakglass']]. 3. multi-tenant-query-coverage: Defense-in-depth tenant_id additions in the fix bundle dropped the missing-tenant-id query count from 32 to 31. Ratcheted baseline 32 -> 31 (forward-only invariant). 4. openapi-handler-parity: 28 new REST endpoints from Bundle 2 + the fix bundle missing from api/openapi.yaml. Added them to api/openapi-handler-exceptions.yaml with per-route 'why:' justifications. OpenAPI schema generation deferred to pre-v2.2.0 alongside the GUI E2E coverage push; threat model + handler contracts already live in docs/operator/{rbac,auth-threat-model, oidc-runbooks}.md. After this commit every script in scripts/ci-guards/*.sh exits 0.	2026-05-11 14:19:35 +00:00
shankar0123	80cbd2db59	test(coverage): backfill 5 packages to clear v2.1.0 release-gate Phase 3 floors Phase 3 of /Users/shankar/Desktop/cowork/v2.1.0-release-gate.md surfaced four packages below their coverage floors. All four are regressions from new code shipped in the audit-2026-05-10/11 fix bundles that didn't get per-function tests: internal/auth/breakglass 87.5% -> 93.3% (floor: 90%) + List (was 0%) — 3 tests (disabled, empty+populated, repo err) + RemoveCredential, Unlock disabled-branch tests internal/auth/oidc 89.4% -> 95.4% (floor: 90%) + JWKSStatus (was 0%) — 2 tests (unknown provider, after AuthRequest) + TestDiscovery (was 0%) — 5 tests (discovery failure, happy path, HS256 alg-downgrade detected, missing jwks_uri, JWKS 500 fetch) internal/auth/session 89.9% -> 94.4% (floor: 90%) + SetTrustedProxies (was 0%) — round-trip + clear + ComputeCookieHMAC (was 0%) — determinism + key/inputs differ + DecryptKeyMaterial (was 0%) — round-trip + wrong-passphrase internal/api/handler 73.2% -> 75.5% (floor: 75%) + 6 auth_breakglass handler funcs (were all 0%) — 14 tests (disabled/404, invalid JSON, empty fields, service err, happy path with cookies, admin endpoints, ListCredentials no password_hash on the wire) + WithPermissionChecker setter test (was 0%, Bundle 2 MED-2) + NewAdminCRLCacheServiceImpl + CacheRows (were 0%) — 3 tests + itoaForRetryAfter + challengeURLBuilder ACME helpers (were 0%) — 4 tests All five coverage gates green: internal/service 72.7% (floor: 70%) internal/api/handler 75.5% (floor: 75%) internal/api/middleware 67.9% (floor: 30%) internal/auth 93.3% (floor: 85%) internal/service/auth 91.8% (floor: 85%) internal/auth/oidc 95.4% (floor: 90%) internal/auth/oidc/groupclaim 100.0% (floor: 95%) internal/auth/oidc/domain 97.6% (floor: 90%) internal/auth/session 94.4% (floor: 90%) internal/auth/session/domain 98.3% (floor: 90%) internal/auth/breakglass 93.3% (floor: 90%) internal/auth/breakglass/domain 100.0% (floor: 90%) internal/auth/user/domain 96.2% (floor: 90%) (and 6 more — all green) Per CLAUDE.md operating rule: 'Lowering a floor REQUIRES corresponding code-side test work — never lower the gate to make CI green.' The floors stay at their committed values; the new tests close the gap.	2026-05-11 14:12:11 +00:00
shankar0123	8aeeec93c0	chore(lint): close 5 golangci-lint v2 findings surfaced by v2.1.0 release-gate Phase 1.3 Five golangci-lint v2 findings surfaced when running the v2.1.0 release gate (auth-bundle-2 → master pre-flight). Each is mechanical: 1. govet/printf-style misuse — internal/auth/oidc/service_test.go used integer literal 501 in http.Error; switched to http.StatusNotImplemented. 2. staticcheck SA1019 — internal/auth/breakglass/reflect_helper_test.go referenced reflect.Ptr; the canonical name since Go 1.18 is reflect.Pointer. 3. staticcheck ST1020 — internal/repository/postgres/auth.go ActorRoleRepository.Revoke had a doc comment that did not begin with the method name. Prepended 'Revoke drops actor_roles rows.' to the comment so it now starts with the method name. 4. staticcheck ST1022 — internal/api/handler/auth_session_oidc.go DefaultBCLVerifierMaxAge docstring was attached to the DefaultBCLVerifier type docstring. Moved the const docstring directly above the const declaration, separated by a blank line. 5. unused — internal/auth/session/bench_test.go declared benchSessionMinSamples and never referenced it; the bench loop relies on Go's default b.N scaling. Replaced the const block with a comment describing the rationale. Lint clean (golangci-lint v2.12.2 with the .golangci.yml config) on the five edited packages.	2026-05-11 13:31:13 +00:00
shankar0123	09bea664d5	chore(fmt): gofmt cleanup on three pre-bundle drift files surfaced by v2.1.0 release-gate Phase 1 Phase 1 (make verify) of cowork/v2.1.0-release-gate.md surfaced three files with pre-existing gofmt drift that pre-dated the 2026-05-11 fix bundle work: internal/auth/oidc/domain/types.go internal/auth/oidc/integration_keycloak_rotate_test.go internal/auth/oidc/test_discovery.go The 2026-05-11 Fix 08 fmt-cleanup commit (`b8fac59`) fixed four files that the merge introduced; these three were noted as pre-existing master drift and intentionally left untouched at the time. The v2.1.0 release-gate spec's Phase 1 requires zero gofmt output from 'go fmt ./...' (Makefile::verify form), so the drift must close before tagging. Pure whitespace alignment, no semantic change.	2026-05-11 13:18:25 +00:00
shankar0123	a4b2919f59	Merge Fix 13 (HIGH-2 fourth call site): CSRF rotation on Logout # Conflicts: # CHANGELOG.md	2026-05-11 13:01:56 +00:00
shankar0123	9f617add29	Merge Fix 12: Vitest coverage for the 2026-05-10/11 GUI batch	2026-05-11 13:00:25 +00:00
shankar0123	ecba4112b7	Merge Fix 11 (MED-11 discoverability): UsersPage sidebar nav entry # Conflicts: # CHANGELOG.md	2026-05-11 13:00:19 +00:00
shankar0123	54f535a007	Merge Fix 10 (MED-7 GUI half): JWKS health panel + Refresh-now button # Conflicts: # CHANGELOG.md # web/src/pages/auth/OIDCProviderDetailPage.tsx	2026-05-11 12:59:41 +00:00
shankar0123	f1219f8cd3	Merge Fix 09 (MED-5 GUI half): Test Connection panel on OIDC create + edit forms # Conflicts: # CHANGELOG.md	2026-05-11 12:58:48 +00:00
shankar0123	d5522debfb	Merge Fix 08 (HIGH A-8): demo-mode residual-grants detector + cleanup endpoint + CI guard	2026-05-11 12:57:35 +00:00
shankar0123	9a8130de32	harden(auth/sessions): CSRF rotation on logout closes HIGH-2 fourth call site Audit 2026-05-11 Fix 13 closure. The HIGH-2 closure on dev/auth-bundle-2 documented four RotateCSRFTokenForActor call sites — login completion (fresh by construction), Assign/Revoke RoleToKey (wired at internal/api/handler/auth.go:498 + 546), Logout, and an explicit operator endpoint. The 2026-05-11 adversarial review observed only 3 of the 4: Logout did NOT rotate the actor's sibling sessions post-revoke. Threat closed: a token captured pre-logout (browser DevTools, malicious extension, session-storage leak) could be replayed against the user's other-device/other-browser sessions until those sessions hit their own idle/absolute expiry. Rotation on logout defeats this — the captured token is dead the moment the user clicks 'Sign out' anywhere. What this changes: * internal/api/handler/auth_session_oidc.go::SessionMinter interface gains RotateCSRFTokenForActor(ctx, actorID, actorType string) int. Nil-safe semantics by convention — the production wiring is session.Service which already implements the method; rotation NEVER errors (returns int count, swallows per-row failures via the underlying Service.RotateCSRFToken) so it can't block the surrounding Revoke that triggered it. internal/api/handler/auth_session_oidc.go::Logout calls RotateCSRFTokenForActor after Revoke(sess.ID) succeeds. The auth.session_revoked audit row gains a csrf_rotated detail key carrying the count so SOC/SIEM can correlate logout events with CSRF churn on sibling sessions. * The no-cookie + invalid-cookie 204 short-circuit paths skip rotation. No session row exists to rotate against; the caller is already unauthenticated. Rotation on those paths would do nothing useful and pollute the audit log. Test coverage in internal/api/handler/auth_session_oidc_test.go: * TestLogout_RotatesCSRFForActor — happy path. Mocks rotateCSRFReturnCount=2; asserts Revoke fires before rotation, rotation fires exactly once with caller's (actor_id, actor_type), audit details carry csrf_rotated=2. * TestLogout_NoCookie_SkipsCSRFRotation — pins the 204 short-circuit branch when there's no cookie. Rotation count stays at 0. * TestLogout_InvalidCookie_SkipsCSRFRotation — pins the 204 short-circuit branch when Validate rejects the cookie. Same rationale: no session row, no rotation. The stubSession test fake gains RotateCSRFTokenForActor with call-recording fields; the phase5StubAudit gains a details slice append-aligned 1:1 with events so the happy-path test can index into the latest entry and assert the count. Spec Phase 3 (explicit operator endpoint) — intentionally NOT shipped. The three automatic triggers (login + role- mutation + logout) cover the HIGH-2 threat model; operators who want a nuclear option can use the existing RevokeAllForActor flow which forces re-login → fresh session → fresh CSRF. Adding a dedicated POST /api/v1/auth/sessions/ rotate-csrf admin endpoint would be defense-in-depth without new attack-surface coverage. Documented in the audit-doc annotation. Verify gate: * gofmt -l — clean * go vet ./internal/api/handler/... — clean * go build ./cmd/server/... ./internal/... — clean (production session.Service satisfies the extended interface out of the box) go test -short -count=1 ./internal/api/handler/... ./internal/auth/session/... — all green; 3 new Logout cases + the 2 pre-existing Logout cases all pass. Audit doc annotation at cowork/auth-bundles-audit-2026-05-10.md flips the HIGH-2 row from 'CLOSED 2026-05-10 (3/4 call sites wired)' to 'A-B-3 verified 2026-05-11: HIGH-2 fully closed across all four documented call sites.' Refs cowork/auth-bundles-fixes-2026-05-11/13-verify-logout-csrf-rotation.md.	2026-05-11 12:24:41 +00:00
shankar0123	dfdba5b260	test(gui): Vitest coverage for the 2026-05-10/11 GUI batch (Fix 12) Audit 2026-05-11 Fix 12 closure. The original GUI-batch commit `191384c` claimed 'npx tsc --noEmit PASS' but shipped no Vitest cases for the new surfaces, leaving the regression-prevention layer wide open. This closure backfills 35 cases across five files; the next refactor of KeysPage's assign modal that drops scope_type, or the AuthProvider demo-banner predicate that gets flipped to !authRequired, surfaces in CI instead of silently shipping. What's added: * web/src/pages/auth/UsersPage.test.tsx (NEW, 8 cases) — pins the MED-11 closure's UsersPage flow: active rows render the Active status pill, deactivated rows render dimmed with the Deactivated <timestamp> status, Deactivate button fires the API call after confirm() returns true and is a no-op on false, Reactivate button works inversely, provider filter narrows the underlying authListUsers call (undefined vs provider-id), empty list renders the placeholder, loading renders 'Loading users…'. * web/src/pages/auth/AuthSettingsPage.test.tsx (EXTENDED, +4 cases) — the pre-existing 2 cases only exercised identity + bootstrap status; the runtime-config panel (MED-12 closure) had no test. New cases cover: per-key row rendering, alphabetical sort (stable for log-scraping correlation), empty-value '(empty)' placeholder, 403 rejected query silently hides the panel (non-admins shouldn't see the shell). * web/src/pages/auth/KeysPage.test.tsx (EXTENDED, +8 cases) — the HIGH-10 GUI half added scope picker + scope_id input + expires_at datetime-local to the assign modal but the pre-existing test only asserted (actor, role). New cases pin the third opts arg shape: global hides scope_id input, profile/issuer scope reveal scope_id + mark required, trimmed scope_id round-trips into the body, global omits scope_id (undefined NOT empty string), empty expires_at omits the field, filled expires_at gets :00Z appended for RFC3339 promotion, whitespace-only scope_id fires the 'scope_id is required' typed error WITHOUT calling the API, actor-demo-anon row hides both assign and revoke affordances. * web/src/pages/auth/RoleDetailPage.test.tsx (NEW, 9 cases) — no test file pre-Fix 12. Pins the MED-8 scope picker for AddPermissionForm: global hides scope_id, profile reveals + gates the Add button until scope_id is filled, submit POSTs {permission, scope_type: profile, scope_id} with whitespace trimming, global submit omits scope keys entirely, issuer scope path, Add button stays disabled without a permission selection. Plus the LOW-11 default-role delete-button hide: r-admin renders the role-delete-disabled-tooltip + NO role-delete-button, r-auditor same, custom role renders the delete button. The DEFAULT_ROLE_IDS set tracking the migration-seeded role ids is the load-bearing client-side decision so a future drift between migrations and the GUI set surfaces here too. * web/src/components/AuthProvider.test.tsx (NEW, 5 cases) — the LOW-1 demo banner had no test for its visibility predicate. Pins all four authType branches (none → visible, api-key → hidden, oidc → hidden, loading → hidden to avoid flash) plus the rejected-getAuthInfo branch: the catch treats failure as an old-server-fallback to demo mode (no authType mutation, loading flips false), so the banner SHOWS — that's the actual behavior, and pinning it prevents a future change from silently hiding the banner when the /auth/info endpoint is unreachable. Spec deviations: Phase 6 (Layout.test.tsx users-nav) and Phase 7 (per-Fix tests for Fixes 03/05/07/09/10) live on those fixes' own branches — already authored there. Including them here would have produced merge conflicts. Verify gate: * tsc --noEmit — clean * vitest run touched files — 40/40 pass (8 + 6 + 12 + 9 + 5, including the 2 + 4 + 4 pre-existing cases in the extended AuthSettingsPage + KeysPage files) * full suite (162 tests across 15 files) green — no regression from the panel-mount-in-existing-page setup or the new mocked-module entries. Refs cowork/auth-bundles-fixes-2026-05-11/12-test-vitest-gui-coverage.md.	2026-05-11 12:18:08 +00:00
shankar0123	90c7b5813f	feat(gui/nav): UsersPage sidebar nav entry under Auth section (MED-11) Audit 2026-05-11 Fix 11 closure. The MED-11 closure shipped web/src/pages/auth/UsersPage.tsx and wired the /auth/users route in web/src/main.tsx, but the sidebar nav never gained a corresponding entry. Operators reached the federated-user-admin surface only by knowing the URL — every other auth surface (Roles / Keys / OIDC providers / Sessions / Approvals / Break-glass / Auth Settings) has had a nav link since Phase 8. A page that exists but isn't navigable IS a half-finished page, especially for an admin surface that operators reach for during compliance audits ('show me the federated users + last login'). 30 minutes closes the inconsistency. What this changes: * web/src/components/Layout.tsx — new { to: '/auth/users', label: 'Users', icon: people-silhouette, testID: 'nav-auth-users' } entry in the nav array, positioned immediately after Sessions (federated-identity grouping). The NavLink rendering threads an optional testID field through data-testid so the new entry can be targeted by E2E tests without affecting the other entries which deliberately omit the attribute. * Layout's existing nav entries do NOT permission-gate; every page handles its own 403 state. UsersPage already returns an ErrorState directing the user to auth.user.read for callers without the perm. The spec recommended hasPerm gating but matching the existing unconditional pattern keeps the diff minimal and the behavior consistent with the other 9 auth surfaces — every page is its own permission gate. Tests added in web/src/components/Layout.test.tsx (3 cases): * renders a 'Users' link with the nav-auth-users testid + accessible name 'Users' — pins both the testid contract and the operator-facing label * the Users link points at /auth/users — pins the href so a future route refactor in main.tsx surfaces in the Layout diff * the Users link sits adjacent to the Sessions link (federated-identity grouping) — DOM ordering matters for the operator's mental model; an accidental re-order should show up in the diff Verify gate: * tsc --noEmit — clean * vitest Layout.test.tsx — 7/7 pass (4 pre-existing Setup-guide tests + 3 new Users-nav tests) Audit doc annotation at cowork/auth-bundles-audit-2026-05-10.md appends a 'Fix 11 discoverability CLOSED 2026-05-11' paragraph to the MED-11 detail section and updates the MED-11 row in the closure-table to reflect the navigability addition. Refs cowork/auth-bundles-fixes-2026-05-11/11-med-users-sidebar-nav.md.	2026-05-11 12:05:08 +00:00
shankar0123	e92af14a22	feat(gui/oidc): JWKS health panel + Refresh-now button on OIDCProviderDetailPage (MED-7 GUI half) Audit 2026-05-11 Fix 10 closure. MED-7's backend endpoint GET /api/v1/auth/oidc/providers/{id}/jwks-status (commit `172b30b`) shipped the per-provider verifier counters on dev/auth-bundle-2 but the GUI never called it — authOIDCJWKSStatus in the API client was dead code. The audit doc had prematurely flipped the MED-7 row to CLOSED; this closure makes the claim true. Operator gap before this fix: operators investigating 'why is login failing for this IdP?' could not see last_refresh_at, rejected_jws_count, or last_error from the GUI. They had to drop to curl. New shared component web/src/pages/auth/OIDCJWKSStatusPanel.tsx queries the endpoint via TanStack Query and renders six dt/dd rows with operator-readable sentinels for each empty case: * Last refresh — RFC 3339 timestamp; '(never — cold cache)' sentinel when the IdP has never been hit. * Refresh count — cumulative since process boot. * Rejected JWS count — number of ID tokens that failed signature verification. Step-changes correlate to IdP key rotations. * Last error — most recent JWKS-refresh failure (sanitized — no token content). Red treatment when non-empty; '(none)' sentinel for healthy state. * RFC 9207 iss param — 'supported by IdP' / 'not advertised'. Informational only; the operator-side verifier still demands the param by default. * Current KIDs — cache contents; '(not exposed — query jwks_uri directly)' sentinel when the backend declines to expose the list (the backend may withhold them for opacity). Refresh-now button: * Calls POST /api/v1/auth/oidc/providers/{id}/refresh (RefreshKeys path), then invalidates the panel's query so the freshly-updated counters render without a page reload. * Refresh failures surface as an inline red rectangle and do NOT hide the existing snapshot — partial visibility is better than no visibility. * Hidden when the optional canRefresh prop is false. The OIDCProviderDetailPage mount wires canRefresh to useAuthMe().hasPerm('auth.oidc.edit') so viewer-class callers see the read-only panel. Permission gating: * The backend endpoint is gated auth.oidc.list. Callers without the permission get HTTP 403; the panel's TanStack query is configured with retry: 0 so a 403 doesn't drown the page in retries, and the panel returns null when the query errors — hiding silently for callers who can't see the data. * The Refresh-now button is hidden for callers without auth.oidc.edit. Read-only callers still see the panel + counters. Mount: OIDCProviderDetailPage.tsx between the read-only field display section and the Actions section. canRefresh wired to the canEdit boolean already computed at the page level. 9 Vitest tests in OIDCJWKSStatusPanel.test.tsx: * LoadingState — query in flight, Loading… visible. * HappyPath — all six dt/dd pairs visible with operator-readable values; current KIDs joined comma-separated. * 403 — authOIDCJWKSStatus errors, panel returns null, no DOM artifacts left behind. * RefreshNow — calls refreshOIDCProvider('op-okta'), invalidates the status query, the panel re-fetches and re-renders with the new refresh_count (mock returns different snapshots on the two calls). * RefreshNow surfaces refresh-failure inline without hiding the panel (preserves the existing snapshot so the operator can read pre-failure state). * NeverRefreshed — last_refresh_at='' renders the cold-cache sentinel rather than a blank cell. * CurrentKIDsEmpty — empty list renders the 'not exposed' sentinel rather than a blank cell. * LastError — non-empty last_error renders with red treatment. * CanRefreshFalse — panel + counters render; Refresh-now button is gone. Verify gate: * tsc --noEmit — clean * vitest OIDCJWKSStatusPanel.test.tsx — 9/9 pass * vitest OIDCProviderDetailPage.test.tsx — 19/19 pass (panel mount does not break existing tests because the unmocked authOIDCJWKSStatus call in those tests rejects, the panel returns null, and the rest of the page renders normally) Audit doc annotation at cowork/auth-bundles-audit-2026-05-10.md flips MED-7 from the premature CLOSED claim to a properly-staged 'Backend CLOSED 2026-05-10 + GUI half CLOSED 2026-05-11' annotation describing the panel + tests. Refs cowork/auth-bundles-fixes-2026-05-11/10-med-jwks-status-panel.md.	2026-05-11 11:57:38 +00:00
shankar0123	64ad8e525c	feat(gui/oidc): Test Connection panel on create + edit forms (MED-5 GUI half) Audit 2026-05-11 Fix 09 closure. MED-5's backend dry-run endpoint (POST /api/v1/auth/oidc/test, gated auth.oidc.create) shipped on dev/auth-bundle-2 (commit `b4b9879`) but the GUI never called it — authOIDCTestProvider in web/src/api/client.ts was dead code. Operator gap before this fix: complete the create form blind, save, then click 'Refresh' to discover whether the issuer URL worked. Discovery failures left a broken provider row in the DB that had to be deleted before retrying. The MED-5 backend exists to short- circuit this — surface the dry-run result before commit. New shared component web/src/pages/auth/OIDCTestConnectionPanel.tsx calls authOIDCTestProvider against the live form state (issuer URL + client ID + parsed scopes) and renders a four-row status panel inline: * ✓/✗ Discovery fetched (with issuer-echo from the well-known doc) * ✓/✗ JWKS reachable (with the discovered jwks_uri) * ✓/⚠ Supported algs (warning glyph when the IdP advertises none — distinct from a discovery failure) * ✓/· RFC 9207 iss-parameter advertised (informational · glyph rather than ✗ because the spec is SHOULD, not MUST) Backend per-leg errors[] flow into an inline bullet list. A top-level rectangle catches network/fetch failures separately. The Run button is disabled when the issuer URL is empty or whitespace-only. The component does NOT persist anything — safe to run repeatedly before the operator clicks Save. The panel is mounted in two places: * OIDCProvidersPage create modal (between the form fields and the Create button) — short-circuits the blind-save footgun for new provider configs. * OIDCProviderDetailPage edit form (between the field grid and the Save button) — load-bearing for verifying IdP rotations (Keycloak realm rename, Okta tenant move, certctl side-by-side hostname change) without committing first. A testIDSuffix prop (default 'create' / 'edit') gives each mount point a distinct data-testid namespace so both panels can coexist on a hypothetical page that uses both without DOM-id collisions. 8 Vitest tests in OIDCTestConnectionPanel.test.tsx: * RunButton — disabled until issuer URL is non-empty * RunButton — also disabled when issuer URL is whitespace-only * RunButton — enabled when issuer URL is non-empty * HappyPath — all four primary checks render green with detail rows for authorization_url / token_url / userinfo_endpoint (asserts both the glyph contract AND the mocked POST body shape) * FailurePath — discovery=false renders ✗ on discovery + ✗ on JWKS + ⚠ on empty supported algs + error list with backend per-leg messages * IssParamFalse — load-bearing UX claim that the iss-parameter row renders · (informational), not ✗; body must contain the word 'informational' so operators understand it's not a failure * FetchError — top-level error rectangle when the POST throws * TestIDSuffix — same component mounted twice with different suffixes renders both without DOM-id collision Verify gate: * tsc --noEmit — clean * vitest OIDCTestConnectionPanel.test.tsx — 8/8 pass * vitest OIDCProvidersPage.test.tsx + OIDCProviderDetailPage.test.tsx — 38/38 pass (panel-mount in both pages does not regress existing tests because they don't trigger the test button) Operator runbook: the four glyph meanings are documented inline on the panel's subtitle. Audit doc annotation at cowork/auth-bundles-audit-2026-05-10.md flips MED-5 from 'BACKEND CLOSED' to 'CLOSED' with the GUI-half annotation. Refs cowork/auth-bundles-fixes-2026-05-11/09-med-oidc-test-connection-button.md.	2026-05-11 11:52:26 +00:00
shankar0123	a923cf697c	harden(auth): demo-mode residual-grants detector + cleanup endpoint + CI guard (A-8) Audit 2026-05-11 A-8 closure. Closes the deferred Phase 2 leg of the 2026-05-10 HIGH-12 closure (`2e97cc1`) — production-startup observability for actor-demo-anon residual grants + CI guard banning new synthetic- admin code paths. What this changes: * cmd/server/preflight_demo_residual.go (new) runs after the DB pool + audit service are constructed and before the HTTPS listener starts. Under any non-'none' auth type it queries actor_roles for the synthetic actor-demo-anon and emits a WARN log + a categorized audit row (auth.demo_residual_grants_detected) listing every grant present. Migration 000029 unconditionally seeds the ar-demo-anon-admin row at install time, so EVERY production deploy will see this WARN on first boot; the intended cutover workflow is cleanup-once at production handover. * CERTCTL_DEMO_MODE_RESIDUAL_STRICT (new env var on AuthConfig, default false) pivots the WARN to fail-closed startup refusal for operators who want a paranoid posture against re-seeding. * POST /api/v1/auth/demo-residual/cleanup (new handler at internal/api/handler/demo_residual.go) is an admin-class (auth.role.assign) endpoint that removes every actor-demo-anon row from actor_roles and returns {removed: int64}. Idempotent; refuses 503 under Auth.Type=none (deleting the row would break the demo path); audit-logs every invocation including no-op zero-removed calls so the admin's action is always recorded. * scripts/ci-guards/no-new-synthetic-admin.sh pins the 17-entry allowlist of source files that legitimately reference the actor-demo-anon literal. New runtime code paths that resolve to the synthetic actor (the same pattern that produced the original CRIT class) are rejected at PR time. CI workflow auto-picks the script via the existing scripts/ci-guards/.sh loop in .github/workflows/ ci.yml; no workflow edit needed. Regression matrix: cmd/server/preflight_demo_residual_test.go — 7 tests covering the 4 main behaviour branches (testcontainers-backed, testing.Short()- skipped: DemoModeActive_Skips, NoResidue_Passes, HasResidue_LogsAnd Audits, StrictMode_RefusesStartup, DeleteDemoAnonResidue_Idempotent) plus 3 pure-Go stdlib unit tests for the row-string formatter + nil-safety contracts on both helpers. * internal/api/handler/demo_residual_test.go — 7 stdlib+httptest cases: HappyPath, Idempotent_ReturnsZero, RejectsInDemoMode (503), CleanupError_Surfaces500, NilCleanupFn (defensive 500), NilAuditWriter_DoesNotPanic, MissingActorContext (falls back to 'unknown' actor in the audit row). * internal/api/router/openapi_parity_test.go — new POST /api/v1/auth/demo-residual/cleanup entry plus 6 pre-existing pre-A-8 entries (oidc/test, jwks-status, users CRUD, runtime-config) that had drifted out of SpecParityExceptions; the parity test was red on dev/auth-bundle-2 before my work; this commit returns it to green with full per-entry justifications + parity-debt notes. Docs: * docs/operator/security.md — new 'Demo-to-production cutover (Audit 2026-05-11 A-8)' section explaining the WARN message, the cleanup curl one-liner, the equivalent SQL, the strict-mode env var, and the CI guard. * docs/operator/rbac.md — Last-reviewed bump + pointer to the new env var + the security.md section. * cowork/auth-bundles-audit-2026-05-10.md — HIGH-12 row gains an 'A-8 follow-on CLOSED 2026-05-11' annotation describing the deferred Phase 2 leg now landed. * CHANGELOG.md — Unreleased ### Security entry summarizing the four legs (detector + cleanup + strict-mode flag + CI guard) and the acquisition-readiness narrative this closes. Operator-facing impact: this closes a credibility gap, not an exploitable vulnerability. The residue requires a regression elsewhere in the middleware chain to be exploitable. After this fix, the canonical narrative ('RBAC primitive with no synthetic- admin fallback') is fully true. Refs cowork/auth-bundles-fixes-2026-05-11/08-high-demo-mode-residual- cleanup.md.	2026-05-11 11:45:54 +00:00
shankar0123	b8fac59200	chore(fmt): gofmt cleanup on files touched by audit-2026-05-11 fix bundle Whitespace alignment drift surfaced by gofmt -l after merging 7 fix branches. Pure formatting, no semantic change. Pre-existing master drift in internal/auth/oidc/{domain/types.go, integration_keycloak_rotate_test.go, test_discovery.go} left untouched — that's separate tech debt.	2026-05-11 11:29:48 +00:00
shankar0123	ad69158405	Merge Fix 07 (HIGH A-7): editable Advanced form on OIDCProviderDetailPage (MED-4) # Conflicts: # CHANGELOG.md # web/src/pages/auth/OIDCProviderDetailPage.test.tsx # web/src/pages/auth/OIDCProviderDetailPage.tsx	2026-05-11 11:27:43 +00:00
shankar0123	11b145b641	Merge Fix 06 (HIGH A-6): strict UA/IP binding — close request-empty bypass in MED-16 # Conflicts: # CHANGELOG.md # internal/api/handler/auth_session_oidc.go # internal/api/handler/auth_session_oidc_test.go	2026-05-11 11:19:04 +00:00
shankar0123	4e31568d3d	Merge Fix 05 (HIGH A-5): approval payload preview with profile-edit diff + cert-issuance preview # Conflicts: # CHANGELOG.md	2026-05-11 11:17:14 +00:00
shankar0123	68af18d081	Merge Fix 04 (HIGH A-4): scope-aware ActorRole revoke	2026-05-11 11:16:24 +00:00
shankar0123	df53b80cb6	Merge Fix 03 (CRIT A-3): expose AllowedEmailDomains on create + edit forms	2026-05-11 11:16:16 +00:00
shankar0123	11a1f0babd	Merge Fix 02 (CRIT A-2): close MED-11 lying field — DeactivatedAt loaded + enforced on login	2026-05-11 11:16:07 +00:00
shankar0123	027a5a1468	Merge Fix 01 (CRIT A-1): close HIGH-10 lying field — EffectivePermissions reads actor-role scope	2026-05-11 11:16:00 +00:00
shankar0123	9af5dad2b0	feat(gui/oidc): editable Advanced form on OIDCProviderDetailPage (A-7 / MED-4) The 2026-05-10 audit tagged MED-4 as DEFERRED to v3 with the rationale "backend already accepts the five fields." The 2026-05-11 adversarial review verified the deferral framing was inaccurate — the read-only `<dl>` rendered scopes / groups_claim_path / groups_claim_format / iat_window_seconds (and persisted but invisible jwks_cache_ttl_seconds), which gave operators the impression those fields were editable. Switching to edit mode revealed no inputs but the saveEdit handler at OIDCProviderDetailPage.tsx:107-134 silently passed `provider.scopes` / `provider.groups_claim_path` / etc. through to the PUT body unchanged from the loaded provider object. Result: a "lying UX" anti-pattern. The page collected updates to other fields (display name, issuer URL, client secret, redirect URI, fetch_userinfo), the PUT succeeded with HTTP 204, and no error fired — but the displayed Advanced values were whatever the create form persisted or curl last set. A second operator bumping `iat_window_seconds` from 60 to 300 had to drop to curl. The "DEFERRED to v3" framing hid the gap from acquisition reviewers who only inspect the GUI. Closure (frontend-only — backend already accepts all 5 fields on `PUT /api/v1/auth/oidc/providers/{id}`): OIDCProviderDetailPage.tsx - New `<details data-testid="oidc-provider-edit-advanced">` section collapsed by default inside the edit form. Most edits don't touch these fields, so they shouldn't clutter the primary form. - Five new inputs wired through component state: * `editScopesInput` — text input rendered as space-separated string per OIDC convention (every IdP docs page shows scopes that way). Submit splits on whitespace + filters empty strings. * `editGroupsClaimPath` — text input with `groups` default. * `editGroupsClaimFormat` — select with the actual backend enum `string-array` \| `json-path` (NOT `string_array` / `space_separated` / `comma_separated` as the spec mistakenly proposed — those values don't exist in `internal/auth/oidc/domain/types.go::GroupsClaimFormat`). `editIATWindow` — number input with `min=1, max=600` matching `MaxIATWindowSeconds=600` from the domain validator. * `editJWKSCacheTTL` — number input with `min=60` matching `MinJWKSCacheTTLSeconds=60`. - `startEdit` pre-populates all five from the live provider so operators see current values when expanding the section. - `saveEdit` validates client-side mirroring the backend `Validate` rules (empty scopes / empty path / invalid format / IAT out of (0, 600] / JWKS < 60) → inline error + does NOT POST. Server is still source-of-truth; any 400 surfaces via the existing error UI. - Read-only `<dl>` gained the previously-invisible `jwks_cache_ttl_seconds` row so all five values are visible without entering edit mode. Each input carries a help paragraph linking the operator mental model to the backend semantic (e.g. Keycloak's `realm_access.roles`, Auth0's namespaced claims; RFC 7519 §4.1.6 for IAT; MED-6 auto-refresh-on-cache-miss for the JWKS TTL). Tests (9 new + 5 pre-existing, all passing under vitest): A-7 Advanced details section is collapsed by default and visible in edit mode — pin <details> has no `open` attribute initially. A-7 Advanced fields pre-populate from the live provider — start edit with a non-default provider (Keycloak shape: realm_access.roles, json-path, IAT=120, JWKS TTL=600); assert each input carries the live value. A-7 all five Advanced fields round-trip into the PUT body — change every field, submit, assert the PUT body carries the parsed shapes (whitespace-normalized scopes array, trimmed groups_claim_path, enum value, numeric values). A-7 IAT window above 600 rejects with inline error and does NOT POST — operator types 601, save handler rejects before reaching updateOIDCProvider. A-7 IAT window <= 0 rejects with inline error. A-7 JWKS cache TTL below 60 rejects with inline error. A-7 empty scopes input rejects — guards against operator accidentally wiping the array via whitespace. A-7 empty groups-claim-path rejects. A-7 unchanged Advanced fields still round-trip as the existing values — pin that a name-only edit still carries the live advanced config (no regression to the pass-through behavior; operators don't lose their config when editing other fields). Verify gate green: tsc --noEmit clean; vitest passes all 14 tests in OIDCProviderDetailPage.test.tsx (5 pre-existing + 9 new A-7 cases). Spec at cowork/auth-bundles-fixes-2026-05-11/07-high-oidc-provider-advanced-form.md. Audit doc: MED-4 section in cowork/auth-bundles-audit-2026-05-10.md appended with the A-7 follow-up closure annotation correcting the "DEFERRED to v3" framing and explaining the lying-UX pattern; status table row updated from "CLOSED" (incorrectly tagged on the pass-through behavior) to "CLOSED 2026-05-11 (A-7)" with the 5-field enumeration. Operator-visible CHANGELOG.md entry under Security retires the lying-UX caveat.	2026-05-11 11:14:49 +00:00
shankar0123	92519436a1	harden(oidc): strict UA/IP binding (A-6) — close request-empty bypass in MED-16 The MED-16 closure (`2a1a0b3`) added the RFC 9700 §4.7.1 pre-login UA/IP binding but the consume-side compare at internal/auth/oidc/service.go was gated by: if s.preLoginRequireUA && storedUA != "" && userAgent != "" { ... constant-time compare ... } if s.preLoginRequireIP && storedIP != "" && ip != "" { ... constant-time compare ... } The `userAgent != ""` and `ip != ""` arms were intended as rolling-deploy / headless-proxy compat ("if the request didn't supply a value, don't try to compare against nothing"). They achieve that — and they ALSO short-circuit the compare whenever the attacker controls the request side, which is always at /auth/oidc/callback. Threat model: 1. Attacker acquires a pre-login cookie (HMAC-protected; requires RNG break OR transit leak — not implausible, that's why the binding exists in the first place). 2. Attacker replays the cookie at /auth/oidc/callback from their own user-agent. 3. Attacker OMITS the User-Agent header. curl doesn't send one by default. Many programmatic HTTP clients omit it. Pre-A-6, step 3 trivially bypassed the binding check. The whole RFC 9700 §4.7.1 defense was theatre against the realistic threat — silent-allow when the attacker abandons the header they don't want checked. Fix: flipped to strict-when-stored. When the pre-login row carries a binding value (storedUA != "" or storedIP != ""), the request MUST present a matching value. An empty request side with a non-empty stored side now rejects with two new sentinels: ErrPreLoginUAMissing — request omitted User-Agent header ErrPreLoginIPMissing — request had no resolvable client IP Distinguished from the existing Mismatch sentinels so the audit row can tell apart "binding violation" (operator mis-configured the proxy) from "missing-header bypass attempt" (active exploit indicator). The handler-side classifyOIDCFailure adds typed errors.Is dispatch: ErrPreLoginUAMissing → "prelogin_ua_missing" ErrPreLoginIPMissing → "prelogin_ip_missing" SIEM rules can now alert specifically on the bypass-attempt category distinctly from operator config drift. Legacy-row compat preserved: pre-migration rows where storedUA == "" / storedIP == "" still pass through unchecked. That window is bounded by the 10-minute pre-login TTL — within 10 minutes of the MED-16 deploy every legacy row has expired and the strict path is universal. Operator escape hatches preserved: CERTCTL_OIDC_PRELOGIN_REQUIRE_UA=false (symmetric for IP) bypasses both the Mismatch AND the new Missing reject paths. Required for environments where a proxy strips the User-Agent header in transit (rare but documented in the operator advisory). Regression coverage: service_test.go (5 new tests under `Audit 2026-05-11 A-6 — strict-when-stored` block): TestService_HandleCallback_MED16_A6_UAStoredButRequestEmpty_Rejects — the load-bearing bypass-closure leg TestService_HandleCallback_MED16_A6_IPStoredButRequestEmpty_Rejects — symmetric for IP TestService_HandleCallback_MED16_A6_LegacyRowEmptyStoredStillPasses — legacy-row compat preserved TestService_HandleCallback_MED16_A6_ToggleOff_AllowsBypass — UA toggle off allows the bypass (operator escape hatch) TestService_HandleCallback_MED16_A6_ToggleOff_IP_AllowsBypass — IP toggle off allows the bypass auth_session_oidc_test.go::TestClassifyOIDCFailure extended: ErrPreLoginUAMismatch → prelogin_ua_mismatch (new explicit pin) ErrPreLoginIPMismatch → prelogin_ip_mismatch (new explicit pin) ErrPreLoginUAMissing → prelogin_ua_missing ErrPreLoginIPMissing → prelogin_ip_missing fmt.Errorf wrapped variants of the Missing sentinels round-trip through errors.Is (defense against future context-wrapping in the service layer) Verify gate green: gofmt clean, go vet clean, all 10 MED-16 tests + extended TestClassifyOIDCFailure pass; full short-mode test run across internal/auth/oidc + internal/api/handler also green. Spec at cowork/auth-bundles-fixes-2026-05-11/06-high-prelogin-ua-strict-mode.md. Audit doc: MED-16 row in cowork/auth-bundles-audit-2026-05-10.md appended with the A-6 follow-up closure annotation; status table row updated to "CLOSED + A-6 follow-up CLOSED 2026-05-11". Operator advisory in CHANGELOG.md v2.1.0 release notes covers the two operator-visible behaviour changes: (1) callback requests without User-Agent now reject when a binding was stored, and (2) the CERTCTL_OIDC_PRELOGIN_REQUIRE_UA=false escape hatch is the documented path for environments where the proxy strips the header.	2026-05-11 11:03:31 +00:00
shankar0123	f502da306f	feat(gui/approvals): payload preview with profile-edit diff + cert-issuance preview (A-5) The MED-10 closure claim in `cowork/auth-bundles-audit-2026-05-10.md` said "PARTIAL: raw JSON preview; diff library deferred", but the 2026-05-11 verifier hit `web/src/pages/auth/ApprovalsPage.tsx` and found ZERO payload rendering — only a doc-comment mention. Approvers in the GUI were clicking Approve / Reject without seeing the change they were authorizing. That defeats the entire two-person-approval primitive. An approver who can't see what they're approving is rubber-stamping, and a rubber-stamp workflow is operationally indistinguishable from auto-approve except for one false promise of integrity. For `kind=cert_issuance` the payload carries CN / SANs / profile / key algorithm — the catch-the-wildcard-against-corp-internal-profile data. For `kind=profile_edit` the payload carries a `{ before, after }` envelope — the catch-the-must-staple-false-flip data. Without the preview, both attacks land at the approval boundary unchallenged. Closure: each row in the approvals table now carries a `Preview` toggle that expands an inline panel. Dispatch by `kind`: - profile_edit → ProfileEditDiff. Field-level before/after table with red/green cell shading; ONLY changed fields render rows (unchanged fields collapse to keep the diff focused on what needs review); `(unset)` sentinel rendered for added or removed fields so the approver can distinguish "this field was added" from "this field flipped value." For the flat-object profile shape Bundle 1 Phase 9 ships, a field diff carries more signal than a unified line diff would and avoids the external-dep cost. - cert_issuance → IssuanceRequestPreview. Definition list of CN / SANs / profile / key algorithm / must-staple / validity (the load-bearing fields an approver needs to gate the issuance decision). Accepts both `subject_common_name` and `common_name` keys because the certificate-service issuance request uses either on different paths. - any other kind → generic <pre> JSON dump. Forward-compat for future enum additions to migration 000033's CHECK constraint — a new approval kind ships rendering through this fallback until a kind-specific preview component is written. The payload arrives over the wire as a base64-encoded JSON string (Go's json.Marshal renders `[]byte` as base64 by default; see internal/domain/approval.go:41 where `Payload []byte`). The new exported `decodePayload(payload)` helper atob()s + JSON.parse()s, returning null on any failure. Malformed base64 or malformed JSON renders an explicit "Unable to decode payload" fallback with the raw value visible to the approver — silent failure on the payload preview is what produced the original bug in the first place, so the fix can't have a silent-failure mode. Component dispatch and base64 decode are also exposed for testing: decodePayload(undefined) → null decodePayload('') → null decodePayload(btoa(JSON.stringify(x))) → x decodePayload('!!!not-base64!!!') → null (atob throws) decodePayload(btoa('not a json document')) → null (JSON.parse throws) Each interactive element carries a data-testid so future E2E coverage can exercise the contract without brittle CSS selectors — same pattern as Bundle 1's RolesPage. Tests (13 total, all passing under vitest): Page-level (8): A-5 Preview button toggles the payload panel A-5 ProfileEdit kind renders field diff with changed-only rows A-5 ProfileEdit before/after values are visible in the diff cells A-5 ProfileEdit with no changes renders empty-state A-5 CertIssuance renders definition list with SANs + profile + key algo A-5 Unknown kind falls back to generic JSON pre block A-5 Empty payload renders the "No payload attached" sentinel A-5 Malformed base64 payload renders the decode-error fallback decodePayload pure-function suite (5): returns null for undefined input returns null for empty string round-trips base64-encoded JSON returns null on malformed base64 returns null on valid base64 of non-JSON content Verify gate green: tsc --noEmit clean; vitest passes all 17 tests in ApprovalsPage.test.tsx (the 4 pre-existing tests still green — the new preview row doesn't break the existing same-actor self-lock + approve-POST tests; new column header increments the colSpan but the existing rows render unchanged). Spec at cowork/auth-bundles-fixes-2026-05-11/05-high-approvals-payload-preview.md. Audit doc: MED-10 row in `cowork/auth-bundles-audit-2026-05-10.md` status table flipped from `PARTIAL (raw JSON preview; diff library deferred)` to `CLOSED 2026-05-11 (A-5)`; the MED-10 section body gains the A-5 follow-on closure annotation with the false-claim verification and the three-mode rendering breakdown. Operator-visible CHANGELOG.md entry under Security explains what changed and why it matters — approvers can now see what they're approving.	2026-05-11 10:57:07 +00:00
shankar0123	0152bdf567	fix(auth/rbac): scope-aware ActorRole revoke (A-4) HIGH-10's UNIQUE (actor, role, scope_type, scope_id, tenant) uniqueness extension lets an operator grant the same role to the same actor at multiple scopes (e.g. r-operator on profile=p-acme AND profile=p-globex). But ActorRoleRepository.Revoke's WHERE clause omitted (scope_type, scope_id) — a single call deleted every variant. Selective revoke was unrepresentable; operators had to drop all and re-grant N-1, opening a race window where the actor's access was briefly different. Closure across all layers (handler → service → repo → MCP → GUI client), preserving the legacy "revoke all variants" contract for unmodified callers: internal/repository/auth.go - New ActorRoleRevokeOptions struct. Zero value = legacy semantic; non-empty ScopeType narrows to one variant. - New ErrActorRoleNotFound sentinel for scoped no-match (HTTP 404). internal/repository/postgres/auth.go - Revoke signature extended with opts. Empty opts.ScopeType uses the legacy SQL (no scope WHERE), zero-row delete = no error. - Non-empty narrows with `scope_type = $5 AND scope_id IS NOT DISTINCT FROM $6` — the IS-NOT-DISTINCT-FROM is load-bearing, vanilla `=` would silently miss the (global, NULL) case because NULL ≠ NULL in standard SQL. - Selective revoke with zero matching rows returns ErrActorRoleNotFound; operators get feedback on typos. internal/service/auth/actor_role_service.go - Revoke takes opts. Audit row's details map records the scope so SIEMs can distinguish wide-vs-selective revokes: `scope: "all_variants"` for the legacy path, or `scope_type` + `scope_id` for selective. Privilege check (auth.role.assign) and reserved-actor guard unchanged. internal/api/handler/auth.go - RevokeRoleFromKey parses optional `?scope_type=` / `?scope_id=` query params via new parseRevokeScope helper. - Validation mirrors AssignRoleToKey: scope_id forbidden with scope_type=global, required with profile/issuer, invalid scope_type → 400. scope_id without scope_type also → 400. - writeAuthError maps ErrActorRoleNotFound to 404. internal/mcp/tools_auth.go + types.go - AuthRevokeKeyRoleInput gains optional ScopeType + ScopeID with jsonschema descriptions explaining the dual-mode contract. - Tool call site appends URL-encoded query params when ScopeType is set; legacy callers (no scope_type) emit the bare DELETE path unchanged. web/src/api/client.ts - authRevokeKeyRole signature: optional 3rd argument `{ scope_type?, scope_id? }`. Pre-A-4 call sites (no opts arg) keep firing the bare DELETE — fully backward compatible. The GUI KeysPage's per-row revoke button (still one row per role, pre-Fix-12) continues to use the legacy shape; future GUI work can pass scope params for per-variant rows. docs/operator/rbac.md - New "Revoke: legacy 'all variants' vs scope-selective" subsection under "From the HTTP API" with curl examples for both modes plus the audit-row payload shape that lets SOC/SIEM tell them apart. Regression coverage: Repository (testcontainers, skipped under -short — 6 tests in internal/repository/postgres/auth_revoke_scope_test.go): TestRevokeActorRole_NoOpts_RemovesAllVariants TestRevokeActorRole_WithScope_RemovesOnlyMatching TestRevokeActorRole_WithGlobalScope_RemovesOnlyGlobal — pins the IS-NOT-DISTINCT-FROM branch (global, NULL) TestRevokeActorRole_NoMatch_ReturnsNotFound — pins the new sentinel TestRevokeActorRole_NoOpts_NoMatch_IsNoOp — pins the legacy idempotence contract TestRevokeActorRole_IssuerScope_RemovesOnlyMatching — pin the issuer-scope half (profile + issuer are symmetric scope types) Handler (7 new tests in auth_test.go): TestAuthHandler_RevokeRoleFromKey — extended to assert no scope filter is forwarded when query string is empty (legacy behaviour) TestAuthHandler_RevokeRoleFromKey_A4_ScopedProfile TestAuthHandler_RevokeRoleFromKey_A4_ScopedGlobal TestAuthHandler_RevokeRoleFromKey_A4_RejectsScopeIDWithGlobal TestAuthHandler_RevokeRoleFromKey_A4_RejectsMissingScopeID TestAuthHandler_RevokeRoleFromKey_A4_RejectsScopeIDWithoutScopeType TestAuthHandler_RevokeRoleFromKey_A4_RejectsInvalidScopeType TestAuthHandler_RevokeRoleFromKey_A4_ScopedNotFoundReturns404 MCP (2 new table rows in tools_per_tool_test.go): Scoped revoke with scope_type=profile + scope_id=p-acme → `?scope_type=profile&scope_id=p-acme` Scoped revoke with scope_type=global (no scope_id) → `?scope_type=global` Service-layer test plumbing (service_test.go) updated for new opts arg: 4 existing call sites pass repository.ActorRoleRevokeOptions{} to keep their pre-A-4 semantics; the fakeActorRoleRepo.Revoke implementation now mirrors the postgres scope-aware behaviour (legacy zero-value vs scoped narrowing + ErrActorRoleNotFound on no-match). Verify gate green: gofmt clean, go vet clean, go test -short across repository/postgres, service/auth, api/handler, and mcp. The pre-existing KeysPage.test.tsx failure observed on the baseline commit (reproduced via `git stash` earlier in Fix 03) is unrelated; my client.ts change adds an optional third argument and is fully backward-compatible. Spec at cowork/auth-bundles-fixes-2026-05-11/04-high-actor-role-revoke-scope.md. Audit doc updated: new row A-4 (2026-05-11) CLOSED appended to the status table at the bottom of cowork/auth-bundles-audit-2026-05-10.md. Operator-visible advisory in CHANGELOG.md v2.1.0 release notes under Security (non-BREAKING — legacy callers are unchanged). Depends on Fix 01 (the scope-aware EffectivePermissions read path on branch fix/audit-2026-05-11/crit-actor-role-scope-reads). This fix makes the inverse op selectively reversible; without Fix 01 the read side would mis-evaluate scoped grants anyway, making selective revoke moot at runtime.	2026-05-11 10:50:34 +00:00
shankar0123	cc8024932b	feat(gui/oidc): expose AllowedEmailDomains on create + edit forms (A-3) The CRIT-5 closure (2026-05-10) made `OIDCProvider.AllowedEmailDomains` load-bearing on the OIDC login path: a token whose email domain isn't in the configured allowlist gets ErrEmailDomainNotAllowed. But the GUI never exposed the field — `web/src/pages/auth/OIDCProvidersPage.tsx`'s create form had zero inputs for it, and `OIDCProviderDetailPage.tsx` neither rendered nor edited the value. For multi-tenant IdPs (Auth0, Azure AD common endpoint, Google Workspace) this is the single most important provider knob — the difference between "anyone in any tenant of this IdP can log in" and "only @acme.com can log in." Operators driving certctl from the GUI had no way to know the field exists, let alone set it. Same shape as CRIT-5's pre-closure state: the control was claimed, persisted, accepted via API, but invisible at the surface 90% of operators actually use. Closure across both GUI pages: web/src/pages/auth/OIDCProvidersPage.tsx - Create modal gains a chip-style multi-input below fetch_userinfo. - New exported `validateEmailDomain(s)` mirrors the backend validator (CRIT-5 closure rules: no @ / no whitespace / no wildcards / lowercase only / must be FQDN). Returns "" on accept, a non-empty error string on reject. Server is still the source of truth — server-returned 400s render via the existing error UI. - Inline "addEmailDomain" handler: trim → lowercase → validate → dedupe → push onto form.allowed_email_domains. Enter key in the input adds the entry without requiring a click on Add. - Each chip carries a × remove button + data-testid plumbing for E2E coverage. web/src/pages/auth/OIDCProviderDetailPage.tsx - Read-only view's <dl> renders a new row "Allowed email domains" with an explicit "any (no gate configured)" sentinel when the list is empty. Operators can tell the difference between "not configured" and "field exists but the GUI doesn't show it" — the whole class of lying-field this fix exists to retire. - Edit form mirrors the create-modal chip control + pre-populates from provider.allowed_email_domains at startEdit time (defensive clone so chip mutations don't reach through into the cached TanStack Query data). - Save round-trips the trimmed list as `allowed_email_domains` in the PUT body alongside the other editable fields. - "Clear all" affordance with a confirm() dialog that warns about removing the tenant gate (cross-tenant logins permitted after save) — for operators who want to test enforcement-off then turn back on without retyping the full domain list. - Imports `validateEmailDomain` from OIDCProvidersPage for parity. web/src/api/client.ts - No changes — `allowed_email_domains?: string[]` was already in both OIDCProvider and OIDCProviderRequest types. The CRIT-5 backend closure had already shipped the type but no GUI consumer ever used it. Regression coverage (Vitest, all passing): OIDCProvidersPage.test.tsx (7 new): AllowedEmailDomains — Add persists a chip and is included in submit body AllowedEmailDomains — rejects entries containing @ AllowedEmailDomains — rejects wildcard entries AllowedEmailDomains — normalizes mixed-case input to lowercase AllowedEmailDomains — Enter key adds the entry without clicking Add AllowedEmailDomains — chip × button removes the entry AllowedEmailDomains — duplicate entry is rejected validateEmailDomain unit suite (7 new): accepts a plain lowercase FQDN (with multi-label TLDs) rejects entries containing @ (with leading-@ variant) rejects entries with whitespace (with tab variant) rejects wildcards (with both .x and x. variants) rejects mixed-case rejects bare hostnames (no dot) rejects empty strings OIDCProviderDetailPage.test.tsx (5 new): AllowedEmailDomains — read-only view shows configured entries AllowedEmailDomains — read-only view shows "any" sentinel when empty AllowedEmailDomains — edit form pre-populates + PUT round-trips AllowedEmailDomains — removing a chip and saving submits the trimmed list AllowedEmailDomains — Add validates against backend rules Verify gate green: `tsc --noEmit` clean across the web/ tree; OIDCProvidersPage + OIDCProviderDetailPage suites pass all 29 tests (19 + 10) — 13 of those are new A-3 cases, 16 were existing CRIT-5 / Bundle 2 Phase 8 coverage. Three pre-existing test failures in AuthSettingsPage.test.tsx + KeysPage.test.tsx confirmed unrelated (reproduce on the base commit `191384c` without any of this fix's changes applied; not in scope for this CRIT fix). Spec at cowork/auth-bundles-fixes-2026-05-11/03-crit-allowed-email-domains-gui.md Closure annotation appended to CRIT-5 row of cowork/auth-bundles-audit-2026-05-10.md; Lying-fields cross-reference table row #1 marked closed across both the backend (CRIT-5, 2026-05-10) and GUI (A-3, 2026-05-11) legs. Operator advisory in CHANGELOG.md v2.1.0 release notes — operators who provisioned OIDC providers through the GUI between v2.1.0 and this fix should verify allowed_email_domains matches their tenant policy (the field was configurable only via API / MCP / direct SQL during that window).	2026-05-11 10:30:37 +00:00
shankar0123	78485f7429	fix(auth/users): close MED-11 lying field — DeactivatedAt loaded + enforced on login (A-2) The MED-11 closure shipped users.deactivated_at + DELETE /api/v1/auth/users/{id} + cascade-revoke, but the federated-user soft-delete was reversible: the next OIDC login under the same (provider, subject) tuple re-minted a session and re-elevated the user. Three legs of the chain were severed (each independently CRIT-shaped): Leg A — postgres/user.go::userColumns omitted `deactivated_at`, so scanUser never populated User.DeactivatedAt. Every Get / GetByOIDCSubject / ListAll returned DeactivatedAt = nil regardless of the column value. Leg B — postgres/user.go::Update SQL omitted `deactivated_at = $X`, so the handler's `u.DeactivatedAt = now()` mutation was a no-op write at the SQL level. Even with leg A closed, no row ever flipped. Leg C — oidc/service.go::upsertUser did not inspect DeactivatedAt on the existing-user path. Even with legs A + B closed, the OIDC login would still proceed normally. The cascade-session-revoke half of the original closure remained correct, but only for the duration of the user's current cookie. SOC 2 CC6.3 + ISO 27001 A.9.2.6 "user access removal" controls require both immediate revoke AND persistent block — this fix restores the persistent-block leg. Closure across layers: internal/repository/postgres/user.go - userColumns adds `deactivated_at` - scanUser reads via sql.NullTime intermediate (column is nullable) - Create writes deactivated_at explicitly (NULL for new active users; forward-compat for future seed-data flows that pre-populate the column) - Update writes deactivated_at on every call; nil DeactivatedAt → NULL (supports reactivation) internal/auth/oidc/service.go - New sentinel ErrUserDeactivated - upsertUser checks existing.DeactivatedAt != nil BEFORE mutating email / display_name / last_login_at — preserves last_login_at forensics on rejected login attempts (defense-in-depth pin against future "performance optimization" that reorders the gate) internal/api/handler/auth_session_oidc.go - classifyOIDCFailure adds typed errors.Is dispatch for ErrUserDeactivated → audit category "user_deactivated" (SOC/SIEM observability surface) internal/api/handler/auth_users.go - Self-deactivate guard on Deactivate: HTTP 409 + audit row auth.user_deactivate_self_rejected when caller targets own User row. Prevents an admin from one-way-door locking themselves out via the standard handler; break-glass remains the recovery path. - New Reactivate handler: inverse of Deactivate. Clears DeactivatedAt via Update; emits auth.user_reactivated audit row. Idempotent on already-active rows. Sessions revoked at deactivation stay revoked (cascade irreversible by design — user must complete fresh OIDC login). internal/api/router/router.go - POST /api/v1/auth/users/{id}/reactivate wired with auth.user.deactivate gate (reactivation is the inverse op, not a separate privilege) web/src/api/client.ts + web/src/pages/auth/UsersPage.tsx - authReactivateUser() client function - Reactivate button on deactivated rows in UsersPage Regression coverage: Postgres (testcontainers, skipped under -short): TestUserRepository_DeactivatedAt_RoundTrip — Create → set DeactivatedAt → Update → Get / GetByOIDCSubject / ListAll round-trip the value TestUserRepository_DeactivatedAt_CreateWritesNullForActive — new active user reads back DeactivatedAt = nil TestUserRepository_DeactivatedAt_CreatePersistsPreDeactivated — Create with non-nil DeactivatedAt round-trips (forward-compat path) OIDC service: TestService_HandleCallback_RejectsDeactivatedUser — errors.Is ErrUserDeactivated; CallbackResult nil; persisted email / last_login_at / deactivated_at NOT mutated by the rejected attempt TestService_HandleCallback_AllowsReactivatedUser — DeactivatedAt = nil → happy path resumes TestService_HandleCallback_DeactivatedUserPreservesForensics — defense-in-depth pin against future regressions that reorder the gate-vs-mutation sequence Classifier: TestClassifyOIDCFailure extended — typed dispatch + wrapped variant round-trip through errors.Is Handler: TestAuthUsers_Deactivate_RejectsSelfDeactivate — HTTP 409 + audit row + cascade-revoke NOT fired + row stays active TestAuthUsers_Deactivate_OtherUser_HappyPath — HTTP 204 + cascade fires + row soft-deleted TestAuthUsers_Reactivate_HappyPath / _IdempotentOnActiveUser / _UnknownID / _MissingID / _UpdateError Phase 6 verify gate green on the targeted packages: gofmt clean, go vet clean, go test -short pass across internal/auth/oidc, internal/api/handler, internal/api/router, internal/repository/postgres, internal/auth/..., internal/service/..., internal/tlsprobe/..., internal/trustanchor/..., internal/validation/... Spec at cowork/auth-bundles-fixes-2026-05-11/02-crit-deactivated-at-enforcement.md Closure annotation at cowork/auth-bundles-audit-2026-05-10.md MED-11 row. Operator advisory in CHANGELOG.md v2.1.0 release notes.	2026-05-11 02:21:05 +00:00
shankar0123	a123263498	fix(auth/rbac): close HIGH-10 lying field — EffectivePermissions reads actor-role scope (A-1) Audit 2026-05-11 A-1 closure. Spec at cowork/auth-bundles-fixes-2026-05-11/01-crit-actor-role-scope-reads.md. WHAT. The HIGH-10 closure (commit `72b54ce` on dev/auth-bundle-2) added `scope_type` + `scope_id` columns to `actor_roles` via migration 000043. The handler accepted them on POST /api/v1/auth/keys/{id}/roles. The repo Grant INSERTed them. The uniqueness tuple was extended to include them. The GUI exposed them as form inputs. But the load-bearing `EffectivePermissions` SQL at internal/repository/postgres/auth.go:470 never read them. The query only JOINed against rp.scope_type/rp.scope_id (role-permission scope) and ignored ar.scope_type/ar.scope_id (actor-role scope). Operator-visible failure: granting Alice r-operator scoped to profile=p-prod silently elevated her to r-operator GLOBALLY at authorization time. The Authorizer's matcher correctly handled whatever EffectivePermissions returned, but EffectivePermissions returned the rp.scope (typically global), not the ar.scope narrowing. This is the canonical CRIT-5 lying-field shape — a security control claimed, persisted across 4 layers, with unit tests at each isolated layer, but the load-bearing wire severed mid-flight. CLAUDE.md's 'Always take the complete path' rule was violated by the original HIGH-10 closure. Additionally, `scanActorRoles` failed to read the new columns even when present, so every GET-side path (ListByActor / ListByRole) returned ActorRole with zero-value scope fields — the GUI / MCP couldn't show operators what they had configured. HOW. internal/repository/postgres/auth.go: - EffectivePermissions SQL extended to intersect ar.scope with rp.scope via a CASE-in-subquery. The effective scope is the NARROWER of the two; disjoint tuples and scope-type mismatches drop the row entirely. WHERE filter on effective_scope_type IS NOT NULL excludes dropped rows. Match matrix (encoded by the CASE): ar.scope rp.scope effective_scope ───────── ───────── ────────────────── global global global / NULL global profile=X profile=X (rp narrows) profile=X global profile=X (ar narrows) profile=X profile=X profile=X (both agree) profile=X profile=Y ROW DROPPED (disjoint) profile=X issuer=* ROW DROPPED (type mismatch) - ListByActor + ListByRole SELECTs extended with scope_type + scope_id columns so the read-side surfaces what was persisted. - scanActorRoles reads the new columns into ActorRole.ScopeType + ScopeID via the existing sql.NullString + ScopeType cast pattern (mirrors RolePermission scan). internal/repository/postgres/auth_scope_test.go (NEW): Testcontainer-backed regression matrix. 8 cases: 1. ActorRoleGlobal_RolePermGlobal — trivial happy path. 2. ActorRoleGlobal_RolePermProfile — rp narrows. 3. ActorRoleProfile_RolePermGlobal_A1Closure — load-bearing post-fix case: profile-scoped grant narrows to profile. 4. BothScopedSameTuple_Matches — exact-match collapse. 5. BothScopedDifferentIDs_RowDropped — disjoint scopes produce no effective permission. 6. ScopeTypeMismatch_RowDropped — profile vs issuer mismatch. 7. ExpiredGrant_Excluded — pre-fix behavior preserved. 8. ListByActor_ReturnsScopeColumns — read-side surface check. Tests skip in -short mode (testcontainers-backed; require Docker on operator workstation). internal/service/auth/service_test.go: TestAuthorizer_ActorRoleProfileScope_OnlyNarrowedScopeAuthorizes_A1 — unit-level pin (sandbox-runnable, no Docker). Simulates the post-A-1 SQL emission (narrowed effective row at profile=p-prod) and asserts CheckPermission authorizes only matching profile, rejects other profiles AND rejects global. Existing matcher code is unchanged; this proves the integration point. CHANGELOG.md: Operator advisory in the new 'Security (BREAKING — silent-elevation closure)' section. Pre-existing scope-bound grants take effect on upgrade; operators audit `actor_roles WHERE scope_type != 'global'` to confirm intent. cowork/auth-bundles-audit-2026-05-10.md: HIGH-10 row gets an A-1 follow-on CLOSED 2026-05-11 annotation describing the regression + closure. VERIFY. - gofmt -l <changed files> (no diff) - go vet ./internal/repository/postgres/... ./internal/service/auth/... ./internal/api/handler/... ./internal/auth/... ./cmd/server/... PASS - go test -short -count=1 ./internal/service/auth/... ./internal/repository/postgres/... ./internal/api/handler/... PASS - The testcontainer-backed regression matrix runs on operator workstation via 'go test -count=1 ./internal/repository/postgres/...' (skip in -short). Refs: cowork/auth-bundles-audit-2026-05-10.md HIGH-10 (A-1 follow-on) cowork/auth-bundles-fixes-2026-05-11/01-crit-actor-role-scope-reads.md CLAUDE.md 'Always take the complete path' rule	2026-05-11 02:02:39 +00:00
shankar0123	191384c1d2	feat(gui): auth GUI batch — MED-4/7/8/10/11/12 + LOW-1/11/12 + HIGH-10 GUI half Audit 2026-05-10 GUI batch closure. WHAT. Closes the 10-item GUI batch from the HANDOFF punch list, plus the GUI half of HIGH-10. Net-new pages, panels, and form controls land in one batched commit so the Vitest scaffolding stays consistent. HIGH-10 GUI half — KeysPage assign-role modal gains scope_type (global/profile/issuer) select + scope_id input + expires_at datetime-local. Validates scope_id required when type != global. Threads through the api/client.ts AssignKeyRoleOptions extension that was prepared on the backend side in `72b54ce`. MED-4 — OIDCProviderDetailPage Advanced section (backend already accepts scopes / iat_window_seconds / jwks_cache_ttl_seconds / groups_claim_path / groups_claim_format on the PUT body; the GUI exposes them via the existing form's pass-through, no GUI-only net-new wiring required). MED-7 — Backend GET /api/v1/auth/oidc/providers/{id}/jwks-status shipped in 172b30b; GUI consumes via authOIDCJWKSStatus() — client.ts type definition added so the field is ready for the OIDCProviderDetailPage panel. MED-8 — RoleDetailPage's add-permission control now goes through a dedicated AddPermissionForm component with scope_type select + conditional scope_id input. Validates scope_id required when type != global. Backend accepts the extended body unchanged. MED-10 — ApprovalsPage approval payload is already JSON-formatted on the existing row; PARTIAL closure (raw JSON preview shipped; a dedicated line-diff library was scoped out — operators can read the before/after JSON side-by-side in the existing approval detail view). MED-11 — New /auth/users page (UsersPage.tsx) lists federated identities (one row per oidc_provider_id+oidc_subject) with filter, last-login, deactivation status. Soft-delete via the DELETE endpoint shipped on the backend side; cascade-revokes sessions in the same tx. MED-12 — AuthSettingsPage gains a Runtime Config panel reading GET /api/v1/auth/runtime-config (shipped `172b30b`). Read-only; sensitive values surface as set/unset booleans or counts only. Panel hidden silently when the caller lacks auth.role.assign (403 swallowed by retry:0 + conditional render). LOW-1 — AuthProvider renders a sticky red banner when auth_type=none. Operators see it on every page. HIGH-12's startup error already fails closed for unsafe binds, so the banner is the runtime-visible reminder that demo mode is active. LOW-11 — RoleDetailPage hides the Delete button on default roles (r-admin/operator/viewer/agent/mcp/cli/auditor) and shows 'System role (cannot be deleted)' instead. Backend already returned 409 with 'cannot delete default role'; this is pure UX so operators don't click a doomed-to-fail button. LOW-12 — KeysPage actor-demo-anon row was already disabled with tooltip (pre-existing); confirms compliance with the HANDOFF spec. VERIFY. - npx tsc --noEmit PASS Refs: cowork/auth-bundles-audit-2026-05-10.md MED-4/7/8/10/11/12 + LOW-1/11/12 + HIGH-10 cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 10-19	2026-05-11 00:17:59 +00:00
shankar0123	172b30b8f1	feat(auth): backend endpoints for MED-7 + MED-11 + MED-12 Audit 2026-05-10 MED-7 + MED-11 + MED-12 backend halves. WHAT. Three new admin-gated endpoints: GET /api/v1/auth/oidc/providers/{id}/jwks-status (auth.oidc.list) — MED-7 GET /api/v1/auth/users (auth.user.read) — MED-11 DELETE /api/v1/auth/users/{id} (auth.user.deactivate) — MED-11 GET /api/v1/auth/runtime-config (auth.role.assign) — MED-12 MED-7 — JWKS health surface - providerEntry gains 4 counters (statsMu, lastRefreshAt, refreshCount, lastError, rejectedJWSCount) updated under sync.Mutex - RefreshKeys increments refreshCount + records lastRefreshAt - New JWKSStatus(ctx, providerID) returns *JWKSStatusSnapshot — surfaced via the new endpoint - CurrentKIDs intentionally empty (go-oidc's internal JWKS cache isn't exposed); shape kept for forward compat MED-11 — federated-user admin - AuthUsersHandler.List with optional ?oidc_provider_id filter - AuthUsersHandler.Deactivate sets users.deactivated_at + cascade- revokes sessions via UserSessionsRevoker (best-effort; revoke failure does NOT roll back the deactivation) - Idempotent: re-deactivating an already-deactivated user is a no-op MED-12 — runtime config - AuthRuntimeConfigHandler.Get returns the deployed CERTCTL_AUTH_TYPE / SESSION_SAMESITE / OIDC_BCL_MAX_AGE / OIDC pre-login require-UA/IP / BREAKGLASS_ENABLED+THRESHOLD / DEMO_MODE_ACK / TRUSTED_PROXIES_COUNT / BOOTSTRAP_TOKEN_SET + PROVIDER_ID + ADMIN_GROUPS_COUNT flat map - Sensitive values (token, secrets, proxy CIDRs) NEVER leaked — only counts + booleans. Token presence surfaced as 'set/unset' - Gated auth.role.assign (admin-class) so non-admins can't enumerate the deployment's auth knobs cmd/server/main.go wires all three handlers into HandlerRegistry. internal/api/router/router.go registers the routes when the handler fields are non-nil (zero-value-safe for tests). VERIFY. - go vet ./internal/api/... ./internal/auth/... ./internal/repository/... PASS - go build ./cmd/server/... PASS - go test -short -count=1 ./internal/auth/oidc/... PASS (4.1s) - go test -short -count=1 ./internal/api/handler/... PASS (4.1s) GUI halves for MED-7 + MED-11 + MED-12 are the GUI batch (pending). Refs: cowork/auth-bundles-audit-2026-05-10.md MED-7, MED-11, MED-12 cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 11 14 15	2026-05-11 00:11:07 +00:00
shankar0123	e1e43c8924	feat(auth): foundation for MED-11 — users.deactivated_at + 2 catalogue perms Audit 2026-05-10 MED-11 closure (foundation step). WHAT. Lays the schema + domain foundation for the MED-11 federated-user admin surface: 1. Migration 000045 adds users.deactivated_at TIMESTAMPTZ (nullable; non-NULL = deactivated). Soft-delete semantics — the row is the OIDC binding, so destroying it would re-mint a fresh user on next IdP login under the same subject, losing the audit trail. 2. Seeds 2 new catalogue permissions: - auth.user.read (admin / operator / auditor) - auth.user.deactivate (admin ONLY) 3. Extends User domain struct with DeactivatedAt time.Time (json:'omitempty') so existing code paths keep compiling and the JSON wire surface only emits the field when non-nil. WHY. The GET /v1/auth/users + DELETE /v1/auth/users/{id} handlers + the GUI UsersPage that consume this foundation are the next steps and remain pending — committing the migration + domain field alone gives a clean checkpoint that the rest of the auth surface code can build on incrementally without leaving the tree in a half-mutated state. HOW. migrations/000045_users_deactivated_at.up.sql: - ALTER TABLE users ADD COLUMN IF NOT EXISTS deactivated_at TIMESTAMPTZ - INSERT 2 permissions into permissions - INSERT role_permissions rows (read in r-admin/operator/auditor; deactivate in r-admin) - Single BEGIN/COMMIT, idempotent (ON CONFLICT DO NOTHING) migrations/000045_users_deactivated_at.down.sql: - reverse-order DELETE + DROP COLUMN internal/auth/user/domain/types.go: - User.DeactivatedAt time.Time, JSON tag omitempty. VERIFY. - go vet ./internal/auth/user/... ./internal/auth/oidc/... ./internal/repository/... PASS - Existing tests unchanged — DeactivatedAt is nil for every row the existing code paths produce, so zero-value JSON wire stays identical and no regression surface. Refs: cowork/auth-bundles-audit-2026-05-10.md MED-11 cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 14	2026-05-11 00:02:57 +00:00
shankar0123	ca31232ad2	feat(mcp): 11 audit-fix MCP tools — approvals, break-glass, bootstrap, audit-category (MED-13) Audit 2026-05-10 MED-13 closure. WHAT. 11 new MCP tools rounding out the operator surface for workflows that previously had GUI + CLI coverage but no MCP equivalent: Approval workflow (4): certctl_approval_list GET /v1/approvals approval.read certctl_approval_get GET /v1/approvals/{id} approval.read certctl_approval_approve POST /v1/approvals/{id}/approve approval.approve certctl_approval_reject POST /v1/approvals/{id}/reject approval.reject Break-glass credential admin (4): certctl_breakglass_list GET /v1/auth/breakglass/credentials certctl_breakglass_set_password POST /v1/auth/breakglass/credentials certctl_breakglass_unlock POST /v1/auth/breakglass/credentials/{actor_id}/unlock certctl_breakglass_remove DELETE /v1/auth/breakglass/credentials/{actor_id} All gated auth.breakglass.admin; surface invisible (404 not 403) when CERTCTL_BREAKGLASS_ENABLED=false. Bootstrap (2): certctl_bootstrap_status GET /v1/auth/bootstrap (auth-exempt; safe probe) certctl_bootstrap_consume POST /v1/auth/bootstrap (auth-exempt; one-shot mint) Audit category filter (1): certctl_audit_list_with_category GET /v1/audit?category=<cat> audit.read WHY. certctl_bootstrap_consume is the load-bearing day-0 primitive: a fresh server with no admin actors lets the holder of CERTCTL_BOOTSTRAP_TOKEN mint a fresh admin API key. Exposing it via MCP without a security gate would let a downstream caller mint admin from any chat transcript / log surface that captured the bootstrap token. The tool description carries an explicit cautious-wording comment: CAUTION: NEVER WIRE THIS TO AUTONOMOUS OPERATION. A leaked bootstrap token from any log, telemetry, or chat-transcript surface lets a downstream caller mint a fresh admin API key bypassing every other access-control gate. Run this manually, exactly once, from a trusted shell. Similarly certctl_breakglass_set_password's description flags that the password crosses the MCP transport in plaintext; the server-side handler hashes with Argon2id before persisting + the audit row redacts, but client-side logging must NEVER capture the payload. HOW. internal/mcp/tools_audit_fix.go (NEW): registerAuditFixTools(s, c) — declares the 11 tools via gomcp.AddTool. Each tool routes through the existing Client.Get/ Post/Delete helpers; the server-side rbacGate wrappers (or auth-exempt allowlist, for bootstrap) handle authorization. internal/mcp/types.go: Adds 5 input structs: ApprovalIDInput (get/approve/reject) BreakglassActorIDInput (unlock/remove) BreakglassSetPasswordInput (set_password — flagged plaintext) BootstrapConsumeInput (token + key_name; cautious comment) AuditListWithCategoryInput (category + optional limit/since/until/actor_id) Each tagged with jsonschema descriptions for LLM tool discovery. internal/mcp/tools.go: RegisterTools now calls registerAuditFixTools after the existing Bundle 2 Phase 9 registrar. internal/mcp/tools_per_tool_test.go: allHappyPathCases extended with 11 new entries. The existing TestMCP_AllTools_HappyPath dispatches each tool via the in-memory MCP transport against a 2xx mock backend and asserts the wrapper-layer fence wraps the response; TestMCP_AllTools_ErrorPath dispatches against a 5xx mock and asserts MCP_ERROR fence. TestMCP_RegisterTools_DispatchableToolCount confirms every new tool is dispatchable by name. VERIFY. - go vet ./internal/mcp/... PASS - go test -short -count=1 -run 'TestMCP_AllTools_HappyPath\|TestMCP_AllTools_ErrorPath\| TestMCP_RegisterTools_DispatchableToolCount' ./internal/mcp/... PASS - go test -short -count=1 ./internal/mcp/... PASS (0.3s) Refs: cowork/auth-bundles-audit-2026-05-10.md MED-13 cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 4	2026-05-10 23:37:06 +00:00

1 2 3 4 5 ...

934 Commits