mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 21:11:30 +00:00
47da13e7a1b3ff5e0aaa594bf4ffc542f6dfda37
12 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9a8130de32 |
harden(auth/sessions): CSRF rotation on logout closes HIGH-2 fourth call site
Audit 2026-05-11 Fix 13 closure. The HIGH-2 closure on dev/auth-bundle-2 documented four RotateCSRFTokenForActor call sites — login completion (fresh by construction), Assign/Revoke RoleToKey (wired at internal/api/handler/auth.go:498 + 546), Logout, and an explicit operator endpoint. The 2026-05-11 adversarial review observed only 3 of the 4: Logout did NOT rotate the actor's sibling sessions post-revoke. Threat closed: a token captured pre-logout (browser DevTools, malicious extension, session-storage leak) could be replayed against the user's other-device/other-browser sessions until those sessions hit their own idle/absolute expiry. Rotation on logout defeats this — the captured token is dead the moment the user clicks 'Sign out' anywhere. What this changes: * internal/api/handler/auth_session_oidc.go::SessionMinter interface gains RotateCSRFTokenForActor(ctx, actorID, actorType string) int. Nil-safe semantics by convention — the production wiring is *session.Service which already implements the method; rotation NEVER errors (returns int count, swallows per-row failures via the underlying Service.RotateCSRFToken) so it can't block the surrounding Revoke that triggered it. * internal/api/handler/auth_session_oidc.go::Logout calls RotateCSRFTokenForActor after Revoke(sess.ID) succeeds. The auth.session_revoked audit row gains a csrf_rotated detail key carrying the count so SOC/SIEM can correlate logout events with CSRF churn on sibling sessions. * The no-cookie + invalid-cookie 204 short-circuit paths skip rotation. No session row exists to rotate against; the caller is already unauthenticated. Rotation on those paths would do nothing useful and pollute the audit log. Test coverage in internal/api/handler/auth_session_oidc_test.go: * TestLogout_RotatesCSRFForActor — happy path. Mocks rotateCSRFReturnCount=2; asserts Revoke fires before rotation, rotation fires exactly once with caller's (actor_id, actor_type), audit details carry csrf_rotated=2. * TestLogout_NoCookie_SkipsCSRFRotation — pins the 204 short-circuit branch when there's no cookie. Rotation count stays at 0. * TestLogout_InvalidCookie_SkipsCSRFRotation — pins the 204 short-circuit branch when Validate rejects the cookie. Same rationale: no session row, no rotation. The stubSession test fake gains RotateCSRFTokenForActor with call-recording fields; the phase5StubAudit gains a details slice append-aligned 1:1 with events so the happy-path test can index into the latest entry and assert the count. Spec Phase 3 (explicit operator endpoint) — intentionally NOT shipped. The three automatic triggers (login + role- mutation + logout) cover the HIGH-2 threat model; operators who want a nuclear option can use the existing RevokeAllForActor flow which forces re-login → fresh session → fresh CSRF. Adding a dedicated POST /api/v1/auth/sessions/ rotate-csrf admin endpoint would be defense-in-depth without new attack-surface coverage. Documented in the audit-doc annotation. Verify gate: * gofmt -l — clean * go vet ./internal/api/handler/... — clean * go build ./cmd/server/... ./internal/... — clean (production *session.Service satisfies the extended interface out of the box) * go test -short -count=1 ./internal/api/handler/... ./internal/auth/session/... — all green; 3 new Logout cases + the 2 pre-existing Logout cases all pass. Audit doc annotation at cowork/auth-bundles-audit-2026-05-10.md flips the HIGH-2 row from 'CLOSED 2026-05-10 (3/4 call sites wired)' to 'A-B-3 verified 2026-05-11: HIGH-2 fully closed across all four documented call sites.' Refs cowork/auth-bundles-fixes-2026-05-11/13-verify-logout-csrf-rotation.md. |
||
|
|
11b145b641 |
Merge Fix 06 (HIGH A-6): strict UA/IP binding — close request-empty bypass in MED-16
# Conflicts: # CHANGELOG.md # internal/api/handler/auth_session_oidc.go # internal/api/handler/auth_session_oidc_test.go |
||
|
|
92519436a1 |
harden(oidc): strict UA/IP binding (A-6) — close request-empty bypass in MED-16
The MED-16 closure (
|
||
|
|
78485f7429 |
fix(auth/users): close MED-11 lying field — DeactivatedAt loaded + enforced on login (A-2)
The MED-11 closure shipped users.deactivated_at + DELETE /api/v1/auth/users/{id}
+ cascade-revoke, but the federated-user soft-delete was reversible: the next
OIDC login under the same (provider, subject) tuple re-minted a session and
re-elevated the user.
Three legs of the chain were severed (each independently CRIT-shaped):
Leg A — postgres/user.go::userColumns omitted `deactivated_at`, so scanUser
never populated User.DeactivatedAt. Every Get / GetByOIDCSubject /
ListAll returned DeactivatedAt = nil regardless of the column value.
Leg B — postgres/user.go::Update SQL omitted `deactivated_at = $X`, so the
handler's `u.DeactivatedAt = now()` mutation was a no-op write at
the SQL level. Even with leg A closed, no row ever flipped.
Leg C — oidc/service.go::upsertUser did not inspect DeactivatedAt on the
existing-user path. Even with legs A + B closed, the OIDC login
would still proceed normally.
The cascade-session-revoke half of the original closure remained correct, but
only for the duration of the user's current cookie. SOC 2 CC6.3 + ISO 27001
A.9.2.6 "user access removal" controls require both immediate revoke AND
persistent block — this fix restores the persistent-block leg.
Closure across layers:
internal/repository/postgres/user.go
- userColumns adds `deactivated_at`
- scanUser reads via sql.NullTime intermediate (column is nullable)
- Create writes deactivated_at explicitly (NULL for new active users;
forward-compat for future seed-data flows that pre-populate the column)
- Update writes deactivated_at on every call; nil DeactivatedAt → NULL
(supports reactivation)
internal/auth/oidc/service.go
- New sentinel ErrUserDeactivated
- upsertUser checks existing.DeactivatedAt != nil BEFORE mutating email /
display_name / last_login_at — preserves last_login_at forensics on
rejected login attempts (defense-in-depth pin against future
"performance optimization" that reorders the gate)
internal/api/handler/auth_session_oidc.go
- classifyOIDCFailure adds typed errors.Is dispatch for ErrUserDeactivated
→ audit category "user_deactivated" (SOC/SIEM observability surface)
internal/api/handler/auth_users.go
- Self-deactivate guard on Deactivate: HTTP 409 + audit row
auth.user_deactivate_self_rejected when caller targets own User row.
Prevents an admin from one-way-door locking themselves out via the
standard handler; break-glass remains the recovery path.
- New Reactivate handler: inverse of Deactivate. Clears DeactivatedAt
via Update; emits auth.user_reactivated audit row. Idempotent on
already-active rows. Sessions revoked at deactivation stay revoked
(cascade irreversible by design — user must complete fresh OIDC
login).
internal/api/router/router.go
- POST /api/v1/auth/users/{id}/reactivate wired with auth.user.deactivate
gate (reactivation is the inverse op, not a separate privilege)
web/src/api/client.ts + web/src/pages/auth/UsersPage.tsx
- authReactivateUser() client function
- Reactivate button on deactivated rows in UsersPage
Regression coverage:
Postgres (testcontainers, skipped under -short):
TestUserRepository_DeactivatedAt_RoundTrip — Create → set DeactivatedAt
→ Update → Get / GetByOIDCSubject / ListAll round-trip the value
TestUserRepository_DeactivatedAt_CreateWritesNullForActive — new active
user reads back DeactivatedAt = nil
TestUserRepository_DeactivatedAt_CreatePersistsPreDeactivated — Create
with non-nil DeactivatedAt round-trips (forward-compat path)
OIDC service:
TestService_HandleCallback_RejectsDeactivatedUser — errors.Is
ErrUserDeactivated; CallbackResult nil; persisted email / last_login_at
/ deactivated_at NOT mutated by the rejected attempt
TestService_HandleCallback_AllowsReactivatedUser — DeactivatedAt = nil
→ happy path resumes
TestService_HandleCallback_DeactivatedUserPreservesForensics —
defense-in-depth pin against future regressions that reorder the
gate-vs-mutation sequence
Classifier:
TestClassifyOIDCFailure extended — typed dispatch + wrapped variant
round-trip through errors.Is
Handler:
TestAuthUsers_Deactivate_RejectsSelfDeactivate — HTTP 409 + audit
row + cascade-revoke NOT fired + row stays active
TestAuthUsers_Deactivate_OtherUser_HappyPath — HTTP 204 + cascade
fires + row soft-deleted
TestAuthUsers_Reactivate_HappyPath / _IdempotentOnActiveUser /
_UnknownID / _MissingID / _UpdateError
Phase 6 verify gate green on the targeted packages: gofmt clean, go vet
clean, go test -short pass across internal/auth/oidc, internal/api/handler,
internal/api/router, internal/repository/postgres, internal/auth/...,
internal/service/..., internal/tlsprobe/..., internal/trustanchor/...,
internal/validation/...
Spec at cowork/auth-bundles-fixes-2026-05-11/02-crit-deactivated-at-enforcement.md
Closure annotation at cowork/auth-bundles-audit-2026-05-10.md MED-11 row.
Operator advisory in CHANGELOG.md v2.1.0 release notes.
|
||
|
|
b4b98799d5 |
feat(oidc): POST /api/v1/auth/oidc/test dry-run endpoint (MED-5)
Audit 2026-05-10 MED-5 closure (backend half).
WHAT.
New POST /api/v1/auth/oidc/test endpoint that validates an OIDC
provider configuration without persisting anything. Mirrors the
read-only legs of the production getOrLoad path so operators can
catch typos / network reachability problems / IdP-advertises-weak-
alg conditions BEFORE creating the provider row.
Request body: {issuer_url, client_id, client_secret, scopes} —
client_secret is accepted but unused (discovery + JWKS reachability
do not require it).
Response body: TestDiscoveryResult{
discovery_succeeded — gooidc.NewProvider returned without error
jwks_reachable — explicit GET against jwks_uri succeeded
supported_alg_values — verbatim id_token_signing_alg_values_supported
iss_param_supported — RFC 9207 advertisement parsed off the disco doc
issuer_echo — the iss URL we were called with
authorization_url,
token_url, jwks_uri,
userinfo_endpoint — discovery doc fields for the GUI to preview
errors[] — per-leg failure messages
}
HTTP status:
- 200 even when individual checks fail (the per-leg errors[] carries
detail so the GUI renders per-check status rows)
- 400 only when the request body is malformed or issuer_url empty
- 500 only when the service-layer call itself errors
WHY.
Pre-fix, operators configuring OIDC had to create a provider, then
hit /refresh, then read the audit log to figure out whether the
discovery doc was reachable / whether the IdP advertises HS256
(the alg-downgrade trap). The GUI rendered no per-check feedback.
MED-5 closes the dry-run gap for the same reason every Issuer +
Target connector has a 'Test connection' button — operator
experience parity.
HOW.
internal/auth/oidc/test_discovery.go (NEW):
- TestDiscoveryResult struct with the per-leg projection.
- Service.TestDiscovery(ctx, issuerURL) drives the read-only
subset of getOrLoad: gooidc.NewProvider, claims parse for
alg-supported + iss-param-supported + jwks_uri + userinfo,
alg-downgrade defense, jwksReachable HTTP GET.
- jwksReachable is a package-level closure so tests can swap.
internal/api/handler/auth_session_oidc.go:
- TestProvider HTTP handler. Uses an inline discoveryTester
interface to type-assert against the OIDCAuthHandshaker stub
(the production Service satisfies; test stubs supply via
explicit method). Audit row 'auth.oidc_provider_tested' carries
the summary fields.
internal/api/router/router.go:
- Wired as POST /api/v1/auth/oidc/test under rbacGate('auth.oidc.create').
internal/api/handler/auth_session_oidc_test.go:
- stubOIDCSvc gains testResult + testErr fields + TestDiscovery
method so it satisfies the inline interface.
- 3 regression tests: happy path, missing issuer_url -> 400,
discovery-failure -> 200 with errors[] populated.
VERIFY.
- go vet ./internal/auth/oidc/... ./internal/api/handler/...
./internal/api/router/... PASS
- go test -short -count=1 -run TestProvider
./internal/api/handler/... PASS (3/3)
- go test -short -count=1 ./internal/auth/oidc/... PASS (3.7s)
- go test -short -count=1 ./internal/api/handler/... PASS (4.7s)
Out of scope for this commit: the GUI 'Test connection' button on
OIDCProviderDetailPage — queued with the GUI batch (items 10-19 of
HANDOFF.md).
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-5
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 2
|
||
|
|
2a1a0b347c |
harden(oidc): pre-login UA/IP binding (MED-16) — RFC 9700 §4.7.1
Audit 2026-05-10 MED-16 closure.
WHAT.
Binds the OIDC pre-login row to the (clientIP, userAgent) tuple of
the /auth/oidc/login request, and enforces a constant-time compare
against the /auth/oidc/callback request at consume time. Defeats
replay of a stolen pre-login cookie by a different browser /
source — the secondary defense layer recommended by RFC 9700 §4.7.1
when the primary layer (HMAC integrity + Path=/ + SameSite=Lax on
the cookie) is bypassed via CSRF / XSS / TLS-termination leak.
WHY.
Pre-fix, the pre-login cookie's HMAC verified only that 'some'
caller of /auth/oidc/login was talking to /auth/oidc/callback; it
did not verify that the SAME browser / source was on both sides.
An attacker who exfiltrated the cookie value via any vector could
replay the bytes through their own user-agent and ride the victim's
authorization. RFC 9700 §4.7.1 calls out the gap explicitly and
recommends binding state to a user-agent fingerprint + source IP.
HOW.
Migration:
migrations/000044_prelogin_uaip.up.sql
ALTER TABLE oidc_pre_login_sessions
ADD COLUMN IF NOT EXISTS client_ip TEXT,
ADD COLUMN IF NOT EXISTS user_agent TEXT;
Both nullable for in-flight rolling-deploy compat — the consume-
side check only enforces when both row AND request carry non-empty
values for the leg in question.
Domain:
internal/repository/oidc.go (PreLoginSession) — adds ClientIP +
UserAgent fields.
Repository:
internal/repository/postgres/oidc_prelogin.go — Create persists
via sql.NullString (empty → NULL); LookupAndConsume reads back.
Re-uses package-local nullableString from discovery.go.
Service:
internal/auth/oidc/service.go
- PreLoginStore.CreatePreLogin signature takes (clientIP,
userAgent) as positions 5–6.
- PreLoginStore.LookupAndConsume returns (clientIP, userAgent)
as positions 5–6.
- HandleAuthRequest signature gains (clientIP, userAgent),
threaded to the store.
- HandleCallback adds Step 1.5 — UA / IP constant-time compare
between stored row and incoming request. Per-leg toggles via
preLoginRequireUA / preLoginRequireIP service fields. Empty
values on either side pass through (rolling-deploy + headless-
proxy compat).
- New sentinels ErrPreLoginUAMismatch, ErrPreLoginIPMismatch.
- SetPreLoginBindingRequirements(requireUA, requireIP) helper
for main.go config wiring.
Adapter:
internal/auth/oidc/prelogin.go — PreLoginAdapter passes the new
fields through to the repo row.
Handler:
internal/api/handler/auth_session_oidc.go
- OIDCAuthHandshaker.HandleAuthRequest signature updated.
- LoginInitiate captures clientIPFromRequest + r.UserAgent()
and passes to the service.
- classifyOIDCFailure adds errors.Is dispatch for the two new
sentinels → prelogin_ua_mismatch / prelogin_ip_mismatch
audit categories.
Config:
internal/config/config.go
+ AuthConfig.OIDCPreLoginRequireUA (default true)
env CERTCTL_OIDC_PRELOGIN_REQUIRE_UA
+ AuthConfig.OIDCPreLoginRequireIP (default true)
env CERTCTL_OIDC_PRELOGIN_REQUIRE_IP
cmd/server/main.go calls oidcService.SetPreLoginBindingRequirements
from cfg.Auth.OIDCPreLoginRequire{UA,IP}.
Tests (internal/auth/oidc/service_test.go):
- TestService_HandleCallback_MED16_UAMismatchRejected
- TestService_HandleCallback_MED16_IPMismatchRejected
- TestService_HandleCallback_MED16_BothMatch_Succeeds
- TestService_HandleCallback_MED16_LegacyRowEmptyValues (rolling-
deploy compat — empty stored values pass through)
- TestService_HandleCallback_MED16_RequireUAFalse_AllowsMismatch
(operator escape-hatch — UA mismatch silently allowed)
Mechanical fan-out:
- stubPreLogin / stubPreLoginRepo signatures updated.
- All existing call sites in service_test.go (~40), prelogin_test.go,
bench_test.go, logging_test.go, provider_enabled_test.go,
integration_keycloak_test.go, integration_okta_smoke_test.go,
auth_session_oidc_test.go updated to pass empty strings for the
new params — pre-existing tests do not exercise UA/IP binding
semantics.
VERIFY.
- go vet ./internal/auth/oidc/... ./internal/api/handler/...
./internal/config/... PASS
- go test -short -count=1 -run MED16 ./internal/auth/oidc/... PASS (5/5)
- go test -short -count=1 ./internal/auth/oidc/... PASS (4.6s)
- go test -short -count=1 ./internal/api/handler/... PASS (4.3s)
- go test -short -count=1 ./internal/config/... PASS
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-16
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 6
RFC 9700 §4.7.1 — OAuth 2.0 Security Best Current Practice
|
||
|
|
2cd2a5c52f |
harden(oidc): RFC 9207 iss URL parameter check on callback (MED-17)
Audit 2026-05-10 MED-17 closure.
WHAT.
When the matched IdP's discovery doc advertises
authorization_response_iss_parameter_supported=true (RFC 9207 §3),
HandleCallback now REQUIRES a non-empty `iss` query parameter on
/auth/oidc/callback and enforces a constant-time compare against the
configured provider's IssuerURL. Mismatch maps to two new sentinel
errors (ErrIssParamMissing / ErrIssParamMismatch) that the handler's
classifyOIDCFailure dispatches via errors.Is BEFORE the substring
fall-through, so the audit failure_category remains distinguishable
between the RFC 9207 leg (iss_param_missing / iss_param_mismatch) and
the in-token iss claim leg (id_token_iss_mismatch).
WHY.
The RFC 9207 iss URL parameter is the load-bearing mix-up-attack
defense for multi-tenant IdPs (Keycloak realms, Authentik tenants,
Auth0 tenants, public-trust CAs). Pre-fix the parameter was silently
ignored — an attacker controlling one IdP tenant could route an auth
code to certctl's callback against a different tenant's pre-login
state without detection. Modern Keycloak / Authentik / public-trust
CAs ship the discovery flag by default; legacy IdPs that don't
advertise are unaffected (back-compat preserved).
HOW.
- internal/auth/oidc/service.go
- providerEntry gains issParamSupported bool.
- getOrLoad extends the discovery-claims read to include
authorization_response_iss_parameter_supported, alongside the
existing id_token_signing_alg_values_supported defense.
- HandleCallback's signature gains callbackIss string at position 5.
Step 2.5 runs after the state compare + provider load: when
issParamSupported is true, an empty callbackIss returns
ErrIssParamMissing; a present-but-mismatched value returns
ErrIssParamMismatch (constant-time compare).
- Two new sentinels: ErrIssParamMissing, ErrIssParamMismatch.
ErrIssuerMismatch's doc-string clarified to note it covers the
in-token leg only.
- internal/api/handler/auth_session_oidc.go
- OIDCAuthHandshaker.HandleCallback signature updated.
- LoginCallback reads r.URL.Query().Get("iss") (no TrimSpace —
byte-strict compare upstream) and threads it through.
- classifyOIDCFailure: typed errors.Is dispatch for the three
iss-family sentinels BEFORE the substring fall-through, so the
three cases stay distinguishable in the audit row.
- internal/api/handler/auth_session_oidc_test.go
- stubOIDCSvc.HandleCallback bumped to 7-arg signature.
- TestClassifyOIDCFailure extended with 5 new cases pinning the
iss-family dispatch + a wrapped-error round-trip.
- internal/auth/oidc/service_test.go
- mockIdP gains advertiseIssParameterSupported bool; the
/.well-known/openid-configuration handler emits the claim only
when set (so existing tests stay back-compat).
- 4 new regression tests:
* MED17_NoSupport_AnyIssAccepted — provider doesn't advertise;
arbitrary callbackIss is ignored (back-compat).
* MED17_SupportButMissing — provider advertises; missing iss →
ErrIssParamMissing.
* MED17_SupportButMismatch — provider advertises; wrong iss →
ErrIssParamMismatch (load-bearing mix-up defense).
* MED17_SupportAndCorrect — provider advertises; matching iss →
success path proves the gate isn't over-eager.
- internal/auth/oidc/bench_test.go,
internal/auth/oidc/logging_test.go,
internal/auth/oidc/integration_keycloak_test.go
- Mechanical: all existing HandleCallback call sites updated to
pass "" for callbackIss (matches pre-fix behavior for IdPs that
don't advertise support — the Keycloak integration suite tests
will be re-evaluated once the Keycloak fixture is run against a
realm with the discovery flag enabled).
VERIFY.
- go vet ./internal/auth/oidc/... ./internal/api/handler/... PASS
- go test -short -count=1 ./internal/auth/oidc/... PASS (3.4s)
- go test -short -count=1 ./internal/api/handler/... PASS (5.4s)
- 4 new MED-17 regression tests + extended TestClassifyOIDCFailure pass.
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-17
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 7
RFC 9207 — OAuth 2.0 Authorization Server Issuer Identification
|
||
|
|
ba0959ddc7 |
feat(auth/sessions): list-all gate + revoke-all-except-current (MED-1/2/3)
Audit 2026-05-10 Fix 13 Phase A — close MED-1, MED-2, MED-3.
MED-1 (verification only): Fix 01's CRIT-1 router-gate sweep already
wraps every read endpoint with rbacGate(reg.Checker, '<resource>.read',
...). Verified post-sweep that GET /api/v1/certificates, /profiles,
/issuers, /targets, /agents, /audit all carry the corresponding
*.read permission gate.
MED-2: ListSessions now gates ?actor_id=<other> on auth.session.list.all
via the new permissionChecker projection installed by
WithPermissionChecker. cmd/server/main.go threads the existing
authCheckerAdapter into the handler. When caller's actor_id !=
caller.ActorID AND the handler has a checker, an inline
CheckPermission(..., 'auth.session.list.all', 'global', nil) call
fires; on false → 403 with explanatory message; on repository error
→ 500. Defense-in-depth: the router-level rbacGate enforces
auth.session.list as the floor; the .list.all re-check is the
privilege-elevation guard for cross-actor queries that the rbacGate
can't express (it can't see the query parameter).
MED-3: ship DELETE /api/v1/auth/sessions?except=current — the
'sign out all other sessions' flow. Gated by auth.session.revoke;
the handler reads the caller's current session ID from
session.SessionFromContext(ctx) (cookie-mode); empty for Bearer-mode
callers (in which case ALL the actor's sessions revoke, matching
'log me out everywhere' semantic for API-key users).
New repository method SessionRepository.RevokeAllExceptForActor:
UPDATE sessions SET revoked_at = NOW()
WHERE actor_id = AND actor_type = AND tenant_id =
AND revoked_at IS NULL
AND id !=
returning rowcount. Added to the interface in internal/repository/session.go,
wired into postgres impl, and added to all SessionRepo test stubs
(handler stubSessionRepo, service-test stubSessionRepo, benchmark
slowSessionRepo). The session.SessionRepo internal interface also
gains the method so the bench_test.go forwarder compiles.
Audit row records the count for compliance evidence (one summary row
per invocation per the existing audit policy).
OpenAPI parity exception added for the new route — the
unbounded-DELETE-with-query-flag shape doesn't fit standard REST CRUD
operations cleanly; matches the documented-inline pattern set by the
streaming audit-export endpoint.
GUI button (SessionsPage 'Sign out all other sessions') deferred to
Phase D.
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-1, MED-2, MED-3
Spec: cowork/auth-bundles-fixes-2026-05-10/13-med-bundle.md Phase A
|
||
|
|
0f340beb14 |
fix(auth/ux): cause-aware OIDC + session error surfacing (HIGH-7 + HIGH-8 closure)
Server (HIGH-7): the OIDC callback failure path now 302-redirects to /login?error=oidc_failed&reason=<category> instead of emitting a blank 400. `category` is the existing audit `failure_category` value; classifyOIDCFailure was extended with three new sentinel paths (email_domain_not_allowed, email_missing_but_required, pkce_invalid) so CRIT-5 + PKCE failures get distinguishable GUI rendering. Audit-log observability is unchanged — the same failure_category is written to the auth.oidc_login_failed audit row; the 302 is purely a UX leg layered on top. Server (HIGH-8): SessionMiddleware now stashes a cause classification on the request context when Validate returns an error, mapping the sentinels via classifySessionError (errors.Is-based, so wrapped sentinels still classify) to the stable wire-strings idle_timeout / absolute_timeout / back_channel_revoked / invalid_token. The 401 emit point in bearerSkipIfAuthenticated reads the stashed cause and emits WWW-Authenticate: Bearer realm="certctl", error="invalid_token", error_description=<cause> per RFC 6750 §3. GUI (HIGH-7): LoginPage reads ?error= + ?reason= from the URL via react-router useSearchParams and renders an operator-friendly amber-bordered banner above the form; OIDC_FAILURE_REASON_TEXT maps all 16 known categories with a defensive 'unspecified' fallback for forward-compat with future server-side categories. GUI (HIGH-8): api/client fetchJSON parses the WWW-Authenticate cause via parseWWWAuthenticateCause and attaches it to the 'certctl:auth-required' CustomEvent detail; AuthProvider redirects to /login?session_expired=<cause> on cause-aware 401s; LoginPage renders a blue-bordered session-cause banner. invalid_token stays on the current page (no hard redirect for opaque failures). Misc cleanup: ErrorState now accepts the title/message/data-testid form added by CRIT-4 BreakglassPage (was erroring tsc on master). Regression matrix: - internal/api/handler/oidc_redirect_categories_test.go pins all 16 failure categories to the 302 + reason= location + audit-row leg - internal/auth/session/www_authenticate_test.go pins the 4 stable cause categories on classifySessionError (incl. errors.Is wrapped sentinels) + the WWW-Authenticate emission across all 4 categories + the no-session-context fallback case - internal/api/handler/auth_session_oidc_test.go: 4 pre-existing TestLoginCallback_*Returns400 tests updated to assert 302 + reason= location (the wire shape changed from 400 to 302, but the audit observability and behaviour-equivalent failure-classification are preserved) - web/src/pages/LoginPage.test.tsx: 6 new cases pinning the failure banner, session-cause banner, unknown-reason fallback, and forward-compat 'unspecified' category Spec: cowork/auth-bundles-fixes-2026-05-10/08-high-7-8-error-surfacing.md Closes: HIGH-7, HIGH-8 of cowork/auth-bundles-audit-2026-05-10.md |
||
|
|
15435ca02b |
fix(oidc/bcl): jti replay-cache + iat freshness check (HIGH-3 closure)
Closes HIGH-3 of the 2026-05-10 audit. Pre-fix the BCL handler
accepted any logout_token whose iat + jti were syntactically present
but never checked (a) that iat fell within a skew window or (b) that
jti hadn't been seen before. A captured logout_token was replayable
indefinitely; once CRIT-2 was fixed, every replay would revoke the
user's current sessions — persistent DoS. RFC 9700 §2.7 + OIDC BCL
1.0 §2.5 require jti replay defense.
- Migration 000040_bcl_replay_cache: oidc_bcl_consumed_jtis table with
composite PK on (jti, issuer_url) — RFC 7519 §4.1.7 per-issuer
uniqueness — and an expires_at index for the GC sweep.
- repository.BCLReplayRepository interface + ErrBCLJTIAlreadyConsumed
sentinel. Postgres impl uses INSERT...ON CONFLICT DO NOTHING
RETURNING true for atomic single-use semantics in one round-trip.
- handler.DefaultBCLVerifier gains WithMaxAge + nowFn clock seam. iat
freshness check rejects tokens whose iat is in the future beyond
max-age OR stale beyond it. Verifier signature extended:
Verify(ctx, jwt) (iss, sub, sid, jti string, iat int64, err error).
- handler.AuthSessionOIDCHandler gains BCLReplayConsumer (interface)
+ WithBCLReplayConsumer(consumer, maxAge) setter. BackChannelLogout
consumes the jti post-verify with TTL = max(24h, 2*maxAge):
- first-receive → 200, sessions revoked, audit outcome=revoked
- replay (ErrBCLJTIAlreadyConsumed) → 200 + Cache-Control: no-store,
audit outcome=jti_replayed, sessions NOT re-revoked
- transient (non-AlreadyConsumed error) → 503 so the IdP retries
- internal/scheduler/scheduler.go: SetBCLReplayGarbageCollector wires
SweepExpired into the existing session-GC tick (no separate ticker
for short-lived replay rows).
- cmd/server/main.go: bclMaxAge from cfg.Auth.OIDCBCLMaxAgeSeconds
(default 60s, env CERTCTL_OIDC_BCL_MAX_AGE_SECONDS); bclReplayRepo
wired into the verifier + handler + scheduler.
- Three regression tests in internal/api/handler/bcl_replay_test.go:
TestBackChannelLogout_FirstReceiveConsumesJTI,
TestBackChannelLogout_ReplayedJTIReturns200WithAudit,
TestBackChannelLogout_TransientConsumeFailureReturns503.
- internal/api/handler/auth_session_oidc_test.go: stubBCLVerifier
gains jti + iat fields; existing TestBackChannelLogout_* tests
rewritten for the new Verify return.
Verification gate green: gofmt clean, go vet clean, go test -short
-count=1 on internal/api/handler / internal/api/router /
internal/scheduler / cmd/server / internal/auth/oidc /
internal/auth/breakglass — all pass.
CRIT-1..CRIT-5 + HIGH-1 + HIGH-2 + HIGH-3 of the 2026-05-10 audit
now closed on this branch. Spec at
cowork/auth-bundles-fixes-2026-05-10/07-high-3-bcl-replay-defense.md.
Refs: cowork/auth-bundles-audit-2026-05-10.md HIGH-3
|
||
|
|
ca1e135aa3 |
fix(oidc/bcl): resolve sub→actor_id via users.GetByOIDCSubject (CRIT-2 closure)
Closes CRIT-2 of the 2026-05-10 audit. The BCL handler previously called
sessionSvc.RevokeAllForActor(sub, "User") but session rows are keyed by
user.ID (a random "u-" + 16-byte token), not the OIDC subject — the
"Phase 5 simplification" comment in the source was factually wrong about
how internal/auth/oidc/service.go::upsertUser seeds user.ID. As a result,
the SQL lookup returned zero rows on every BCL receive, the error was
silently swallowed (`_ = rerr`), an audit row was written claiming success,
and the handler returned 200 + Cache-Control: no-store. OIDC BCL 1.0 §2.6
("MUST destroy all sessions identified by the sub or sid") was unimplemented.
CWE-613.
This commit:
- Adds userRepo (repository.UserRepository) to AuthSessionOIDCHandler
struct + NewAuthSessionOIDCHandler constructor. cmd/server/main.go
injects the existing oidcUserRepo (no new repository instance).
- Replaces the broken sub-as-actor-id path with:
1. providerRepo.List(ctx, tenantID) + IssuerURL filter to map
claims.iss → provider row (N is small; typically 1-5).
2. userRepo.GetByOIDCSubject(ctx, provider.ID, sub) to resolve the
OIDC subject → user.ID.
3. sessionSvc.RevokeAllForActor(user.ID, "User") with the RESOLVED
actor_id (not the OIDC subject).
- Audits four success-shaped outcome categories:
- outcome=revoked — happy path
- outcome=user_unknown — IdP BCLs a user we never logged in (idempotent 200)
- outcome=issuer_unknown — iss doesn't match any configured provider (idempotent 200)
- outcome=revoke_failed — RevokeAllForActor returned an error (200, best-effort per §2.8)
And two transient outcomes that return 503 (IdP retries per §2.8):
- outcome=provider_lookup_failed — providerRepo.List error
- outcome=user_lookup_failed — non-NotFound userRepo error
- Removes the misleading "Phase 5 simplification" comment block; replaces
with a doc explaining the resolution path + outcome taxonomy + spec refs.
- Adds 5 regression tests in internal/api/handler/auth_session_oidc_test.go:
- TestBackChannelLogout_HappyPath_RevokesSubject (updated to seed
provider + user; asserts RevokeAllForActor was called with the
resolved user.ID, not the raw OIDC subject — the test that would
have caught CRIT-2 had it existed)
- TestBackChannelLogout_UnknownUserReturns200WithAudit
- TestBackChannelLogout_IssuerUnknownReturns200WithAudit
- TestBackChannelLogout_TransientUserRepoErrorReturns503
- TestBackChannelLogout_RevokeFailureReturns200WithAuditFailureOutcome
- Introduces stubUserRepo in the handler test file (matching the four
repository.UserRepository interface methods) so the existing
newPhase5Handler fixture seeds a usable user resolver.
Verification gate green:
- gofmt -l . clean
- go vet ./... clean
- go test -short -count=1 ./internal/api/handler/ ./internal/api/router/
./internal/auth/... ./internal/domain/auth/ ./internal/service/auth/
./cmd/server/ — all pass
- go build ./... clean
CRIT-1 from the same audit is already closed on this branch (commit
|
||
|
|
9c679a5960 |
auth-bundle-2 Phase 5: OIDC + session HTTP surface (13 endpoints),
pre-login store, OpenID Connect Back-Channel Logout 1.0, cookieAuth
scheme, 7 new auth permissions, CI guard, handler tests
Phase 5 of the bundle puts the Phase 3 OIDC service + Phase 4 session
service on the wire. 13 HTTP endpoints split into three logical groups:
Public OIDC handshake (auth-exempt; protocol-mediated):
GET /auth/oidc/login?provider=<id> -> 302 to IdP authorization URL
+ sets certctl_oidc_pending cookie
(10-min TTL, Path=/auth/oidc/,
SameSite=Lax)
GET /auth/oidc/callback?code=...&state=... -> consume pre-login row,
run Phase 3's 11-step token
validation, mint post-login
session, 302 to dashboard
POST /auth/oidc/back-channel-logout -> OpenID Connect BCL 1.0 — IdP
POSTs logout_token JWT; certctl
validates signature against IdP
JWKS via Phase 3 alg allow-list,
required claims (iss/aud/iat/jti/
events; exactly one of sub/sid;
nonce ABSENT per spec §2.4),
revokes matching sessions,
returns 200 with
Cache-Control: no-store
POST /auth/logout -> revoke caller's session
Session management (RBAC-gated auth.session.*):
GET /api/v1/auth/sessions -> auth.session.list (own / all)
DELETE /api/v1/auth/sessions/{id} -> auth.session.revoke (own bypass)
OIDC provider + group-mapping CRUD (RBAC-gated auth.oidc.*):
GET /api/v1/auth/oidc/providers -> auth.oidc.list
POST /api/v1/auth/oidc/providers -> auth.oidc.create
(client_secret encrypted
at rest via
internal/crypto.EncryptIfKeySet)
PUT /api/v1/auth/oidc/providers/{id} -> auth.oidc.edit
DELETE /api/v1/auth/oidc/providers/{id} -> auth.oidc.delete
(refused via
ErrOIDCProviderInUse → 409
when users authenticated
via this provider)
POST /api/v1/auth/oidc/providers/{id}/refresh -> auth.oidc.edit
(re-runs IdP downgrade
defense via
OIDCService.RefreshKeys)
GET /api/v1/auth/oidc/group-mappings -> auth.oidc.list
POST /api/v1/auth/oidc/group-mappings -> auth.oidc.edit
DELETE /api/v1/auth/oidc/group-mappings/{id} -> auth.oidc.edit
Migration 000037 ships:
- oidc_pre_login_sessions table (10-min absolute TTL, FK CASCADE on
oidc_provider_id, FK RESTRICT on signing_key_id; index on
absolute_expires_at for the GC sweep);
- 7 new permissions seeded into r-admin only:
auth.session.list, auth.session.list.all, auth.session.revoke,
auth.oidc.list, auth.oidc.create, auth.oidc.edit, auth.oidc.delete
CanonicalPermissions extended in lockstep at internal/domain/auth/
validate.go.
Pre-login machinery:
- internal/repository/oidc.go gains PreLoginRepository interface +
PreLoginSession struct + ErrPreLoginNotFound / ErrPreLoginExpired
sentinels.
- internal/repository/postgres/oidc_prelogin.go ships the impl;
LookupAndConsume uses DELETE ... RETURNING for atomic single-use.
- internal/auth/oidc/prelogin.go is the PreLoginAdapter that bridges
the OIDC service's Phase 3 PreLoginStore interface to the new
repository, signing the cookie value under the active
SessionSigningKey via the same v1.<id>.<key>.<HMAC> wire format
Phase 4 uses for post-login cookies. Defense-in-depth: the
pre-login `pl-` prefix is enforced by ParseCookieValue(prefix);
a stolen pre-login cookie cannot be replayed against the
post-login Validate path (pinned by
TestService_Validate_RejectsPreLoginCookieAtPostLoginGate).
Session package extension:
- internal/auth/session/service.go gains exported SignCookieValue,
ParseCookieValue (with caller-supplied id-1 prefix), ComputeCookieHMAC,
DecryptKeyMaterial wrappers so the OIDC pre-login adapter shares
the same length-prefixed HMAC math without code duplication.
- parseCookie no longer hardcodes the `ses-` prefix check (moved to
Validate as defense-in-depth; pre-login cookie verification uses
the `pl-` prefix via ParseCookieValue).
Cookie attributes (all Phase 5 endpoints honor CERTCTL_SESSION_SAMESITE
+ Secure=true via SessionCookieAttrs from Phase 4 config):
- certctl_oidc_pending: Path=/auth/oidc/, MaxAge=600s, SameSite=Lax
(cannot be Strict because the IdP-initiated callback is a top-level
navigation from a different origin).
- certctl_session: Path=/, Expires=8h, SameSite=Lax|Strict, HttpOnly.
- certctl_csrf: Path=/, Expires=8h, HttpOnly=false (intentional —
GUI must read it to echo into X-CSRF-Token header).
Audit logging on every mutating operation (event_category="auth"):
auth.oidc_login_succeeded / failed / unmapped_groups
auth.oidc_back_channel_logout / failed
auth.session_revoked
auth.oidc_provider_{created,updated,deleted,refreshed}
auth.group_mapping_{added,removed}
OpenAPI updates:
- cookieAuth security scheme added to api/openapi.yaml under
components.securitySchemes (apiKey / cookie / certctl_session).
- The 13 Phase 5 routes are added to SpecParityExceptions with a
deferral note: full per-endpoint OpenAPI rows land in a follow-on
commit alongside the GUI work (Phase 8) so the ergonomic shape can
be validated against the live GUI client.
CI guard: scripts/ci-guards/N-bundle-2-security-empty-preserved.sh
asserts api/openapi.yaml has ≥ 14 'security: []' occurrences (the
pre-Bundle-2 baseline). Reducing the count below 14 would silently
force a Bearer-or-cookie requirement onto an endpoint that legitimately
runs without certctl-issued credentials; the guard fires before that
regression lands.
Handler tests (internal/api/handler/auth_session_oidc_test.go):
- All 6 prompt-mandated negative cases:
BCL with missing events claim -> 400
BCL with nonce present -> 400 (per spec §2.4)
BCL with sig signed by an unknown key -> 400
Callback with replayed state -> 400
Callback with PKCE verifier mismatch -> 400
Callback with expired pre-login row -> 400
- Plus happy paths for every endpoint, edge cases (missing-cookie,
duplicate-name, in-use-409, wrong-tenant), and the Helper-function
coverage (peekIssuer, classifyOIDCFailure, defaultIfBlank,
defaultIntIfZero, clientIPFromRequest, encryptClientSecret).
Coverage on internal/api/handler/auth_session_oidc.go: 80.9% per-function
(above the Phase 5 spec's ≥ 80% floor).
Server wiring (cmd/server/main.go):
Wired AFTER sessionService (Phase 4) so the OIDC PreLoginAdapter can
sign pre-login cookies under the active SessionSigningKey:
oidcProviderRepo + oidcMappingRepo + oidcUserRepo + oidcPreLoginRepo
-> preLoginAdapter -> oidcService -> authSessionOIDCHandler.
sessionMinterAdapter shim bridges *session.Service.Create to the
oidcsvc.SessionMinter port the OIDC service consumes.
Router wiring (internal/api/router/router.go):
4 public OIDC routes via direct r.mux.Handle (auth-exempt; pinned in
AuthExemptRouterRoutes); 9 RBAC-gated routes via r.Register +
rbacGate(checker, perm, h). Routes only register when
reg.AuthSessionOIDC != nil so pre-Phase-5 builds skip the block
entirely.
Verifications: gofmt clean, go vet clean across all touched packages,
go test -short -count=1 green across internal/api/handler (74 tests +
new Phase 5 batch), internal/api/router (parity + auth-exempt
allowlist), internal/auth/oidc + session (no regressions), full domain
+ scheduler + config sweeps green, ci-guard
N-bundle-2-security-empty-preserved.sh green (17 ≥ 14 baseline).
|