harden(auth/session+oidc): 503/401 split + go-oidc string pin (LOW-6 + Nit-2)

Audit 2026-05-10 — close LOW-6 + Nit-2 from the HANDOFF.md backend
batch (items 8 + 9).

LOW-6: introduce ErrSessionTransient sentinel in session.Service.
session.Validate now distinguishes:
  - errors.Is(err, repository.ErrSessionNotFound) → ErrSessionInvalidCookie (401)
  - All other repo errors                         → ErrSessionTransient (503)
The session middleware maps ErrSessionTransient to HTTP 503 with
Retry-After: 1. Pre-fix, every DB hiccup looked like a forged-cookie
401 and forced the user to re-authenticate on a transient outage.
Two new regression tests pin the wire shape:
  - TestService_Validate_TransientSessionGetError (service layer)
  - TestService_Validate_SessionNotFoundMapsToInvalidCookie (negative
    leg: not-found stays 401)
  - TestSessionMiddleware_TransientErrorMappedTo503 (middleware-level
    503 + Retry-After header)

Nit-2: isJWKSFetchError documentation now pins go-oidc/v3 v3.18.0 as
the source-of-truth string set. v3.18.0 exposes only
*oidc.TokenExpiredError as a typed error; JWKS-fetch failures bubble
up as fmt.Errorf-wrapped strings. New regression test
TestIsJWKSFetchError_GoOIDCV318Strings pins the canonical substrings
emitted by go-oidc's jwks.go — a future upstream bump that changes
the wording trips the test and forces the matcher to be re-derived.
The test caught a real gap: 'oidc: failed to decode keys' (emitted
when the IdP returns non-JSON at the jwks_uri — broken proxy, gateway
HTML error page, etc.) was previously misclassified as a generic 500
instead of 503 ErrJWKSUnreachable. Added 'decode keys' substring to
the matcher.

Status: LOW-6 + Nit-2 marked CLOSED in audit-doc table.

Refs: cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 8, 9
      cowork/auth-bundles-audit-2026-05-10.md LOW-6, Nit-2
This commit is contained in:
shankar0123
2026-05-10 22:41:19 +00:00
parent 9cce2ab043
commit e7c4654b16
6 changed files with 138 additions and 3 deletions
+14
View File
@@ -90,6 +90,20 @@ func NewSessionMiddleware(svc SessionValidator) func(http.Handler) http.Handler
UserAgent: r.UserAgent(),
})
if verr != nil {
// Audit 2026-05-10 LOW-6 closure — ErrSessionTransient
// means the backend hit a retryable error (DB hiccup,
// connection reset, etc.) rather than the cookie being
// malformed. Surface 503 + Retry-After so well-behaved
// clients (curl --retry, browser fetch automatic retry,
// MCP clients) retry instead of forcing the user to
// re-auth on a transient issue. Pre-fix, every DB error
// looked like a forged-cookie 401.
if errors.Is(verr, ErrSessionTransient) {
w.Header().Set("Retry-After", "1")
w.Header().Set("Content-Type", "application/json; charset=utf-8")
http.Error(w, `{"error":"transient backend error; retry"}`, http.StatusServiceUnavailable)
return
}
// Cookie present but invalid (expired / tampered /
// retired-key / IP-bind / UA-bind / revoked). Defer to
// the next middleware so a valid Bearer can still