harden(auth/session+oidc): 503/401 split + go-oidc string pin (LOW-6 + Nit-2)

Audit 2026-05-10 — close LOW-6 + Nit-2 from the HANDOFF.md backend
batch (items 8 + 9).

LOW-6: introduce ErrSessionTransient sentinel in session.Service.
session.Validate now distinguishes:
  - errors.Is(err, repository.ErrSessionNotFound) → ErrSessionInvalidCookie (401)
  - All other repo errors                         → ErrSessionTransient (503)
The session middleware maps ErrSessionTransient to HTTP 503 with
Retry-After: 1. Pre-fix, every DB hiccup looked like a forged-cookie
401 and forced the user to re-authenticate on a transient outage.
Two new regression tests pin the wire shape:
  - TestService_Validate_TransientSessionGetError (service layer)
  - TestService_Validate_SessionNotFoundMapsToInvalidCookie (negative
    leg: not-found stays 401)
  - TestSessionMiddleware_TransientErrorMappedTo503 (middleware-level
    503 + Retry-After header)

Nit-2: isJWKSFetchError documentation now pins go-oidc/v3 v3.18.0 as
the source-of-truth string set. v3.18.0 exposes only
*oidc.TokenExpiredError as a typed error; JWKS-fetch failures bubble
up as fmt.Errorf-wrapped strings. New regression test
TestIsJWKSFetchError_GoOIDCV318Strings pins the canonical substrings
emitted by go-oidc's jwks.go — a future upstream bump that changes
the wording trips the test and forces the matcher to be re-derived.
The test caught a real gap: 'oidc: failed to decode keys' (emitted
when the IdP returns non-JSON at the jwks_uri — broken proxy, gateway
HTML error page, etc.) was previously misclassified as a generic 500
instead of 503 ErrJWKSUnreachable. Added 'decode keys' substring to
the matcher.

Status: LOW-6 + Nit-2 marked CLOSED in audit-doc table.

Refs: cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 8, 9
      cowork/auth-bundles-audit-2026-05-10.md LOW-6, Nit-2
This commit is contained in:
shankar0123
2026-05-10 22:41:19 +00:00
parent 77860fbcc3
commit acaa81472d
6 changed files with 138 additions and 3 deletions
+20 -1
View File
@@ -908,6 +908,19 @@ func atHashMatches(rawIDToken, accessToken, claimAtHash string) bool {
// error talking to the IdP's jwks_uri during a key rotation event).
// Maps to ErrJWKSUnreachable so the handler returns 503 to the
// in-flight login attempt without auto-revoking existing sessions.
//
// Audit 2026-05-10 Nit-2 — pinned against go-oidc/v3 v3.18.0. As of
// that release, the only typed error exposed by the oidc package is
// `*oidc.TokenExpiredError`; JWKS-fetch failures bubble up as
// fmt.Errorf-wrapped strings from internal/keyset.go's `verify` path
// (`failed to verify signature: fetching keys: ...`,
// `oidc: fetching keys ...`, `oidc: failed to get keys for kid ...`).
// The regression test in service_test.go::TestIsJWKSFetchError_GoOIDCV318Strings
// pins the canonical substrings; a future go-oidc bump that changes
// the wording trips the test and forces this function to be re-derived.
// When go-oidc exposes a typed error (track at
// https://github.com/coreos/go-oidc/issues for the upstream RFE),
// switch to errors.As.
func isJWKSFetchError(err error) bool {
if err == nil {
return false
@@ -915,7 +928,13 @@ func isJWKSFetchError(err error) bool {
msg := err.Error()
return strings.Contains(msg, "fetching keys") ||
strings.Contains(msg, "jwks_uri") ||
strings.Contains(msg, "key set")
strings.Contains(msg, "key set") ||
// go-oidc/v3 v3.18.0 jwks.go:260: `oidc: failed to decode keys`
// — emitted when the IdP returns non-JSON at the jwks_uri
// (broken proxy, gateway HTML error page, etc.). Audit
// 2026-05-10 Nit-2 closure — was previously misclassified as
// a generic 500 instead of 503 ErrJWKSUnreachable.
strings.Contains(msg, "decode keys")
}
// decryptClientSecret runs the client_secret_encrypted blob through
+35
View File
@@ -1071,6 +1071,41 @@ func TestService_IsJWKSFetchError(t *testing.T) {
}
}
// TestIsJWKSFetchError_GoOIDCV318Strings pins the canonical go-oidc/v3
// v3.18.0 error wordings against isJWKSFetchError. Audit 2026-05-10
// Nit-2: go-oidc's only typed error as of v3.18.0 is
// *oidc.TokenExpiredError; JWKS-fetch failures bubble up as
// fmt.Errorf-wrapped strings. A future go-oidc bump that changes
// these strings will trip this test and force isJWKSFetchError to be
// re-derived (or, ideally, switched to errors.As against a newly-
// exposed typed error). Without this pin, a silent upstream string
// change would make every JWKS-rotation login surface as 500 instead
// of 503 — the operator-distinguishable wire shape promised by
// ErrJWKSUnreachable.
func TestIsJWKSFetchError_GoOIDCV318Strings(t *testing.T) {
// Canonical substrings observed in go-oidc/v3 v3.18.0 verify path.
// Sources (all under github.com/coreos/go-oidc/v3@v3.18.0/oidc/):
// - jwks.go:175 → fmt.Errorf("fetching keys %w", err)
// - jwks.go:260 → fmt.Errorf("oidc: failed to decode keys: %v %s", ...)
// Also stably matched by isJWKSFetchError's "jwks_uri" + "key set"
// fallbacks (substrings inside go-oidc-emitted strings and our
// own /api/v1/auth/oidc/.../refresh wrap errors).
canonical := []string{
// Direct go-oidc v3.18.0 fmt.Errorf outputs.
"fetching keys: dial tcp: lookup idp.example.com: no such host",
"oidc: failed to decode keys: invalid character 'h' looking for beginning of value",
// Wrap from our own RefreshKeys / verify retry path.
"failed to refresh remote key set: timeout",
"unable to load key set: cancelled",
}
for _, msg := range canonical {
if !isJWKSFetchError(errors.New(msg)) {
t.Errorf("canonical go-oidc v3.18.0 string %q not detected as JWKS-fetch error; "+
"update isJWKSFetchError or pin the new substring", msg)
}
}
}
// TestService_DecryptClientSecret_NoKeyReturnsBytesAsIs covers the
// empty-key short-circuit (used by tests with plaintext blobs).
func TestService_DecryptClientSecret_NoKeyReturnsBytesAsIs(t *testing.T) {