auth-bundle-2 Phase 3: OIDC service (HandleAuthRequest, HandleCallback,

RefreshKeys), hand-rolled group-claim resolver, 21+ negative-test
matrix, token-leak hygiene, IdP downgrade-attack defense

Phase 3 of the bundle ships the business logic that turns the Phase 2
storage primitives into a working OpenID Connect 1.0 + RFC 7636 PKCE
authorization-code flow against any enterprise IdP (Okta / Azure AD /
Google Workspace / Keycloak / Authentik / Auth0).

Service surface:

  - Service.HandleAuthRequest(providerID) -> authURL, cookie, preLoginID
    Builds the IdP redirect with PKCE-S256 (mandatory; RFC 9700 §2.1.1),
    server-generated 32-byte state + nonce, persisted to the pre-login
    row keyed by the cookie value.
  - Service.HandleCallback(cookie, code, state, ip, ua) -> *CallbackResult
    11-step validation: pre-login lookup-and-consume (single-use),
    constant-time state compare, code-for-token exchange with PKCE
    verifier, ID-token verify (alg pin via go-oidc/v3), service-layer
    re-checks of iss / aud / azp (multi-aud requires it; mismatch
    rejected) / at_hash (REQUIRED when access_token returned —
    Phase 3 lifts the OIDC core "MAY" to a service-level "MUST") /
    exp / iat-window / nonce, group-claim resolution with userinfo
    fallback, group->role mapping (fail-closed on no match),
    user upsert, session mint via SessionMinter port.
  - Service.RefreshKeys(providerID) — explicit cache eviction +
    re-load. Re-runs the IdP downgrade-attack defense so a provider
    that later rotates to advertising HS* / none is caught BEFORE the
    next user login attempt.

Security posture (every fail-closed branch is a sentinel error +
test):

  - Algorithm pinning: allow-list {RS256, RS512, ES256, ES384, EdDSA};
    deny-list {HS256, HS384, HS512, none}. Belt-and-braces re-check
    via isDisallowedAlg after go-oidc.Verify.
  - PKCE-S256 mandatory (oauth2.GenerateVerifier + S256ChallengeOption);
    `plain` rejection sentinel exists for defense-in-depth.
  - State + nonce: 32-byte crypto/rand, base64url-no-pad,
    constant-time compare, single-use.
  - IdP downgrade-attack defense: at provider creation / RefreshKeys,
    reject any IdP whose discovery doc advertises HS* / none in
    id_token_signing_alg_values_supported.
  - JWKS fail-closed: in-flight login fails 503; existing sessions
    untouched. isJWKSFetchError detects the gooidc verify-error
    shape; ErrJWKSUnreachable is the wire mapping.
  - Token-leak hygiene: ID tokens, access tokens, refresh tokens,
    authorization codes, PKCE verifiers, state, nonce, signing key
    bytes — NEVER logged at any level. logging_test.go pins the
    invariant via a slog buffer + grep-assert across HandleAuthRequest,
    HandleCallback, alg rejection, and provider-load paths.

Group-claim resolver (internal/auth/oidc/groupclaim/):

  - Hand-rolled per Decision 10 (no JSON-path lib; ~150 LOC).
  - URL-shape paths (https:// / http://) treated as a single
    literal key — Auth0 namespaced claims like
    https://your-namespace/groups work without splitting on the
    dots in the URL.
  - Dot-separated paths walked through nested map[string]interface{}.
  - []interface{} / []string / single-string normalized to []string;
    bool / number / object / nil → fail closed.
  - 18 unit tests + sentinels (ErrPathEmpty, ErrSegmentMissing,
    ErrSegmentNotObject, ErrInvalidValueType).

Test surface:

  - service_test.go: 57 test functions including all 21 prompt-mandated
    negative cases (wrong aud / wrong iss / expired / unknown alg /
    alg=none / HMAC alg / azp missing on multi-aud / azp mismatched /
    at_hash missing / at_hash mismatched / iat in future / iat too old /
    nonce mismatched / state mismatched / state replayed / PKCE plain
    sentinel / pre-login replay / forged cookie / IdP downgrade /
    group-claim missing / group-claim unmapped) plus the userinfo
    fallback matrix (happy path + endpoint-missing + endpoint-failing +
    userinfo-also-empty), HandleAuthRequest entry point + RNG-failure
    paths, upsertUser update + create + display-name fallback +
    Validate-error paths, decryptClientSecret real-encrypt round-trip
    + bad-passphrase, alg-parser malformed-header matrix.
  - logging_test.go: 4 hygiene tests pinning no token / code / verifier /
    state / cookie / client_secret / alg name appears in any captured
    log line.
  - groupclaim/resolver_test.go: 18 cases covering Okta string-array,
    Keycloak realm_access.roles, Auth0 namespaced URL claim,
    single-string normalization, deeply-nested 3-segment walks, and
    every fail-closed branch.

Coverage:
  internal/auth/oidc                  92.2%  (floor: 90)
  internal/auth/oidc/groupclaim      100.0%  (floor: 95)
  internal/auth/oidc/domain           96.2%  (floor: 90)

Coverage gates added at .github/coverage-thresholds.yml so a future
regression in any fail-closed branch fails CI before the commit lands.

Phase 3 of cowork/auth-bundle-2-prompt.md is closed. Next up: Phase 4
(Session service: cookies, revocation, sliding-vs-absolute expiry).
This commit is contained in:
shankar0123
2026-05-10 04:56:03 +00:00
parent 95f1d6cf63
commit 854135dfb7
7 changed files with 3057 additions and 22 deletions
+43
View File
@@ -105,3 +105,46 @@ internal/service/auth:
(ErrUnauthenticated / ErrForbidden / ErrSelfRoleAssignment /
ErrAuthReservedActor / ErrAuthUnknownPermission /
ErrAuthRoleInUse).
internal/auth/oidc:
floor: 90
why: |
Bundle 2 Phase 3 — OIDC service coverage gate. Phase 3 spec
pins the floor at 90 explicitly because every fail-closed
branch is load-bearing for the security posture: alg pinning
(deny-list HS*/none + allow-list RS*/ES*/EdDSA), audience
re-check, azp enforcement on multi-aud tokens, at_hash
REQUIRED-when-access-token-present (Phase 3 lifts the OIDC
core "MAY" to a service-level "MUST"), iat-window window,
nonce constant-time-compare, single-use state replay defense,
PKCE-S256 mandatory, IdP downgrade-attack defense at
provider-load + RefreshKeys time, JWKS-fail-closed semantics,
group-claim resolution + userinfo-fallback fail-closed
semantics, token-leak hygiene. A regression in any one of
these branches is a security incident; the floor catches it
before the commit lands. The mock-IdP fixture in
service_test.go is the load-bearing harness.
internal/auth/oidc/groupclaim:
floor: 95
why: |
Bundle 2 Phase 3 — group-claim resolver. Hand-rolled (no
JSON-path dep per Decision 10); ~150 LOC, every branch
exercised by 19 unit tests covering the documented IdP shapes
(Okta string array, Keycloak realm_access.roles, Auth0
namespaced URL claim, single-string normalization,
deeply-nested 3-segment walks) plus every fail-closed branch
(empty path, missing key, missing nested key, non-object
intermediate, bool/number/object/nil values, array with
non-string element, URL-shape with dots-in-path treated as
literal). Resolver should be at 100%; floor at 95 leaves a
1-statement margin for future error-message refactors.
internal/auth/oidc/domain:
floor: 90
why: |
Bundle 2 Phase 1 — OIDCProvider + GroupRoleMapping domain.
Validation-heavy package; constructors + Validate methods
cover all canonical IdP shapes (Okta / Azure AD / Google
Workspace / Keycloak / Authentik / Auth0). Floor at 90 to
catch any future field that ships without a validator.