Commit Graph

3 Commits

Author SHA1 Message Date
shankar0123 757e2ec30c auth-bundle-2 Phase 3: OIDC service (HandleAuthRequest, HandleCallback,
RefreshKeys), hand-rolled group-claim resolver, 21+ negative-test
matrix, token-leak hygiene, IdP downgrade-attack defense

Phase 3 of the bundle ships the business logic that turns the Phase 2
storage primitives into a working OpenID Connect 1.0 + RFC 7636 PKCE
authorization-code flow against any enterprise IdP (Okta / Azure AD /
Google Workspace / Keycloak / Authentik / Auth0).

Service surface:

  - Service.HandleAuthRequest(providerID) -> authURL, cookie, preLoginID
    Builds the IdP redirect with PKCE-S256 (mandatory; RFC 9700 §2.1.1),
    server-generated 32-byte state + nonce, persisted to the pre-login
    row keyed by the cookie value.
  - Service.HandleCallback(cookie, code, state, ip, ua) -> *CallbackResult
    11-step validation: pre-login lookup-and-consume (single-use),
    constant-time state compare, code-for-token exchange with PKCE
    verifier, ID-token verify (alg pin via go-oidc/v3), service-layer
    re-checks of iss / aud / azp (multi-aud requires it; mismatch
    rejected) / at_hash (REQUIRED when access_token returned —
    Phase 3 lifts the OIDC core "MAY" to a service-level "MUST") /
    exp / iat-window / nonce, group-claim resolution with userinfo
    fallback, group->role mapping (fail-closed on no match),
    user upsert, session mint via SessionMinter port.
  - Service.RefreshKeys(providerID) — explicit cache eviction +
    re-load. Re-runs the IdP downgrade-attack defense so a provider
    that later rotates to advertising HS* / none is caught BEFORE the
    next user login attempt.

Security posture (every fail-closed branch is a sentinel error +
test):

  - Algorithm pinning: allow-list {RS256, RS512, ES256, ES384, EdDSA};
    deny-list {HS256, HS384, HS512, none}. Belt-and-braces re-check
    via isDisallowedAlg after go-oidc.Verify.
  - PKCE-S256 mandatory (oauth2.GenerateVerifier + S256ChallengeOption);
    `plain` rejection sentinel exists for defense-in-depth.
  - State + nonce: 32-byte crypto/rand, base64url-no-pad,
    constant-time compare, single-use.
  - IdP downgrade-attack defense: at provider creation / RefreshKeys,
    reject any IdP whose discovery doc advertises HS* / none in
    id_token_signing_alg_values_supported.
  - JWKS fail-closed: in-flight login fails 503; existing sessions
    untouched. isJWKSFetchError detects the gooidc verify-error
    shape; ErrJWKSUnreachable is the wire mapping.
  - Token-leak hygiene: ID tokens, access tokens, refresh tokens,
    authorization codes, PKCE verifiers, state, nonce, signing key
    bytes — NEVER logged at any level. logging_test.go pins the
    invariant via a slog buffer + grep-assert across HandleAuthRequest,
    HandleCallback, alg rejection, and provider-load paths.

Group-claim resolver (internal/auth/oidc/groupclaim/):

  - Hand-rolled per Decision 10 (no JSON-path lib; ~150 LOC).
  - URL-shape paths (https:// / http://) treated as a single
    literal key — Auth0 namespaced claims like
    https://your-namespace/groups work without splitting on the
    dots in the URL.
  - Dot-separated paths walked through nested map[string]interface{}.
  - []interface{} / []string / single-string normalized to []string;
    bool / number / object / nil → fail closed.
  - 18 unit tests + sentinels (ErrPathEmpty, ErrSegmentMissing,
    ErrSegmentNotObject, ErrInvalidValueType).

Test surface:

  - service_test.go: 57 test functions including all 21 prompt-mandated
    negative cases (wrong aud / wrong iss / expired / unknown alg /
    alg=none / HMAC alg / azp missing on multi-aud / azp mismatched /
    at_hash missing / at_hash mismatched / iat in future / iat too old /
    nonce mismatched / state mismatched / state replayed / PKCE plain
    sentinel / pre-login replay / forged cookie / IdP downgrade /
    group-claim missing / group-claim unmapped) plus the userinfo
    fallback matrix (happy path + endpoint-missing + endpoint-failing +
    userinfo-also-empty), HandleAuthRequest entry point + RNG-failure
    paths, upsertUser update + create + display-name fallback +
    Validate-error paths, decryptClientSecret real-encrypt round-trip
    + bad-passphrase, alg-parser malformed-header matrix.
  - logging_test.go: 4 hygiene tests pinning no token / code / verifier /
    state / cookie / client_secret / alg name appears in any captured
    log line.
  - groupclaim/resolver_test.go: 18 cases covering Okta string-array,
    Keycloak realm_access.roles, Auth0 namespaced URL claim,
    single-string normalization, deeply-nested 3-segment walks, and
    every fail-closed branch.

Coverage:
  internal/auth/oidc                  92.2%  (floor: 90)
  internal/auth/oidc/groupclaim      100.0%  (floor: 95)
  internal/auth/oidc/domain           96.2%  (floor: 90)

Coverage gates added at .github/coverage-thresholds.yml so a future
regression in any fail-closed branch fails CI before the commit lands.

Phase 3 of cowork/auth-bundle-2-prompt.md is closed. Next up: Phase 4
(Session service: cookies, revocation, sliding-vs-absolute expiry).
2026-05-10 04:56:03 +00:00
shankar0123 795d7725b8 auth-bundle-2 Phase 1: OIDC + Session + User + Breakglass domain types
Phase 1 ships the persisted-shape types Bundle 2 needs end-to-end.
No DB migrations, no service layer, no HTTP handlers; Phase 2 ships
the SQL, Phase 3+ ship the consumers. Each type has a Validate()
method that enforces the on-disk invariants the schema will mirror,
and a focused _test.go that pins each invariant's failure mode.

Per-package summary:

internal/auth/oidc/domain/ (OIDCProvider + GroupRoleMapping):
* OIDCProvider carries the operator-configured IdP record. Fields
  match the prompt's Phase 1 list plus IATWindowSeconds and
  JWKSCacheTTLSeconds (Phase 3 references these by name; landing
  them in Phase 1's domain type avoids the lying-field gap).
  ClientSecretEncrypted is opaque from this layer; it is the v2 blob
  produced by internal/crypto/encryption.go and is `json:"-"` so it
  never wire-leaks.
* Validate() rejects: invalid id prefix, empty name, non-https
  issuer_url (matches Phase 3's "JWKS endpoint MUST be HTTPS"),
  empty client_id, empty client_secret_encrypted, non-https
  redirect_uri, invalid groups_claim_format, scopes missing openid,
  IAT window outside (0, 600], JWKS cache TTL below 60s. Defaults
  applied in-place: GroupsClaimPath="groups", GroupsClaimFormat=
  "string-array", Scopes=["openid","profile","email"],
  IATWindowSeconds=300, JWKSCacheTTLSeconds=3600,
  TenantID="t-default".
* GroupRoleMapping carries the operator-configured group-to-role
  rule. Validate() pins prefix conventions ("grm-", "op-", "r-")
  and non-empty group name.
* 18 tests across happy-path + every negative invariant.

internal/auth/session/domain/ (Session + SessionSigningKey):
* Session covers BOTH the post-login row (full 1h-idle/8h-absolute
  cookie lifecycle) AND the Phase 5 pre-login row (10-minute TTL,
  carries OIDC state+nonce+PKCE verifier across the IdP redirect).
  IsPreLogin discriminates. CSRFTokenHash holds SHA-256 of the
  CSRF token plaintext (the plaintext lives in a JS-readable
  certctl_csrf cookie; storing only the hash on the row defends
  against DB-read leaks per the Phase 4 CSRF contract).
* Validate() pins: id prefix "ses-", non-empty actor id/type,
  signing key id prefix "sk-", AbsoluteExpiresAt strictly > Idle,
  IdleExpiresAt strictly > CreatedAt, CSRFTokenHash exactly 64
  lowercase hex chars when set.
* Cookie naming constants pinned by a separate test
  (TestCookieNamingConstants) so a future rename can't silently
  break the GUI's web/src/api/client.ts which reads these names by
  string.
* SessionSigningKey stores the v2-encrypted HMAC key material; the
  retired-before-created invariant catches malformed rows. 14
  tests across both types.

internal/auth/user/domain/ (User):
* Federated-human identity for SSO logins. Distinct from Bundle 1's
  free-form actor_id strings: actor_roles.actor_id = User.ID for
  federated humans (per the prompt's note about how the two
  identity systems intersect).
* WebAuthnCredentials JSONB column reserved for v3 (Decision 12);
  defaults to "[]" on Validate() so Bundle 2 + v3 share the same
  on-disk format from day one.
* Email validation is intentionally loose (basic shape: one @,
  non-empty local + domain, no whitespace, dot in domain). RFC 5321
  / 5322 grammars are not enforced; the IdP issued the email and
  we trust its shape, only rejecting gross corruption.
* 8 tests across happy-path + invalid-id + empty-email +
  malformed-email + invalid-provider-id + tenant defaulting +
  WebAuthn-credentials passthrough.

internal/auth/breakglass/domain/ (BreakglassCredential):
* Phase 7.5 type. Argon2id PHC-format password hash; Validate()
  pins the Argon2id magic prefix so non-Argon2id formats (bcrypt,
  pbkdf2, plaintext) are rejected at the persistence boundary.
* MinPasswordLengthBytes (12) + MaxPasswordLengthBytes (256)
  constants pinned by a dedicated test so the operator-facing
  password-strength contract can't drift silently.
* IsLocked(now) helper exposes the lockout state machine for the
  Phase 7.5 service to consume; the lockout window default is
  15min in the service layer.
* 9 tests across happy-path + per-invariant negative + lockout
  state machine + tenant defaulting.

Cross-cutting:
* Every type has json:"-" on the encrypted-credential field
  (ClientSecretEncrypted, KeyMaterialEncrypted, PasswordHash,
  CSRFTokenHash) so even a misconfigured handler that marshals the
  domain type directly into a response body cannot leak the
  secret. Mirrors Bundle 1's pattern for issuer/target credentials.
* Every type carries TenantID with Validate() defaulting to
  authdomain.DefaultTenantID. Forward-compat for the future
  managed-service multi-tenant activation; Bundle 2 ships
  single-tenant.

Verifications:
* gofmt -l clean across all 8 new files (one round-trip required to
  satisfy Go 1.19+ doc-comment list-formatting rules in
  session/domain/types.go).
* go vet clean on internal/auth/oidc/... + session/... + user/... +
  breakglass/...
* go test -short -count=1 green on all four new domain packages
  (49 test functions total).
* go test -short -count=1 still green on Bundle 1 packages
  (internal/auth, internal/auth/bootstrap, internal/service/auth,
  internal/config).
* govulncheck ./... clean (M-024 hard CI gate).
* All 24 ci-guards pass locally.

Phase 1 exit criteria from cowork/auth-bundle-2-prompt.md:
* All types compile: yes.
* Validators have at least 5 test cases each: yes (smallest is
  User with 8 tests; OIDCProvider has 13).
* make verify equivalent green: gofmt + vet + go test pass
  (golangci-lint deferred to CI per the same operating-rule
  pattern Phase 0 used).
2026-05-10 03:41:46 +00:00
shankar0123 7d7bda93ba auth-bundle-2 Phase 0: dependency-add + oidc auth-type literal + runtime guard
Bundle 2 Phase 0 stages the dependencies + auth-type discriminator
literal that later phases consume. No handler chain wired yet; an
operator who sets CERTCTL_AUTH_TYPE=oidc on this commit gets a clear
refuse-to-start error rather than a silent fallback to api-key (the
G-1 failure mode that drove "jwt" out of the allowed set).

Deliverables:

* go.mod: github.com/coreos/go-oidc/v3 v3.18.0 added as a direct
  require. Per the pre-bundle dependency audit (Apache-2.0, zero CVEs
  ever per OSV.dev, 2,400+ stars, used by Hashicorp Vault + Dex +
  Hydra + Authentik + every Kubernetes OIDC integration), this is the
  ecosystem-standard Go OIDC client. Pinned to a specific minor
  (v3.18.0) per the prompt's "no bare latest" rule.
* go.mod: golang.org/x/oauth2 promoted from // indirect to direct,
  bumped from v0.34.0 to v0.36.0 by go mod tidy. Both versions are
  OSV-clean. Maintained by the Go team.
* No JSON-path library added (forbidden by the dependency audit; the
  group-claim resolver is hand-rolled in Phase 3).
* internal/config/config.go: AuthTypeOIDC constant added with a
  load-bearing comment explaining (a) this is the AUTH-TYPE literal,
  not a JWT alg literal, so the G-1 closure invariant is preserved
  ("jwt" stays out of ValidAuthTypes forever); (b) the runtime guard
  in cmd/server/main.go intentionally refuses-to-start when oidc is
  set pre-Phase-6 to avoid the silent-downgrade failure mode.
  ValidAuthTypes() now returns {api-key, none, oidc}.
* internal/config/config_test.go: TestValidAuthTypesIsExactly_APIKey_None
  renamed to TestValidAuthTypesIsExactly_APIKey_None_OIDC and now pins
  the 3-entry set. TestValidAuthTypesDoesNotContainJWT (G-1 closure
  test) still passes because "jwt" is never added back.
  TestValidate_GenericInvalidAuthType's bad-types list updated:
  "oidc" removed (now valid), "saml" added (correctly rejected per
  Decision 5's SAML deferral).
* cmd/server/main.go: defense-in-depth runtime auth-type guard now
  has an explicit AuthTypeOIDC case that exit(1)s with an actionable
  message: "the OIDC auth chain is not yet wired in this build (Auth
  Bundle 2 Phase 6 ships the session middleware that consumes this
  auth-type literal)." This closes the lying-field gap the literal
  would otherwise create. Phase 6 of Bundle 2 relaxes this case to
  fall through alongside api-key + none.
* api/openapi.yaml: /v1/auth/info auth_type enum extended from
  [api-key, none] to [api-key, none, oidc] with an in-line comment
  explaining the Phase-0-vs-Phase-6 timing so an OpenAPI consumer
  isn't surprised by "oidc" appearing here pre-Bundle-2-merge.
* deploy/helm/certctl/templates/_helpers.tpl::certctl.validateAuthType:
  valid set extended to include "oidc". Chart-time validation now
  passes for type=oidc; the binary's runtime guard takes over to
  refuse the start. Once Bundle 2 ships, the runtime guard relaxes
  and OIDC works end-to-end with no further chart edits.
* .env.example: CERTCTL_AUTH_TYPE comment block updated to document
  the three valid values + the Phase-0-vs-Phase-6 timing.
* internal/auth/oidc/doc.go: new package directory with package doc
  + transitional blank imports for coreos/go-oidc/v3 + x/oauth2 so
  go mod tidy keeps both deps as direct requires until Phase 3's
  service.go replaces the blanks with real symbol use. Doc explains
  the package layout (oidc/ + oidc/domain/ + oidc/groupclaim/ +
  oidc/testfixtures/) so the post-Bundle-2 reader can navigate.

Verifications:
* gofmt clean on every changed file.
* go vet clean on internal/config + cmd/server + internal/auth/oidc.
* go test -short -count=1 green on internal/config (including the
  G-1 closure + new validation tests), cmd/server, internal/auth (all
  Bundle 1 packages), internal/service/auth.
* govulncheck ./... clean (M-024 hard CI gate).
* All 24 ci-guards pass locally.

Phase 0 exit criteria from cowork/auth-bundle-2-prompt.md:
* go.mod shows coreos/go-oidc/v3 as direct: yes.
* golang.org/x/oauth2 is direct (not indirect): yes.
* govulncheck ./... clean: yes.
* No JSON-path library in go.mod / go.sum deltas: confirmed (only
  v3 of go-oidc + the x/oauth2 bump landed).
* make verify green: gofmt + vet + go test pass; full make verify
  (which would invoke golangci-lint) deferred to CI since the
  sandbox doesn't have golangci-lint installed; the operator runs
  make verify locally before pushing per CLAUDE.md operating rule.
2026-05-10 03:31:51 +00:00