mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 18:21:32 +00:00
master
16 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5d7bc86451 |
fix(oidc): SEC-020 — wrap fetchUserinfoGroups via SafeOIDCContext
Acquisition-audit Sprint 1 follow-up to SEC-001 (2026-05-16). The
original SEC-001 sweep routed two OIDC discovery legs (test_discovery.go
dry-run + service.go runtime provider load) through
validation.SafeHTTPDialContext via the SafeOIDCContext(ctx) helper.
This commit closes one of the two adjacent call sites the sweep missed:
the userinfo-fallback path at service.go::fetchUserinfoGroups.
Pre-fix:
func (s *Service) fetchUserinfoGroups(ctx, entry, token, path) {
...
ts := entry.oauthConfig.TokenSource(ctx, token)
uinfo, err := entry.provider.UserInfo(ctx, ts)
...
}
go-oidc/v3 Provider.UserInfo (oidc.go:351-374) derives its
http.Client from ctx via getClient(ctx) (oidc.go:61-65). Without an
override, the internal doRequest (oidc.go:87-92) falls through to
http.DefaultClient — no SSRF guard, no DNS-rebinding re-resolve at
dial time. An IdP whose discovery doc advertises a userinfo_endpoint
pointing at a reserved address (loopback / link-local /
169.254.169.254 cloud-metadata) would trigger an unguarded HTTPS
egress at userinfo-fetch time. Operator opt-in to fetch_userinfo=true
turns the gap on; the leg fires whenever the ID token doesn't surface
the configured groups claim.
Post-fix:
safeCtx := SafeOIDCContext(ctx)
ts := entry.oauthConfig.TokenSource(safeCtx, token)
uinfo, err := entry.provider.UserInfo(safeCtx, ts)
Context-key shape: gooidc.ClientContext is implemented as
context.WithValue(ctx, oauth2.HTTPClient, client) (go-oidc v3.18.0
oidc.go:57-59). Both go-oidc's getClient AND golang.org/x/oauth2's
internal.ContextClient read the same oauth2.HTTPClient key, so the
SINGLE SafeOIDCContext wrap covers go-oidc-driven HTTP calls
(Provider.UserInfo / Verifier JWKS) AND oauth2-driven HTTP calls
(Config.TokenSource refresh / Exchange). No additional
context.WithValue(ctx, oauth2.HTTPClient, ...) is required.
Files touched:
internal/auth/oidc/service.go — wrap ctx in fetchUserinfoGroups
internal/auth/oidc/safehttp.go — extend SEC-001 header comment block
to enumerate the two newly-patched sites (SEC-020 here +
SEC-021 in the next commit) and the oauth2.HTTPClient key-sharing
rationale, so future audits don't re-flag the design as confused
internal/auth/oidc/service_test.go — new test
TestFetchUserinfoGroups_SSRF_BlocksReservedAddress that
stands up a loopback discovery server whose discovery doc
advertises userinfo_endpoint = http://169.254.169.254/userinfo,
constructs *gooidc.Provider via the test-bypassed
oidcDiscoveryClient (setup_test.go's init() pattern), then
RESTORES the production SafeHTTPDialContext-backed client just
before the fetchUserinfoGroups call. Asserts the error wraps
SafeHTTPDialContext's 'refusing to dial reserved address'
rejection rather than a generic connect-refused. Companion to
the TestDefaultBCLVerifier_SSRF_BlocksReservedAddress that
SEC-021 (next commit) adds.
Verified:
gofmt -l internal/ docs/ (clean)
go vet ./... (clean)
go test -race -short ./internal/auth/oidc/... (all green)
TestFetchUserinfoGroups_SSRF_BlocksReservedAddress (new; green)
All 4 cited CI guards pass (openapi-handler-parity,
openapi-codegen-drift, no-sh-c-in-connectors, skip-inventory-drift)
Acceptance grep:
internal/auth/oidc/service.go:963: uinfo, err := entry.provider.UserInfo(safeCtx, ts)
internal/auth/oidc/service.go:1084: provider, err := gooidc.NewProvider(SafeOIDCContext(ctx), cfgRow.IssuerURL)
No bare-ctx UserInfo / NewProvider remains in service.go.
Closes acquisition-audit SEC-020. SEC-021 (BCL discovery re-fetch)
lands in the next commit.
|
||
|
|
e6cfd756ac |
fix(auth): SEC-001 — gate OIDC discovery through SafeHTTPDialContext + ValidateSafeURL
Sprint 1 unified-master-audit closure. Two OIDC discovery call sites
passed the bare request context to gooidc.NewProvider:
- internal/auth/oidc/test_discovery.go:65 (dry-run validator)
- internal/auth/oidc/service.go:1066 (runtime cache load)
gooidc.NewProvider derives its HTTP client from the context via
oidc.ClientContext; with no override it falls through to
http.DefaultClient — no SSRF guard. An admin with auth.oidc.create
could induce server-side HTTPS egress to loopback (127.0.0.1, ::1),
RFC 1918, link-local (169.254.169.254 — cloud-instance metadata),
and IPv6 link-local (fe80::/10). The companion JWKS reachability
probe was already routed through SafeHTTPDialContext via the
Bundle 5 R6 closure; the discovery + claims path bypassed that.
Fix:
- New internal/auth/oidc/safehttp.go: oidcDiscoveryClient (Transport
DialContext = validation.SafeHTTPDialContext) + SafeOIDCContext
helper. Both call sites now wrap ctx through SafeOIDCContext
before NewProvider runs.
- Defense-in-depth: OIDCProvider.Validate calls
validation.ValidateSafeURL on the IssuerURL after the existing
https/parse checks, refusing reserved-address issuers at
provider-creation time.
- TestDiscovery surfaces the SSRF policy error via the result's
Errors slice up-front (early-fail UX rail) before invoking
NewProvider.
Test seams:
- setup_test.go swaps oidcDiscoveryClient + validateIssuerSSRF
for httptest loopback compatibility, mirroring the existing
jwksProbeClient pattern.
Regression coverage:
- internal/auth/oidc/domain/types_test.go: 5-case table pinning
loopback v4/v6, cloud metadata, link-local v4/v6 rejection.
- internal/auth/oidc/coverage_fill_test.go: same 5 cases against
Service.TestDiscovery via temporarily restoring the production
gate.
Closes SEC-001.
|
||
|
|
21aeed4f4e |
legal: addlicense headers + normalize legacy variants (Phase 0 RED-4)
Phase 0 closure (Path B2, post-rewrite):
addlicense sweep — adds the canonical certctl LLC copyright + BUSL-1.1
SPDX header to every production Go file. Template:
// Copyright 2026 certctl LLC. All rights reserved.
// SPDX-License-Identifier: BUSL-1.1
Coverage: 338 / 338 production Go files (cmd/ + internal/, excluding
*_test.go and **/testdata/**). Pre-sweep coverage was 22 / 338 (6.5%);
post-sweep is 338 / 338 (100%).
Normalized 22 pre-existing legacy headers (`// Copyright (c) certctl`
+ `// SPDX-License-Identifier: BSL-1.1`) and 1 file using a
`Certctl Contributors` attribution. The legacy SPDX ID `BSL-1.1`
is non-standard; the official SPDX identifier for Business Source
License 1.1 is `BUSL-1.1` (capital U). All 338 files now share the
canonical form.
Generated via:
addlicense -c "certctl LLC" -y 2026 \
-f cowork/legal/copyright-header.tpl \
-ignore '**/testdata/**' -ignore '**/*_test.go' \
cmd/ internal/
Verification:
find cmd internal -name '*.go' -not -name '*_test.go' \
-not -path '*/testdata/*' \
-exec grep -L '^// Copyright 2026 certctl LLC' {} \; | wc -l
Returns: 0
gofmt clean. Header additions are comments only, no compile impact.
Closes: cowork/certctl-architecture-diligence-audit.html#fix-RED-4
|
||
|
|
fefeccfa59 |
harden(oidc): relax alg-downgrade IdP-bind check to intersection-empty (Keycloak compat)
Phase-10 live-IdP smoke (Keycloak 26.x via testcontainers-go) revealed
the IdP-bind alg-downgrade check was too strict for real-world IdPs.
6 of the integration tests in internal/auth/oidc/integration_keycloak*_test.go
were failing with:
oidc: IdP advertises weak signing algorithms (HS*/none);
refusing to use as defense against downgrade attacks: HS256
Keycloak 26.x (and several other real-world IdPs — Auth0 when HS-mode is
enabled, some Authentik configs) advertise EVERY alg they're capable of
in the discovery doc's id_token_signing_alg_values_supported field, even
when the realm only signs with RS256 in practice. Pre-fix the IdP-bind
check refused on ANY HS* or 'none' advertisement → no real Keycloak deploy
could ever bind a provider row, hence the integration-test failures.
The strict-deny check was defense-in-depth on top of the load-bearing
per-token alg-pin at sig-verify time (isDisallowedAlg, service.go L1177):
that check rejects every ID token whose JWS header carries an alg outside
DefaultAllowedAlgs, regardless of what the discovery doc advertises.
A forged HS256 token signed with the IdP's RS256 pubkey as HMAC secret
is rejected at sig-verify time → the actual algorithm-confusion attack
is closed by the per-token pin, NOT by the discovery-doc check.
Fix: relax the IdP-bind check to refuse only when the intersection of
advertised vs DefaultAllowedAlgs is EMPTY (the pathological all-weak-alg
IdP case). Keycloak (RS256 + HS256 advertised) now binds successfully;
an HS-only IdP still fails closed.
Changes:
- internal/auth/oidc/service.go: rewrite the alg-check loop at L1067 in
getOrLoad / RefreshKeys to compute the intersection set; refuse only
when no acceptable alg is advertised. ErrIdPDowngradeAdvertised
docstring updated to reflect new contract. DefaultAllowedAlgs
docstring + the package-level design-comment block at L40-72 updated
with v2.1.0-relaxed semantics callouts.
- internal/auth/oidc/test_discovery.go: TestDiscovery dry-run validator
rewritten to surface HS*/none alongside RS* as an informational note
('note: IdP advertises weak algorithms %v alongside acceptable ones')
rather than a hard-fail error. HS-only / none-only still hard-fails.
- internal/auth/oidc/service_test.go: TestService_IdPDowngradeDefense_*
tests updated. Renamed:
- RejectsHSAdvertised → RS256PlusHS256_BindsSuccessfully (positive)
- RejectsNoneAdvertised → RejectsHSOnlyAdvertised (intersection-empty)
- RefreshKeys_CatchesPostLoadDowngrade rotated to HS-only post-load
- internal/auth/oidc/coverage_fill_test.go: TestTestDiscovery_AlgDowngradeDetected
split into _HS256AlongsideRS256_BindsWithNote (positive, asserts note
but no hard-fail) + _HSOnly_StillTrips_HardFail (intersection-empty).
- docs/operator/auth-threat-model.md: OIDC token-validation alg-allow-list
section rewritten to call out the load-bearing-defense hierarchy
(per-token pin first, IdP-bind check defense-in-depth) and document
the v2.1.0 relaxation rationale.
- CHANGELOG.md: ### Security entry under Unreleased.
Verify: go test ./internal/auth/oidc/ -short PASS; gofmt clean; go vet
clean. The Keycloak integration tests should now pass when the operator
re-runs 'make keycloak-integration-test'.
|
||
|
|
11b145b641 |
Merge Fix 06 (HIGH A-6): strict UA/IP binding — close request-empty bypass in MED-16
# Conflicts: # CHANGELOG.md # internal/api/handler/auth_session_oidc.go # internal/api/handler/auth_session_oidc_test.go |
||
|
|
92519436a1 |
harden(oidc): strict UA/IP binding (A-6) — close request-empty bypass in MED-16
The MED-16 closure (
|
||
|
|
78485f7429 |
fix(auth/users): close MED-11 lying field — DeactivatedAt loaded + enforced on login (A-2)
The MED-11 closure shipped users.deactivated_at + DELETE /api/v1/auth/users/{id}
+ cascade-revoke, but the federated-user soft-delete was reversible: the next
OIDC login under the same (provider, subject) tuple re-minted a session and
re-elevated the user.
Three legs of the chain were severed (each independently CRIT-shaped):
Leg A — postgres/user.go::userColumns omitted `deactivated_at`, so scanUser
never populated User.DeactivatedAt. Every Get / GetByOIDCSubject /
ListAll returned DeactivatedAt = nil regardless of the column value.
Leg B — postgres/user.go::Update SQL omitted `deactivated_at = $X`, so the
handler's `u.DeactivatedAt = now()` mutation was a no-op write at
the SQL level. Even with leg A closed, no row ever flipped.
Leg C — oidc/service.go::upsertUser did not inspect DeactivatedAt on the
existing-user path. Even with legs A + B closed, the OIDC login
would still proceed normally.
The cascade-session-revoke half of the original closure remained correct, but
only for the duration of the user's current cookie. SOC 2 CC6.3 + ISO 27001
A.9.2.6 "user access removal" controls require both immediate revoke AND
persistent block — this fix restores the persistent-block leg.
Closure across layers:
internal/repository/postgres/user.go
- userColumns adds `deactivated_at`
- scanUser reads via sql.NullTime intermediate (column is nullable)
- Create writes deactivated_at explicitly (NULL for new active users;
forward-compat for future seed-data flows that pre-populate the column)
- Update writes deactivated_at on every call; nil DeactivatedAt → NULL
(supports reactivation)
internal/auth/oidc/service.go
- New sentinel ErrUserDeactivated
- upsertUser checks existing.DeactivatedAt != nil BEFORE mutating email /
display_name / last_login_at — preserves last_login_at forensics on
rejected login attempts (defense-in-depth pin against future
"performance optimization" that reorders the gate)
internal/api/handler/auth_session_oidc.go
- classifyOIDCFailure adds typed errors.Is dispatch for ErrUserDeactivated
→ audit category "user_deactivated" (SOC/SIEM observability surface)
internal/api/handler/auth_users.go
- Self-deactivate guard on Deactivate: HTTP 409 + audit row
auth.user_deactivate_self_rejected when caller targets own User row.
Prevents an admin from one-way-door locking themselves out via the
standard handler; break-glass remains the recovery path.
- New Reactivate handler: inverse of Deactivate. Clears DeactivatedAt
via Update; emits auth.user_reactivated audit row. Idempotent on
already-active rows. Sessions revoked at deactivation stay revoked
(cascade irreversible by design — user must complete fresh OIDC
login).
internal/api/router/router.go
- POST /api/v1/auth/users/{id}/reactivate wired with auth.user.deactivate
gate (reactivation is the inverse op, not a separate privilege)
web/src/api/client.ts + web/src/pages/auth/UsersPage.tsx
- authReactivateUser() client function
- Reactivate button on deactivated rows in UsersPage
Regression coverage:
Postgres (testcontainers, skipped under -short):
TestUserRepository_DeactivatedAt_RoundTrip — Create → set DeactivatedAt
→ Update → Get / GetByOIDCSubject / ListAll round-trip the value
TestUserRepository_DeactivatedAt_CreateWritesNullForActive — new active
user reads back DeactivatedAt = nil
TestUserRepository_DeactivatedAt_CreatePersistsPreDeactivated — Create
with non-nil DeactivatedAt round-trips (forward-compat path)
OIDC service:
TestService_HandleCallback_RejectsDeactivatedUser — errors.Is
ErrUserDeactivated; CallbackResult nil; persisted email / last_login_at
/ deactivated_at NOT mutated by the rejected attempt
TestService_HandleCallback_AllowsReactivatedUser — DeactivatedAt = nil
→ happy path resumes
TestService_HandleCallback_DeactivatedUserPreservesForensics —
defense-in-depth pin against future regressions that reorder the
gate-vs-mutation sequence
Classifier:
TestClassifyOIDCFailure extended — typed dispatch + wrapped variant
round-trip through errors.Is
Handler:
TestAuthUsers_Deactivate_RejectsSelfDeactivate — HTTP 409 + audit
row + cascade-revoke NOT fired + row stays active
TestAuthUsers_Deactivate_OtherUser_HappyPath — HTTP 204 + cascade
fires + row soft-deleted
TestAuthUsers_Reactivate_HappyPath / _IdempotentOnActiveUser /
_UnknownID / _MissingID / _UpdateError
Phase 6 verify gate green on the targeted packages: gofmt clean, go vet
clean, go test -short pass across internal/auth/oidc, internal/api/handler,
internal/api/router, internal/repository/postgres, internal/auth/...,
internal/service/..., internal/tlsprobe/..., internal/trustanchor/...,
internal/validation/...
Spec at cowork/auth-bundles-fixes-2026-05-11/02-crit-deactivated-at-enforcement.md
Closure annotation at cowork/auth-bundles-audit-2026-05-10.md MED-11 row.
Operator advisory in CHANGELOG.md v2.1.0 release notes.
|
||
|
|
172b30b8f1 |
feat(auth): backend endpoints for MED-7 + MED-11 + MED-12
Audit 2026-05-10 MED-7 + MED-11 + MED-12 backend halves.
WHAT.
Three new admin-gated endpoints:
GET /api/v1/auth/oidc/providers/{id}/jwks-status (auth.oidc.list) — MED-7
GET /api/v1/auth/users (auth.user.read) — MED-11
DELETE /api/v1/auth/users/{id} (auth.user.deactivate) — MED-11
GET /api/v1/auth/runtime-config (auth.role.assign) — MED-12
MED-7 — JWKS health surface
- providerEntry gains 4 counters (statsMu, lastRefreshAt, refreshCount,
lastError, rejectedJWSCount) updated under sync.Mutex
- RefreshKeys increments refreshCount + records lastRefreshAt
- New JWKSStatus(ctx, providerID) returns *JWKSStatusSnapshot —
surfaced via the new endpoint
- CurrentKIDs intentionally empty (go-oidc's internal JWKS cache
isn't exposed); shape kept for forward compat
MED-11 — federated-user admin
- AuthUsersHandler.List with optional ?oidc_provider_id filter
- AuthUsersHandler.Deactivate sets users.deactivated_at + cascade-
revokes sessions via UserSessionsRevoker (best-effort; revoke
failure does NOT roll back the deactivation)
- Idempotent: re-deactivating an already-deactivated user is a no-op
MED-12 — runtime config
- AuthRuntimeConfigHandler.Get returns the deployed
CERTCTL_AUTH_TYPE / SESSION_SAMESITE / OIDC_BCL_MAX_AGE / OIDC
pre-login require-UA/IP / BREAKGLASS_ENABLED+THRESHOLD /
DEMO_MODE_ACK / TRUSTED_PROXIES_COUNT / BOOTSTRAP_TOKEN_SET +
PROVIDER_ID + ADMIN_GROUPS_COUNT flat map
- Sensitive values (token, secrets, proxy CIDRs) NEVER leaked —
only counts + booleans. Token presence surfaced as 'set/unset'
- Gated auth.role.assign (admin-class) so non-admins can't
enumerate the deployment's auth knobs
cmd/server/main.go wires all three handlers into HandlerRegistry.
internal/api/router/router.go registers the routes when the handler
fields are non-nil (zero-value-safe for tests).
VERIFY.
- go vet ./internal/api/... ./internal/auth/... ./internal/repository/... PASS
- go build ./cmd/server/... PASS
- go test -short -count=1 ./internal/auth/oidc/... PASS (4.1s)
- go test -short -count=1 ./internal/api/handler/... PASS (4.1s)
GUI halves for MED-7 + MED-11 + MED-12 are the GUI batch (pending).
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-7, MED-11, MED-12
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 11 14 15
|
||
|
|
e005c004e1 |
harden(oidc): JWKS auto-refresh on kid-not-in-cache (MED-6)
Audit 2026-05-10 MED-6 closure.
WHAT.
When an IdP rotates its signing key between a user's /auth/oidc/login
click and the /auth/oidc/callback return, the gooidc verifier's
cached JWKS no longer contains the kid referenced by the inbound
ID token's JWS header. Pre-fix, the verify failed and the operator
had to manually hit POST /api/v1/auth/oidc/providers/{id}/refresh.
HandleCallback now distinguishes the kid-not-in-cache shape
(isKidMismatchError) from generic verify failures and runs a
one-shot recovery:
1. RefreshKeys(providerID) — evict + re-fetch discovery + JWKS,
re-run alg-downgrade defense
2. getOrLoad(providerID) — refresh the cached providerEntry
3. verifier.Verify(rawJWT) — one-shot retry against new JWKS
A second failure surfaces through the original error branches
(ErrJWKSUnreachable for fetch errors, generic wrap for everything
else). NO retry loop — bounded recovery only.
WHY.
Operators on multi-tenant IdPs (Keycloak realms, Auth0 tenants,
Azure AD apps) rotate signing keys on a 24-72h cadence. Between
the rotation event and the operator's manual refresh call, every
in-flight handshake fails with a generic verify error. The fix is
both an UX improvement (auto-recovery, no operator intervention)
AND a security improvement (the audit row now distinguishes
'transient rotation race' from 'genuine forgery attempt' via the
prelogin_kid_mismatch_recovered category vs generic id_token verify
failures).
HOW.
internal/auth/oidc/service.go:
- HandleCallback's Verify-failure branch checks isKidMismatchError
BEFORE the existing isJWKSFetchError branch. On match, runs
RefreshKeys + getOrLoad + verifier.Verify exactly once. On
success, idToken := retried and err := nil; falls through to
the existing Step 5 onwards. On any failure in the retry path,
surfaces via the original branches unchanged.
- isKidMismatchError matcher: pinned go-oidc/v3 v3.18.0 substrings
('kid .* not found', 'signing key .* not found', 'no matching
key', 'key with id .* not found'). Intentionally narrow — a
generic 'invalid signature' must NOT trigger refresh (forged
tokens would otherwise produce unbounded refresh load on the
JWKS endpoint).
internal/auth/oidc/service_test.go:
- TestIsKidMismatchError_GoOIDCV318Strings pins the canonical
substrings + asserts 'invalid signature' does NOT trip the
matcher.
- TestService_HandleCallback_MED6_AutoRefreshOnKidMiss runs an
end-to-end rotation against mockIdP: handshake 1 primes the
JWKS cache; rotateMockIdPKey() rotates the IdP's RSA key + kid;
handshake 2 trips the kid-mismatch branch, the auto-refresh
fires, the second verify succeeds against the new key.
VERIFY.
- go vet ./internal/auth/oidc/... PASS
- go test -short -count=1 -run 'MED6|KidMismatch'
./internal/auth/oidc/... PASS (2/2)
- go test -short -count=1 ./internal/auth/oidc/... PASS (4.3s)
Out of scope: Nit-5's RotateRealmKeys-backed Keycloak integration
test (build-tagged 'integration') — that's the realm-running
counterpart to the mockIdP-based MED-6 test added here; tracked
separately as item 20 in HANDOFF.md.
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-6
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 3
|
||
|
|
2a1a0b347c |
harden(oidc): pre-login UA/IP binding (MED-16) — RFC 9700 §4.7.1
Audit 2026-05-10 MED-16 closure.
WHAT.
Binds the OIDC pre-login row to the (clientIP, userAgent) tuple of
the /auth/oidc/login request, and enforces a constant-time compare
against the /auth/oidc/callback request at consume time. Defeats
replay of a stolen pre-login cookie by a different browser /
source — the secondary defense layer recommended by RFC 9700 §4.7.1
when the primary layer (HMAC integrity + Path=/ + SameSite=Lax on
the cookie) is bypassed via CSRF / XSS / TLS-termination leak.
WHY.
Pre-fix, the pre-login cookie's HMAC verified only that 'some'
caller of /auth/oidc/login was talking to /auth/oidc/callback; it
did not verify that the SAME browser / source was on both sides.
An attacker who exfiltrated the cookie value via any vector could
replay the bytes through their own user-agent and ride the victim's
authorization. RFC 9700 §4.7.1 calls out the gap explicitly and
recommends binding state to a user-agent fingerprint + source IP.
HOW.
Migration:
migrations/000044_prelogin_uaip.up.sql
ALTER TABLE oidc_pre_login_sessions
ADD COLUMN IF NOT EXISTS client_ip TEXT,
ADD COLUMN IF NOT EXISTS user_agent TEXT;
Both nullable for in-flight rolling-deploy compat — the consume-
side check only enforces when both row AND request carry non-empty
values for the leg in question.
Domain:
internal/repository/oidc.go (PreLoginSession) — adds ClientIP +
UserAgent fields.
Repository:
internal/repository/postgres/oidc_prelogin.go — Create persists
via sql.NullString (empty → NULL); LookupAndConsume reads back.
Re-uses package-local nullableString from discovery.go.
Service:
internal/auth/oidc/service.go
- PreLoginStore.CreatePreLogin signature takes (clientIP,
userAgent) as positions 5–6.
- PreLoginStore.LookupAndConsume returns (clientIP, userAgent)
as positions 5–6.
- HandleAuthRequest signature gains (clientIP, userAgent),
threaded to the store.
- HandleCallback adds Step 1.5 — UA / IP constant-time compare
between stored row and incoming request. Per-leg toggles via
preLoginRequireUA / preLoginRequireIP service fields. Empty
values on either side pass through (rolling-deploy + headless-
proxy compat).
- New sentinels ErrPreLoginUAMismatch, ErrPreLoginIPMismatch.
- SetPreLoginBindingRequirements(requireUA, requireIP) helper
for main.go config wiring.
Adapter:
internal/auth/oidc/prelogin.go — PreLoginAdapter passes the new
fields through to the repo row.
Handler:
internal/api/handler/auth_session_oidc.go
- OIDCAuthHandshaker.HandleAuthRequest signature updated.
- LoginInitiate captures clientIPFromRequest + r.UserAgent()
and passes to the service.
- classifyOIDCFailure adds errors.Is dispatch for the two new
sentinels → prelogin_ua_mismatch / prelogin_ip_mismatch
audit categories.
Config:
internal/config/config.go
+ AuthConfig.OIDCPreLoginRequireUA (default true)
env CERTCTL_OIDC_PRELOGIN_REQUIRE_UA
+ AuthConfig.OIDCPreLoginRequireIP (default true)
env CERTCTL_OIDC_PRELOGIN_REQUIRE_IP
cmd/server/main.go calls oidcService.SetPreLoginBindingRequirements
from cfg.Auth.OIDCPreLoginRequire{UA,IP}.
Tests (internal/auth/oidc/service_test.go):
- TestService_HandleCallback_MED16_UAMismatchRejected
- TestService_HandleCallback_MED16_IPMismatchRejected
- TestService_HandleCallback_MED16_BothMatch_Succeeds
- TestService_HandleCallback_MED16_LegacyRowEmptyValues (rolling-
deploy compat — empty stored values pass through)
- TestService_HandleCallback_MED16_RequireUAFalse_AllowsMismatch
(operator escape-hatch — UA mismatch silently allowed)
Mechanical fan-out:
- stubPreLogin / stubPreLoginRepo signatures updated.
- All existing call sites in service_test.go (~40), prelogin_test.go,
bench_test.go, logging_test.go, provider_enabled_test.go,
integration_keycloak_test.go, integration_okta_smoke_test.go,
auth_session_oidc_test.go updated to pass empty strings for the
new params — pre-existing tests do not exercise UA/IP binding
semantics.
VERIFY.
- go vet ./internal/auth/oidc/... ./internal/api/handler/...
./internal/config/... PASS
- go test -short -count=1 -run MED16 ./internal/auth/oidc/... PASS (5/5)
- go test -short -count=1 ./internal/auth/oidc/... PASS (4.6s)
- go test -short -count=1 ./internal/api/handler/... PASS (4.3s)
- go test -short -count=1 ./internal/config/... PASS
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-16
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 6
RFC 9700 §4.7.1 — OAuth 2.0 Security Best Current Practice
|
||
|
|
2cd2a5c52f |
harden(oidc): RFC 9207 iss URL parameter check on callback (MED-17)
Audit 2026-05-10 MED-17 closure.
WHAT.
When the matched IdP's discovery doc advertises
authorization_response_iss_parameter_supported=true (RFC 9207 §3),
HandleCallback now REQUIRES a non-empty `iss` query parameter on
/auth/oidc/callback and enforces a constant-time compare against the
configured provider's IssuerURL. Mismatch maps to two new sentinel
errors (ErrIssParamMissing / ErrIssParamMismatch) that the handler's
classifyOIDCFailure dispatches via errors.Is BEFORE the substring
fall-through, so the audit failure_category remains distinguishable
between the RFC 9207 leg (iss_param_missing / iss_param_mismatch) and
the in-token iss claim leg (id_token_iss_mismatch).
WHY.
The RFC 9207 iss URL parameter is the load-bearing mix-up-attack
defense for multi-tenant IdPs (Keycloak realms, Authentik tenants,
Auth0 tenants, public-trust CAs). Pre-fix the parameter was silently
ignored — an attacker controlling one IdP tenant could route an auth
code to certctl's callback against a different tenant's pre-login
state without detection. Modern Keycloak / Authentik / public-trust
CAs ship the discovery flag by default; legacy IdPs that don't
advertise are unaffected (back-compat preserved).
HOW.
- internal/auth/oidc/service.go
- providerEntry gains issParamSupported bool.
- getOrLoad extends the discovery-claims read to include
authorization_response_iss_parameter_supported, alongside the
existing id_token_signing_alg_values_supported defense.
- HandleCallback's signature gains callbackIss string at position 5.
Step 2.5 runs after the state compare + provider load: when
issParamSupported is true, an empty callbackIss returns
ErrIssParamMissing; a present-but-mismatched value returns
ErrIssParamMismatch (constant-time compare).
- Two new sentinels: ErrIssParamMissing, ErrIssParamMismatch.
ErrIssuerMismatch's doc-string clarified to note it covers the
in-token leg only.
- internal/api/handler/auth_session_oidc.go
- OIDCAuthHandshaker.HandleCallback signature updated.
- LoginCallback reads r.URL.Query().Get("iss") (no TrimSpace —
byte-strict compare upstream) and threads it through.
- classifyOIDCFailure: typed errors.Is dispatch for the three
iss-family sentinels BEFORE the substring fall-through, so the
three cases stay distinguishable in the audit row.
- internal/api/handler/auth_session_oidc_test.go
- stubOIDCSvc.HandleCallback bumped to 7-arg signature.
- TestClassifyOIDCFailure extended with 5 new cases pinning the
iss-family dispatch + a wrapped-error round-trip.
- internal/auth/oidc/service_test.go
- mockIdP gains advertiseIssParameterSupported bool; the
/.well-known/openid-configuration handler emits the claim only
when set (so existing tests stay back-compat).
- 4 new regression tests:
* MED17_NoSupport_AnyIssAccepted — provider doesn't advertise;
arbitrary callbackIss is ignored (back-compat).
* MED17_SupportButMissing — provider advertises; missing iss →
ErrIssParamMissing.
* MED17_SupportButMismatch — provider advertises; wrong iss →
ErrIssParamMismatch (load-bearing mix-up defense).
* MED17_SupportAndCorrect — provider advertises; matching iss →
success path proves the gate isn't over-eager.
- internal/auth/oidc/bench_test.go,
internal/auth/oidc/logging_test.go,
internal/auth/oidc/integration_keycloak_test.go
- Mechanical: all existing HandleCallback call sites updated to
pass "" for callbackIss (matches pre-fix behavior for IdPs that
don't advertise support — the Keycloak integration suite tests
will be re-evaluated once the Keycloak fixture is run against a
realm with the discovery flag enabled).
VERIFY.
- go vet ./internal/auth/oidc/... ./internal/api/handler/... PASS
- go test -short -count=1 ./internal/auth/oidc/... PASS (3.4s)
- go test -short -count=1 ./internal/api/handler/... PASS (5.4s)
- 4 new MED-17 regression tests + extended TestClassifyOIDCFailure pass.
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-17
cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md item 7
RFC 9207 — OAuth 2.0 Authorization Server Issuer Identification
|
||
|
|
e7c4654b16 |
harden(auth/session+oidc): 503/401 split + go-oidc string pin (LOW-6 + Nit-2)
Audit 2026-05-10 — close LOW-6 + Nit-2 from the HANDOFF.md backend
batch (items 8 + 9).
LOW-6: introduce ErrSessionTransient sentinel in session.Service.
session.Validate now distinguishes:
- errors.Is(err, repository.ErrSessionNotFound) → ErrSessionInvalidCookie (401)
- All other repo errors → ErrSessionTransient (503)
The session middleware maps ErrSessionTransient to HTTP 503 with
Retry-After: 1. Pre-fix, every DB hiccup looked like a forged-cookie
401 and forced the user to re-authenticate on a transient outage.
Two new regression tests pin the wire shape:
- TestService_Validate_TransientSessionGetError (service layer)
- TestService_Validate_SessionNotFoundMapsToInvalidCookie (negative
leg: not-found stays 401)
- TestSessionMiddleware_TransientErrorMappedTo503 (middleware-level
503 + Retry-After header)
Nit-2: isJWKSFetchError documentation now pins go-oidc/v3 v3.18.0 as
the source-of-truth string set. v3.18.0 exposes only
*oidc.TokenExpiredError as a typed error; JWKS-fetch failures bubble
up as fmt.Errorf-wrapped strings. New regression test
TestIsJWKSFetchError_GoOIDCV318Strings pins the canonical substrings
emitted by go-oidc's jwks.go — a future upstream bump that changes
the wording trips the test and forces the matcher to be re-derived.
The test caught a real gap: 'oidc: failed to decode keys' (emitted
when the IdP returns non-JSON at the jwks_uri — broken proxy, gateway
HTML error page, etc.) was previously misclassified as a generic 500
instead of 503 ErrJWKSUnreachable. Added 'decode keys' substring to
the matcher.
Status: LOW-6 + Nit-2 marked CLOSED in audit-doc table.
Refs: cowork/auth-bundles-fixes-2026-05-10/HANDOFF.md items 8, 9
cowork/auth-bundles-audit-2026-05-10.md LOW-6, Nit-2
|
||
|
|
925523e06e |
feat(oidc): Enabled toggle on OIDCProvider (MED-9)
Audit 2026-05-10 Fix 13 Phase B — close MED-9. MED-4/5/6/7 deferred to v3.
MED-9: ship the OIDCProvider.Enabled boolean. Pre-fix, the only way
to take a provider offline during an incident was DELETE, which
breaks active user_oidc_provider FK references and orphans any
session that minted under the provider. Post-fix:
- Migration 000042 adds enabled BOOLEAN NOT NULL DEFAULT TRUE.
Default-true means existing pre-migration rows are all enabled
post-deploy; no breaking-change window.
- internal/auth/oidc/domain/types.go::OIDCProvider.Enabled ships
the domain field with JSON tag 'enabled'.
- Repository read/write paths (List, Get, GetByName, Create, Update)
all carry the column.
- internal/auth/oidc/service.go::HandleAuthRequest rejects with
the new ErrProviderDisabled sentinel when cfgRow.Enabled=false.
- cmd/server/main.go::oidcProvidersListAdapter.List filters
disabled providers before constructing OIDCProviderInfo so the
LoginPage's 'Sign in with X' buttons never render for offline
IdPs.
- Defense-in-depth: the ErrProviderDisabled service-layer check
is the guard for direct API / MCP / CLI callers that bypass the
GUI.
Regression test: internal/auth/oidc/provider_enabled_test.go warms
the entry cache via a successful HandleAuthRequest, flips
cfgRow.Enabled=false on the cached entry, then asserts the next call
returns ErrProviderDisabled (errors.Is). Test fixtures (newValidProvider,
makeProvider) updated to set Enabled: true so existing tests stay
green.
Operators can toggle Enabled today via the existing PUT
/api/v1/auth/oidc/providers/{id} body field. A dedicated GUI
toggle on OIDCProviderDetailPage and a single-purpose PUT-just-enabled
endpoint are deferred to the v3 GUI-polish bundle — the load-bearing
wire is in place now.
MED-4 (GUI advanced fields on edit), MED-5 (POST .../test endpoint
+ button), MED-6 (JWKS auto-refresh on cache-miss), MED-7 (JWKS
health endpoint + GUI panel): DEFERRED to v3 with explicit
annotations in the audit doc. Workarounds: MED-4 fields are
PUT-editable via curl/MCP; MED-5 → call refresh post-create;
MED-6 → call refresh manually on key rotation.
Refs: cowork/auth-bundles-audit-2026-05-10.md MED-4, MED-5, MED-6,
MED-7, MED-9
Spec: cowork/auth-bundles-fixes-2026-05-10/13-med-bundle.md Phase B
|
||
|
|
739745e9fe |
fix(oidc): enforce AllowedEmailDomains allowlist in HandleCallback
Closes CRIT-5 of the 2026-05-10 audit — the LAST Critical blocker for
v2.1.0. The OIDCProvider.AllowedEmailDomains field shipped persisted
(internal/auth/oidc/domain/types.go:47), API-surfaced
(internal/api/handler/auth_session_oidc.go), MCP-surfaced
(internal/mcp/tools_auth_bundle2.go), and GUI-editable, but the
verifier in internal/auth/oidc/service.go::HandleCallback NEVER read
it. Operators filling allowed_email_domains: ["acme.com"] expected
"users outside acme.com cannot log in" — the field had zero effect.
Textbook lying-field shape per CLAUDE.md's "complete path" rule.
This commit:
- Adds Step 7.5 to HandleCallback (between profile-claim resolve and
group-claim resolve): when the provider's AllowedEmailDomains slice
is non-empty, the user's email-domain MUST match a list entry (case-
insensitive exact match; subdomains NOT auto-accepted — operators
who want dev.acme.com authorized must list it explicitly).
- Two new sentinel errors at the package level:
- ErrEmailDomainNotAllowed — email is set but domain not in list
- ErrEmailMissingButRequired — allowlist set + ID token has no email
- New extractEmailDomain helper: case-folds + trims whitespace + uses
LastIndex for the @ split + rejects empty input / no-@ / empty
local-part / empty domain-part. Returns the lowercase domain or
an error.
- 21 regression tests in internal/auth/oidc/email_domain_test.go:
- 10 extractEmailDomain shape cases (plain, mixed-case input,
leading/trailing whitespace, subdomain preserved, empty, no @,
empty local-part, empty domain-part, multiple @ via LastIndex).
- 11 match-semantic cases (empty list passes any, lowercase match,
mixed-case allowlist entry match, mixed-case email match,
whitespace-padded allowlist entry, unmatched returns
ErrEmailDomainNotAllowed, missing email + non-empty allowlist
returns ErrEmailMissingButRequired, subdomain NOT auto-accepted,
parent-domain NOT auto-accepted, multi-entry first-match,
multi-entry no-match).
Subdomain matching (alice@dev.acme.com against allowlist=[acme.com])
is intentionally NOT auto-accepted. The audit's MED-line tracks the
wildcard / suffix support story for v3; v2.1 ships strict.
Verification gate green:
- gofmt clean
- go vet clean
- go test -short -count=1 ./internal/auth/oidc/... ./internal/api/...
./internal/domain/auth/ — all pass (incl. existing OIDC service
test suite, the 4 BCL tests, the auditor pin, and the AST
RBAC-gate coverage guard).
Branch dev/auth-bundle-2 status post-commit: CRIT-1 (
|
||
|
|
1d01c87663 |
auth-bundle-2 Phase 7 + Phase 7.5: OIDC first-admin bootstrap +
break-glass admin (Argon2id, lockout, default-OFF, surface-invisibility)
Phase 7 — OIDC first-admin bootstrap (Decision 3):
- Optional AdminBootstrapHook closure on *oidc.Service. When wired,
HandleCallback consults the hook AFTER group resolution + user
upsert and BEFORE the empty-mapping fail-closed check. Hook
receives (providerID, groups, userID); returns grantAdmin=true
when the user matches CERTCTL_BOOTSTRAP_ADMIN_GROUPS AND no
admin exists yet in the tenant.
- cmd/server/main.go wires the hook as a closure that:
* Filters by CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID (if configured).
* Probes AdminExists via authActorRoleRepo (admin-already-exists
silently returns false; bootstrap mode is one-shot per tenant).
* Walks group intersection.
* On match: grants r-admin via authActorRoleRepo.Grant + emits
the bootstrap.oidc_first_admin audit row with
event_category=auth + INFO log.
- Coexists with the Bundle 1 env-var-token bootstrap. Both paths
can be configured; first match wins (admin-existence probe
short-circuits the second).
- HandleCallback's empty-mapping fail-closed check moved AFTER the
hook so a fresh deployment with zero group_role_mappings can
still mint the first admin.
- 5 tests in service_test.go: hook grants admin on match, hook
returns false preserves empty-mapping fail-closed, admin-already-
exists silently falls through to normal mapping, hook-error wraps
+ bubbles, idempotent when admin is already in the mapped role set.
Phase 7.5 — Break-glass admin (Decision 4, default-OFF):
Migration 000038 ships:
- breakglass_credentials table — at-most-one-credential-per-actor
(UNIQUE(actor_id)), Argon2id PHC-format password_hash, lockout
state machine (failure_count, locked_until, last_failure_at).
FK CASCADE on users(id) so deleting a user atomically removes
their credential.
- Two new permissions seeded into r-admin only:
auth.breakglass.admin — set/rotate/unlock/remove credentials.
auth.breakglass.login — actor uses break-glass to log in.
CanonicalPermissions extended in lockstep.
internal/auth/breakglass/service.go (~580 LOC):
- Service.Enabled() reflects CERTCTL_BREAKGLASS_ENABLED.
- SetPassword: Argon2id with OWASP 2024 params (m=64MiB, t=3, p=4,
salt=16 random bytes, output=32 bytes); per-password random salt;
PHC-format hash output. Min 12 / max 256 byte input.
- Authenticate: constant-time-compare via subtle.ConstantTimeCompare
on every code path. Identical 401 + identical timing across the
wrong-password / locked-account / non-existent-actor paths so an
attacker cannot probe whether a given actor has break-glass
configured. Non-existent-actor + locked-account paths run a
verifyDummy() Argon2id pass for timing parity. Lockout state
machine: failure_count++ on every wrong attempt; threshold (default
5) trips locked_until = NOW() + duration (default 15m). Successful
Authenticate resets the counter. Reset-window: failures aged out
after CERTCTL_BREAKGLASS_LOCKOUT_RESET_INTERVAL (default 1h)
auto-reset on next attempt.
- Unlock + RemoveCredential: admin-only (auth.breakglass.admin
gated at the router via rbacGate). Audit rows on every operation.
- All public methods refuse to act when Enabled()==false (returns
ErrDisabled; the handler maps to HTTP 404 — surface invisibility).
internal/repository/postgres/breakglass.go ships the 5-method
postgres impl with atomic single-statement IncrementFailure (so
concurrent racing wrong-password attempts can't observe an
intermediate state and slip past the threshold) and idempotent
ResetFailureCount.
internal/api/handler/auth_breakglass.go ships the 4-endpoint HTTP
surface:
- POST /auth/breakglass/login (auth-exempt; 5/min rate-limited per
source IP via the existing rate limiter; returns 404 when
disabled). On success sets the post-login session cookie + CSRF
cookie via SessionService.Create + 204. On any failure:
uniform 401 + identical timing (the service has already audited
the specific failure category).
- POST /api/v1/auth/breakglass/credentials (auth.breakglass.admin)
- POST /api/v1/auth/breakglass/credentials/{actor_id}/unlock
(auth.breakglass.admin)
- DELETE /api/v1/auth/breakglass/credentials/{actor_id}
(auth.breakglass.admin)
Admin endpoints share the surface-invisibility property: when
CERTCTL_BREAKGLASS_ENABLED=false, every admin endpoint also returns
404 (not 403) so probing via the admin surface gets the same signal
as probing the login endpoint.
Tests (internal/auth/breakglass/service_test.go):
All 8 Phase 7.5 spec-mandated negative cases:
1. Service.Enabled()==false → all ops return ErrDisabled.
2. Wrong password → ErrInvalidCredentials, failure_count++,
audit row with event_category=auth.
3. Failure_count exceeds threshold → locked, subsequent attempts
(including with the CORRECT password) return identical-shape
401 while the lockout window holds.
4. Lockout window expires → next attempt with correct password
succeeds + resets the counter.
5. Password < 12 bytes (or > 256 bytes) → ErrWeakPassword.
6. Password leak hygiene — the service has zero slog calls; the
audit-row map literal never includes the password plaintext.
7. Argon2id hash never appears in logs OR API responses — pinned
by `json:"-"` tag on BreakglassCredential.PasswordHash + a
belt-and-braces json.Marshal probe asserting the hash bytes
never appear in the marshaled output.
8. Constant-time-compare verified via timing-statistical test —
wrong-password vs no-credential paths take statistically
indistinguishable time (within 5x ratio). The verifyDummy()
hash compute on the no-credential + locked paths is what
keeps timing parity; absent that, an attacker could side-
channel "actor doesn't have a credential" via timing.
Plus coverage-lift batch covering: SetPassword first-time vs rotate,
no-caller-id rejection, no-target-id rejection, RNG failure surface,
Authenticate happy-path mints session, no-credential audit row,
session-mint-failure surface, FailureResetInterval recycle, Unlock
+ RemoveCredential happy paths, hash-format unit tests (round-trip,
mismatch, malformed/wrong-version/bad-base64 formats), nil-audit +
nil-session pass-through.
Coverage on internal/auth/breakglass/ at 91.5% per-statement (above
the Phase 7.5 spec ≥ 90% floor).
cmd/server/main.go wiring:
- Constructs breakglassRepo + breakglassService + breakglassHandler
after the OIDC service block.
- breakglassSessionMinterAdapter shim bridges *session.Service.Create
to the breakglass.SessionMinter port.
- Logs WARN at boot when CERTCTL_BREAKGLASS_ENABLED=true (operator
visibility for the deliberate SSO-bypass).
internal/config/config.go gains:
- AuthConfig.BootstrapAdminGroups + BootstrapOIDCProviderID for
Phase 7 (CERTCTL_BOOTSTRAP_ADMIN_GROUPS comma-list +
CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID).
- AuthConfig.Breakglass nested struct with 4 env vars
(CERTCTL_BREAKGLASS_ENABLED + LOCKOUT_THRESHOLD + LOCKOUT_DURATION
+ LOCKOUT_RESET_INTERVAL).
Router wiring:
- 4 new breakglass routes registered when reg.AuthBreakglass != nil;
public login route via direct r.mux.Handle (auth-exempt), 3 admin
routes via r.Register + rbacGate(auth.breakglass.admin).
- POST /auth/breakglass/login pinned in AuthExemptRouterRoutes
allowlist with Phase 7.5 justification.
- SpecParityExceptions extended with 4 new entries documenting
the Phase 7.5 deferral of full per-endpoint OpenAPI rows
(handler doc-block at the top of auth_breakglass.go is the
operator-facing reference).
Threat model (encoded in service.go + auth_breakglass.go doc-blocks
+ migration 000038 docstrings, to be promoted to docs/operator/auth-
threat-model.md in Phase 12):
- Break-glass is a deliberate bypass of the SSO security boundary.
An attacker who phishes the password OR finds it in a compromised
password manager bypasses MFA, OIDC, and every group-claim gate.
- Recommendation: keep CERTCTL_BREAKGLASS_ENABLED=false in steady-
state. Enable only during SSO-broken incidents. Disable after
recovery.
- WebAuthn pairing (v3 per Decision 12) is the load-bearing second
factor. Without it, break-glass is best treated as an emergency-
only path.
- Audit trail surfaces every break-glass action under
event_category=auth; the auditor role can monitor for unexpected
break-glass logins.
Verifications: gofmt clean, go vet clean across all touched packages,
go test -short -count=1 green across internal/auth/oidc (3.0s; new
Phase 7 hook tests integrated alongside the 21+ Phase 3 negatives),
internal/auth/breakglass (3.6s; 8 spec-mandated negatives + coverage
batch passing), internal/config + internal/domain/auth + internal/api/
router + internal/api/handler all green, no regressions in Bundle 1
packages.
|
||
|
|
854135dfb7 |
auth-bundle-2 Phase 3: OIDC service (HandleAuthRequest, HandleCallback,
RefreshKeys), hand-rolled group-claim resolver, 21+ negative-test
matrix, token-leak hygiene, IdP downgrade-attack defense
Phase 3 of the bundle ships the business logic that turns the Phase 2
storage primitives into a working OpenID Connect 1.0 + RFC 7636 PKCE
authorization-code flow against any enterprise IdP (Okta / Azure AD /
Google Workspace / Keycloak / Authentik / Auth0).
Service surface:
- Service.HandleAuthRequest(providerID) -> authURL, cookie, preLoginID
Builds the IdP redirect with PKCE-S256 (mandatory; RFC 9700 §2.1.1),
server-generated 32-byte state + nonce, persisted to the pre-login
row keyed by the cookie value.
- Service.HandleCallback(cookie, code, state, ip, ua) -> *CallbackResult
11-step validation: pre-login lookup-and-consume (single-use),
constant-time state compare, code-for-token exchange with PKCE
verifier, ID-token verify (alg pin via go-oidc/v3), service-layer
re-checks of iss / aud / azp (multi-aud requires it; mismatch
rejected) / at_hash (REQUIRED when access_token returned —
Phase 3 lifts the OIDC core "MAY" to a service-level "MUST") /
exp / iat-window / nonce, group-claim resolution with userinfo
fallback, group->role mapping (fail-closed on no match),
user upsert, session mint via SessionMinter port.
- Service.RefreshKeys(providerID) — explicit cache eviction +
re-load. Re-runs the IdP downgrade-attack defense so a provider
that later rotates to advertising HS* / none is caught BEFORE the
next user login attempt.
Security posture (every fail-closed branch is a sentinel error +
test):
- Algorithm pinning: allow-list {RS256, RS512, ES256, ES384, EdDSA};
deny-list {HS256, HS384, HS512, none}. Belt-and-braces re-check
via isDisallowedAlg after go-oidc.Verify.
- PKCE-S256 mandatory (oauth2.GenerateVerifier + S256ChallengeOption);
`plain` rejection sentinel exists for defense-in-depth.
- State + nonce: 32-byte crypto/rand, base64url-no-pad,
constant-time compare, single-use.
- IdP downgrade-attack defense: at provider creation / RefreshKeys,
reject any IdP whose discovery doc advertises HS* / none in
id_token_signing_alg_values_supported.
- JWKS fail-closed: in-flight login fails 503; existing sessions
untouched. isJWKSFetchError detects the gooidc verify-error
shape; ErrJWKSUnreachable is the wire mapping.
- Token-leak hygiene: ID tokens, access tokens, refresh tokens,
authorization codes, PKCE verifiers, state, nonce, signing key
bytes — NEVER logged at any level. logging_test.go pins the
invariant via a slog buffer + grep-assert across HandleAuthRequest,
HandleCallback, alg rejection, and provider-load paths.
Group-claim resolver (internal/auth/oidc/groupclaim/):
- Hand-rolled per Decision 10 (no JSON-path lib; ~150 LOC).
- URL-shape paths (https:// / http://) treated as a single
literal key — Auth0 namespaced claims like
https://your-namespace/groups work without splitting on the
dots in the URL.
- Dot-separated paths walked through nested map[string]interface{}.
- []interface{} / []string / single-string normalized to []string;
bool / number / object / nil → fail closed.
- 18 unit tests + sentinels (ErrPathEmpty, ErrSegmentMissing,
ErrSegmentNotObject, ErrInvalidValueType).
Test surface:
- service_test.go: 57 test functions including all 21 prompt-mandated
negative cases (wrong aud / wrong iss / expired / unknown alg /
alg=none / HMAC alg / azp missing on multi-aud / azp mismatched /
at_hash missing / at_hash mismatched / iat in future / iat too old /
nonce mismatched / state mismatched / state replayed / PKCE plain
sentinel / pre-login replay / forged cookie / IdP downgrade /
group-claim missing / group-claim unmapped) plus the userinfo
fallback matrix (happy path + endpoint-missing + endpoint-failing +
userinfo-also-empty), HandleAuthRequest entry point + RNG-failure
paths, upsertUser update + create + display-name fallback +
Validate-error paths, decryptClientSecret real-encrypt round-trip
+ bad-passphrase, alg-parser malformed-header matrix.
- logging_test.go: 4 hygiene tests pinning no token / code / verifier /
state / cookie / client_secret / alg name appears in any captured
log line.
- groupclaim/resolver_test.go: 18 cases covering Okta string-array,
Keycloak realm_access.roles, Auth0 namespaced URL claim,
single-string normalization, deeply-nested 3-segment walks, and
every fail-closed branch.
Coverage:
internal/auth/oidc 92.2% (floor: 90)
internal/auth/oidc/groupclaim 100.0% (floor: 95)
internal/auth/oidc/domain 96.2% (floor: 90)
Coverage gates added at .github/coverage-thresholds.yml so a future
regression in any fail-closed branch fails CI before the commit lands.
Phase 3 of cowork/auth-bundle-2-prompt.md is closed. Next up: Phase 4
(Session service: cookies, revocation, sliding-vs-absolute expiry).
|