fix(oidc/bcl): jti replay-cache + iat freshness check (HIGH-3 closure)

Closes HIGH-3 of the 2026-05-10 audit. Pre-fix the BCL handler
accepted any logout_token whose iat + jti were syntactically present
but never checked (a) that iat fell within a skew window or (b) that
jti hadn't been seen before. A captured logout_token was replayable
indefinitely; once CRIT-2 was fixed, every replay would revoke the
user's current sessions — persistent DoS. RFC 9700 §2.7 + OIDC BCL
1.0 §2.5 require jti replay defense.

- Migration 000040_bcl_replay_cache: oidc_bcl_consumed_jtis table with
  composite PK on (jti, issuer_url) — RFC 7519 §4.1.7 per-issuer
  uniqueness — and an expires_at index for the GC sweep.

- repository.BCLReplayRepository interface + ErrBCLJTIAlreadyConsumed
  sentinel. Postgres impl uses INSERT...ON CONFLICT DO NOTHING
  RETURNING true for atomic single-use semantics in one round-trip.

- handler.DefaultBCLVerifier gains WithMaxAge + nowFn clock seam. iat
  freshness check rejects tokens whose iat is in the future beyond
  max-age OR stale beyond it. Verifier signature extended:
  Verify(ctx, jwt) (iss, sub, sid, jti string, iat int64, err error).

- handler.AuthSessionOIDCHandler gains BCLReplayConsumer (interface)
  + WithBCLReplayConsumer(consumer, maxAge) setter. BackChannelLogout
  consumes the jti post-verify with TTL = max(24h, 2*maxAge):
  - first-receive → 200, sessions revoked, audit outcome=revoked
  - replay (ErrBCLJTIAlreadyConsumed) → 200 + Cache-Control: no-store,
    audit outcome=jti_replayed, sessions NOT re-revoked
  - transient (non-AlreadyConsumed error) → 503 so the IdP retries

- internal/scheduler/scheduler.go: SetBCLReplayGarbageCollector wires
  SweepExpired into the existing session-GC tick (no separate ticker
  for short-lived replay rows).

- cmd/server/main.go: bclMaxAge from cfg.Auth.OIDCBCLMaxAgeSeconds
  (default 60s, env CERTCTL_OIDC_BCL_MAX_AGE_SECONDS); bclReplayRepo
  wired into the verifier + handler + scheduler.

- Three regression tests in internal/api/handler/bcl_replay_test.go:
  TestBackChannelLogout_FirstReceiveConsumesJTI,
  TestBackChannelLogout_ReplayedJTIReturns200WithAudit,
  TestBackChannelLogout_TransientConsumeFailureReturns503.

- internal/api/handler/auth_session_oidc_test.go: stubBCLVerifier
  gains jti + iat fields; existing TestBackChannelLogout_* tests
  rewritten for the new Verify return.

Verification gate green: gofmt clean, go vet clean, go test -short
-count=1 on internal/api/handler / internal/api/router /
internal/scheduler / cmd/server / internal/auth/oidc /
internal/auth/breakglass — all pass.

CRIT-1..CRIT-5 + HIGH-1 + HIGH-2 + HIGH-3 of the 2026-05-10 audit
now closed on this branch. Spec at
cowork/auth-bundles-fixes-2026-05-10/07-high-3-bcl-replay-defense.md.

Refs: cowork/auth-bundles-audit-2026-05-10.md HIGH-3
This commit is contained in:
shankar0123
2026-05-10 20:53:29 +00:00
parent 1697845493
commit 15435ca02b
10 changed files with 418 additions and 21 deletions
+26
View File
@@ -92,6 +92,14 @@ type SessionGarbageCollector interface {
GarbageCollect(ctx context.Context) (int, error)
}
// BCLReplayGarbageCollector sweeps expired rows from the BCL consumed-jti
// table. Audit 2026-05-10 HIGH-3 closure — the scheduler invokes this
// alongside the session-GC tick so a single ticker drives both. Concrete
// impl is repository.BCLReplayRepository.SweepExpired.
type BCLReplayGarbageCollector interface {
SweepExpired(ctx context.Context, now time.Time) (int, error)
}
// JobReaperService defines the interface for job timeout reaping used by the scheduler.
type JobReaperService interface {
ReapTimedOutJobs(ctx context.Context, csrTTL, approvalTTL time.Duration) error
@@ -118,6 +126,7 @@ type Scheduler struct {
crlCacheService CRLCacheServicer
acmeGC ACMEGarbageCollector
sessionGC SessionGarbageCollector
bclReplayGC BCLReplayGarbageCollector
jobReaper JobReaperService
logger *slog.Logger
@@ -336,6 +345,13 @@ func (s *Scheduler) SetSessionGarbageCollector(gc SessionGarbageCollector) {
s.sessionGC = gc
}
// SetBCLReplayGarbageCollector wires the BCL consumed-jti GC. Audit
// 2026-05-10 HIGH-3 closure. The sweep runs on the same ticker as the
// session GC loop (no separate interval; replay rows are short-lived).
func (s *Scheduler) SetBCLReplayGarbageCollector(gc BCLReplayGarbageCollector) {
s.bclReplayGC = gc
}
// SetSessionGCInterval configures the interval at which the session GC
// sweep runs. Default 1h. Wire: CERTCTL_SESSION_GC_INTERVAL. Zero or
// negative values are ignored.
@@ -1214,6 +1230,16 @@ func (s *Scheduler) sessionGCLoop(ctx context.Context) {
if _, err := s.sessionGC.GarbageCollect(opCtx); err != nil {
s.logger.Warn("session gc sweep failed (next tick will retry)", "error", err)
}
// Audit 2026-05-10 HIGH-3 — sweep expired BCL consumed-jti
// rows on the same tick. Best-effort; failure logs at WARN
// (the next tick retries).
if s.bclReplayGC != nil {
if n, err := s.bclReplayGC.SweepExpired(opCtx, time.Now().UTC()); err != nil {
s.logger.Warn("bcl replay gc sweep failed (next tick will retry)", "error", err)
} else if n > 0 {
s.logger.Debug("bcl replay gc swept rows", "rows", n)
}
}
}()
}
}