auth-bundle-2 Phase 6: session middleware + CSRF token plumbing +

chained-auth combinator + AuthInfo OIDC providers extension + 2 CI
guards (Bundle-1-compat + Bundle-1-to-2-upgrade)

Phase 6 wires the Phase 4 session service + Phase 5 OIDC handlers into
the request path. Three middlewares + one combinator land in
internal/auth/session/middleware.go:

  1. SessionMiddleware reads `certctl_session` cookie, validates via
     SessionService.Validate, populates the legacy UserKey/AdminKey
     + Phase 3 RBAC context keys (ActorIDKey/ActorTypeKey/TenantIDKey)
     so downstream RequirePermission + audit-attribution see a
     consistent caller. Best-effort UpdateLastSeen keeps the idle-
     expiry sliding window fresh. CRITICALLY: never 401s on validate
     failure — defers to the next middleware so the chained-auth
     combinator can fall back to Bearer.

  2. CSRFMiddleware gates state-changing methods (POST/PUT/DELETE/
     PATCH) for session-authenticated requests. API-key actors are
     EXEMPT (no session row in context => CSRF doesn't apply; they're
     not browser-driven). Constant-time-compares SHA-256(X-CSRF-Token
     header) against the session row's stored hash via
     SessionService.ValidateCSRF. Mismatch returns 403.

  3. ChainAuthSessionThenBearer is the load-bearing chained-auth
     combinator: tries the session cookie first; on miss/invalid,
     falls back to the API-key Bearer middleware; if neither
     authenticates, 401. The composition uses bearerSkipIfAuthenticated
     so a request with both a valid session AND a valid Bearer uses
     the session (cookie wins per the Bundle 2 contract).

Middleware chain order in cmd/server/main.go (per Phase 6 spec):

  RequestID → Logging → Recovery → CORS → RateLimit → AUTH (chained:
  session → Bearer) → CSRF (state-changing only; API-key exempt) →
  Audit → Handler

The chained authMiddleware replaces the bare Bundle-1 bearerMiddleware
at the chain entry point; csrfMiddleware lands immediately after so
session-authenticated requests pass through CSRF before audit. Both
new middlewares are pass-throughs when sessionService is nil
(pre-Phase-4 builds).

AuthInfo extension (Category E): GET /api/v1/auth/info now returns the
list of configured OIDC providers (id + display_name + login_url
where login_url = `/auth/oidc/login?provider=<id>`) so the GUI Login
page renders the correct "Sign in with X" buttons. Endpoint stays
auth-exempt; the providers list is public configuration. Wired via
HealthHandler.OIDCProvidersResolver + a new OIDCProvidersListResolver
projection interface; the cmd/server adapter
oidcProvidersListAdapter projects the postgres OIDCProviderRepository
into the public-safe shape. Resolver lookups are best-effort: failures
fall back to the minimal payload rather than 500-ing the GUI's auth
probe. Nil resolver preserves the pre-Phase-6 minimal shape so test
fixtures + no-db deploys keep compiling.

Bypass list preserved (Category E): the existing public-route
allowlist in router.AuthExemptRouterRoutes is preserved by virtue of
those routes registering via direct r.mux.Handle (they bypass the
entire chain). The protocol-endpoint allowlist (ACME/SCEP/EST/OCSP/
CRL) bypasses via cmd/server/main.go::buildFinalHandler URL-prefix
dispatch — those routes never reach the auth middleware at all. Both
preservations are pinned by the Bundle-1 compat CI guard below.

Tests (internal/auth/session/middleware_test.go):

All 7 Phase 6 spec-mandated middleware-chain tests pass:

  1. Session cookie + correct CSRF → 200.
  2. Session cookie + wrong CSRF → 403.
  3. Bearer-only (no session) + no CSRF → 200 (API-key actors are
     CSRF-exempt by design).
  4. No cookie + no Bearer → 401.
  5. Expired cookie + valid Bearer → fall back to Bearer succeeds.
  6. Tampered cookie → 401 (no Bearer to fall back to).
  7. Bypass-list awareness — state-changing method, no auth, no
     session row → uniform 401 (NOT a CSRF 403; the CSRF check is
     gated on session-row presence and never fires for unauth
     requests).

Plus coverage-lift tests covering nil-service pass-through, safe-
methods bypass, SessionFromContext nil + populated, isStateChangingMethod
matrix, clientIPFromRequest variants (RemoteAddr / XFF first-hop /
XFF single / no-port), nil-bearer chain branches.

Coverage on internal/auth/session/middleware.go: 100% per-function
across the 9 entry points (SessionValidator interfaces +
NewSessionMiddleware + NewCSRFMiddleware + ChainAuthSessionThenBearer +
bearerSkipIfAuthenticated + SessionFromContext + isStateChangingMethod
+ clientIPFromRequest + lastIndexByte). Package coverage 94.9%.

Two new CI guards:

  scripts/ci-guards/bundle-1-compat-regression.sh — Bundle-1-only
  compat invariants. Static-source checks that protect the Bundle-1
  path since spinning up docker-compose + running the integration
  test suite is sandbox-infeasible:
    1. SessionMiddleware MUST defer-to-next on missing/invalid cookie.
    2. CSRFMiddleware MUST be pass-through on missing session row.
    3. cmd/server/main.go MUST wire ChainAuthSessionThenBearer.
    4. The 4 public OIDC routes MUST be in AuthExemptRouterRoutes.
    5. AuthInfo MUST guard on OIDCProvidersResolver != nil.

  scripts/ci-guards/bundle-1-to-2-upgrade-regression.sh — Bundle-1 →
  Bundle-2 upgrade invariants:
    1. Migrations 000034..000037 use CREATE TABLE IF NOT EXISTS.
    2. Migrations are wrapped in BEGIN; ... COMMIT;.
    3. NO DROP TABLE / ALTER ... DROP COLUMN against any of the 19
       protected Bundle-1 tables (api_keys, audit_events, certificates,
       certificate_versions, profiles, issuers, targets, agents, jobs,
       owners, teams, agent_groups, notifications, roles, permissions,
       role_permissions, actor_roles, tenants, approvals,
       intermediate_cas, issuance_approval_requests).
    4. 000037 INSERTs use ON CONFLICT DO NOTHING (idempotent re-apply).
    5. ChainAuthSessionThenBearer is wired (Bundle-1 Bearer keys
       continue to authenticate post-upgrade).
    6. Bootstrap handler is registered (fresh-deployment bootstrap
       still works).

Both guards are sandbox-feasible static analysis. When the operator
gets a Linux VM with docker-in-docker, promote both to real `docker
compose up` integration tests against a v2.1.0 baseline DB dump.

Verifications: gofmt clean, go vet ./internal/auth/... ./internal/api/...
./cmd/server/... clean, go test -short -count=1 -race green across
internal/auth/session (94.9% coverage), internal/api/handler,
internal/api/router, no regressions in Bundle 1 packages, both new
ci-guards green.
This commit is contained in:
shankar0123
2026-05-10 06:22:25 +00:00
parent 9c679a5960
commit 3189f3cd71
6 changed files with 1031 additions and 3 deletions
+107
View File
@@ -0,0 +1,107 @@
#!/usr/bin/env bash
# scripts/ci-guards/bundle-1-compat-regression.sh
#
# Auth Bundle 2 / Phase 6 Bundle-1-only compat regression.
#
# Pre-commit invariant: a deployment with CERTCTL_AUTH_TYPE=api-key,
# zero OIDC providers configured, and zero session cookies on requests
# behaves byte-identically to Bundle 1.
#
# Phase 6 wires session middleware into the chain:
# RequestID -> Logging -> Recovery -> CORS -> RateLimit ->
# Auth (session-then-Bearer fallback) -> CSRF -> Audit -> Handler
#
# The session middleware MUST short-circuit cleanly when:
# - The request has no `certctl_session` cookie.
# - There are no OIDC providers configured (no IdPs to redirect to).
# - The CSRFMiddleware MUST be a pass-through for API-key actors
# (no session row in context => no CSRF check).
#
# This guard checks the static-source invariants that protect the
# Bundle-1 path, since spinning up docker-compose + running the full
# integration test suite is sandbox-infeasible. Concretely:
#
# 1. session.NewSessionMiddleware MUST defer to next on missing OR
# invalid cookie (not 401). If a future refactor changes that to
# a 401, the Bearer fallback path breaks and every API-key request
# fails.
#
# 2. session.NewCSRFMiddleware MUST be a pass-through when the
# session row is absent from context. A future refactor that
# checks CSRF on Bearer requests would break every programmatic
# API client.
#
# 3. session.ChainAuthSessionThenBearer MUST be the entry point
# authMiddleware refers to in cmd/server/main.go. A regression
# that drops the chain and goes straight to bearerMiddleware
# breaks the session login path; a regression that drops the
# bearer middleware entirely breaks every Bundle-1 client.
#
# 4. The 4 public OIDC routes MUST be in router.AuthExemptRouterRoutes
# (so /auth/oidc/login etc. don't go through the auth chain on a
# Bundle-1-only deployment AND don't 401 a user trying to start
# a login).
#
# Each invariant: a single grep that fails the build on regression.
#
# When the sandbox-feasibility constraint changes (operator gets a
# Linux VM with docker-in-docker for the CI runs), promote this to a
# real `docker compose up` integration test that runs the existing
# test suite + asserts zero new 401s vs the v2.1.0 baseline. Until
# then, the static checks below are the load-bearing pin.
set -e
ROOT="$(git rev-parse --show-toplevel 2>/dev/null || pwd)"
cd "$ROOT"
fail=0
# Invariant 1: SessionMiddleware MUST defer-to-next on cookie miss/invalid.
if ! grep -q 'next.ServeHTTP(w, r)' internal/auth/session/middleware.go; then
echo "::error::SessionMiddleware no longer defers to next on missing cookie"
fail=1
fi
if grep -q 'http.Error.*StatusUnauthorized' internal/auth/session/middleware.go; then
echo "::warning::SessionMiddleware appears to write 401 directly — verify Bearer fallback still works"
fi
# Invariant 2: CSRFMiddleware MUST be pass-through on missing session row.
if ! grep -qE 'sessionContextKey\{\}\)\.\(\*sessiondomain\.Session\)' internal/auth/session/middleware.go; then
echo "::error::CSRFMiddleware no longer reads session row from context"
fail=1
fi
if ! grep -qE 'if !ok \|\| sess == nil \{$' internal/auth/session/middleware.go; then
echo "::error::CSRFMiddleware no longer pass-throughs on missing session row (API-key actors must be CSRF-exempt)"
fail=1
fi
# Invariant 3: chained-auth combinator MUST be the entry point in main.go.
if ! grep -q 'session.ChainAuthSessionThenBearer' cmd/server/main.go; then
echo "::error::cmd/server/main.go does not wire session.ChainAuthSessionThenBearer"
fail=1
fi
if ! grep -q 'bearerMiddleware\s*=\s*auth.NewAuthWithKeyStore' cmd/server/main.go; then
echo "::error::cmd/server/main.go no longer constructs the Bundle-1 Bearer middleware"
fail=1
fi
# Invariant 4: public OIDC routes are in the auth-exempt allowlist.
for route in 'GET /auth/oidc/login' 'GET /auth/oidc/callback' 'POST /auth/oidc/back-channel-logout' 'POST /auth/logout'; do
if ! grep -qF "\"$route\"" internal/api/router/router.go; then
echo "::error::router.AuthExemptRouterRoutes is missing entry: $route"
fail=1
fi
done
# Invariant 5: AuthInfo extension MUST gracefully degrade when no
# OIDCProvidersResolver is wired (test-fixture + no-db-deploy paths).
if ! grep -q 'if h.OIDCProvidersResolver != nil' internal/api/handler/health.go; then
echo "::error::AuthInfo no longer guards on OIDCProvidersResolver != nil"
fail=1
fi
if [ $fail -eq 0 ]; then
echo "OK: Bundle-1 compat regression invariants hold."
fi
exit $fail
+150
View File
@@ -0,0 +1,150 @@
#!/usr/bin/env bash
# scripts/ci-guards/bundle-1-to-2-upgrade-regression.sh
#
# Auth Bundle 2 / Phase 6 Bundle-1 → Bundle-2 upgrade regression.
#
# Pre-commit invariant: an existing v2.1.0 (Bundle-1-shipped) deployment
# upgraded in place to Bundle 2 must:
#
# (a) Have all Bundle-2 migrations apply cleanly. The new migrations
# (000034 oidc_providers, 000035 sessions, 000036 users, 000037
# oidc_pre_login + auth.session.*/auth.oidc.* permissions) MUST
# be additive — no DROP TABLE / ALTER COLUMN that would break a
# Bundle-1 dump.
#
# (b) Bundle 1's CERTCTL_BOOTSTRAP_TOKEN path keeps working for fresh
# deployments without an admin (bootstrap.go invariant; pinned
# by Bundle 1 Phase 6 tests).
#
# (c) Existing minted admin's API key continues to authenticate every
# Bundle 1 endpoint (chained-auth combinator's Bearer fallback).
#
# (d) Existing admin's role grants in actor_roles survive the upgrade
# (additive migrations preserve all rows).
#
# (e) Bundled certctl-agent continues to authenticate against
# agent-demo-1 (Bundle 1 demo path; pinned by demo-compose.yml).
#
# This guard checks the static-source invariants that protect those
# properties since spinning up a v2.1.0 dump + upgrading is sandbox-
# infeasible. Concretely:
#
# 1. Migrations 000034..000037 use `CREATE TABLE IF NOT EXISTS` (not
# `CREATE TABLE`) so re-running against a partially-migrated DB
# doesn't error.
#
# 2. Migrations 000034..000037 are wrapped in `BEGIN; ... COMMIT;`
# so a partial failure rolls back cleanly.
#
# 3. NO migration in the 000034..000037 range runs `DROP TABLE` or
# `ALTER TABLE ... DROP COLUMN` against any Bundle-1 table
# (api_keys, audit_events, certificates, certificate_versions,
# certificate_profiles, issuers, targets, agents, jobs, owners,
# teams, agent_groups, notifications, roles, permissions,
# role_permissions, actor_roles, tenants, etc.). Adding a new
# table or extending an existing one with a NULLable column or
# DEFAULT-valued column is fine.
#
# 4. INSERT INTO permissions / role_permissions in 000037 use
# `ON CONFLICT (id) DO NOTHING` / equivalent so a Bundle-2 deploy
# whose v2.1.0 baseline already has the rows doesn't duplicate
# them.
#
# When the sandbox-feasibility constraint changes, promote this to a
# real `pg_dump` round-trip from a v2.1.0 baseline + apply migrations
# + assert the row counts on the protected Bundle-1 tables match
# pre-upgrade.
set -e
ROOT="$(git rev-parse --show-toplevel 2>/dev/null || pwd)"
cd "$ROOT"
fail=0
PHASE2_RANGE="000034 000035 000036 000037"
# Bundle-1 tables that MUST NOT be DROPPED or have columns DROPPED in
# the Bundle-2 migration range. Adding columns or new tables is fine.
PROTECTED_TABLES=(
api_keys audit_events certificates certificate_versions
certificate_profiles issuers targets agents jobs owners teams
agent_groups notifications roles permissions role_permissions
actor_roles tenants approvals intermediate_cas
issuance_approval_requests
)
for num in $PHASE2_RANGE; do
upfile=$(ls migrations/${num}_*.up.sql 2>/dev/null | head -1)
if [ -z "$upfile" ]; then
echo "::warning::no migration ${num}_*.up.sql found; skipping invariants for this number"
continue
fi
# Invariant 1: CREATE TABLE IF NOT EXISTS.
if grep -E '^CREATE TABLE [^[:space:]]' "$upfile" | grep -v 'IF NOT EXISTS' >/dev/null; then
echo "::error::$upfile uses 'CREATE TABLE' without 'IF NOT EXISTS' — re-running against a partially-migrated DB will fail"
fail=1
fi
# Invariant 2: BEGIN ... COMMIT wrapping.
if ! grep -q '^BEGIN;' "$upfile"; then
echo "::error::$upfile is not wrapped in 'BEGIN;'"
fail=1
fi
if ! grep -q '^COMMIT;' "$upfile"; then
echo "::error::$upfile is not wrapped in 'COMMIT;'"
fail=1
fi
# Invariant 3: no DROP TABLE / ALTER ... DROP COLUMN against
# protected Bundle-1 tables.
for tbl in "${PROTECTED_TABLES[@]}"; do
if grep -qE "DROP TABLE[^[:space:]]*[[:space:]]+(IF EXISTS )?$tbl([[:space:]]|;|$)" "$upfile"; then
echo "::error::$upfile contains DROP TABLE against protected Bundle-1 table: $tbl"
fail=1
fi
if grep -qE "ALTER TABLE[[:space:]]+$tbl[[:space:]].*DROP COLUMN" "$upfile"; then
echo "::error::$upfile contains ALTER TABLE ... DROP COLUMN against protected Bundle-1 table: $tbl"
fail=1
fi
done
done
# Invariant 4: 000037 INSERTs use ON CONFLICT DO NOTHING.
upfile37=$(ls migrations/000037_*.up.sql 2>/dev/null | head -1)
if [ -n "$upfile37" ]; then
if grep -q 'INSERT INTO permissions' "$upfile37"; then
if ! grep -q 'ON CONFLICT.*DO NOTHING' "$upfile37"; then
echo "::error::$upfile37 INSERT INTO permissions missing ON CONFLICT DO NOTHING"
fail=1
fi
fi
if grep -q 'INSERT INTO role_permissions' "$upfile37"; then
if ! grep -q 'ON CONFLICT.*DO NOTHING' "$upfile37"; then
echo "::error::$upfile37 INSERT INTO role_permissions missing ON CONFLICT DO NOTHING"
fail=1
fi
fi
fi
# Invariant 5: ChainAuthSessionThenBearer's Bearer fallback MUST be
# wired in cmd/server/main.go so existing v2.1.0-minted API keys
# continue to authenticate.
if ! grep -q 'session.ChainAuthSessionThenBearer' cmd/server/main.go; then
echo "::error::cmd/server/main.go does not wire the chained-auth combinator (Bundle-1 Bearer keys would stop authenticating)"
fail=1
fi
if ! grep -q 'auth.NewAuthWithKeyStore(authKeyStore)' cmd/server/main.go; then
echo "::error::cmd/server/main.go does not construct the Bundle-1 Bearer middleware"
fail=1
fi
# Invariant 6: bootstrap path is preserved — v2.1.0 path still works
# for fresh deployments without an admin.
if ! grep -q 'bootstrapHandler' cmd/server/main.go; then
echo "::error::cmd/server/main.go does not register the bootstrap handler — fresh-deployment bootstrap broken"
fail=1
fi
if [ $fail -eq 0 ]; then
echo "OK: Bundle-1 → Bundle-2 upgrade regression invariants hold."
fi
exit $fail