auth-bundle-2 Phase 2a: SQL migrations (oidc_providers, sessions, users)

Three new idempotent transactional migrations that materialize the
Phase 1 domain types into Postgres tables. Repository implementations
+ integration tests land as Phase 2b in the next commit.

migrations/000034_oidc_providers.up.sql:
  oidc_providers table with the full OIDCProvider field set
    (issuer_url + client_id + client_secret_encrypted v2 blob +
    redirect_uri + groups_claim_path + groups_claim_format +
    fetch_userinfo + scopes[] + allowed_email_domains[] +
    iat_window_seconds + jwks_cache_ttl_seconds + tenant_id).
  group_role_mappings table linking provider+group_name to role_id.
  Closed-enum CHECK on groups_claim_format ('string-array' or
    'json-path').
  Defense-in-depth bounds CHECKs on iat_window_seconds (1..600) and
    jwks_cache_ttl_seconds (>= 60); app-layer Validate() also
    enforces these.
  ON DELETE CASCADE on group_role_mappings.provider_id so deleting a
    provider cleans up its mappings.
  ON DELETE RESTRICT on group_role_mappings.role_id so an in-use role
    can't be silently dropped.

migrations/000035_sessions.up.sql:
  session_signing_keys table with key_material_encrypted v2 blob +
    retired_at nullable + the retired-after-created CHECK.
  Partial index on (tenant_id, created_at DESC) WHERE retired_at IS
    NULL backs the GetActive hot path.
  sessions table covers BOTH the post-login row (1h-idle/8h-absolute
    cookie lifecycle) AND the Phase 5 pre-login row (10-minute TTL,
    is_pre_login=true). csrf_token_hash holds the SHA-256 of the
    CSRF token plaintext (the plaintext lives in a separate
    JS-readable cookie, hashed here so a DB-read leak can't replay).
  Two CHECK constraints pin the expiry order (absolute > idle, idle >
    created); these match the Phase 1 domain Validate() pre-write
    invariants but enforce them at the DB layer too so direct SQL
    inserts can't silently land malformed rows.
  Partial indexes on actor_id (active sessions only), the active
    session lookup, the pre-login GC sweep (created_at), and the
    absolute-expired GC sweep (absolute_expires_at) cover the four
    hot paths Phase 4's service consumes.
  ON DELETE RESTRICT on sessions.signing_key_id so a signing key
    referenced by an active session can't be dropped (the retention
    window keeps retired keys valid; full purge waits until every
    session signed under that key has expired).

migrations/000036_users.up.sql:
  users table for federated-human identity (per-(provider, subject)
    tuple via UNIQUE constraint, not global - identity is per-IdP by
    design).
  webauthn_credentials JSONB DEFAULT '[]' reserved for v3 (Decision
    12); Bundle 2 always stores [].
  Email index for the GUI's "find user by email" surface (not unique
    because the same email can appear in multiple providers per the
    per-IdP identity model).
  ON DELETE RESTRICT on users.oidc_provider_id keeps Phase 3's "delete
    provider only when no users authenticated via it" rule enforced
    at the DB layer; the OIDCProviderRepository.Delete impl will
    translate SQLSTATE 23503 into a 409 sentinel.

All three migrations:
  Wrapped in BEGIN/COMMIT so partial-fail leaves no half-state.
  IF NOT EXISTS / IF EXISTS / ON CONFLICT DO NOTHING for idempotency
    (the certctl-server boot path applies every migration on every
    start per CLAUDE.md "Idempotent migrations" architecture rule).
  TIMESTAMPTZ for time columns (no TIMESTAMP WITHOUT TIME ZONE).
  TEXT primary keys with prefixes per CLAUDE.md "Architecture
    Decisions" (op- / grm- / sk- / ses- / u-).
  Multi-tenant ready: tenant_id column with DEFAULT 't-default' on
    every row, FK to tenants(id) ON DELETE CASCADE. Bundle 2 ships
    single-tenant; managed-service activation adds tenants without a
    schema migration.

Down migrations exist in lockstep, drop tables in FK-safe order
(group_role_mappings -> oidc_providers; sessions ->
session_signing_keys; users alone). Down-migrations are destructive;
docstrings call this out.

Verifications:
  Migration count: ls migrations/*.up.sql | wc -l = 36 (33 from
    Bundle 1 + 3 new).
  BEGIN/COMMIT pair counts: each new migration is 1:1.
  No Docker in this sandbox, so the migrations are not applied
    end-to-end here; CI's testcontainers harness runs them via
    postgres.RunMigrations on every push. Phase 2b's repository
    integration tests will exercise the schema against Postgres 16
    Alpine.
This commit is contained in:
shankar0123
2026-05-10 04:08:06 +00:00
parent 795d7725b8
commit aab8b9f13f
6 changed files with 297 additions and 0 deletions
+16
View File
@@ -0,0 +1,16 @@
-- 000034_oidc_providers.down.sql
-- Reverses 000034_oidc_providers.up.sql. Destructive: every configured
-- OIDC provider + every group→role mapping is dropped. Existing OIDC
-- sessions in the `sessions` table (000035) become orphaned but are
-- not auto-revoked here; operators run `certctl-cli auth sessions
-- revoke-all` after a down-migration if they need clean state.
--
-- FK-safe order: group_role_mappings → oidc_providers (mappings ref
-- provider_id, so mappings drop first).
BEGIN;
DROP INDEX IF EXISTS idx_group_role_mappings_provider_id;
DROP TABLE IF EXISTS group_role_mappings;
DROP TABLE IF EXISTS oidc_providers;
COMMIT;
+93
View File
@@ -0,0 +1,93 @@
-- 000034_oidc_providers.up.sql
-- Auth Bundle 2 / Phase 2: OIDC provider configuration + group→role
-- mapping tables. Backs internal/auth/oidc/domain/{OIDCProvider,
-- GroupRoleMapping}. Phase 3 (OIDC service) reads these rows to
-- validate ID tokens against the configured IdP allow-list.
--
-- All operations use IF NOT EXISTS / IF EXISTS / ON CONFLICT DO NOTHING
-- so the migration is idempotent: safe to re-run on every
-- certctl-server boot per the project's "Idempotent migrations"
-- architecture decision. Wrapped in a single transaction so a
-- partial-fail leaves no half-state.
--
-- Schema convention follows CLAUDE.md "Architecture Decisions": TEXT
-- primary keys with prefixes (`op-`, `grm-`), TIMESTAMPTZ for time
-- columns, FK cascade behaviour explicit (group_role_mappings cascades
-- on provider deletion).
--
-- Multi-tenant readiness: every row carries tenant_id with
-- DEFAULT 't-default'. Bundle 2 ships single-tenant; the future
-- managed-service multi-tenant offering activates by inserting
-- additional tenants without a schema migration.
--
-- client_secret_encrypted holds the v2 blob produced by
-- `internal/crypto/encryption.go` (magic byte 0x02 || salt(16) ||
-- nonce(12) || ciphertext+tag). Plaintext NEVER lives in the DB.
BEGIN;
-- OIDC providers: operator-configured IdP records. One row per IdP.
-- N providers supported from day one for the future managed-service
-- offering where a multi-team customer may have multiple IdPs.
CREATE TABLE IF NOT EXISTS oidc_providers (
id TEXT PRIMARY KEY, -- prefix `op-`
tenant_id TEXT NOT NULL DEFAULT 't-default'
REFERENCES tenants(id) ON DELETE CASCADE,
name TEXT NOT NULL,
issuer_url TEXT NOT NULL, -- must be https:// (validated at app layer)
client_id TEXT NOT NULL,
client_secret_encrypted BYTEA NOT NULL, -- v2 blob; never plaintext
redirect_uri TEXT NOT NULL, -- must be https:// (validated at app layer)
groups_claim_path TEXT NOT NULL DEFAULT 'groups',
groups_claim_format TEXT NOT NULL DEFAULT 'string-array',
fetch_userinfo BOOLEAN NOT NULL DEFAULT FALSE,
scopes TEXT[] NOT NULL DEFAULT ARRAY['openid','profile','email'],
allowed_email_domains TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[],
iat_window_seconds INTEGER NOT NULL DEFAULT 300, -- min 1, max 600 enforced at app layer
jwks_cache_ttl_seconds INTEGER NOT NULL DEFAULT 3600, -- min 60 enforced at app layer
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (tenant_id, name),
-- Closed enum for groups_claim_format. Phase 3's resolver
-- dispatches on this column.
CONSTRAINT oidc_providers_claim_format_check
CHECK (groups_claim_format IN ('string-array', 'json-path')),
-- Defense-in-depth: app-layer Validate() also enforces these.
CONSTRAINT oidc_providers_iat_window_bounds
CHECK (iat_window_seconds > 0 AND iat_window_seconds <= 600),
CONSTRAINT oidc_providers_jwks_ttl_bounds
CHECK (jwks_cache_ttl_seconds >= 60)
);
-- Group→role mappings: one row per (provider, group_name, role) tuple.
-- ON DELETE CASCADE on provider so deleting a provider cleans up its
-- mappings. Name-based per the forward-compat seam: if the IdP renames
-- a group, the operator updates the mapping. We don't depend on
-- IdP-internal identifiers (which differ per IdP and resist
-- documentation).
CREATE TABLE IF NOT EXISTS group_role_mappings (
id TEXT PRIMARY KEY, -- prefix `grm-`
tenant_id TEXT NOT NULL DEFAULT 't-default'
REFERENCES tenants(id) ON DELETE CASCADE,
provider_id TEXT NOT NULL REFERENCES oidc_providers(id) ON DELETE CASCADE,
group_name TEXT NOT NULL,
role_id TEXT NOT NULL REFERENCES roles(id) ON DELETE RESTRICT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- One mapping per (provider, group_name, role_id) tuple. An
-- operator can map one group to multiple roles by inserting
-- multiple rows with different role_ids; the unique constraint
-- prevents accidental duplicates.
UNIQUE (provider_id, group_name, role_id)
);
-- Indexes for the hot paths Phase 3's service consumes:
-- ListByProvider walks all mappings for a given provider; Map(group_names)
-- reads the same rows then filters in-memory.
CREATE INDEX IF NOT EXISTS idx_group_role_mappings_provider_id
ON group_role_mappings (provider_id);
COMMIT;
+19
View File
@@ -0,0 +1,19 @@
-- 000035_sessions.down.sql
-- Reverses 000035_sessions.up.sql. Destructive: every active session
-- + every signing key is dropped. Operators MUST take a backup before
-- running this; sessions cannot be recovered.
--
-- FK-safe order: sessions → session_signing_keys (sessions ref
-- signing_key_id, so sessions drop first).
BEGIN;
DROP INDEX IF EXISTS idx_sessions_absolute_expires_at;
DROP INDEX IF EXISTS idx_sessions_pre_login_gc;
DROP INDEX IF EXISTS idx_sessions_active;
DROP INDEX IF EXISTS idx_sessions_actor_id;
DROP TABLE IF EXISTS sessions;
DROP INDEX IF EXISTS idx_session_signing_keys_active;
DROP TABLE IF EXISTS session_signing_keys;
COMMIT;
+99
View File
@@ -0,0 +1,99 @@
-- 000035_sessions.up.sql
-- Auth Bundle 2 / Phase 2: server-side session management. Two cookie
-- shapes share the `sessions` table:
--
-- 1. Post-login row: minted by SessionService.Create after a
-- successful OIDC callback or break-glass authenticate. Carries
-- the cookie HMAC-signed via the active session_signing_keys row.
-- Idle timeout 1h default, absolute timeout 8h default.
--
-- 2. Pre-login row: minted at /auth/oidc/login to hold OIDC state +
-- nonce + PKCE verifier across the IdP redirect. Same row shape,
-- `is_pre_login = true`, 10-minute absolute TTL, GC'd by the same
-- scheduler sweep as expired post-login sessions.
--
-- session_signing_keys holds the HMAC key material. Phase 4's
-- Service.RotateSigningKey mints new keys and retires old ones; the
-- retention window keeps retired keys valid for verification of
-- cookies signed under them so existing sessions don't immediately
-- fail.
--
-- All operations idempotent. Wrapped in a single transaction.
-- Multi-tenant ready (tenant_id on every row).
BEGIN;
-- Session signing keys. The "active" key is the most recently created
-- non-retired row; Phase 4's Service.GetActive returns it. Retired keys
-- (RetiredAt IS NOT NULL) stay in the table for the configurable
-- retention window so cookies signed under them still verify.
CREATE TABLE IF NOT EXISTS session_signing_keys (
id TEXT PRIMARY KEY, -- prefix `sk-`
tenant_id TEXT NOT NULL DEFAULT 't-default'
REFERENCES tenants(id) ON DELETE CASCADE,
key_material_encrypted BYTEA NOT NULL, -- v2 blob; never plaintext
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
retired_at TIMESTAMPTZ NULL,
CONSTRAINT session_signing_keys_retired_after_created
CHECK (retired_at IS NULL OR retired_at >= created_at)
);
-- Index on (tenant_id, retired_at IS NULL, created_at DESC) backs the
-- GetActive query: most-recently-created non-retired key per tenant.
CREATE INDEX IF NOT EXISTS idx_session_signing_keys_active
ON session_signing_keys (tenant_id, created_at DESC)
WHERE retired_at IS NULL;
-- Sessions table. Holds both post-login and pre-login rows; is_pre_login
-- discriminates. CSRFTokenHash is SHA-256 hex of the operator-facing
-- CSRF token (the plaintext lives in a separate JS-readable cookie so
-- the GUI can echo it into the X-CSRF-Token header).
CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY, -- prefix `ses-`
tenant_id TEXT NOT NULL DEFAULT 't-default'
REFERENCES tenants(id) ON DELETE CASCADE,
actor_id TEXT NOT NULL,
actor_type TEXT NOT NULL, -- matches domain.ActorType strings
signing_key_id TEXT NOT NULL REFERENCES session_signing_keys(id) ON DELETE RESTRICT,
is_pre_login BOOLEAN NOT NULL DEFAULT FALSE,
csrf_token_hash TEXT NOT NULL DEFAULT '', -- 64 lowercase hex chars when set; '' for pre-login rows
idle_expires_at TIMESTAMPTZ NOT NULL,
absolute_expires_at TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
last_seen_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
ip_address TEXT NOT NULL DEFAULT '',
user_agent TEXT NOT NULL DEFAULT '',
revoked_at TIMESTAMPTZ NULL,
CONSTRAINT sessions_expiry_order
CHECK (absolute_expires_at > idle_expires_at),
CONSTRAINT sessions_idle_after_created
CHECK (idle_expires_at > created_at)
);
-- Index for "list sessions for me" hot path (Phase 5
-- GET /v1/auth/sessions) — actor_id is the WHERE clause.
CREATE INDEX IF NOT EXISTS idx_sessions_actor_id
ON sessions (actor_id, actor_type)
WHERE revoked_at IS NULL AND is_pre_login = FALSE;
-- Index for the active-session lookup (Phase 4 Validate hot path).
-- Partial index (revoked_at IS NULL) keeps it small; revoked sessions
-- are GC'd separately.
CREATE INDEX IF NOT EXISTS idx_sessions_active
ON sessions (id)
WHERE revoked_at IS NULL;
-- Index for the pre-login GC sweep: walk pre-login rows older than
-- the 10-minute TTL.
CREATE INDEX IF NOT EXISTS idx_sessions_pre_login_gc
ON sessions (created_at)
WHERE is_pre_login = TRUE;
-- Index for the absolute-expired GC sweep: walk rows past the absolute
-- expiry window.
CREATE INDEX IF NOT EXISTS idx_sessions_absolute_expires_at
ON sessions (absolute_expires_at);
COMMIT;
+16
View File
@@ -0,0 +1,16 @@
-- 000036_users.down.sql
-- Reverses 000036_users.up.sql. Destructive: every federated-human
-- user record is dropped. Operators MUST take a backup before
-- running this; SSO logins fail until a fresh login re-creates rows.
--
-- The actor_roles table (Bundle 1's RBAC) does NOT cascade-delete
-- here because actor_roles.actor_id is a TEXT column without an FK
-- to users. Down-migrating users orphans actor_roles rows whose
-- actor_id matches a deleted user; those rows become unreachable
-- via the normal UI but are not auto-cleaned.
BEGIN;
DROP INDEX IF EXISTS idx_users_email;
DROP TABLE IF EXISTS users;
COMMIT;
+54
View File
@@ -0,0 +1,54 @@
-- 000036_users.up.sql
-- Auth Bundle 2 / Phase 2: federated-human user identity table.
--
-- Distinction from Bundle 1's `actor_roles`: actor_roles indexes
-- `actor_id` strings (free-form, e.g. API-key names). For federated
-- humans, the user's actor_id IS users.id; so for SSO logins,
-- `actor_roles.actor_id = users.id` and the actor_type column is
-- `'User'` (matches domain.ActorTypeUser).
--
-- Identity is per-(provider, oidc_subject) tuple. A person who
-- authenticates against multiple OIDC providers gets multiple rows by
-- design; identity is per-provider, not global. The future managed
-- offering can collapse identities at the application layer if a
-- customer requires it.
--
-- webauthn_credentials JSONB column reserved for v3 (Decision 12).
-- Bundle 2 always stores `[]`; v3's WebAuthn enrollment populates it.
--
-- All operations idempotent. Wrapped in a single transaction.
BEGIN;
CREATE TABLE IF NOT EXISTS users (
id TEXT PRIMARY KEY, -- prefix `u-`
tenant_id TEXT NOT NULL DEFAULT 't-default'
REFERENCES tenants(id) ON DELETE CASCADE,
email TEXT NOT NULL,
display_name TEXT NOT NULL DEFAULT '',
oidc_subject TEXT NOT NULL,
oidc_provider_id TEXT NOT NULL REFERENCES oidc_providers(id) ON DELETE RESTRICT,
last_login_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
webauthn_credentials JSONB NOT NULL DEFAULT '[]'::JSONB, -- reserved for v3; always [] in Bundle 2
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- Identity invariant: one row per (provider, oidc_subject) tuple.
-- Phase 3 HandleCallback uses this to look up an existing user
-- before deciding to insert.
UNIQUE (oidc_provider_id, oidc_subject)
);
-- Email lookup (operator GUI 'find user by email' surface). Not
-- unique because the same email can appear in multiple providers
-- (per the per-provider identity model above).
CREATE INDEX IF NOT EXISTS idx_users_email
ON users (tenant_id, email);
-- ON DELETE RESTRICT on oidc_provider_id keeps Phase 3's
-- "delete provider only when no users authenticated via it" rule
-- enforced at the DB layer; the OIDCProviderRepository.Delete
-- implementation translates the SQLSTATE 23503 into
-- repository.ErrAuthRoleInUse-equivalent for HTTP 409.
COMMIT;