Files
certctl/migrations/000043_actor_role_scope.up.sql
T
shankar0123 f1aecc6c77 fix(migrations): 000043 idempotency — wrap CHECK + UNIQUE adds in DO blocks
Cold-DB compose smoke ran the migration ladder twice (first cold-boot,
then smoke step 4 force-recreate certctl-server with the bootstrap
token env var). On the second run, 000043 fails with:

  pq: constraint "actor_roles_scope_type_enum" for relation
  "actor_roles" already exists

Server then crashloops trying the same migration every ~10s until the
healthcheck times out and the smoke gives up (5 min wall clock).

Root cause: internal/repository/postgres/db.go::RunMigrations has
no schema_migrations tracker — every *.up.sql runs on every boot.
That makes idempotency mandatory; the CLAUDE.md architecture
decision 'Idempotent migrations. IF NOT EXISTS + ON CONFLICT for
safe repeated execution' is the contract every migration must
honor. Most do; 000043 didn't.

PostgreSQL CHECK constraints don't support IF NOT EXISTS directly,
so each non-idempotent statement gets wrapped in a DO block that
guards against duplication via pg_constraint lookup. The canonical
pattern lives in migrations/000033_approval_kinds.up.sql — mirrored
here exactly. ADD COLUMN already used IF NOT EXISTS; DROP
CONSTRAINT already used IF EXISTS; CREATE INDEX already used IF
NOT EXISTS. Only the two ADD CONSTRAINT CHECK and one ADD
CONSTRAINT UNIQUE needed the DO-block wrap.

Wrapped in BEGIN/COMMIT to match 000033 — keeps all schema
changes inside a single transaction.

Behavior:
  - Fresh DB: every DO block runs the ADD CONSTRAINT (no row in
    pg_constraint yet). Schema lands identically to the
    non-idempotent original.
  - Warm DB (constraints already present): every DO block
    short-circuits via the NOT EXISTS guard. Migration is a no-op.

Same bug class as 2026-05-09 migration 000045 broken INSERT
(commit 6444e13) and the 2026-05-09 migration 000029 PRIMARY KEY
fix. THIRD time the non-idempotent migration pattern slipped past
code review — strongly suggests a CI guard that scans every
*.up.sql for un-guarded ADD CONSTRAINT is the next follow-up.

Audit-Closes: post-v2.1.0-anti-rot/item-6
Audit-Closes: audit-2026-05-10/HIGH-10-followon
2026-05-12 15:31:55 +00:00

78 lines
3.2 KiB
PL/PgSQL

-- =============================================================================
-- 2026-05-10 Audit / HIGH-10 closure
-- =============================================================================
--
-- Per-actor scope override on role grants. Pre-fix, actor_roles had
-- expires_at (already shipped) but no scope_type/scope_id columns, so
-- "give Alice operator over profile X only" wasn't expressible at the
-- grant layer — the only path was creating a scoped role and granting
-- that. This migration adds the per-grant scope tuple so an operator
-- can attach Alice to the standing r-operator role but scope the
-- grant to profile X.
--
-- scope_type defaults to 'global' to preserve existing rows; scope_id
-- stays NULL when scope_type='global'. Authorizer.CheckPermission
-- already understands the tuple shape (role_permissions carries the
-- same columns); the actor-role addition gives operators a second
-- knob without forcing them to fork roles.
--
-- Idempotency note (2026-05-12 cold-DB smoke closure): the
-- certctl-server boot sequence runs every *.up.sql on every start
-- (internal/repository/postgres/db.go::RunMigrations has no
-- schema_migrations tracker — see CLAUDE.md "Idempotent migrations"
-- architecture decision). All non-IF-NOT-EXISTS operations below
-- must be wrapped in DO blocks that skip when the object already
-- exists. The canonical pattern lives in
-- migrations/000033_approval_kinds.up.sql.
-- =============================================================================
BEGIN;
ALTER TABLE actor_roles
ADD COLUMN IF NOT EXISTS scope_type TEXT NOT NULL DEFAULT 'global',
ADD COLUMN IF NOT EXISTS scope_id TEXT;
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint
WHERE conname = 'actor_roles_scope_type_enum') THEN
ALTER TABLE actor_roles
ADD CONSTRAINT actor_roles_scope_type_enum
CHECK (scope_type IN ('global', 'profile', 'issuer'));
END IF;
END$$;
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint
WHERE conname = 'actor_roles_scope_id_required_when_not_global') THEN
ALTER TABLE actor_roles
ADD CONSTRAINT actor_roles_scope_id_required_when_not_global
CHECK (
(scope_type = 'global' AND scope_id IS NULL) OR
(scope_type IN ('profile', 'issuer') AND scope_id IS NOT NULL)
);
END IF;
END$$;
-- The (actor_id, actor_type, role_id, tenant_id) uniqueness must
-- relax: an operator can grant the same role to the same actor at
-- different scopes (e.g. r-operator on profile-A AND on profile-B).
ALTER TABLE actor_roles
DROP CONSTRAINT IF EXISTS actor_roles_actor_id_actor_type_role_id_tenant_id_key;
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint
WHERE conname = 'actor_roles_actor_role_scope_unique') THEN
ALTER TABLE actor_roles
ADD CONSTRAINT actor_roles_actor_role_scope_unique
UNIQUE (actor_id, actor_type, role_id, scope_type, scope_id, tenant_id);
END IF;
END$$;
CREATE INDEX IF NOT EXISTS idx_actor_roles_scope
ON actor_roles(scope_type, scope_id) WHERE scope_id IS NOT NULL;
COMMIT;