Files
certctl/.github
shankar0123 17b30c1f7f auth-bundle-2 Phase 4: session service (cookie minting + signature
validation, idle/absolute expiry, signing-key rotation, CSRF, GC),
15-case negative-test matrix, fail-fatal initial-key bootstrap

Phase 4 of the bundle ships the post-login session lifecycle that backs
every authenticated request once Phase 5 wires the OIDC handlers + the
session middleware. The state machine is the load-bearing primitive for
the Bundle 2 control plane: forge a session cookie and you bypass every
RBAC gate.

Service surface (internal/auth/session/service.go, ~880 LOC):

  - Service.Create(actorID, actorType, ip, ua) -> *CreateResult
    Mints a session row; signs the cookie value with the active signing
    key; returns the cookie payload AND the CSRF token plaintext for
    the handler to set on the response.
  - Service.Validate(ValidateInput) -> *Session
    Parses the cookie, looks up the signing key (incl. retired-but-in-
    retention), recomputes HMAC-SHA256, loads the session row, enforces
    revocation + absolute + idle expiry + optional IP/UA bind. Maps to
    one of 9 sentinel errors; the handler uniformly returns 401 to the
    wire (specific reason in the audit row).
  - Service.ValidateCSRF(headerValue, *Session) error
    Constant-time compares SHA-256(header) against the stored hash on
    the session row.
  - Service.UpdateLastSeen / Revoke / RevokeAllForActor
  - Service.RotateCSRFToken — mints fresh token, persists hash, returns
    plaintext; called on login completion, logout, role-change against
    actor, explicit operator rotate.
  - Service.RotateSigningKey — mints new active key, retires previous;
    retired keys stay valid for cfg.SigningKeyRetention so existing
    cookies don't immediately fail.
  - Service.EnsureInitialSigningKey — idempotent; mints first key on
    fresh deploys; emits auth.session_signing_key_bootstrap audit row
    with event_category=auth. Wired into cmd/server/main.go AFTER
    migrations + RBAC backfill, BEFORE the HTTP listener binds; failure
    is FATAL (logger.Error + os.Exit(1)) per the prompt — server refuses
    to boot rather than serve session-less.
  - Service.GarbageCollect — sweeps expired post-login sessions +
    pre-login rows >10min + retired-past-retention signing keys. Wired
    into the new internal/scheduler/scheduler.go::sessionGCLoop on a
    CERTCTL_SESSION_GC_INTERVAL tick.

Cookie wire format (load-bearing):

  v1.<session_id>.<signing_key_id>.<base64url-no-pad(HMAC-SHA256)>

The HMAC input is LENGTH-PREFIXED to defeat concatenation collisions:

  len(session_id) || ":" || session_id || ":" || len(signing_key_id) || ":" || signing_key_id

where len(...) is the ASCII decimal byte-length. Without the length
prefix, the bare-concatenation form `session_id || signing_key_id`
would let a forger swap one byte across the boundary — `<a, bc>` and
`<ab, c>` produce identical HMAC inputs. The length prefix moves the
boundary into the input itself so the two cases can never collide.

The v1. version prefix is reserved. A future incompatible upgrade
ships as v2. and the parser rejects unknown prefixes (no fallback).

CSRF token model:

  - Plaintext goes in a JS-readable certctl_csrf cookie (HttpOnly=false
    intentional; the GUI must read it to echo into X-CSRF-Token header).
  - SHA-256 hash of the plaintext lives on the session row.
  - Validation: SHA-256(X-CSRF-Token) constant-time-compared.
  - Rotated by Service.RotateCSRFToken on login / logout / role-change /
    explicit admin-trigger.

Optional defense-in-depth (default OFF):

  - CERTCTL_SESSION_BIND_IP — Validate compares client IP to row's
    recorded IP. Mismatch -> 401, audit row, session NOT auto-revoked
    (user may have legitimate IP change). Mobile + corporate-NAT
    environments leave this off.
  - CERTCTL_SESSION_BIND_USER_AGENT — same shape against UA.

Configurable lifetimes (env vars wired in internal/config/config.go):

  CERTCTL_SESSION_IDLE_TIMEOUT             1h
  CERTCTL_SESSION_ABSOLUTE_TIMEOUT         8h
  CERTCTL_SESSION_SIGNING_KEY_RETENTION    24h
  CERTCTL_SESSION_GC_INTERVAL              1h
  CERTCTL_SESSION_SAMESITE                 Lax
  CERTCTL_SESSION_BIND_IP                  false
  CERTCTL_SESSION_BIND_USER_AGENT          false

Test surface (internal/auth/session/service_test.go, ~860 LOC):

  All 15 prompt-mandated negative cases:

    1.  Tampered cookie (HMAC byte flipped near segment start where all
        6 bits are real — base64url-no-pad's last char carries only 2
        bits so a tail-flip is unreliable).
    1b. Tampered SESSION_ID segment (same HMAC-recompute outcome).
    2.  Cookie missing v1. prefix.
    3.  Cookie with unknown version prefix (v99).
    4.  Idle expiry — back-dated last_seen_at + idle_expires_at.
    5.  Absolute expiry — back-dated absolute_expires_at.
    6.  Revoked session.
    7.  Wrong signing key id (no row matches).
    8.  Cookie signed under retired-but-in-retention key SUCCEEDS.
    9.  Cookie signed under retired-past-retention key FAILS.
    10. Concatenation collision — direct evidence that
        computeHMAC("abc","de") != computeHMAC("ab","cde") AND that
        a forged-boundary-slide cookie is rejected.
    11. CSRF token missing.
    12. CSRF token mismatch (constant-time compare).
    13. IP-bind enabled + IP changed -> ErrSessionIPMismatch + audit row.
    14. UA-bind enabled + UA changed -> ErrSessionUAMismatch + audit row.
    15. EnsureInitialSigningKey RNG failure -> ErrInitialSigningKeyMintFailed
        wrap (cmd/server/main.go treats as fatal).

  Plus coverage-lift batch covering: every error wrap on every repo
  collaborator (Create, Get, UpdateLastSeen, UpdateCSRFTokenHash,
  Revoke, RevokeAllForActor, GC), every RNG-failure surface in Create /
  RotateCSRFToken / RotateSigningKey, every alg-pinning helper edge,
  the cookie parser's full negative matrix (empty, wrong segment count,
  missing prefixes, bad base64, wrong HMAC length), and a real-encryption
  round-trip via internal/crypto.EncryptIfKeySet -> DecryptIfKeySet so
  the v3-blob path is exercised end-to-end at the session-cookie level.

Coverage:

  internal/auth/session              94.5%  (floor 90)
  internal/auth/session/domain       96+%   (floor 90, Phase 1)

.github/coverage-thresholds.yml extended with 2 new gate entries
(internal/auth/session and internal/auth/session/domain). The
why: paragraphs explain why each fail-closed branch is load-bearing.

Repository extensions:

  internal/repository/session.go gains UpdateCSRFTokenHash on the
  SessionRepository interface; internal/repository/postgres/session.go
  ships the implementation. RotateCSRFToken consumes it.

Scheduler extensions:

  internal/scheduler/scheduler.go gains SessionGarbageCollector
  interface + sessionGC field + sessionGCInterval +
  SetSessionGarbageCollector + SetSessionGCInterval + sessionGCLoop.
  Pattern matches the existing acmeGCLoop: atomic.Bool guard prevents
  concurrent sweeps, sync.WaitGroup tracks for graceful shutdown,
  per-tick context.WithTimeout(1m) bounds a stuck Postgres.

Server wiring:

  cmd/server/main.go constructs sessionService AFTER the bootstrap
  block (post-RBAC backfill) and BEFORE the policy-service block.
  EnsureInitialSigningKey runs immediately; failure is fatal via
  os.Exit(1). The scheduler section wires SetSessionGarbageCollector
  + SetSessionGCInterval alongside the other interval setters and
  emits an Info log so operators can confirm the loop is enabled.

Phase 4 deviation note: Service.GarbageCollect() returns (int, error)
rather than the prompt's literal `error`. The int is the count of
session rows deleted on this sweep; the scheduler discards it (`_, err
:= ...`) but tests + future operator-facing audit rows can read it.
The wider behavior matches the spec exactly.

Verifications: gofmt clean, go vet ./internal/auth/session/...
./internal/scheduler/... ./internal/config/... ./cmd/server/...
./internal/repository/... clean, go test -short -count=1 -race green
across all 3 session packages, full repository + auth + scheduler +
config test sweeps green, no regressions in Bundle 1 packages.
2026-05-10 05:31:24 +00:00
..