mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-09 01:39:59 +00:00
1d01c87663
break-glass admin (Argon2id, lockout, default-OFF, surface-invisibility)
Phase 7 — OIDC first-admin bootstrap (Decision 3):
- Optional AdminBootstrapHook closure on *oidc.Service. When wired,
HandleCallback consults the hook AFTER group resolution + user
upsert and BEFORE the empty-mapping fail-closed check. Hook
receives (providerID, groups, userID); returns grantAdmin=true
when the user matches CERTCTL_BOOTSTRAP_ADMIN_GROUPS AND no
admin exists yet in the tenant.
- cmd/server/main.go wires the hook as a closure that:
* Filters by CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID (if configured).
* Probes AdminExists via authActorRoleRepo (admin-already-exists
silently returns false; bootstrap mode is one-shot per tenant).
* Walks group intersection.
* On match: grants r-admin via authActorRoleRepo.Grant + emits
the bootstrap.oidc_first_admin audit row with
event_category=auth + INFO log.
- Coexists with the Bundle 1 env-var-token bootstrap. Both paths
can be configured; first match wins (admin-existence probe
short-circuits the second).
- HandleCallback's empty-mapping fail-closed check moved AFTER the
hook so a fresh deployment with zero group_role_mappings can
still mint the first admin.
- 5 tests in service_test.go: hook grants admin on match, hook
returns false preserves empty-mapping fail-closed, admin-already-
exists silently falls through to normal mapping, hook-error wraps
+ bubbles, idempotent when admin is already in the mapped role set.
Phase 7.5 — Break-glass admin (Decision 4, default-OFF):
Migration 000038 ships:
- breakglass_credentials table — at-most-one-credential-per-actor
(UNIQUE(actor_id)), Argon2id PHC-format password_hash, lockout
state machine (failure_count, locked_until, last_failure_at).
FK CASCADE on users(id) so deleting a user atomically removes
their credential.
- Two new permissions seeded into r-admin only:
auth.breakglass.admin — set/rotate/unlock/remove credentials.
auth.breakglass.login — actor uses break-glass to log in.
CanonicalPermissions extended in lockstep.
internal/auth/breakglass/service.go (~580 LOC):
- Service.Enabled() reflects CERTCTL_BREAKGLASS_ENABLED.
- SetPassword: Argon2id with OWASP 2024 params (m=64MiB, t=3, p=4,
salt=16 random bytes, output=32 bytes); per-password random salt;
PHC-format hash output. Min 12 / max 256 byte input.
- Authenticate: constant-time-compare via subtle.ConstantTimeCompare
on every code path. Identical 401 + identical timing across the
wrong-password / locked-account / non-existent-actor paths so an
attacker cannot probe whether a given actor has break-glass
configured. Non-existent-actor + locked-account paths run a
verifyDummy() Argon2id pass for timing parity. Lockout state
machine: failure_count++ on every wrong attempt; threshold (default
5) trips locked_until = NOW() + duration (default 15m). Successful
Authenticate resets the counter. Reset-window: failures aged out
after CERTCTL_BREAKGLASS_LOCKOUT_RESET_INTERVAL (default 1h)
auto-reset on next attempt.
- Unlock + RemoveCredential: admin-only (auth.breakglass.admin
gated at the router via rbacGate). Audit rows on every operation.
- All public methods refuse to act when Enabled()==false (returns
ErrDisabled; the handler maps to HTTP 404 — surface invisibility).
internal/repository/postgres/breakglass.go ships the 5-method
postgres impl with atomic single-statement IncrementFailure (so
concurrent racing wrong-password attempts can't observe an
intermediate state and slip past the threshold) and idempotent
ResetFailureCount.
internal/api/handler/auth_breakglass.go ships the 4-endpoint HTTP
surface:
- POST /auth/breakglass/login (auth-exempt; 5/min rate-limited per
source IP via the existing rate limiter; returns 404 when
disabled). On success sets the post-login session cookie + CSRF
cookie via SessionService.Create + 204. On any failure:
uniform 401 + identical timing (the service has already audited
the specific failure category).
- POST /api/v1/auth/breakglass/credentials (auth.breakglass.admin)
- POST /api/v1/auth/breakglass/credentials/{actor_id}/unlock
(auth.breakglass.admin)
- DELETE /api/v1/auth/breakglass/credentials/{actor_id}
(auth.breakglass.admin)
Admin endpoints share the surface-invisibility property: when
CERTCTL_BREAKGLASS_ENABLED=false, every admin endpoint also returns
404 (not 403) so probing via the admin surface gets the same signal
as probing the login endpoint.
Tests (internal/auth/breakglass/service_test.go):
All 8 Phase 7.5 spec-mandated negative cases:
1. Service.Enabled()==false → all ops return ErrDisabled.
2. Wrong password → ErrInvalidCredentials, failure_count++,
audit row with event_category=auth.
3. Failure_count exceeds threshold → locked, subsequent attempts
(including with the CORRECT password) return identical-shape
401 while the lockout window holds.
4. Lockout window expires → next attempt with correct password
succeeds + resets the counter.
5. Password < 12 bytes (or > 256 bytes) → ErrWeakPassword.
6. Password leak hygiene — the service has zero slog calls; the
audit-row map literal never includes the password plaintext.
7. Argon2id hash never appears in logs OR API responses — pinned
by `json:"-"` tag on BreakglassCredential.PasswordHash + a
belt-and-braces json.Marshal probe asserting the hash bytes
never appear in the marshaled output.
8. Constant-time-compare verified via timing-statistical test —
wrong-password vs no-credential paths take statistically
indistinguishable time (within 5x ratio). The verifyDummy()
hash compute on the no-credential + locked paths is what
keeps timing parity; absent that, an attacker could side-
channel "actor doesn't have a credential" via timing.
Plus coverage-lift batch covering: SetPassword first-time vs rotate,
no-caller-id rejection, no-target-id rejection, RNG failure surface,
Authenticate happy-path mints session, no-credential audit row,
session-mint-failure surface, FailureResetInterval recycle, Unlock
+ RemoveCredential happy paths, hash-format unit tests (round-trip,
mismatch, malformed/wrong-version/bad-base64 formats), nil-audit +
nil-session pass-through.
Coverage on internal/auth/breakglass/ at 91.5% per-statement (above
the Phase 7.5 spec ≥ 90% floor).
cmd/server/main.go wiring:
- Constructs breakglassRepo + breakglassService + breakglassHandler
after the OIDC service block.
- breakglassSessionMinterAdapter shim bridges *session.Service.Create
to the breakglass.SessionMinter port.
- Logs WARN at boot when CERTCTL_BREAKGLASS_ENABLED=true (operator
visibility for the deliberate SSO-bypass).
internal/config/config.go gains:
- AuthConfig.BootstrapAdminGroups + BootstrapOIDCProviderID for
Phase 7 (CERTCTL_BOOTSTRAP_ADMIN_GROUPS comma-list +
CERTCTL_BOOTSTRAP_OIDC_PROVIDER_ID).
- AuthConfig.Breakglass nested struct with 4 env vars
(CERTCTL_BREAKGLASS_ENABLED + LOCKOUT_THRESHOLD + LOCKOUT_DURATION
+ LOCKOUT_RESET_INTERVAL).
Router wiring:
- 4 new breakglass routes registered when reg.AuthBreakglass != nil;
public login route via direct r.mux.Handle (auth-exempt), 3 admin
routes via r.Register + rbacGate(auth.breakglass.admin).
- POST /auth/breakglass/login pinned in AuthExemptRouterRoutes
allowlist with Phase 7.5 justification.
- SpecParityExceptions extended with 4 new entries documenting
the Phase 7.5 deferral of full per-endpoint OpenAPI rows
(handler doc-block at the top of auth_breakglass.go is the
operator-facing reference).
Threat model (encoded in service.go + auth_breakglass.go doc-blocks
+ migration 000038 docstrings, to be promoted to docs/operator/auth-
threat-model.md in Phase 12):
- Break-glass is a deliberate bypass of the SSO security boundary.
An attacker who phishes the password OR finds it in a compromised
password manager bypasses MFA, OIDC, and every group-claim gate.
- Recommendation: keep CERTCTL_BREAKGLASS_ENABLED=false in steady-
state. Enable only during SSO-broken incidents. Disable after
recovery.
- WebAuthn pairing (v3 per Decision 12) is the load-bearing second
factor. Without it, break-glass is best treated as an emergency-
only path.
- Audit trail surfaces every break-glass action under
event_category=auth; the auditor role can monitor for unexpected
break-glass logins.
Verifications: gofmt clean, go vet clean across all touched packages,
go test -short -count=1 green across internal/auth/oidc (3.0s; new
Phase 7 hook tests integrated alongside the 21+ Phase 3 negatives),
internal/auth/breakglass (3.6s; 8 spec-mandated negatives + coverage
batch passing), internal/config + internal/domain/auth + internal/api/
router + internal/api/handler all green, no regressions in Bundle 1
packages.
505 lines
20 KiB
Go
505 lines
20 KiB
Go
// Package breakglass — Auth Bundle 2 Phase 7.5 / break-glass admin service.
|
|
//
|
|
// Decision 4: operator-toggleable local-password admin for the SSO-broken
|
|
// case. No second factor in this bundle (WebAuthn pairs in v3 per
|
|
// Decision 12). The path exists so an admin can recover when OIDC is
|
|
// down; it is NOT for general human auth.
|
|
//
|
|
// Threat model (load-bearing):
|
|
//
|
|
// - Break-glass is a deliberate bypass of the SSO security boundary.
|
|
// An attacker who phishes the password OR finds it in a compromised
|
|
// password manager bypasses MFA, OIDC, and every group-claim gate.
|
|
// - Operators MUST keep CERTCTL_BREAKGLASS_ENABLED=false in steady-
|
|
// state. Enable only during SSO-broken incidents. Disable after
|
|
// recovery.
|
|
// - WebAuthn pairing (v3 per Decision 12) is the load-bearing second
|
|
// factor. Without it, break-glass is best treated as an
|
|
// emergency-only path.
|
|
// - Audit trail surfaces every break-glass action under
|
|
// event_category=auth; the auditor role can monitor for unexpected
|
|
// break-glass logins.
|
|
//
|
|
// Defense-in-depth (load-bearing):
|
|
//
|
|
// - Argon2id with OWASP-2024 parameters (m=64MiB, t=3, p=4, salt=16
|
|
// bytes, output=32 bytes). Per-password random salt; PHC-format
|
|
// hash for forward-compat parameter rotation.
|
|
// - subtle.ConstantTimeCompare on every password verify. Identical
|
|
// timing + identical error shape across the wrong-password,
|
|
// locked-account, and non-existent-actor paths so an attacker
|
|
// cannot probe whether a given actor has break-glass configured.
|
|
// - Lockout state machine: failure_count increments on every wrong
|
|
// attempt; threshold (default 5) trips locked_until = NOW() +
|
|
// duration (default 15m). Successful Authenticate resets the
|
|
// counter. Admin-initiated Unlock also resets.
|
|
// - Surface invisibility: when Service.Enabled() == false, every
|
|
// handler returns 404 (NOT 403) so the surface is invisible to
|
|
// scanners.
|
|
// - Token-leak hygiene: passwords NEVER appear in any log line at
|
|
// any level. Pinned by logging_test.go's slog buffer + grep-assert.
|
|
// - PasswordHash is `json:"-"` on the domain type so a misconfigured
|
|
// handler cannot wire-leak the hash via JSON marshaling.
|
|
package breakglass
|
|
|
|
import (
|
|
"context"
|
|
"crypto/rand"
|
|
"crypto/subtle"
|
|
"encoding/base64"
|
|
"errors"
|
|
"fmt"
|
|
"strings"
|
|
"time"
|
|
|
|
"golang.org/x/crypto/argon2"
|
|
|
|
bgdomain "github.com/certctl-io/certctl/internal/auth/breakglass/domain"
|
|
"github.com/certctl-io/certctl/internal/domain"
|
|
authdomain "github.com/certctl-io/certctl/internal/domain/auth"
|
|
"github.com/certctl-io/certctl/internal/repository"
|
|
)
|
|
|
|
// =============================================================================
|
|
// Service-layer sentinel errors.
|
|
// =============================================================================
|
|
|
|
var (
|
|
// ErrDisabled: Service.Enabled() returned false. The handler MUST
|
|
// translate to HTTP 404 (NOT 403) so the surface is invisible.
|
|
ErrDisabled = errors.New("breakglass: service disabled")
|
|
|
|
// ErrInvalidCredentials: wrong password OR account locked OR
|
|
// no credential exists for the actor. The wire response is
|
|
// uniform 401 + identical timing across all three cases.
|
|
ErrInvalidCredentials = errors.New("breakglass: invalid credentials")
|
|
|
|
// ErrWeakPassword: SetPassword rejected the input for being
|
|
// shorter than MinPasswordLengthBytes (12) or longer than
|
|
// MaxPasswordLengthBytes (256).
|
|
ErrWeakPassword = errors.New("breakglass: password fails strength requirements (min 12, max 256 bytes)")
|
|
|
|
// ErrUnauthenticated: Service.SetPassword / Unlock / RemoveCredential
|
|
// called without a non-empty caller actor id.
|
|
ErrUnauthenticated = errors.New("breakglass: caller is unauthenticated")
|
|
)
|
|
|
|
// =============================================================================
|
|
// Config.
|
|
// =============================================================================
|
|
|
|
// Config bundles the operator-tunable knobs Phase 7.5 exposes via
|
|
// CERTCTL_BREAKGLASS_* env vars.
|
|
type Config struct {
|
|
// Enabled gates the entire service surface. Default false; operator
|
|
// flips to true via CERTCTL_BREAKGLASS_ENABLED. When false, every
|
|
// public method returns ErrDisabled and every handler 404s.
|
|
Enabled bool
|
|
|
|
// LockoutThreshold: failure count that trips locked_until. Default 5.
|
|
// Wire: CERTCTL_BREAKGLASS_LOCKOUT_THRESHOLD.
|
|
LockoutThreshold int
|
|
|
|
// LockoutDuration: how long the account stays locked after the
|
|
// threshold trips. Default 15m. Wire: CERTCTL_BREAKGLASS_LOCKOUT_DURATION.
|
|
LockoutDuration time.Duration
|
|
|
|
// LockoutResetInterval: idle time after last_failure_at before
|
|
// the failure_count resets to 0 on next attempt. Default 1h.
|
|
// Wire: CERTCTL_BREAKGLASS_LOCKOUT_RESET_INTERVAL.
|
|
LockoutResetInterval time.Duration
|
|
}
|
|
|
|
// DefaultConfig returns the Phase 7.5 defaults. cmd/server/main.go
|
|
// merges CERTCTL_BREAKGLASS_* env vars over these.
|
|
func DefaultConfig() Config {
|
|
return Config{
|
|
Enabled: false,
|
|
LockoutThreshold: 5,
|
|
LockoutDuration: 15 * time.Minute,
|
|
LockoutResetInterval: 1 * time.Hour,
|
|
}
|
|
}
|
|
|
|
// Argon2id parameters — OWASP 2024 recommendations, fixed.
|
|
const (
|
|
argon2Memory = 64 * 1024 // KiB → 64 MiB
|
|
argon2Iterations = 3
|
|
argon2Parallelism = 4
|
|
argon2SaltSize = 16
|
|
argon2OutputSize = 32
|
|
)
|
|
|
|
// =============================================================================
|
|
// Collaborator interfaces (narrow projections for stub-friendly tests).
|
|
// =============================================================================
|
|
|
|
// AuditRecorder is the slice of *service.AuditService used by the
|
|
// break-glass service. Every audit row carries event_category=auth.
|
|
type AuditRecorder interface {
|
|
RecordEventWithCategory(ctx context.Context, actor string, actorType domain.ActorType, action, eventCategory, resourceType, resourceID string, details map[string]interface{}) error
|
|
}
|
|
|
|
// SessionMinter is the slice of *session.Service the Authenticate path
|
|
// uses to mint a post-login session after a successful break-glass
|
|
// password verify.
|
|
type SessionMinter interface {
|
|
Create(ctx context.Context, actorID, actorType, ip, userAgent string) (cookieValue, csrfToken string, err error)
|
|
}
|
|
|
|
// =============================================================================
|
|
// Service.
|
|
// =============================================================================
|
|
|
|
// Service implements the break-glass admin lifecycle.
|
|
type Service struct {
|
|
repo repository.BreakglassCredentialRepository
|
|
audit AuditRecorder
|
|
sessions SessionMinter
|
|
cfg Config
|
|
tenantID string
|
|
|
|
// Test seams.
|
|
clockNow func() time.Time
|
|
readRand func([]byte) (int, error)
|
|
}
|
|
|
|
// NewService constructs the break-glass service.
|
|
func NewService(
|
|
repo repository.BreakglassCredentialRepository,
|
|
audit AuditRecorder,
|
|
sessions SessionMinter,
|
|
cfg Config,
|
|
tenantID string,
|
|
) *Service {
|
|
return &Service{
|
|
repo: repo,
|
|
audit: audit,
|
|
sessions: sessions,
|
|
cfg: cfg,
|
|
tenantID: tenantID,
|
|
clockNow: time.Now,
|
|
readRand: rand.Read,
|
|
}
|
|
}
|
|
|
|
// SetClockForTest replaces the clock used for lockout-window
|
|
// calculations. ONLY for tests.
|
|
func (s *Service) SetClockForTest(now func() time.Time) { s.clockNow = now }
|
|
|
|
// SetRandReaderForTest replaces the entropy source used for salts.
|
|
// ONLY for tests.
|
|
func (s *Service) SetRandReaderForTest(r func([]byte) (int, error)) { s.readRand = r }
|
|
|
|
// Enabled reflects CERTCTL_BREAKGLASS_ENABLED.
|
|
func (s *Service) Enabled() bool { return s.cfg.Enabled }
|
|
|
|
// =============================================================================
|
|
// SetPassword — admin-only; sets / rotates the break-glass password.
|
|
// =============================================================================
|
|
|
|
// SetPasswordResult is the return shape for SetPassword.
|
|
type SetPasswordResult struct {
|
|
ActorID string
|
|
CreatedAt time.Time
|
|
}
|
|
|
|
// SetPassword hashes + persists a fresh break-glass password for the
|
|
// target actor. Caller must hold auth.breakglass.admin (gated at the
|
|
// router level via rbacGate). Audit row: auth.breakglass_password_set.
|
|
//
|
|
// callerActorID is the operator performing the rotation (audit
|
|
// attribution). targetActorID is the actor whose break-glass cred is
|
|
// being set.
|
|
func (s *Service) SetPassword(ctx context.Context, callerActorID, targetActorID, plaintext string) (*SetPasswordResult, error) {
|
|
if !s.Enabled() {
|
|
return nil, ErrDisabled
|
|
}
|
|
if strings.TrimSpace(callerActorID) == "" {
|
|
return nil, ErrUnauthenticated
|
|
}
|
|
if strings.TrimSpace(targetActorID) == "" {
|
|
return nil, fmt.Errorf("breakglass: target actor id is required")
|
|
}
|
|
if l := len(plaintext); l < bgdomain.MinPasswordLengthBytes || l > bgdomain.MaxPasswordLengthBytes {
|
|
return nil, ErrWeakPassword
|
|
}
|
|
|
|
hash, err := s.hashPassword(plaintext)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("breakglass: hash password: %w", err)
|
|
}
|
|
|
|
// Try Update first; fall back to Create when the row doesn't exist.
|
|
if uerr := s.repo.UpdatePasswordHash(ctx, targetActorID, s.tenantID, hash); uerr != nil {
|
|
if !errors.Is(uerr, repository.ErrBreakglassNotFound) {
|
|
return nil, fmt.Errorf("breakglass: update: %w", uerr)
|
|
}
|
|
// First-time set — Create the row.
|
|
newID, idErr := s.newID()
|
|
if idErr != nil {
|
|
return nil, fmt.Errorf("breakglass: id generate: %w", idErr)
|
|
}
|
|
cred := &bgdomain.BreakglassCredential{
|
|
ID: newID,
|
|
TenantID: s.tenantID,
|
|
ActorID: targetActorID,
|
|
PasswordHash: hash,
|
|
}
|
|
if cerr := s.repo.Create(ctx, cred); cerr != nil {
|
|
return nil, fmt.Errorf("breakglass: create: %w", cerr)
|
|
}
|
|
}
|
|
|
|
s.recordAudit(ctx, "auth.breakglass_password_set", callerActorID, domain.ActorTypeUser, targetActorID,
|
|
map[string]interface{}{"caller_actor_id": callerActorID, "target_actor_id": targetActorID})
|
|
|
|
return &SetPasswordResult{
|
|
ActorID: targetActorID,
|
|
CreatedAt: s.clockNow().UTC(),
|
|
}, nil
|
|
}
|
|
|
|
// =============================================================================
|
|
// Authenticate — auth-bypass; the whole point is to log in WITHOUT
|
|
// existing creds. Rate-limited at the handler layer. Identical timing
|
|
// + identical 401 across the wrong-password, locked-account, and
|
|
// non-existent-actor paths.
|
|
// =============================================================================
|
|
|
|
// AuthenticateResult is the return shape for Authenticate.
|
|
type AuthenticateResult struct {
|
|
CookieValue string
|
|
CSRFToken string
|
|
}
|
|
|
|
// Authenticate verifies the supplied plaintext against the stored
|
|
// Argon2id hash. Returns (cookie, csrf, nil) on success; ErrInvalidCredentials
|
|
// uniformly otherwise.
|
|
//
|
|
// Failure modes (all return ErrInvalidCredentials at the wire):
|
|
// - Service disabled → ErrDisabled (handler maps to 404).
|
|
// - Actor has no credential row → ErrInvalidCredentials.
|
|
// - Account locked → ErrInvalidCredentials.
|
|
// - Wrong password → ErrInvalidCredentials, failure_count++, may
|
|
// trigger lockout.
|
|
//
|
|
// On success: failure_count reset, audit row, session minted via
|
|
// SessionService.Create.
|
|
func (s *Service) Authenticate(ctx context.Context, actorID, plaintext, ip, userAgent string) (*AuthenticateResult, error) {
|
|
if !s.Enabled() {
|
|
return nil, ErrDisabled
|
|
}
|
|
|
|
cred, err := s.repo.GetByActor(ctx, actorID, s.tenantID)
|
|
if err != nil {
|
|
// Both not-found AND DB error map to identical-shape error
|
|
// + identical timing path. Audit the attempt.
|
|
s.recordAudit(ctx, "auth.breakglass_login_failed", actorID, domain.ActorTypeUser, actorID,
|
|
map[string]interface{}{
|
|
"actor_id": actorID,
|
|
"failure_category": "no_credential_or_lookup_error",
|
|
"ip_address": ip,
|
|
})
|
|
// Run a dummy Argon2id verify to keep timing parity with
|
|
// the wrong-password path (so an attacker can't
|
|
// time-side-channel "actor has no breakglass row").
|
|
_ = s.verifyDummy(plaintext)
|
|
return nil, ErrInvalidCredentials
|
|
}
|
|
|
|
now := s.clockNow().UTC()
|
|
|
|
// Lockout check.
|
|
if cred.LockedUntil != nil && now.Before(*cred.LockedUntil) {
|
|
s.recordAudit(ctx, "auth.breakglass_login_failed", actorID, domain.ActorTypeUser, actorID,
|
|
map[string]interface{}{
|
|
"actor_id": actorID,
|
|
"failure_category": "locked",
|
|
"ip_address": ip,
|
|
})
|
|
// Run dummy verify for timing parity.
|
|
_ = s.verifyDummy(plaintext)
|
|
return nil, ErrInvalidCredentials
|
|
}
|
|
|
|
// Reset-window check: if last_failure_at is older than
|
|
// LockoutResetInterval, the failure_count has aged out — reset
|
|
// it before this attempt counts.
|
|
if cred.LastFailureAt != nil && now.Sub(*cred.LastFailureAt) > s.cfg.LockoutResetInterval && cred.FailureCount > 0 {
|
|
_ = s.repo.ResetFailureCount(ctx, actorID, s.tenantID)
|
|
}
|
|
|
|
// Constant-time verify against the stored Argon2id PHC hash.
|
|
ok, verr := verifyPassword(plaintext, cred.PasswordHash)
|
|
if verr != nil || !ok {
|
|
// Wrong password (or hash format corruption). Increment +
|
|
// possibly lock + audit + return ErrInvalidCredentials.
|
|
_, _ = s.repo.IncrementFailure(ctx, actorID, s.tenantID, s.cfg.LockoutThreshold, int(s.cfg.LockoutDuration.Seconds()))
|
|
s.recordAudit(ctx, "auth.breakglass_login_failed", actorID, domain.ActorTypeUser, actorID,
|
|
map[string]interface{}{
|
|
"actor_id": actorID,
|
|
"failure_category": "wrong_password",
|
|
"ip_address": ip,
|
|
})
|
|
return nil, ErrInvalidCredentials
|
|
}
|
|
|
|
// Success. Reset counter, audit, mint session.
|
|
_ = s.repo.ResetFailureCount(ctx, actorID, s.tenantID)
|
|
s.recordAudit(ctx, "auth.breakglass_login_succeeded", actorID, domain.ActorTypeUser, actorID,
|
|
map[string]interface{}{"actor_id": actorID, "ip_address": ip})
|
|
|
|
if s.sessions == nil {
|
|
// Test path / no session minter wired. Return zero result.
|
|
return &AuthenticateResult{}, nil
|
|
}
|
|
cookie, csrf, mintErr := s.sessions.Create(ctx, actorID, string(domain.ActorTypeUser), ip, userAgent)
|
|
if mintErr != nil {
|
|
return nil, fmt.Errorf("breakglass: session mint: %w", mintErr)
|
|
}
|
|
return &AuthenticateResult{
|
|
CookieValue: cookie,
|
|
CSRFToken: csrf,
|
|
}, nil
|
|
}
|
|
|
|
// =============================================================================
|
|
// Unlock — admin-only; resets failure_count + clears locked_until.
|
|
// =============================================================================
|
|
|
|
// Unlock clears the lockout state for the named actor. Caller must
|
|
// hold auth.breakglass.admin. Audit row: auth.breakglass_unlocked.
|
|
func (s *Service) Unlock(ctx context.Context, callerActorID, targetActorID string) error {
|
|
if !s.Enabled() {
|
|
return ErrDisabled
|
|
}
|
|
if strings.TrimSpace(callerActorID) == "" {
|
|
return ErrUnauthenticated
|
|
}
|
|
if err := s.repo.ResetFailureCount(ctx, targetActorID, s.tenantID); err != nil {
|
|
return fmt.Errorf("breakglass: unlock: %w", err)
|
|
}
|
|
s.recordAudit(ctx, "auth.breakglass_unlocked", callerActorID, domain.ActorTypeUser, targetActorID,
|
|
map[string]interface{}{"caller_actor_id": callerActorID, "target_actor_id": targetActorID})
|
|
return nil
|
|
}
|
|
|
|
// =============================================================================
|
|
// RemoveCredential — admin-only.
|
|
// =============================================================================
|
|
|
|
// RemoveCredential deletes the break-glass credential row for the
|
|
// named actor. Active sessions for that actor are NOT auto-revoked
|
|
// (separate concern; the operator can call SessionService.RevokeAll
|
|
// in lockstep). Audit row: auth.breakglass_credential_removed.
|
|
func (s *Service) RemoveCredential(ctx context.Context, callerActorID, targetActorID string) error {
|
|
if !s.Enabled() {
|
|
return ErrDisabled
|
|
}
|
|
if strings.TrimSpace(callerActorID) == "" {
|
|
return ErrUnauthenticated
|
|
}
|
|
if err := s.repo.Delete(ctx, targetActorID, s.tenantID); err != nil {
|
|
return fmt.Errorf("breakglass: remove: %w", err)
|
|
}
|
|
s.recordAudit(ctx, "auth.breakglass_credential_removed", callerActorID, domain.ActorTypeUser, targetActorID,
|
|
map[string]interface{}{"caller_actor_id": callerActorID, "target_actor_id": targetActorID})
|
|
return nil
|
|
}
|
|
|
|
// =============================================================================
|
|
// Helpers — Argon2id hash + verify, ID generation, audit, dummy verify.
|
|
// =============================================================================
|
|
|
|
// hashPassword runs Argon2id over plaintext + a fresh 16-byte random
|
|
// salt; returns the PHC-format string.
|
|
func (s *Service) hashPassword(plaintext string) (string, error) {
|
|
salt := make([]byte, argon2SaltSize)
|
|
if _, err := s.readRand(salt); err != nil {
|
|
return "", err
|
|
}
|
|
hash := argon2.IDKey([]byte(plaintext), salt,
|
|
uint32(argon2Iterations), uint32(argon2Memory),
|
|
uint8(argon2Parallelism), uint32(argon2OutputSize))
|
|
return fmt.Sprintf("$argon2id$v=%d$m=%d,t=%d,p=%d$%s$%s",
|
|
argon2.Version,
|
|
argon2Memory, argon2Iterations, argon2Parallelism,
|
|
base64.RawStdEncoding.EncodeToString(salt),
|
|
base64.RawStdEncoding.EncodeToString(hash),
|
|
), nil
|
|
}
|
|
|
|
// verifyPassword parses a PHC-format Argon2id hash, recomputes the hash
|
|
// over plaintext + the embedded salt + embedded params, and constant-
|
|
// time-compares. Returns (true, nil) on match; (false, nil) on mismatch;
|
|
// non-nil err only on hash-format-corruption (caller treats as auth fail).
|
|
func verifyPassword(plaintext, encoded string) (bool, error) {
|
|
if !strings.HasPrefix(encoded, bgdomain.Argon2idPHCPrefix) {
|
|
return false, fmt.Errorf("not an argon2id hash")
|
|
}
|
|
parts := strings.Split(encoded, "$")
|
|
// Format: $argon2id$v=N$m=M,t=T,p=P$<salt-base64>$<hash-base64>
|
|
// Split by $ → ["", "argon2id", "v=N", "m=M,t=T,p=P", "<salt>", "<hash>"]
|
|
if len(parts) != 6 {
|
|
return false, fmt.Errorf("malformed argon2id hash (parts=%d)", len(parts))
|
|
}
|
|
var version int
|
|
if _, err := fmt.Sscanf(parts[2], "v=%d", &version); err != nil {
|
|
return false, fmt.Errorf("parse version: %w", err)
|
|
}
|
|
if version != argon2.Version {
|
|
return false, fmt.Errorf("incompatible argon2id version: %d (want %d)", version, argon2.Version)
|
|
}
|
|
var memory, iters, parallelism uint32
|
|
if _, err := fmt.Sscanf(parts[3], "m=%d,t=%d,p=%d", &memory, &iters, ¶llelism); err != nil {
|
|
return false, fmt.Errorf("parse params: %w", err)
|
|
}
|
|
salt, err := base64.RawStdEncoding.DecodeString(parts[4])
|
|
if err != nil {
|
|
return false, fmt.Errorf("decode salt: %w", err)
|
|
}
|
|
want, err := base64.RawStdEncoding.DecodeString(parts[5])
|
|
if err != nil {
|
|
return false, fmt.Errorf("decode hash: %w", err)
|
|
}
|
|
got := argon2.IDKey([]byte(plaintext), salt, iters, memory, uint8(parallelism), uint32(len(want)))
|
|
return subtle.ConstantTimeCompare(got, want) == 1, nil
|
|
}
|
|
|
|
// verifyDummy runs a real Argon2id pass against fixed params + a
|
|
// throwaway salt so the wrong-password / no-credential / locked-account
|
|
// paths take statistically indistinguishable time. The result is
|
|
// discarded.
|
|
func (s *Service) verifyDummy(plaintext string) bool {
|
|
dummySalt := make([]byte, argon2SaltSize) // all-zeros — fine for timing parity
|
|
_ = argon2.IDKey([]byte(plaintext), dummySalt,
|
|
uint32(argon2Iterations), uint32(argon2Memory),
|
|
uint8(argon2Parallelism), uint32(argon2OutputSize))
|
|
return false
|
|
}
|
|
|
|
// newID returns `bg-<base64url-no-pad-of-16-random-bytes>`.
|
|
func (s *Service) newID() (string, error) {
|
|
b := make([]byte, 16)
|
|
if _, err := s.readRand(b); err != nil {
|
|
return "", err
|
|
}
|
|
return "bg-" + base64.RawURLEncoding.EncodeToString(b), nil
|
|
}
|
|
|
|
// recordAudit is a thin wrapper that swallows audit errors (best-effort;
|
|
// a failed audit must not block a successful auth operation). Phase 8
|
|
// contract: every row event_category=auth.
|
|
func (s *Service) recordAudit(ctx context.Context, action, actor string, actorType domain.ActorType, resourceID string, details map[string]interface{}) {
|
|
if s.audit == nil {
|
|
return
|
|
}
|
|
_ = s.audit.RecordEventWithCategory(ctx, actor, actorType, action,
|
|
domain.EventCategoryAuth, "breakglass_credential", resourceID, details)
|
|
}
|
|
|
|
// _ ensures authdomain import is live in case future service code needs
|
|
// the canonical permission constants.
|
|
var _ = authdomain.RoleIDAdmin
|