fix(security): close BUNDLE 5 — auth, OIDC, MCP, API + browser security edges

Bundle 5 closure (2026-05-13 acquisition diligence audit). 13-finding security audit pass across the auth / OIDC / MCP / API / browser- security surface. Five real closures shipped in code, two false-as- stated findings annotated with the existing implementation, three operator-decision items documented for v3 follow-up, three doc-only fixes (auth architecture narrative aligned with shipped OIDC). Source findings closed (code): S1 break-glass /auth/breakglass/login lacked the documented 5/min per-source-IP rate limit; handler now owns its own SlidingWindowLimiter wired at startup. Doc claim turns true. R6 OIDC test_discovery JWKS probe ran on http.DefaultClient; now uses an http.Client whose transport wraps validation.SafeHTTPDialContext. JWKS URI can no longer pivot into reserved-address ranges via DNS rebinding. R7 Slack + Teams notifiers built http.Client without the SSRF dial-time guard. Both New() constructors now install validation.SafeHTTPDialContext; webhook URLs (operator- configured via dynamic-config GUI) cannot dial 169.254.x or in-cluster reserved ranges. Test seam: newForTest bypasses the guard for httptest's 127.0.0.1 binds, mirroring the existing internal/connector/notifier/webhook pattern. RT-L2 CERTCTL_ACME_INSECURE=true now emits a prominent logger.Warn at server boot. Pre-Bundle-5 the knob silently disabled ACME directory TLS verification. Source findings closed (doc): finding 1 + HIGH-5 Architecture doc claimed no in-process JWT/ OIDC/mTLS/SAML and pointed everyone at the authenticating-gateway pattern. Auth Bundle 2 (commit dea5053) shipped native OIDC + sessions + break-glass. New §"In-process authentication surface" table (api-key / oidc / none) supersedes the old framing; "Authenticating-gateway pattern (SAML, mTLS-as-auth, LDAP)" section retained for protocols certctl still doesn't ship natively. Source findings verified false (existing implementation): S4 OIDC email-domain allowlist — `email_domain_test.go` already pins the strict-equality semantics (subdomain not auto-accepted, multi-entry no-match path, empty allowlist accepts all by-design per RFC 9700 §4.1.1). SEC-L1 CSP / HSTS / referrer-policy headers — already shipped at internal/api/middleware/securityheaders.go and wired at cmd/server/main.go L2003+L2027+L2115. Operator-decision / deferred (tracked in bundle-5 closure doc): S3 CERTCTL_API_KEYS_NAMED parsing is wired, end-to-end validation is partial. Operator decides: complete the named-key middleware path or deprecate the syntax. S5 Audit-middleware best-effort for read paths; security-critical writes use WithinTx. Operator decides per-path escalation. S8 MCP threat model — the binary is a thin protocol bridge, no privileges of its own; every tool call carries CERTCTL_API_KEY and is auth'd + RBAC-gated server-side. Optional CERTCTL_MCP_READ_ONLY gate tracked as v3. SEC-H1 2026-05-10 audit CRIT-1/2/4 already closed on master; CRIT-3/5 status against the spec folder is operator- workstation-validation-only. Documented for follow-up. SEC-L2 WebAuthn / FIDO2 / step-up — already documented in docs/operator/auth-threat-model.md "Threats Bundle 2 does NOT close". v3 work item per CLAUDE.md decision 12. Full per-finding rationale + receipts at docs/operator/security-bundle-5-audit-closure.md. Verification: gofmt -l # clean go vet ./internal/connector/notifier/slack ./internal/connector/notifier/teams ./internal/auth/oidc ./internal/api/handler ./cmd/server # clean go build ./cmd/server [...] # clean go test -short -count=1 ./internal/connector/notifier/slack ./internal/connector/notifier/teams ./internal/api/handler ./internal/auth/oidc ./internal/config # PASS # (slack 0.028s + teams # 0.023s + handler 11.0s; # newForTest seam keeps # httptest tests green) Audit-Closes: BUNDLE-5 S1 R6 R7 RT-L2 finding-1 HIGH-5 Audit-Verifies-False: S4 SEC-L1 Audit-Defers: S3 S5 S8 SEC-H1 SEC-L2
2026-06-07 14:51:30 +00:00 · 2026-05-13 01:18:45 +00:00
parent 750478a6fe
commit 596e675ec7
9 changed files with 265 additions and 14 deletions
@@ -32,6 +32,7 @@ import (
 	"github.com/certctl-io/certctl/internal/auth/breakglass"
 	bgdomain "github.com/certctl-io/certctl/internal/auth/breakglass/domain"
 	sessiondomain "github.com/certctl-io/certctl/internal/auth/session/domain"
+	"github.com/certctl-io/certctl/internal/ratelimit"
 )

 // =============================================================================
@@ -51,9 +52,30 @@ type BreakglassService interface {
 }

 // AuthBreakglassHandler ships the Phase 7.5 surface.
+//
+// Bundle 5 closure (S1): the docstring at the top of this file claimed
+// the login endpoint was "Rate-limited at 5/minute per source IP via
+// the existing rate limiter middleware" but no per-route limiter was
+// wired — `/auth/breakglass/login` is registered via `r.mux.Handle`
+// in router.go::AuthExemptRouterRoutes and bypasses the global RPS
+// middleware that wraps `r.Register`-mounted routes. The login handler
+// now owns its own SlidingWindowLimiter (5 attempts / minute / source
+// IP, 50 000 key cap) so the documented behavior actually ships.
+//
+// Wired at startup via SetLoginRateLimiter (called from cmd/server/main.go
+// alongside the other per-handler rate limiters that close audit
+// findings H-9 / H-12 / Bundle 3 D7 / etc.). Defense-in-depth: even
+// when the limiter is nil (legacy / test), the service-layer Argon2id
+// lockout state machine still protects against brute force — but a
+// nil limiter is a misconfiguration the integration test catches.
 type AuthBreakglassHandler struct {
 	svc         BreakglassService
 	cookieAttrs SessionCookieAttrs
+	// loginLimiter rate-limits POST /auth/breakglass/login by source IP.
+	// nil-safe: when unset, the handler skips the limiter check and
+	// relies on the service-layer Argon2id lockout. Production deploys
+	// MUST set this via SetLoginRateLimiter.
+	loginLimiter *ratelimit.SlidingWindowLimiter
 }

 // NewAuthBreakglassHandler constructs the handler.
@@ -61,6 +83,13 @@ func NewAuthBreakglassHandler(svc BreakglassService, cookieAttrs SessionCookieAt
 	return &AuthBreakglassHandler{svc: svc, cookieAttrs: cookieAttrs}
 }

+// SetLoginRateLimiter wires the per-source-IP rate limiter the Login
+// handler enforces. Bundle 5 closure (S1) — see the AuthBreakglassHandler
+// type docstring for the full rationale.
+func (h *AuthBreakglassHandler) SetLoginRateLimiter(l *ratelimit.SlidingWindowLimiter) {
+	h.loginLimiter = l
+}
+
 // =============================================================================
 // 1. Public login endpoint.
 // =============================================================================
@@ -98,6 +127,22 @@ func (h *AuthBreakglassHandler) Login(w http.ResponseWriter, r *http.Request) {
 	}

 	ip := clientIPFromRequest(r)
+
+	// Bundle 5 closure (S1): per-source-IP rate limit. 5 attempts /
+	// minute / IP (default; configurable via the constructor at
+	// cmd/server/main.go). Returns 429 with no body so the response
+	// shape matches the rest of the auth surface (scanner-unfriendly).
+	// Audited by the service layer on the next attempt — we don't
+	// audit the rate-limit hit itself here because that would let an
+	// attacker flood the audit table with rate-limit rows from a
+	// single IP.
+	if h.loginLimiter != nil {
+		if err := h.loginLimiter.Allow(ip, time.Now()); err != nil {
+			Error(w, http.StatusTooManyRequests, "too many requests")
+			return
+		}
+	}
+
 	res, err := h.svc.Authenticate(r.Context(), req.ActorID, req.Password, ip, r.UserAgent())
 	if err != nil {
 		// All authenticate errors map to the SAME 401 + same body.