Files
certctl/internal/scep/intune/rate_limit.go
T
shankar0123 aa139ee0d9 EST RFC 7030 hardening master bundle Phases 2-4: end-to-end mTLS sibling
route + RFC 9266 channel binding + HTTP Basic enrollment-password +
per-source-IP failed-auth limit + per-(CN, sourceIP) sliding-window cap.

Two new shared packages so EST + Intune share infrastructure:
- internal/cms/ — RFC 9266 tls-exporter extractor (ExtractTLSExporter
  with stdlib-panic recovery for synthetic ConnectionStates) +
  CSR-side channel-binding parser via raw TBSCertificationRequestInfo
  walk (the stdlib's csr.Attributes can't represent the OCTET STRING
  binding value), VerifyChannelBinding composite, EmbedChannel-
  BindingAttribute fixture helper, typed sentinel errors for missing
  / mismatch / not-TLS-1.3 mapped to HTTP 400 / 409 / 426 in handler.
- internal/trustanchor/ — extracted from scep/intune/trust_anchor*.go
  so the EST mTLS sibling route + Intune dispatcher share the same
  SIGHUP-reloadable PEM bundle primitive. intune.TrustAnchorHolder
  is now `= trustanchor.Holder` (type alias) + NewTrustAnchorHolder =
  trustanchor.New (function alias) — every existing call site compiles
  unchanged. Intune's LoadTrustAnchor is a thin wrapper over
  trustanchor.LoadBundle. White-box tests moved to the new package.
- internal/ratelimit/ — extracted from scep/intune/rate_limit.go (this
  was Phase 4.1, in the same bundle). intune.PerDeviceRateLimiter
  is now a thin wrapper preserving the (subject, issuer)→key
  composition; EST handler reaches for SlidingWindowLimiter directly.

ESTHandler grew six optional fields wired by per-profile setters
(SetMTLSTrust / SetChannelBindingRequired / SetEnrollmentPassword /
SetSourceIPRateLimiter / SetPerPrincipalRateLimiter / SetLabelForLog)
plus four new mTLS-route methods (CACertsMTLS / SimpleEnrollMTLS /
SimpleReEnrollMTLS / CSRAttrsMTLS); shared internal pipeline
handleEnrollOrReEnroll(reEnroll, viaMTLS) keeps the auth/binding/
rate-limit gates DRY. New router method RegisterESTMTLSHandlers
registers /.well-known/est-mtls/<PathID>/{cacerts,simpleenroll,
simplereenroll,csrattrs}; AuthExemptDispatchPrefixes extends the
no-auth chain to /.well-known/est-mtls.

cmd/server/main.go's EST loop wires per-profile mTLS holder +
channel-binding policy + per-principal limiter + (when EnrollmentPassword
non-empty) Basic + source-IP limiter; new preflightESTMTLSClientCATrust-
Bundle returns *trustanchor.Holder so SIGHUP rotates the EST mTLS
bundle live without restart. SCEP + EST mTLS profiles now share a
single union mtlsUnionPoolForTLS passed to buildServerTLSConfigWithMTLS
(replaces the protocol-specific scepMTLSUnionPoolForTLS); per-handler
re-verify enforces "cert must chain to THIS profile's bundle" so
cross-protocol bleed is blocked at the application layer even though
the TLS layer trusts certs from either pool's union.

Phase 3.3 source-IP failed-Basic limiter defaults: 10 attempts / 1h
/ 50k tracked IPs (no env var; tunable in a follow-up). Phase 4.2
per-principal limiter cap from CERTCTL_EST_PROFILE_<NAME>_RATE_
LIMIT_PER_PRINCIPAL_24H (existing field, Phase 1 shipped).

New tests:
- internal/cms/channelbinding_test.go: extractor + CSR-side parser +
  composite + TLS-1.3 round-trip end-to-end + EmbedChannelBinding-
  Attribute round-trip
- internal/trustanchor/holder_test.go: parseBundlePEM white-box +
  LoadBundle + Holder Get/Pool/SetLabelForLog/Reload-happy/
  Reload-keeps-old-on-failure/Reload-keeps-old-on-expired/
  WatchSIGHUP-reloads-pool/WatchSIGHUP-stop-clean
- internal/api/handler/est_hardening_test.go: 16 named cases covering
  mTLS no-trust-pool 500 + no-cert 401 + cross-profile cert 401 +
  happy-path 200 + CACertsMTLS auth gate + CSRAttrsMTLS auth gate +
  channel-binding required-absent-rejected + not-required-absent-
  allowed + writeChannelBindingError mapping + Basic no-header 401
  + Basic wrong-password 401 + Basic correct-200 + Basic-no-password
  no-gate + per-IP failed-attempt lockout 429 + per-principal
  blocks-after-cap + different-principals-independent + no-limiter-
  unbounded.

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
disk-space testcontainers download), staticcheck clean for
cms/trustanchor/api/handler/api/router/scep/intune/ratelimit/
cmd/server, go test -short -count=1 green for cms/trustanchor/
api/handler/api/router/scep/intune/ratelimit/service. G-3
docs-drift guard reproduced locally clean (Phase 1 already
documented every new env var; Phases 2-4 added zero new env vars).
2026-04-29 23:15:35 +00:00

88 lines
3.9 KiB
Go

package intune
import (
"time"
"github.com/shankar0123/certctl/internal/ratelimit"
)
// SCEP RFC 8894 + Intune master bundle Phase 8.6.
//
// PerDeviceRateLimiter is the second line of defense behind the replay
// cache from Phase 7. The replay cache catches the same challenge being
// submitted twice (within the challenge TTL); this rate limiter catches a
// compromised Connector signing key (or a stolen key+cert pair) issuing
// many DIFFERENT valid challenges for the same device subject in a short
// window.
//
// Threat model:
//
// - Replay cache (Phase 7): nonce-keyed; catches duplicate submission.
// - This limiter: (Subject, Issuer)-keyed; catches enrollment-flooding.
//
// EST RFC 7030 hardening master bundle Phase 4.1: the implementation that
// used to live in this file was extracted to internal/ratelimit (where it
// can be shared with EST per-principal + EST HTTP-Basic source-IP rate
// limiters). PerDeviceRateLimiter is now a thin wrapper around
// ratelimit.SlidingWindowLimiter that preserves the original
// (subject, issuer) → key composition in the Allow signature so existing
// SCEP/Intune callers don't have to change.
//
// New callers SHOULD use ratelimit.SlidingWindowLimiter directly. The
// EST RFC 7030 Phase 4.2 EST per-principal cap uses the shared package.
// ErrRateLimited is the typed error returned when the per-device rate
// limit fires. Aliased to ratelimit.ErrRateLimited so errors.Is matches
// against either name (the SCEP audit closure already pinned the
// "rate_limited" metric label against this sentinel; the alias preserves
// sentinel identity across the package boundary).
var ErrRateLimited = ratelimit.ErrRateLimited
// PerDeviceRateLimiter wraps ratelimit.SlidingWindowLimiter with the
// (subject, issuer)-composed-key Allow signature the Intune dispatcher
// uses. Concurrency-safe (the underlying limiter holds the mutex).
type PerDeviceRateLimiter struct {
inner *ratelimit.SlidingWindowLimiter
}
// NewPerDeviceRateLimiter returns a limiter with the given per-key cap +
// window. maxN ≤ 0 disables the limiter (all Allow calls return nil);
// this is operator opt-out for the rare case where the per-device cap is
// undesirable (e.g. test harnesses, sketchpad deploys).
//
// Window defaults to 24h when zero. Map cap defaults to 100,000 when zero
// (matches the replay cache cap; see internal/scep/intune/replay.go).
func NewPerDeviceRateLimiter(maxN int, window time.Duration, mapCap int) *PerDeviceRateLimiter {
return &PerDeviceRateLimiter{inner: ratelimit.NewSlidingWindowLimiter(maxN, window, mapCap)}
}
// Allow checks whether an enrollment for the given (subject, issuer)
// tuple is permitted right now. Returns nil when allowed (and records
// the timestamp in the bucket) or ErrRateLimited when the bucket is at
// maxN.
//
// Empty subject is treated as "skip the limiter" — the caller's claim
// validation should have rejected an empty-subject claim already; this
// is belt-and-suspenders to prevent a single empty-subject bucket from
// becoming a fleet-wide chokepoint.
func (l *PerDeviceRateLimiter) Allow(subject, issuer string, now time.Time) error {
if subject == "" {
// Empty-subject early return preserved from the pre-Phase-4.1
// behavior: ratelimit.SlidingWindowLimiter also short-circuits
// on empty key, but the explicit check here documents the
// (subject, issuer) → empty-key contract and saves one call
// frame in the hot path.
return nil
}
key := subject + "|" + issuer
return l.inner.Allow(key, now)
}
// Len returns the approximate number of distinct (subject, issuer) keys
// currently tracked. For observability + tests.
func (l *PerDeviceRateLimiter) Len() int { return l.inner.Len() }
// Disabled reports whether the limiter is in opt-out mode (maxN ≤ 0).
// Useful for handler-side gating + admin-endpoint observability.
func (l *PerDeviceRateLimiter) Disabled() bool { return l.inner.Disabled() }