fix(middleware): SEC-006 — TTL-evict idle token-bucket rate-limiter entries

Sprint 2 unified-master-audit closure. Pre-fix the keyed rate
limiter's bucket map had no eviction. The package-level comment
explicitly noted the leak: high-cardinality unauthenticated traffic
(CGNAT churn, Tor exit lists, botnets, infinite-cardinality scanners)
grew process memory unboundedly. Production deploys with millions of
unique IPs would eventually OOM.

Fix:
  - RateLimitConfig.BucketTTL (env CERTCTL_RATE_LIMIT_BUCKET_TTL,
    default 1h, clamp-floor 1m). 1h chosen to be well above realistic
    operator IP churn windows (returning clients keep their bucket)
    and well below the unbounded-leak window the pre-fix code
    allowed.
  - tokenBucket gains a lastAccess field updated on every allow()
    call via touch(); reading via lastAccessTime() under the bucket's
    own mutex.
  - keyedRateLimiter.sweepLoop runs in a single goroutine per
    limiter (production wires 2: default + no-auth fallback), waking
    every BucketTTL/4. sweep() removes any bucket whose lastAccess
    is older than the cutoff and bumps evictedTotal atomically.
  - Both NewRateLimiter call sites in cmd/server/main.go (default
    stack and no-auth fallback) now thread cfg.RateLimit.BucketTTL.

Regression coverage:
  - TestKeyedRateLimiter_SweepEvictsIdleBuckets: 1000 synthetic IP
    keys populate the map, advance past TTL, call sweep() directly,
    assert map drained to 0 + evictedTotal=1000 + fresh key creates
    new bucket (map not poisoned).
  - TestKeyedRateLimiter_SweepKeepsActiveBuckets: inverse — a bucket
    touched within the TTL window survives the sweep. Catches a
    future regression that inverts the cutoff comparison.

Closes SEC-006.
This commit is contained in:
shankar0123
2026-05-16 04:01:18 +00:00
parent 037876fa0f
commit 8f2e5771db
5 changed files with 246 additions and 11 deletions
+9
View File
@@ -2080,6 +2080,11 @@ func main() {
BurstSize: cfg.RateLimit.BurstSize,
PerUserRPS: cfg.RateLimit.PerUserRPS,
PerUserBurstSize: cfg.RateLimit.PerUserBurstSize,
// SEC-006 (Sprint 2): bounded bucket TTL so a long-running
// server with high-cardinality unauthenticated traffic
// (CGNAT churn, Tor exits, scanners) doesn't grow the map
// indefinitely.
BucketTTL: cfg.RateLimit.BucketTTL,
})
// SEC-003 closure (Sprint 1, 2026-05-16). Pre-fix the
// rate-limit-enabled stack was rebuilt without
@@ -2166,6 +2171,10 @@ func main() {
noAuthRateLimiter := middleware.NewRateLimiter(middleware.RateLimitConfig{
RPS: cfg.RateLimit.RPS,
BurstSize: cfg.RateLimit.BurstSize,
// SEC-006 closure (Sprint 2): same bucket-TTL eviction for the
// no-auth limiter — this one's the higher exposure since every
// unauthenticated probe gets a fresh IP-keyed bucket.
BucketTTL: cfg.RateLimit.BucketTTL,
})
noAuthMiddleware = append(noAuthMiddleware, noAuthRateLimiter)
}