mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 16:21:30 +00:00
750478a6fe
Bundle 4 closure (2026-05-13 acquisition diligence audit). Closes the
"what happens under multi-replica" question cluster: migration runner
had no concurrency control + no applied-version ledger, 15 scheduler
loops had per-process idempotency but no cross-replica documentation,
rate limits were process-local without an operator-facing scope
statement, load-test scope explicitly omitted four hot paths without
linking them to a roadmap.
Source findings closed:
HIGH-1 + D4 + finding 4 (migration tracking)
D8 (scheduler loop ownership)
MED-1 + MED-2 (rate-limit scope)
T9 + LOW-7 + finding 7 (load-test receipt scope)
Closures by source ID:
HIGH-1 + D4 + finding 4 — Migration tracking + advisory lock.
internal/repository/postgres/db.go::RunMigrations now wraps every
migration execution in:
1. A dedicated *sql.Conn pinned to one connection for the entire
scan + apply lifecycle (pg_advisory_lock is connection-scoped).
2. pg_advisory_lock(migrationAdvisoryLockID) — fixed int64 key
derived from "certctl-migrations" so the same constant resolves
across deployments without colliding with operator advisory
locks. Blocks the second replica until the first finishes.
3. CREATE TABLE IF NOT EXISTS schema_migrations(version TEXT PK,
applied_at TIMESTAMPTZ DEFAULT NOW()) — audit ledger.
4. Skip-applied loop: SELECT version FROM schema_migrations →
map[string]struct{} → skip every .up.sql whose filename is in
the map. INSERT after successful execute, ON CONFLICT
(version) DO NOTHING for defense in depth.
Pre-Bundle-4 every server boot re-ran all 45 .up.sql files. The
"idempotency via IF NOT EXISTS / ON CONFLICT" contract in CLAUDE.md
held per-migration but offered no protection when two Helm replicas
raced on schema DDL. Post-Bundle-4 single-replica deploys see zero
behavior change beyond the audit-table population; multi-replica
deploys get HA-safe schema bootstrap.
D8 — Scheduler HA semantics documented.
New docs/operator/scheduler-ha.md with per-loop inventory of all 15
loops in internal/scheduler/scheduler.go. Classification:
- HA-safe (jobProcessorLoop, jobRetryLoop) — FOR UPDATE SKIP
LOCKED via ClaimPendingJobs (Bundle 1 H-6 closure, 3e78ecb).
- HA-safe-ish (jobTimeoutLoop) — atomic UPDATE-WHERE-status.
- Idempotent under N>1 replicas (renewalCheckLoop,
agentHealthCheckLoop, shortLivedExpiryCheckLoop, networkScanLoop,
healthCheckLoop, acmeGCLoop, sessionGCLoop) — duplicate ticks
produce idempotent side effects.
- Side-effect-duplicating under N>1 replicas
(notificationProcessLoop, notificationRetryLoop, digestLoop,
cloudDiscoveryLoop, crlGenerationLoop) — duplicate
webhook/email/AWS-API/CRL-signing operations. Operators
running multi-replica accept N× side effects or pin to
server.replicas: 1.
Leader-election work tracked in WORKSPACE-ROADMAP.md as v3.
MED-1 + MED-2 — Rate-limit scope.
New docs/operator/rate-limit-scope.md states the contract verbatim:
process-local sync.Mutex-guarded sliding-window log, effective
cluster-wide cap = configured-per-replica × server.replicas,
restart-safe (no persistent state, no shared store), bounded
(50k/100k key cap with eviction). Five call sites documented:
ocspLimiter (1m/IP), exportLimiter (1h/actor), EST per-principal
(24h/CN), EST failed-auth (1h/IP), Intune dispatcher
(24h/Subject+Issuer), plus the HTTP middleware token-bucket
(RPS+Burst per replica). Cluster-wide shared limits via Redis or
Postgres-backed bucket are tracked in WORKSPACE-ROADMAP.md as v3.
T9 + LOW-7 + finding 7 — Load-test receipt scope.
The existing harness at deploy/test/loadtest/ already
self-documents the gap ("What it explicitly does NOT measure"). No
code change needed for this finding; Bundle 4 cross-references
scheduler-ha.md and rate-limit-scope.md from those gap callouts so
the four deferred coverage classes (issuer connector, scheduler
throughput, agent fleet, DB p99) land in the same place an
acquirer reads about HA semantics and rate limits.
Tests:
internal/repository/postgres/migrations_test.go (new, 4 tests):
- TestRunMigrations_PopulatesSchemaMigrations: audit table
exists and is non-empty after the first migration run.
- TestRunMigrations_SkipsAppliedOnSecondCall: second call is
observable no-op on row count.
- TestRunMigrations_ConcurrentCallsSerialized: two goroutines
racing the migrator both return without error; row count
unchanged; no duplicate versions.
- TestRunMigrations_FreshDatabaseHappyPath: ≥ 30 migrations
land on a fresh schema.
Gated by testcontainers via the existing repo_test.go getTestDB
pattern; skipped under -short. The integration lane runs them.
Verification:
gofmt -l # clean
go vet ./internal/repository/postgres ./cmd/server # clean
go build ./cmd/server ./internal/repository/postgres # clean
go test -short -count=1 ./internal/repository/postgres
./internal/ratelimit # PASS
Operator follow-up: full integration run on workstation:
go test -count=1 ./internal/repository/postgres -run TestRunMigrations_
Receipts (paths for the audit packet):
Migration runner evidence: internal/repository/postgres/db.go
L135-340 (advisory-lock + ledger + skip-applied loop) +
internal/repository/postgres/migrations_test.go (4 tests).
Scheduler loop inventory: docs/operator/scheduler-ha.md (15-loop
table with HA classification per loop).
Rate-limit storage matrix: docs/operator/rate-limit-scope.md.
Load-test baseline: deploy/test/loadtest/README.md (already
self-documenting), cross-linked from scheduler-ha.md.
Remaining operator warnings (deferred, tracked in WORKSPACE-ROADMAP.md):
- Leader election for the four duplicate-side-effect loops
(notificationProcessLoop, notificationRetryLoop, digestLoop,
cloudDiscoveryLoop, crlGenerationLoop). v3 work item.
- Shared rate-limits across replicas (Redis / Postgres token
bucket). v3 work item.
- Issuer-connector + scheduler-throughput + agent-fleet + DB-p99
load-test coverage. Tracked separately; per-issuer Prometheus
histograms already capture issuer round-trip latency in
production runs.
Audit-Closes: BUNDLE-4 HIGH-1 D4 D8 MED-1 MED-2 T9 LOW-7 finding-4 finding-7
55 lines
5.3 KiB
Markdown
55 lines
5.3 KiB
Markdown
# Rate-Limit Scope
|
||
|
||
> Last reviewed: 2026-05-13
|
||
|
||
How certctl's rate limiters behave under multi-replica and restart, and where the boundaries are. Closes Bundle 4 audit findings **MED-1** (process-local rate limits) and **MED-2** (rate-limit semantics across replicas).
|
||
|
||
## TL;DR
|
||
|
||
Every rate limiter in certctl is **process-local**: in-memory `sync.Mutex`-guarded maps in the `internal/ratelimit` package. The effective rate-limit across an N-replica deployment is **N × the configured per-replica limit**. Limiter state is lost on restart — no persistent ledger, no shared store. This is intentional for v2.1.0 and documented; shared rate limits across replicas (via Redis or a Postgres-backed token bucket) is a v3 work item tracked in `WORKSPACE-ROADMAP.md`.
|
||
|
||
## Limiter inventory
|
||
|
||
The shared primitive lives at `internal/ratelimit/sliding_window.go::SlidingWindowLimiter`. It's a sliding-window log algorithm (each key holds timestamps within the configured window; on `Allow` the bucket prunes timestamps older than `now - window` and admits or rejects based on the post-prune count).
|
||
|
||
Five call sites exercise it across `cmd/server/main.go`:
|
||
|
||
| Call site | Key | Window | Cap | What it protects |
|
||
|---|---|---|---|---|
|
||
| `ocspLimiter` | source IP | 1 minute | `CERTCTL_OCSP_RATE_LIMIT_PER_IP_MIN` (default 1000) | RFC 6960 OCSP responder against scan amplification. |
|
||
| `exportLimiter` | actor ID | 1 hour | `CERTCTL_CERT_EXPORT_RATE_LIMIT_PER_ACTOR_HR` (default 50) | `/api/v1/certificates/{id}/export` bulk-cert-pull abuse. |
|
||
| EST per-principal | CN | 24 hours | per-profile `RateLimitPerPrincipal24h` | EST RFC 7030 device enrollment abuse. |
|
||
| EST failed-auth | source IP | 1 hour | 10 attempts | EST HTTP-Basic brute force. |
|
||
| Intune dispatcher | (Subject, Issuer) | 24 hours | per-profile `PerDeviceRateLimit24h` | SCEP + Intune duplicate-enrollment cap. |
|
||
|
||
The HTTP middleware rate-limiter (`internal/api/middleware/middleware.go::RateLimitConfig`, knobs `CERTCTL_RATE_LIMIT_RPS` + `CERTCTL_RATE_LIMIT_BURST`) is a separate token-bucket implementation but follows the same process-local scope.
|
||
|
||
## What is in scope
|
||
|
||
- **Per-process abuse mitigation**. A scanner hitting one replica's OCSP responder at 5000 rps gets dropped to 1000 rps by `ocspLimiter`. A compromised API key trying to bulk-export 1000 certs in an hour against one replica gets dropped to 50.
|
||
- **Bounded memory footprint**. Each limiter caps its key-tracking map at the size passed to `NewSlidingWindowLimiter` (50 000 for OCSP/export, 100 000 for EST/Intune per-device). The at-cap eviction janitor drops the oldest entry by newest-timestamp so a key explosion (random source IPs from a botnet, random CNs from a misconfigured fleet) never bloats memory.
|
||
- **Restart safety**. The sliding-window state is per-process in-memory. On a server restart the limiter state resets — any attacker who'd burned through their window cap before the restart gets a fresh window after. Conversely, a legitimate caller that had hit their cap right before a restart gets immediately unblocked. Both directions are intentional: we don't ship a persistent state store, so the post-restart state is "fresh start".
|
||
|
||
## What is NOT in scope
|
||
|
||
- **Shared limits across replicas**. With `server.replicas: N`, an attacker hitting all replicas in parallel gets N × the per-replica cap. For the default OCSP knob (1000 rps per IP per minute) at N=3, that's 3000 rps per IP per minute before any single replica drops the traffic. The chart's default is N=1; operators running multi-replica should multiply published per-replica caps by the replica count to get the cluster-wide effective cap.
|
||
- **Cross-restart persistence**. See "Restart safety" above — this is by-design but operators need to know.
|
||
- **First-party (`Authorization: Bearer ...`) request rate-limit cohesion across replicas**. The middleware-level RPS/burst rate-limit (`CERTCTL_RATE_LIMIT_RPS`) is also process-local. At N=3 replicas with default `RPS=50, Burst=100`, the effective cluster-wide rate is 150 rps with 300 burst.
|
||
|
||
## Operator guidance
|
||
|
||
**Single-replica deployments** (Helm chart default `server.replicas: 1`): published caps are the effective caps. No mental math.
|
||
|
||
**Multi-replica deployments**: multiply every published cap by `server.replicas` to get the effective per-key cap. If your threat model needs strict cluster-wide rate limits (e.g., SOC 2 control mapping that quotes "≤ 1000 OCSP requests per IP per minute"), one of:
|
||
|
||
1. **Pin to single replica** and scale vertically. The default v2.1.0 posture; works for substantial fleets.
|
||
2. **Front the cluster with a rate-limited gateway** (NGINX `limit_req_zone`, Envoy global rate-limit service, Cloudflare WAF Bypass rules) and treat the cluster-internal limiter as defense-in-depth.
|
||
3. **Wait for the v3 shared-rate-limit work** (tracked in WORKSPACE-ROADMAP.md). Likely Redis or Postgres-backed token-bucket plumbed through the same `internal/ratelimit` package so call sites stay unchanged.
|
||
|
||
## Source-of-truth references
|
||
|
||
- Shared primitive: `internal/ratelimit/sliding_window.go` (the package comment at the top is the canonical algorithm + concurrency reference).
|
||
- Middleware rate limiter: `internal/api/middleware/middleware.go::RateLimitConfig`.
|
||
- Call sites: `grep -n "ratelimit.NewSlidingWindowLimiter\|RateLimitConfig" cmd/server/main.go`.
|
||
- Configuration knobs: `docs/reference/configuration.md` (search "rate limit").
|