mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 20:31:30 +00:00
feat(scep-intune): per-profile dispatcher + SIGHUP reload + per-device rate limit + compliance hook seam
Phase 8 of the SCEP RFC 8894 + Intune master bundle. Wires the internal/scep/intune validator from Phase 7 into the SCEPService dispatch path, with a SIGHUP-reloadable trust anchor holder, a per-(Subject, Issuer) sliding-window rate limiter, and a nil-default ComplianceCheck seam for V3-Pro. Operator-visible surface (per-profile, all default to off): CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED=true CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CONNECTOR_CERT_PATH=/etc/certctl/intune.pem CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_AUDIENCE=https://certctl.example.com/scep/corp CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CHALLENGE_VALIDITY=60m CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_PER_DEVICE_RATE_LIMIT_24H=3 Per-profile dispatch (Phase 8.8): an operator running corp-laptops through Intune AND IoT devices through static challenge configures INTUNE_ENABLED=true on the corp profile only — the IoT profile's PKCSReq path skips the dispatcher entirely. Mirrors the per-profile shape established by Phase 1.5. Wire-in surfaces: * config.go (Phase 8.1): SCEPProfileConfig.Intune sub-config of type SCEPIntuneProfileConfig (Enabled/ConnectorCertPath/Audience/ ChallengeValidity/PerDeviceRateLimit24h). Loaded from the indexed CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_* env-var family. Per-profile Validate gate refuses INTUNE_ENABLED=true with empty ConnectorCertPath OR negative PerDeviceRateLimit24h. * cmd/server/main.go (Phase 8.2 + wire-in): preflightSCEPIntuneTrustAnchor helper mirrors preflightSCEPRACertKey/preflightSCEPMTLSTrustBundle shape — fail-loud at boot when the trust anchor file is missing / unreadable / empty / contains an expired cert. The per-profile loop builds the holder + replay cache + rate limiter, calls SetIntuneIntegration on the SCEPService, and starts the SIGHUP watcher. A deferred sweep stops every watcher at shutdown. * internal/scep/intune/trust_anchor_holder.go (Phase 8.5): TrustAnchorHolder mirrors cmd/server/tls.go::certHolder. RWMutex- guarded pool + Reload that swaps a fresh slice on success + WatchSIGHUP goroutine that responds to the same SIGHUP the existing TLS-cert watcher uses. A bad reload (parse error, expired cert) keeps the OLD pool in place so a half-rotation doesn't take Intune enrollment down — same fail-safe pattern. Operators rotate via the on-disk file then 'kill -HUP <certctl-pid>'. * internal/scep/intune/rate_limit.go (Phase 8.6): hand-rolled sliding-window-log limiter keyed by (Subject, Issuer). 100k-entry map cap (matches replay cache); at-cap drops the bucket whose newest timestamp is the oldest. Default 3 enrollments per 24h covers legitimate first-cert + recovery + post-wipe re-enrollment but blocks bulk enumeration from a compromised Connector signing key. maxN <= 0 disables the limiter for tests + the rare operator who wants no per-device cap. Empty subject short-circuits to allow (defense-in-depth: caller's claim validation rejects empty-subject upstream; no shared bucket on ''). Why hand-rolled instead of golang.org/x/time/rate: the rate package is in go.sum as an indirect transitive but not a direct dep. ~30 LoC of stdlib avoids creating a new direct dep. * internal/service/scep.go (Phase 8.3 + 8.4 + 8.7): - SCEPService gains intuneEnabled / intuneTrust / intuneAudience / intuneValidity / intuneReplayCache / intuneRateLimiter / complianceCheck fields. - SetIntuneIntegration() constructor-time injection wires the per-profile state. Profiles with INTUNE_ENABLED=false never call this method, so they pay zero overhead. - SetComplianceCheck() installs the V3-Pro plug-in (see Phase 8.7). - looksIntuneShaped(): JWT-shape pre-check (length > 200 + exactly two dots). Allowed to false-positive (validator catches malformed → ErrChallengeMalformed); MUST NOT false-negative on real Intune challenges. - dispatchIntuneChallenge(): the load-bearing core. Runs ValidateChallenge → CSR-binding via DeviceMatchesCSR → replay cache CheckAndInsert → per-device Allow → optional ComplianceCheck. Each failure leg increments a typed metric label and emits an audit-friendly Warn log line. - PKCSReq + PKCSReqWithEnvelope + RenewalReqWithEnvelope all call dispatchIntuneChallenge first; on outcome.decided=true they either short-circuit (with a typed-error → SCEPFailInfo mapping) or call processEnrollment with action='scep_pkcsreq_intune' (so audit greps can count Intune-vs-static enrollments). - mapIntuneErrorToFailInfo(): typed-error → SCEPFailInfo per RFC 8894 §3.2.1.4.5 (signature/replay/expired → BadMessageCheck; claim-mismatch → BadRequest; default → BadRequest). - intuneFailReason(): typed-error → metric label ('signature_invalid' / 'expired' / 'rate_limited' / etc.). Default 'malformed' so a previously-unseen error category still surfaces in the metric for follow-up. - ComplianceCheck (Phase 8.7): nil-default no-op gate. V3-Pro plugs in via SetComplianceCheck to call Microsoft Graph's compliance API. Returns (compliant, reason, err). nil-err + compliant=false → CertRep FAILURE + 'compliance' reason in audit. err != nil → fail-safe deny (V3-Pro module is responsible for any 'permit on API failure' policy). * internal/service/scep.go also gains parseCSRForIntune() — small private wrapper around encoding/pem + x509 used by the dispatcher for the claim ↔ CSR binding check (separated from the broader processEnrollment because we want to bind BEFORE consuming the replay-cache slot). Tests (gates: ≥85% coverage on intune package, ≥70% on service): * scep_intune_test.go (in internal/service): 14 dispatcher tests covering happy-path Intune enrollment + static-challenge fallback + tampered-challenge reject + claim-mismatch reject + replay detected + rate-limited + compliance-hook nil-default + compliance- hook denies non-compliant + compliance-hook error fails closed + IntuneEnabled accessor + 'no IntuneEnabled = static path unchanged' regression pin + intuneFailReason mapping for every typed error + looksIntuneShaped boundary cases. * trust_anchor_holder_test.go (in internal/scep/intune): NewLoadsBundle, NewRequiresLogger, NewSurfacesLoadError, ReloadHappyPath, ReloadKeepsOldOnFailure, ReloadKeepsOldOnExpired (the fail-safe semantics that make the SIGHUP path operator-friendly), WatchSIGHUPReloadsPool (real SIGHUP to self with poll-for-swap pattern mirroring cmd/server/tls_test.go), WatchSIGHUPStopIsClean (does NOT fire SIGHUP after stop — same caveat as the TLS test: the Go runtime would otherwise terminate the test runner on the next SIGHUP since signal.Stop has removed the handler). * rate_limit_test.go (in internal/scep/intune): AllowsUpToCap, DistinctKeysIndependent, WindowExpiry, DisabledBypass (maxN=0), NegativeCapDisabled, EmptySubjectShortCircuits (defense-in-depth against an empty-subject DoS chokepoint), DefaultCapsHonored, MapCapEvictsOldest (at-cap eviction branch), ConcurrentRaceFree (50 goroutines × 200 inserts), pruneOlderThan + the no-op case. Verification: * gofmt -l on all touched files: clean * go vet ./... : clean * staticcheck on intune/service/config/cmd-server: clean * go test -count=1 -cover ./internal/scep/intune/...: 94.8% (target ≥85%) * go test -short across intune+service+config+handler+cmd-server: all green * G-3 docs-drift CI guard reproduced locally: docs-only filtered= empty, config-only=empty. The new env vars match the existing CERTCTL_SCEP_ allowlist prefix. Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 8 cowork/scep-rfc8894-intune/progress.md Constitutional rule: 'Always take the complete path, not the easy path' (cowork/CLAUDE.md::Operating Rules) — operator can flip CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED=true and observe the dispatcher pick up Intune-shaped challenges end-to-end with no further code changes. Foundation + plumbing ship together.
This commit is contained in:
@@ -656,6 +656,11 @@ SCEP uses a single URL (`/scep?operation=...`). The handler extracts PKCS#10 CSR
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_RA_KEY_PATH` | (none) | Per-profile RA private key PEM path (mode `0600`). Same semantics as `CERTCTL_SCEP_RA_KEY_PATH` but scoped to one profile. **Required for every profile.** |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_MTLS_ENABLED` | `false` | **Phase 6.5 (opt-in).** When true, certctl exposes a sibling `/scep-mtls/<pathID>` route alongside the standard `/scep/<pathID>` route. The sibling route requires the SCEP client to present an mTLS client cert that chains to `_MTLS_CLIENT_CA_TRUST_BUNDLE_PATH`. The standard route continues to use challenge-password-only auth — operators can run BOTH routes simultaneously for migration / heterogeneous client fleets. mTLS is additive (not a replacement for the challenge password). Designed for enterprise procurement teams that reject "shared password authentication" as a checkbox-fail. Same model Apple's MDM and Cisco's BRSKI use. |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_MTLS_CLIENT_CA_TRUST_BUNDLE_PATH` | (none) | PEM bundle of CA certs that sign the client (device-bootstrap) certs the operator allows to enroll on this profile's `/scep-mtls/<pathID>` route. **Required when `_MTLS_ENABLED=true`.** Operators with multiple bootstrap CAs concatenate them. The startup preflight (`cmd/server/main.go::preflightSCEPMTLSTrustBundle`) validates: file exists, parses as PEM, contains ≥1 cert, none expired. |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED` | `false` | **Phase 8 (opt-in).** When true, this profile routes Intune-shaped challenge passwords (length > 200 + exactly two dots) to the Microsoft Intune Certificate Connector signed-challenge validator. Static challenge passwords still work as a fallback for non-Intune devices in mixed-fleet deployments. Per-profile flag so an operator running corp-laptops via Intune AND IoT devices via static challenge can opt-in on the corp profile only. |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CONNECTOR_CERT_PATH` | (none) | Filesystem path to a PEM bundle of one or more Microsoft Intune Certificate Connector signing certs. **Required when `_INTUNE_ENABLED=true`.** Reloaded on `SIGHUP` (mirrors the server TLS-cert reload pattern). Startup preflight + reload both refuse empty bundles + expired certs and surface the offending subject CN in the error message. Operators who rotate the Connector signing cert update the file on disk then `kill -HUP <certctl-pid>` to apply (no restart required). |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_AUDIENCE` | (empty, audience check disabled) | Expected `aud` claim in the Intune challenge — typically the public SCEP endpoint URL the Connector is configured to call (e.g. `https://certctl.example.com/scep/corp`). Empty disables the check, useful for proxy / load-balancer scenarios where the URL the Connector saw differs from the URL we see. Operators who pin a public URL gain defense-in-depth against challenge re-use across endpoints. |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CHALLENGE_VALIDITY` | `60m` | Maximum age of an Intune challenge, on top of the challenge's own `iat`/`exp` claims. Defense-in-depth: even if the Connector mints a 24h-valid challenge, this caps the window during which a leaked challenge can be replayed. Default matches Microsoft's published Connector defaults. Zero disables the cap (relies entirely on the challenge's `exp`). |
|
||||
| `CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_PER_DEVICE_RATE_LIMIT_24H` | `3` | Maximum enrollments per `(claim.Subject, claim.Issuer)` pair in any rolling 24-hour window. Catches a compromised Connector signing key issuing many DIFFERENT valid challenges for the same device. Default 3 covers legitimate first-cert + recovery + post-wipe re-enrollment. Zero disables the limiter (not recommended for production). |
|
||||
|
||||
---
|
||||
|
||||
|
||||
+73
-4
@@ -420,19 +420,88 @@ challenge+mTLS:
|
||||
the password requirement doesn't go away — the password is still
|
||||
the application-layer auth boundary).
|
||||
|
||||
### Microsoft Intune dynamic-challenge dispatcher (Phase 8, opt-in)
|
||||
|
||||
When SCEP sits behind the Microsoft Intune Certificate Connector, devices
|
||||
present an Intune-issued signed challenge (a JWT-like blob over a JSON
|
||||
claim payload) instead of the static `_CHALLENGE_PASSWORD`. Phase 8 wires
|
||||
a per-profile dispatcher that validates these signed challenges against
|
||||
the Connector's signing-cert trust anchor and binds the asserted device
|
||||
identity to the inbound CSR. Static challenge passwords still work as a
|
||||
fallback so heterogeneous fleets (some Intune-enrolled, some not) keep
|
||||
working.
|
||||
|
||||
**Per-profile env vars** (all default to off; legacy/static-only profiles
|
||||
need no changes):
|
||||
|
||||
```
|
||||
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED=true
|
||||
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CONNECTOR_CERT_PATH=/etc/certctl/intune-corp.pem
|
||||
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_AUDIENCE=https://certctl.example.com/scep/corp
|
||||
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CHALLENGE_VALIDITY=60m
|
||||
CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_PER_DEVICE_RATE_LIMIT_24H=3
|
||||
```
|
||||
|
||||
**Trust-anchor extraction:** the operator extracts the Connector
|
||||
installation's signing cert (from the Connector's certificate store on
|
||||
the Windows host running the Connector — Microsoft does not publish a
|
||||
direct download) and writes a PEM bundle to the configured path.
|
||||
Multiple Connectors in HA = concatenate their certs.
|
||||
|
||||
**Trust-anchor reload:** the holder re-reads the bundle on `SIGHUP` (the
|
||||
same signal that rotates the server's TLS cert). A bad reload (parse
|
||||
error, expired cert) keeps the OLD pool in place — operators get a
|
||||
recoverable failure window rather than a service-down. Rotate the file
|
||||
on disk, then `kill -HUP <certctl-pid>` to apply with no restart.
|
||||
|
||||
**Replay protection:** in-memory cache of seen challenge nonces with TTL
|
||||
= `_CHALLENGE_VALIDITY` (default 60m). Sized for 100k entries, which
|
||||
covers a ~25 RPS Intune fleet's steady-state. The same challenge
|
||||
submitted twice within the TTL is rejected with `ErrChallengeReplay`.
|
||||
|
||||
**Per-device rate limit:** sliding-window-log limiter keyed by
|
||||
`(claim.Subject, claim.Issuer)`. Default 3 enrollments per 24h covers
|
||||
legitimate first-cert + recovery + post-wipe re-enrollment but blocks a
|
||||
compromised Connector signing key from issuing many DIFFERENT valid
|
||||
challenges for the same device. Set the var to `0` to disable.
|
||||
|
||||
**Audit + observability:** Intune enrollments emit
|
||||
`audit_event.action="scep_pkcsreq_intune"` (or
|
||||
`"scep_renewalreq_intune"`) so operators can grep the audit log to count
|
||||
Intune-vs-static enrollments. Per-failure-mode reason flows into the log
|
||||
line; the metric label set is `success / signature_invalid / expired /
|
||||
not_yet_valid / wrong_audience / replay / rate_limited / claim_mismatch
|
||||
/ unknown_version / malformed`.
|
||||
|
||||
**Compliance-state hook (V3-Pro plug-in seam):** a nil-default
|
||||
`ComplianceCheck` field on `SCEPService` lets a future Pro module plug
|
||||
in a Microsoft Graph compliance API call between challenge validation
|
||||
and certificate issuance. V2 ships the seam (one struct field + one
|
||||
setter + one nil-guarded call site) so Pro is plug-in code, not a
|
||||
dispatcher refactor.
|
||||
|
||||
**Mixed-mode (recommended):** keep `_CHALLENGE_PASSWORD` set even when
|
||||
Intune is enabled. Devices that don't go through Intune (manual
|
||||
enrollment, on-prem MDM bridges) continue to enroll via the static path;
|
||||
the dispatcher routes Intune-shaped challenges (length > 200 + exactly
|
||||
two dots) to the validator and falls through to the static compare
|
||||
otherwise.
|
||||
|
||||
### Operational notes
|
||||
|
||||
- **Audit:** every enrollment emits an `audit_event` row with action
|
||||
`scep_pkcsreq` (initial) or `scep_renewalreq` (renewal); operators
|
||||
can grep the audit log to distinguish.
|
||||
can grep the audit log to distinguish. Intune-dispatched enrollments
|
||||
use `scep_pkcsreq_intune` and `scep_renewalreq_intune` respectively.
|
||||
- **Body-size cap:** `http.MaxBytesReader` middleware caps request
|
||||
bodies at `CERTCTL_MAX_BODY_SIZE` (default 1MB); SCEP PKIMessages are
|
||||
typically <50KB so the default cap is generous.
|
||||
- **HTTPS-only:** the SCEP endpoint inherits the TLS-1.3-pinned control
|
||||
plane; there is no plaintext fallback.
|
||||
- **Forward reference:** for Microsoft Intune deployments specifically,
|
||||
see [`scep-intune.md`](scep-intune.md) (the doc Phase 11 of the
|
||||
master bundle ships).
|
||||
- **Forward reference:** for the deeper Intune integration writeup
|
||||
(architecture, migration playbook, troubleshooting,
|
||||
Microsoft-support-statement), see [`scep-intune.md`](scep-intune.md)
|
||||
(Phase 11 of the master bundle).
|
||||
|
||||
## Related docs
|
||||
|
||||
|
||||
Reference in New Issue
Block a user