fix(scep-intune): close 11 audit gaps from 2026-04-29 pre-tag review

Closes the eleven gaps identified in the pre-v2.1.0 audit of the SCEP
RFC 8894 + Intune master bundle (cowork/scep-bundle-gap-closure-prompt.md).
Constitutional rule from cowork/CLAUDE.md::Operating Rules — 'Always
take the complete path, not the easy path' — drove this closure: each
gap was a load-bearing wire that crossed multiple layers (config →
validator → service wire-up → tests → docs) and shipping the bundle
without them would have produced lying-field footguns where operator-
visible config options stored values without affecting behavior.

WHAT LANDS:

Phase A — Clock-skew tolerance (master prompt §15 hazard closure)
  internal/scep/intune/challenge.go: ValidateChallenge migrated from
  positional args to ValidateOptions{} struct; new ClockSkewTolerance
  field with default 0 (strict). 24 call sites updated mechanically.
  Asymmetric application: now+tolerance >= iat AND now-tolerance < exp.
  internal/config/config.go: SCEPIntuneProfileConfig.ClockSkewTolerance
  default 60s + Validate() refusal when >= ChallengeValidity.
  cmd/server/main.go: SetIntuneIntegration signature extended;
  per-profile env-var loader honors CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CLOCK_SKEW_TOLERANCE.
  internal/service/scep.go: intuneClockSkew field + IntuneStatsSnapshot
  surfaces clock_skew_tolerance_ns. web/src/api/types.ts mirrors.
  4 new tests in challenge_test.go covering accept-within-tolerance,
  reject-beyond-tolerance, accept-expired-within-tolerance,
  negative-treated-as-zero defensive normalization.
  docs/scep-intune.md updated with the new env var + time-bounds rule.

Phase B — unknown-version-rejected golden test
  internal/scep/intune/golden_helper_test.go: goldenUnknownVersionPayload
  helper + signGoldenChallengeAny generic signer.
  challenge_golden_test.go: TestGoldenChallenge_UnknownVersionRejected
  uses an in-process ECDSA fixture (the on-disk PEM was generated with
  a Go-stdlib version that produces different ecdsa.GenerateKey bytes
  from the current call). TestRegenerateGoldenFixtures emits the new
  unknown_version fixture file too.

Phase C — Two named Intune e2e tests
  internal/api/handler/scep_intune_e2e_test.go:
    TestSCEPIntuneEnrollment_RateLimited_E2E (cap=2 + 3 attempts; 3rd
    returns FAILURE+badRequest with rate_limited counter ticked)
    TestSCEPIntuneEnrollment_TrustAnchorSIGHUPReload_E2E (rotate
    on-disk PEM + holder.Reload(); old-key challenge fails with
    badMessageCheck; signature_invalid counter ticked)
  intuneE2EFixture struct extended with trustHolder + trustPath fields
  so tests can rotate.

Phase D — Four new ChromeOS hermetic tests (10 total now)
  internal/api/handler/scep_chromeos_test.go:
    _RAKeyMismatch — PKIMessage encrypted to wrong RA cert; handler
      rejects without reaching service.
    _3DESBackwardCompat — RFC 8894 §3.5.2 legacy fallback verified.
    _RSACSR + _ECDSACSR — explicit matrix-pair pinning.
  buildTestECDSACSR helper for ECDSA P-256 CSR construction;
  tripleDESCBCEncrypt mirrors aesCBCEncrypt for 3DES-CBC;
  assertChromeOSPositiveCertRep shared assertion.

Phase E — Per-profile counter isolation test
  internal/api/handler/scep_profile_counter_isolation_test.go:
    TestSCEPHandler_PerProfileIntuneCountersIsolated wires two
    SCEPService instances + drives distinct PKIMessages + asserts
    counter isolation. Guards against a future cmd/server/main.go
    refactor that shares a *intuneCounterTab across profiles.
  buildPerProfileIntuneFixture parameterized helper.

Phase F — Server-boot regression tests
  cmd/server/preflight_scep_intune_test.go: 3 named tests covering
  disabled-backward-compat, broken-config-with-PathID, expired-cert
  refusal. preflightSCEPIntuneTrustAnchor signature extended with
  pathID arg so error messages carry PathID= for operator log-grep.

Phase G — docs/connectors.md
  Four new subsections under §EST/SCEP Integration: multi-profile
  dispatch + mTLS sibling route + Intune Connector dispatcher + SCEP
  probe in network scanner. Each has a one-paragraph operator
  explanation + an env-var or endpoint table.

Phase H — Coverage uplift
  internal/service/scep_probe_persist_test.go: 5 unit tests on
  persistProbeResult (nil-safe + nil-repo-safe + repo-error swallow +
  nil-logger guard) + ListRecentSCEPProbes (empty-slice-not-nil + repo
  pass-through) + describeCertAlgorithm (RSA/ECDSA/QF1008-nil-curve
  defensive branch/Ed25519/DSA/empty). CI gates (service ≥70, handler
  ≥75) PASS at 70.9% / 79.3%.

Phase I — deploy/test integration variant
  deploy/test/scep_intune_e2e_test.go (//go:build integration):
    TestSCEPIntuneEnrollment_Integration + _RateLimited_Integration
    against the live docker-compose certctl container. Skip-when-
    stack-missing semantics so sandbox + CI both work.
  deploy/docker-compose.test.yml: new e2eintune SCEP profile env
  vars + bind-mount of deploy/test/fixtures/.
  deploy/test/fixtures/README.md: documents the deterministic trust
  anchor regeneration recipe.

VERIFICATION (sandbox):
  gofmt -d        — clean for all changed files
  staticcheck     — clean for intune + handler + config + service +
                    cmd/server packages
  go vet          — clean for the same packages
  go test -short  — green for intune (95.3% cov), service (70.9%),
                    handler (79.3%), config (94.0%), cmd/server (boot
                    path; my preflight tests cover the directly-
                    testable function), pkcs7 (80.5% informational)

DEFERRED (per closure prompt §7 out-of-scope):
  - V3-Pro Conditional Access gating + Microsoft Graph integration
  - Standalone certctl-scan CLI binary
  - OCSP rate-limiting, OCSP stapling, delta CRLs

Spec preserved at cowork/scep-bundle-gap-closure-prompt.md;
journal at cowork/scep-rfc8894-intune/progress.md (audit-closure
section appended).
This commit is contained in:
Shankar
2026-04-29 20:28:53 +00:00
parent 9fcea95708
commit 444942eab8
20 changed files with 2143 additions and 74 deletions
+78 -19
View File
@@ -166,6 +166,56 @@ func unmarshalChallengeV1(payload []byte) (*ChallengeClaim, error) {
return c, nil
}
// ValidateOptions parameterizes ValidateChallenge. Introduced in the
// 2026-04-29 SCEP RFC 8894 + Intune master-prompt §15 hazard closure
// to add a configurable clock-skew tolerance without continuing to
// pile positional arguments onto the validator. Future per-validation
// knobs (e.g. an explicit version allow-list, a custom sig-alg policy)
// land here without churning every call site.
//
// Field defaults via the zero value MUST preserve the strict pre-§15
// behavior — i.e. a caller that passes ValidateOptions{Trust: ..., Now: ...}
// with no other fields gets exactly the iat/exp/audience semantics that
// shipped before the tolerance was introduced. This is a load-bearing
// contract for the existing test suite and any out-of-tree caller that
// hasn't migrated to opt-in tolerance.
type ValidateOptions struct {
// Trust is the pool of operator-supplied Connector signing-cert public
// keys to verify the challenge signature against. Required (an empty
// pool returns ErrChallengeSignature with a "no trust anchors
// configured" message so the operator boot-time misconfig is
// distinguishable from an in-the-wild signature mismatch).
Trust []*x509.Certificate
// ExpectedAudience is the SCEP endpoint URL the challenge's "aud"
// claim is expected to match. Empty disables the audience check
// (proxy / load-balancer scenarios where the URL the Connector saw
// differs from the URL we see, plus test convenience).
ExpectedAudience string
// Now is the wall-clock time used for the iat/exp comparisons.
// Injected (rather than read from time.Now() inside the function) so
// tests are deterministic and the per-profile dispatcher can pin a
// single "request started at" timestamp across the validate + replay
// + rate-limit triplet.
Now time.Time
// ClockSkewTolerance widens the iat/exp window by ±|tolerance| to
// absorb modest clock drift between the Microsoft Intune Certificate
// Connector and the certctl host. Default zero preserves strict
// pre-§15 behaviour. Operators wire this from the per-profile env
// var CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CLOCK_SKEW_TOLERANCE
// (default 60s — see internal/config/config.go).
//
// Asymmetric application: an iat in the future is accepted when
// `now + tolerance >= iat` (so a Connector clock 30s ahead of certctl
// passes with tolerance=60s). An exp in the past is accepted when
// `now - tolerance < exp` (so a Connector clock 30s behind certctl
// passes too). Negative tolerance is treated as zero (a defensive
// no-op rather than a footgun that tightens the window).
ClockSkewTolerance time.Duration
}
// ValidateChallenge runs the full Intune-challenge validation pipeline:
//
// 1. ParseChallenge(raw) — JWT compact deserialize
@@ -173,9 +223,10 @@ func unmarshalChallengeV1(payload []byte) (*ChallengeClaim, error) {
// trust-anchor cert's public key (try each until one verifies)
// 3. Extract version claim via the lightweight versioned-prelude
// 4. Dispatch to the per-version unmarshaler (v1 today)
// 5. Time bounds: now ≥ iat AND now < exp (with stdlib RFC 3339 grace)
// 6. Audience: claim.Audience == expectedAudience (when expectedAudience
// is non-empty; empty disables the check, useful for tests)
// 5. Time bounds: now+tolerance ≥ iat AND now-tolerance < exp
// (tolerance defaults to zero — strict — and widens via opts)
// 6. Audience: claim.Audience == opts.ExpectedAudience (when
// ExpectedAudience is non-empty; empty disables the check)
//
// Returns *ChallengeClaim on success, typed error on failure (caller can
// errors.Is the specific dimension).
@@ -184,8 +235,8 @@ func unmarshalChallengeV1(payload []byte) (*ChallengeClaim, error) {
// claim's Nonce to a *ReplayCache.CheckAndInsert. We deliberately don't
// own the cache here so the validator stays stateless + testable; the
// handler glues parser + cache together.
func ValidateChallenge(raw string, trust []*x509.Certificate, expectedAudience string, now time.Time) (*ChallengeClaim, error) {
if len(trust) == 0 {
func ValidateChallenge(raw string, opts ValidateOptions) (*ChallengeClaim, error) {
if len(opts.Trust) == 0 {
return nil, fmt.Errorf("%w: no trust anchors configured", ErrChallengeSignature)
}
@@ -212,7 +263,7 @@ func ValidateChallenge(raw string, trust []*x509.Certificate, expectedAudience s
return nil, fmt.Errorf("%w: header JSON: %v", ErrChallengeMalformed, err)
}
if err := verifyChallengeSignature(hdr.Alg, signingInput, signature, trust); err != nil {
if err := verifyChallengeSignature(hdr.Alg, signingInput, signature, opts.Trust); err != nil {
return nil, err
}
@@ -230,26 +281,34 @@ func ValidateChallenge(raw string, trust []*x509.Certificate, expectedAudience s
return nil, err
}
// Time bounds. The Connector's signed iat/exp ARE authoritative;
// we don't impose a separate validity cap here (the operator can
// add one in the handler if defense-in-depth is wanted, e.g. via
// SCEPProfileConfig.IntuneChallengeValidity in Phase 8).
if !claim.IssuedAt.IsZero() && now.Before(claim.IssuedAt) {
return nil, fmt.Errorf("%w: iat=%s now=%s", ErrChallengeNotYetValid,
claim.IssuedAt.Format(time.RFC3339), now.Format(time.RFC3339))
// Time bounds. Tolerance defaults to zero (strict) and is normalized
// to absolute value so a misconfigured negative value is a defensive
// no-op rather than a footgun that tightens the window.
tolerance := opts.ClockSkewTolerance
if tolerance < 0 {
tolerance = -tolerance
}
if !claim.ExpiresAt.IsZero() && !now.Before(claim.ExpiresAt) {
return nil, fmt.Errorf("%w: exp=%s now=%s", ErrChallengeExpired,
claim.ExpiresAt.Format(time.RFC3339), now.Format(time.RFC3339))
now := opts.Now
// iat check: a future iat is accepted when (now + tolerance) >= iat.
// Equivalent to: reject when (now + tolerance) < iat.
if !claim.IssuedAt.IsZero() && now.Add(tolerance).Before(claim.IssuedAt) {
return nil, fmt.Errorf("%w: iat=%s now=%s tolerance=%s", ErrChallengeNotYetValid,
claim.IssuedAt.Format(time.RFC3339), now.Format(time.RFC3339), tolerance)
}
// exp check: a past exp is accepted when (now - tolerance) < exp.
// Equivalent to: reject when (now - tolerance) >= exp.
if !claim.ExpiresAt.IsZero() && !now.Add(-tolerance).Before(claim.ExpiresAt) {
return nil, fmt.Errorf("%w: exp=%s now=%s tolerance=%s", ErrChallengeExpired,
claim.ExpiresAt.Format(time.RFC3339), now.Format(time.RFC3339), tolerance)
}
// Audience binds the challenge to a specific SCEP endpoint URL. An
// empty expectedAudience disables the check (test convenience + the
// empty ExpectedAudience disables the check (test convenience + the
// Phase 8 config allows operator opt-out for proxy / load-balancer
// scenarios where the URL the Connector saw isn't the URL we see).
if expectedAudience != "" && claim.Audience != "" && claim.Audience != expectedAudience {
if opts.ExpectedAudience != "" && claim.Audience != "" && claim.Audience != opts.ExpectedAudience {
return nil, fmt.Errorf("%w: claim=%q expected=%q", ErrChallengeWrongAudience,
claim.Audience, expectedAudience)
claim.Audience, opts.ExpectedAudience)
}
return claim, nil