Commit Graph

7 Commits

Author SHA1 Message Date
shankar0123 2b4d0069d9 security(scep-intune): annotate verifyES256/RS256 SHA-256 as RFC-mandated (CodeQL #21 false positive)
CodeQL alert #21 (go/weak-sensitive-data-hashing, severity: High)
flagged the sha256.Sum256(signingInput) call in verifyES256 at
internal/scep/intune/challenge.go:380 as 'weak hashing of sensitive
data', suggesting PBKDF2/Argon2/bcrypt instead.

This is a CodeQL false positive. The CodeQL query triggers when
SHA-256 is used near *x509.Certificate (the trust pool) and infers
'this might be password hashing.' But the actual context is JWS
signature verification:

  - verifyRS256 implements RFC 7518 §3.3 — 'RSASSA-PKCS1-v1_5
    using SHA-256'. SHA-256 is spec-mandated.
  - verifyES256 implements RFC 7518 §3.4 — 'ECDSA using P-256
    and SHA-256'. SHA-256 is spec-mandated.
  - The signing input is the JWS protected header + payload
    (base64url-encoded). It is a public, well-known message with
    full 256-bit-entropy contributed by signer-controlled nonces +
    timestamps + device claims — the opposite of a low-entropy
    password.
  - The output is verified against an asymmetric signature
    (rsa.VerifyPKCS1v15 / ecdsa.Verify), not compared to a
    pre-computed hash digest. This is signature verification,
    not password hashing.
  - Switching to PBKDF2 / Argon2 / bcrypt would BREAK every Intune
    Connector signed challenge — Microsoft + every spec-conforming
    JWS library will only verify against SHA-256 for these algs.

Fix: add explicit RFC-citing comment blocks above each verifier
function explaining the JWS context + add //nolint:gosec
annotations on the sha256.Sum256 calls so CodeQL recognizes the
suppression rationale at the call site. The annotation cites the
specific RFC clause (7518 §3.3 / §3.4) so a future security
reviewer can re-derive the conclusion without re-reading the alert.

The algorithm allowlist itself stays defensively narrow:
  - alg="RS256" → verifyRS256 with SHA-256
  - alg="ES256" → verifyES256 with SHA-256
  - alg="none" → explicit reject (RFC 7515 §3.6 attack vector)
  - any other alg → reject as unsupported

Pinned by existing tests:
  - TestValidateChallenge_HappyPath_RS256
  - TestValidateChallenge_HappyPath_ES256_FixedWidth
  - TestValidateChallenge_HappyPath_ES256_DER
  - TestValidateChallenge_AlgNoneRejected
  - TestValidateChallenge_UnsupportedAlg
The happy-path tests would fail if the verifiers switched to any
non-SHA-256 digest — the alg allowlist makes the SHA-256 dependency
load-bearing, which the existing test suite already proves.

Verified locally:
  gofmt: clean.
  go vet ./internal/scep/intune/...: exit 0.
  go test -short -count=1 ./internal/scep/intune/...: PASS
    (every existing challenge_test.go subtest still green).

Reference: https://github.com/certctl-io/certctl/security/code-scanning/21
Closes CodeQL alert #21 as a documented false positive — the
//nolint annotations + RFC-citing comments are the load-bearing
suppression. Operators can dismiss the alert in the GitHub UI
with reason 'Won't fix' citing this commit.
2026-05-04 05:08:02 +00:00
shankar0123 8b75e0311b chore: rename Go module path to github.com/certctl-io/certctl
Mechanical sed across the main go.mod's module declaration, the f5-mock-icontrol
sub-module's go.mod, every Go file's import path (361 files), and a rebuild of
the checked-in f5-mock-icontrol binary so its embedded build-info reflects the
new module path. No behavior change.

Choice B from cowork/transfer-certctl-to-org.md, executed 2026-05-04. Choice A
(keep module path declared as github.com/shankar0123/certctl regardless of
repo URL) shipped on the day of the org transfer (2026-05-03) since we had no
external Go consumers; this commit closes that deferral.

Backward-compat: GitHub HTTP redirects continue to forward
github.com/shankar0123/certctl → github.com/certctl-io/certctl at the URL
level, but Go's module proxy uses the path declared in go.mod as the
canonical name. Pre-fix, anyone trying `go get github.com/certctl-io/certctl/...`
hit a "module path mismatch" error because go.mod said
github.com/shankar0123/certctl and the URL they fetched it from said
certctl-io/certctl. Post-fix, the canonical name and the URL agree, so
go get / go install / external Go consumers / Go-tooling integrations
work cleanly via either the new path (preferred) or the old path (which
redirects and Go follows the redirect for source fetch).

Anyone still importing the old path inside their own code keeps working
provided they update their go.mod's `require` line to match — the module
path declared in their consumer's go.sum / go.mod is the authoritative
import name, so a mass sed across their import statements is the migration
on the consumer side. No external consumers exist today.

Diff shape:
  361 *.go files  — import path replacement only
    2 go.mod     — module declaration replacement only
    1 binary     — deploy/test/f5-mock-icontrol/f5-mock-icontrol rebuilt
                   so embedded build-info reflects the new path (8618965 vs
                   8618933 bytes; 32-byte diff is the build-info change)

  Total: 364 files, 730 insertions / 730 deletions, net-zero size, pure
  mechanical substitution.

Verification:
  gofmt: 17 files needed re-alignment after sed (the new path is one char
    shorter than the old, so column-aligned import groups drifted). Applied
    `gofmt -w` to fix.
  go mod tidy: clean exit on both modules.
  go vet ./...: clean exit.
  go build ./...: clean exit.
  go test -short -count=1 on representative packages: all green
    (internal/domain, internal/validation, internal/crypto, internal/crypto/signer,
    cmd/agent). Test output now reads `ok github.com/certctl-io/certctl/...`
    confirming the module path resolves correctly.
  binary: f5-mock-icontrol rebuilt; `strings | grep shankar0123` returns
    nothing; `strings | grep certctl-io/certctl` shows the new module path
    embedded in build-info.

Files intentionally NOT touched in this commit:
  README.md / CHANGELOG.md / docs/ / etc. — already swept to certctl-io
    URLs in commit 0729ee4 (the post-transfer URL refresh). This commit is
    purely the Go-tooling layer.
  Scarf pixels (`shankar0123.docker.scarf.sh/...`) — Scarf-account
    namespace, not a Go import or GitHub repo URL. Stays.

This is a non-blocking, non-customer-impacting change. Operators pulling
container images, running `make verify`, hitting the API, or installing the
agent see no functional difference. Only Go-tooling consumers (none today)
are affected, and they're enabled — not broken — by this commit.
2026-05-04 00:30:29 +00:00
shankar0123 aa139ee0d9 EST RFC 7030 hardening master bundle Phases 2-4: end-to-end mTLS sibling
route + RFC 9266 channel binding + HTTP Basic enrollment-password +
per-source-IP failed-auth limit + per-(CN, sourceIP) sliding-window cap.

Two new shared packages so EST + Intune share infrastructure:
- internal/cms/ — RFC 9266 tls-exporter extractor (ExtractTLSExporter
  with stdlib-panic recovery for synthetic ConnectionStates) +
  CSR-side channel-binding parser via raw TBSCertificationRequestInfo
  walk (the stdlib's csr.Attributes can't represent the OCTET STRING
  binding value), VerifyChannelBinding composite, EmbedChannel-
  BindingAttribute fixture helper, typed sentinel errors for missing
  / mismatch / not-TLS-1.3 mapped to HTTP 400 / 409 / 426 in handler.
- internal/trustanchor/ — extracted from scep/intune/trust_anchor*.go
  so the EST mTLS sibling route + Intune dispatcher share the same
  SIGHUP-reloadable PEM bundle primitive. intune.TrustAnchorHolder
  is now `= trustanchor.Holder` (type alias) + NewTrustAnchorHolder =
  trustanchor.New (function alias) — every existing call site compiles
  unchanged. Intune's LoadTrustAnchor is a thin wrapper over
  trustanchor.LoadBundle. White-box tests moved to the new package.
- internal/ratelimit/ — extracted from scep/intune/rate_limit.go (this
  was Phase 4.1, in the same bundle). intune.PerDeviceRateLimiter
  is now a thin wrapper preserving the (subject, issuer)→key
  composition; EST handler reaches for SlidingWindowLimiter directly.

ESTHandler grew six optional fields wired by per-profile setters
(SetMTLSTrust / SetChannelBindingRequired / SetEnrollmentPassword /
SetSourceIPRateLimiter / SetPerPrincipalRateLimiter / SetLabelForLog)
plus four new mTLS-route methods (CACertsMTLS / SimpleEnrollMTLS /
SimpleReEnrollMTLS / CSRAttrsMTLS); shared internal pipeline
handleEnrollOrReEnroll(reEnroll, viaMTLS) keeps the auth/binding/
rate-limit gates DRY. New router method RegisterESTMTLSHandlers
registers /.well-known/est-mtls/<PathID>/{cacerts,simpleenroll,
simplereenroll,csrattrs}; AuthExemptDispatchPrefixes extends the
no-auth chain to /.well-known/est-mtls.

cmd/server/main.go's EST loop wires per-profile mTLS holder +
channel-binding policy + per-principal limiter + (when EnrollmentPassword
non-empty) Basic + source-IP limiter; new preflightESTMTLSClientCATrust-
Bundle returns *trustanchor.Holder so SIGHUP rotates the EST mTLS
bundle live without restart. SCEP + EST mTLS profiles now share a
single union mtlsUnionPoolForTLS passed to buildServerTLSConfigWithMTLS
(replaces the protocol-specific scepMTLSUnionPoolForTLS); per-handler
re-verify enforces "cert must chain to THIS profile's bundle" so
cross-protocol bleed is blocked at the application layer even though
the TLS layer trusts certs from either pool's union.

Phase 3.3 source-IP failed-Basic limiter defaults: 10 attempts / 1h
/ 50k tracked IPs (no env var; tunable in a follow-up). Phase 4.2
per-principal limiter cap from CERTCTL_EST_PROFILE_<NAME>_RATE_
LIMIT_PER_PRINCIPAL_24H (existing field, Phase 1 shipped).

New tests:
- internal/cms/channelbinding_test.go: extractor + CSR-side parser +
  composite + TLS-1.3 round-trip end-to-end + EmbedChannelBinding-
  Attribute round-trip
- internal/trustanchor/holder_test.go: parseBundlePEM white-box +
  LoadBundle + Holder Get/Pool/SetLabelForLog/Reload-happy/
  Reload-keeps-old-on-failure/Reload-keeps-old-on-expired/
  WatchSIGHUP-reloads-pool/WatchSIGHUP-stop-clean
- internal/api/handler/est_hardening_test.go: 16 named cases covering
  mTLS no-trust-pool 500 + no-cert 401 + cross-profile cert 401 +
  happy-path 200 + CACertsMTLS auth gate + CSRAttrsMTLS auth gate +
  channel-binding required-absent-rejected + not-required-absent-
  allowed + writeChannelBindingError mapping + Basic no-header 401
  + Basic wrong-password 401 + Basic correct-200 + Basic-no-password
  no-gate + per-IP failed-attempt lockout 429 + per-principal
  blocks-after-cap + different-principals-independent + no-limiter-
  unbounded.

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
disk-space testcontainers download), staticcheck clean for
cms/trustanchor/api/handler/api/router/scep/intune/ratelimit/
cmd/server, go test -short -count=1 green for cms/trustanchor/
api/handler/api/router/scep/intune/ratelimit/service. G-3
docs-drift guard reproduced locally clean (Phase 1 already
documented every new env var; Phases 2-4 added zero new env vars).
2026-04-29 23:15:35 +00:00
shankar0123 530593507b fix(scep-intune): close 11 audit gaps from 2026-04-29 pre-tag review
Closes the eleven gaps identified in the pre-v2.1.0 audit of the SCEP
RFC 8894 + Intune master bundle (cowork/scep-bundle-gap-closure-prompt.md).
Constitutional rule from cowork/CLAUDE.md::Operating Rules — 'Always
take the complete path, not the easy path' — drove this closure: each
gap was a load-bearing wire that crossed multiple layers (config →
validator → service wire-up → tests → docs) and shipping the bundle
without them would have produced lying-field footguns where operator-
visible config options stored values without affecting behavior.

WHAT LANDS:

Phase A — Clock-skew tolerance (master prompt §15 hazard closure)
  internal/scep/intune/challenge.go: ValidateChallenge migrated from
  positional args to ValidateOptions{} struct; new ClockSkewTolerance
  field with default 0 (strict). 24 call sites updated mechanically.
  Asymmetric application: now+tolerance >= iat AND now-tolerance < exp.
  internal/config/config.go: SCEPIntuneProfileConfig.ClockSkewTolerance
  default 60s + Validate() refusal when >= ChallengeValidity.
  cmd/server/main.go: SetIntuneIntegration signature extended;
  per-profile env-var loader honors CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CLOCK_SKEW_TOLERANCE.
  internal/service/scep.go: intuneClockSkew field + IntuneStatsSnapshot
  surfaces clock_skew_tolerance_ns. web/src/api/types.ts mirrors.
  4 new tests in challenge_test.go covering accept-within-tolerance,
  reject-beyond-tolerance, accept-expired-within-tolerance,
  negative-treated-as-zero defensive normalization.
  docs/scep-intune.md updated with the new env var + time-bounds rule.

Phase B — unknown-version-rejected golden test
  internal/scep/intune/golden_helper_test.go: goldenUnknownVersionPayload
  helper + signGoldenChallengeAny generic signer.
  challenge_golden_test.go: TestGoldenChallenge_UnknownVersionRejected
  uses an in-process ECDSA fixture (the on-disk PEM was generated with
  a Go-stdlib version that produces different ecdsa.GenerateKey bytes
  from the current call). TestRegenerateGoldenFixtures emits the new
  unknown_version fixture file too.

Phase C — Two named Intune e2e tests
  internal/api/handler/scep_intune_e2e_test.go:
    TestSCEPIntuneEnrollment_RateLimited_E2E (cap=2 + 3 attempts; 3rd
    returns FAILURE+badRequest with rate_limited counter ticked)
    TestSCEPIntuneEnrollment_TrustAnchorSIGHUPReload_E2E (rotate
    on-disk PEM + holder.Reload(); old-key challenge fails with
    badMessageCheck; signature_invalid counter ticked)
  intuneE2EFixture struct extended with trustHolder + trustPath fields
  so tests can rotate.

Phase D — Four new ChromeOS hermetic tests (10 total now)
  internal/api/handler/scep_chromeos_test.go:
    _RAKeyMismatch — PKIMessage encrypted to wrong RA cert; handler
      rejects without reaching service.
    _3DESBackwardCompat — RFC 8894 §3.5.2 legacy fallback verified.
    _RSACSR + _ECDSACSR — explicit matrix-pair pinning.
  buildTestECDSACSR helper for ECDSA P-256 CSR construction;
  tripleDESCBCEncrypt mirrors aesCBCEncrypt for 3DES-CBC;
  assertChromeOSPositiveCertRep shared assertion.

Phase E — Per-profile counter isolation test
  internal/api/handler/scep_profile_counter_isolation_test.go:
    TestSCEPHandler_PerProfileIntuneCountersIsolated wires two
    SCEPService instances + drives distinct PKIMessages + asserts
    counter isolation. Guards against a future cmd/server/main.go
    refactor that shares a *intuneCounterTab across profiles.
  buildPerProfileIntuneFixture parameterized helper.

Phase F — Server-boot regression tests
  cmd/server/preflight_scep_intune_test.go: 3 named tests covering
  disabled-backward-compat, broken-config-with-PathID, expired-cert
  refusal. preflightSCEPIntuneTrustAnchor signature extended with
  pathID arg so error messages carry PathID= for operator log-grep.

Phase G — docs/connectors.md
  Four new subsections under §EST/SCEP Integration: multi-profile
  dispatch + mTLS sibling route + Intune Connector dispatcher + SCEP
  probe in network scanner. Each has a one-paragraph operator
  explanation + an env-var or endpoint table.

Phase H — Coverage uplift
  internal/service/scep_probe_persist_test.go: 5 unit tests on
  persistProbeResult (nil-safe + nil-repo-safe + repo-error swallow +
  nil-logger guard) + ListRecentSCEPProbes (empty-slice-not-nil + repo
  pass-through) + describeCertAlgorithm (RSA/ECDSA/QF1008-nil-curve
  defensive branch/Ed25519/DSA/empty). CI gates (service ≥70, handler
  ≥75) PASS at 70.9% / 79.3%.

Phase I — deploy/test integration variant
  deploy/test/scep_intune_e2e_test.go (//go:build integration):
    TestSCEPIntuneEnrollment_Integration + _RateLimited_Integration
    against the live docker-compose certctl container. Skip-when-
    stack-missing semantics so sandbox + CI both work.
  deploy/docker-compose.test.yml: new e2eintune SCEP profile env
  vars + bind-mount of deploy/test/fixtures/.
  deploy/test/fixtures/README.md: documents the deterministic trust
  anchor regeneration recipe.

VERIFICATION (sandbox):
  gofmt -d        — clean for all changed files
  staticcheck     — clean for intune + handler + config + service +
                    cmd/server packages
  go vet          — clean for the same packages
  go test -short  — green for intune (95.3% cov), service (70.9%),
                    handler (79.3%), config (94.0%), cmd/server (boot
                    path; my preflight tests cover the directly-
                    testable function), pkcs7 (80.5% informational)

DEFERRED (per closure prompt §7 out-of-scope):
  - V3-Pro Conditional Access gating + Microsoft Graph integration
  - Standalone certctl-scan CLI binary
  - OCSP rate-limiting, OCSP stapling, delta CRLs

Spec preserved at cowork/scep-bundle-gap-closure-prompt.md;
journal at cowork/scep-rfc8894-intune/progress.md (audit-closure
section appended).
2026-04-29 20:28:53 +00:00
shankar0123 e0d00717c7 feat(scep-intune): golden-file tests + e2e harness against fixture trust anchor
Phase 10 of the SCEP RFC 8894 + Intune master bundle. Adds reproducible
testdata fixtures + a hermetic end-to-end test that exercises the full
handler → service → dispatcher → CertRep wire path.

Phase 10.1 — Golden-file tests (internal/scep/intune/):

  * testdata/intune_trust_anchor.pem — deterministic ECDSA P-256 cert
    seeded from a constant byte string (sha256-derived PRNG); regenerates
    byte-identical PEM bytes across runs.
  * testdata/intune_challenge_golden_success.txt — valid challenge,
    iat/exp window covers goldenChallengeNow.
  * testdata/intune_challenge_golden_expired.txt — same trust anchor +
    payload shape but iat/exp shifted into the past.
  * testdata/intune_challenge_golden_tampered_sig.txt — payload bytes
    intact, last sig byte flipped.

  challenge_golden_test.go reads each fixture and asserts:
    - Success → ValidateChallenge returns a populated claim
      (DeviceName / Subject / SANDNS pinned to the documented values).
    - Expired → errors.Is(err, ErrChallengeExpired).
    - Tampered → errors.Is(err, ErrChallengeSignature).
    - Plus two defensive permutations: WrongAudienceReuse pins the
      audience-check ordering after a successful sig verify;
      RotatedTrustAnchorRejects pins the holder-rotation failure mode
      using a freshly-generated unrelated trust cert.

  golden_helper_test.go contains the deterministic-PRNG, ES256 signer,
  fixture-load helpers, and the regeneration target. Operators flip
  fixtures via:
    go test -run='^TestRegenerateGoldenFixtures$'             ./internal/scep/intune/... -args -update-golden

  Why ECDSA + a deterministic seed: a hand-pasted base64 blob would
  break on every Go stdlib bump (json.Marshal field ordering, ASN.1
  encoding edge cases). Generating from a pinned seed gives
  reproducible PEM bytes; only the ECDSA signature suffix varies
  across regenerations (Go's stdlib doesn't expose RFC 6979
  deterministic-k cleanly), and ValidateChallenge re-verifies the
  signature on every read so it doesn't matter.

  intune package coverage: 95.2% (was 94.8%).

Phase 10.2 — Hermetic end-to-end test (internal/api/handler/scep_intune_e2e_test.go):

  Departs from the spec's deploy/test/ location because the handler
  package already has the chromeOS-shape PKIMessage builders (buildTestCSR
  / buildEnvelopedDataForTest / buildSignedDataForTest / aesCBCEncrypt /
  postPKIOperation). Putting the e2e test in the handler package lets it
  reuse those helpers AND run in the default 'go test ./...' sweep —
  every CI run exercises the full Intune dispatcher chain. The
  deploy/test/ location is reserved for a future docker-compose-driven
  variant that would mount a fixture trust anchor into the running
  container; this hermetic version proves the wire works without that
  dependency.

  intuneE2EFixture stands up:
    - A real Intune Connector signing keypair (ECDSA P-256) + cert
      written to a temp PEM file the TrustAnchorHolder loads at startup.
    - A real RA pair the SCEPHandler decrypts EnvelopedData with.
    - A fixture issuer connector (intuneE2EIssuerConnector) that
      records every IssueCertificate call + returns a deterministic
      child cert chained to a fixture CA. Implements the full
      IssuerConnector interface (IssueCertificate / RenewCertificate /
      RevokeCertificate / GenerateCRL / SignOCSPResponse / GetRenewalInfo)
      with the non-issuance methods stubbed.
    - A capturing AuditRepository that records every Create call so
      the test can assert action='scep_pkcsreq_intune' was emitted.
    - A real SCEPService with SetIntuneIntegration wired to a real
      ReplayCache + PerDeviceRateLimiter.

  Three test scenarios:

    1. TestSCEPIntuneEnrollment_E2E — the documented happy path. Forge
       a valid Intune-shaped challenge (ES256 signed, length > 200, two
       dots — satisfies looksIntuneShaped), build a CSR with CN matching
       the claim's device_name, POST through HandleSCEP, decode the
       CertRep, assert pkiStatus=SUCCESS + issuer.issued has one entry
       + audit log carries 'scep_pkcsreq_intune' + IntuneStats.counters[
       'success']==1.

    2. TestSCEPIntuneEnrollment_ClaimMismatchRejected_E2E — same setup
       but CSR CN is 'attacker-host.example.com'. Dispatcher must
       reject with CertRep FAILURE+BadRequest (mapIntuneErrorToFailInfo:
       ErrClaimCNMismatch → BadRequest), no issuance, IntuneStats
       counters['claim_mismatch']==1.

    3. TestSCEPIntuneEnrollment_TamperedSignature_E2E — flip a byte in
       the JWT signature segment of the Intune challenge before
       wrapping it in the PKIMessage. Dispatcher rejects with
       FAILURE+BadMessageCheck (signature errors → BadMessageCheck per
       the same mapping table).

  Important sanity learning during construction: the buildTestCSR
  helper from scep_chromeos_test.go does NOT populate DNSNames on the
  CSR. The success claim therefore omits san_dns to avoid tripping
  ErrClaimSANDNSMismatch (claim says ['x'], CSR has nothing). The
  claim_mismatch sibling test exercises the SAN-dimension via the
  CN mismatch path; coverage of explicit SANDNS mismatches stays in
  the unit tests in claim_test.go where the helper builds CSRs with
  full SANs.

Verification:
  * gofmt clean on touched files
  * go vet ./internal/scep/intune/... ./internal/api/handler/...: clean
  * staticcheck: clean
  * go test -count=1 -cover ./internal/scep/intune/...: 95.2%
  * 5 golden tests + 3 e2e tests all pass
  * No new env vars (G-3 docs guard not triggered)
  * No new HTTP routes (openapi-parity guard not triggered)
  * Sibling test packages (service + router) still green

Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 10
      cowork/scep-rfc8894-intune/progress.md
2026-04-29 16:55:52 +00:00
shankar0123 7612da783a feat(scep-intune): per-profile dispatcher + SIGHUP reload + per-device rate limit + compliance hook seam
Phase 8 of the SCEP RFC 8894 + Intune master bundle. Wires the
internal/scep/intune validator from Phase 7 into the SCEPService
dispatch path, with a SIGHUP-reloadable trust anchor holder, a
per-(Subject, Issuer) sliding-window rate limiter, and a nil-default
ComplianceCheck seam for V3-Pro.

Operator-visible surface (per-profile, all default to off):

  CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED=true
  CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CONNECTOR_CERT_PATH=/etc/certctl/intune.pem
  CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_AUDIENCE=https://certctl.example.com/scep/corp
  CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_CHALLENGE_VALIDITY=60m
  CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_PER_DEVICE_RATE_LIMIT_24H=3

Per-profile dispatch (Phase 8.8): an operator running corp-laptops
through Intune AND IoT devices through static challenge configures
INTUNE_ENABLED=true on the corp profile only — the IoT profile's
PKCSReq path skips the dispatcher entirely. Mirrors the per-profile
shape established by Phase 1.5.

Wire-in surfaces:

  * config.go (Phase 8.1): SCEPProfileConfig.Intune sub-config of
    type SCEPIntuneProfileConfig (Enabled/ConnectorCertPath/Audience/
    ChallengeValidity/PerDeviceRateLimit24h). Loaded from the indexed
    CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_* env-var family. Per-profile
    Validate gate refuses INTUNE_ENABLED=true with empty ConnectorCertPath
    OR negative PerDeviceRateLimit24h.

  * cmd/server/main.go (Phase 8.2 + wire-in): preflightSCEPIntuneTrustAnchor
    helper mirrors preflightSCEPRACertKey/preflightSCEPMTLSTrustBundle
    shape — fail-loud at boot when the trust anchor file is missing /
    unreadable / empty / contains an expired cert. The per-profile loop
    builds the holder + replay cache + rate limiter, calls
    SetIntuneIntegration on the SCEPService, and starts the SIGHUP
    watcher. A deferred sweep stops every watcher at shutdown.

  * internal/scep/intune/trust_anchor_holder.go (Phase 8.5):
    TrustAnchorHolder mirrors cmd/server/tls.go::certHolder. RWMutex-
    guarded pool + Reload that swaps a fresh slice on success +
    WatchSIGHUP goroutine that responds to the same SIGHUP the existing
    TLS-cert watcher uses. A bad reload (parse error, expired cert)
    keeps the OLD pool in place so a half-rotation doesn't take Intune
    enrollment down — same fail-safe pattern. Operators rotate via the
    on-disk file then 'kill -HUP <certctl-pid>'.

  * internal/scep/intune/rate_limit.go (Phase 8.6): hand-rolled
    sliding-window-log limiter keyed by (Subject, Issuer). 100k-entry
    map cap (matches replay cache); at-cap drops the bucket whose
    newest timestamp is the oldest. Default 3 enrollments per 24h
    covers legitimate first-cert + recovery + post-wipe re-enrollment
    but blocks bulk enumeration from a compromised Connector signing
    key. maxN <= 0 disables the limiter for tests + the rare operator
    who wants no per-device cap. Empty subject short-circuits to allow
    (defense-in-depth: caller's claim validation rejects empty-subject
    upstream; no shared bucket on '').

    Why hand-rolled instead of golang.org/x/time/rate: the rate
    package is in go.sum as an indirect transitive but not a direct
    dep. ~30 LoC of stdlib avoids creating a new direct dep.

  * internal/service/scep.go (Phase 8.3 + 8.4 + 8.7):
    - SCEPService gains intuneEnabled / intuneTrust / intuneAudience /
      intuneValidity / intuneReplayCache / intuneRateLimiter /
      complianceCheck fields.
    - SetIntuneIntegration() constructor-time injection wires the
      per-profile state. Profiles with INTUNE_ENABLED=false never
      call this method, so they pay zero overhead.
    - SetComplianceCheck() installs the V3-Pro plug-in (see Phase 8.7).
    - looksIntuneShaped(): JWT-shape pre-check (length > 200 + exactly
      two dots). Allowed to false-positive (validator catches malformed
      → ErrChallengeMalformed); MUST NOT false-negative on real Intune
      challenges.
    - dispatchIntuneChallenge(): the load-bearing core. Runs
      ValidateChallenge → CSR-binding via DeviceMatchesCSR → replay
      cache CheckAndInsert → per-device Allow → optional ComplianceCheck.
      Each failure leg increments a typed metric label and emits an
      audit-friendly Warn log line.
    - PKCSReq + PKCSReqWithEnvelope + RenewalReqWithEnvelope all call
      dispatchIntuneChallenge first; on outcome.decided=true they
      either short-circuit (with a typed-error → SCEPFailInfo mapping)
      or call processEnrollment with action='scep_pkcsreq_intune'
      (so audit greps can count Intune-vs-static enrollments).
    - mapIntuneErrorToFailInfo(): typed-error → SCEPFailInfo per
      RFC 8894 §3.2.1.4.5 (signature/replay/expired → BadMessageCheck;
      claim-mismatch → BadRequest; default → BadRequest).
    - intuneFailReason(): typed-error → metric label
      ('signature_invalid' / 'expired' / 'rate_limited' / etc.). Default
      'malformed' so a previously-unseen error category still surfaces
      in the metric for follow-up.
    - ComplianceCheck (Phase 8.7): nil-default no-op gate. V3-Pro plugs
      in via SetComplianceCheck to call Microsoft Graph's compliance
      API. Returns (compliant, reason, err). nil-err + compliant=false
      → CertRep FAILURE + 'compliance' reason in audit. err != nil →
      fail-safe deny (V3-Pro module is responsible for any 'permit on
      API failure' policy).

  * internal/service/scep.go also gains parseCSRForIntune() — small
    private wrapper around encoding/pem + x509 used by the dispatcher
    for the claim ↔ CSR binding check (separated from the broader
    processEnrollment because we want to bind BEFORE consuming the
    replay-cache slot).

Tests (gates: ≥85% coverage on intune package, ≥70% on service):

  * scep_intune_test.go (in internal/service): 14 dispatcher tests
    covering happy-path Intune enrollment + static-challenge fallback
    + tampered-challenge reject + claim-mismatch reject + replay
    detected + rate-limited + compliance-hook nil-default + compliance-
    hook denies non-compliant + compliance-hook error fails closed +
    IntuneEnabled accessor + 'no IntuneEnabled = static path
    unchanged' regression pin + intuneFailReason mapping for every
    typed error + looksIntuneShaped boundary cases.

  * trust_anchor_holder_test.go (in internal/scep/intune): NewLoadsBundle,
    NewRequiresLogger, NewSurfacesLoadError, ReloadHappyPath,
    ReloadKeepsOldOnFailure, ReloadKeepsOldOnExpired (the fail-safe
    semantics that make the SIGHUP path operator-friendly),
    WatchSIGHUPReloadsPool (real SIGHUP to self with poll-for-swap
    pattern mirroring cmd/server/tls_test.go), WatchSIGHUPStopIsClean
    (does NOT fire SIGHUP after stop — same caveat as the TLS test:
    the Go runtime would otherwise terminate the test runner on the
    next SIGHUP since signal.Stop has removed the handler).

  * rate_limit_test.go (in internal/scep/intune): AllowsUpToCap,
    DistinctKeysIndependent, WindowExpiry, DisabledBypass (maxN=0),
    NegativeCapDisabled, EmptySubjectShortCircuits (defense-in-depth
    against an empty-subject DoS chokepoint), DefaultCapsHonored,
    MapCapEvictsOldest (at-cap eviction branch), ConcurrentRaceFree
    (50 goroutines × 200 inserts), pruneOlderThan + the no-op case.

Verification:

  * gofmt -l on all touched files: clean
  * go vet ./... : clean
  * staticcheck on intune/service/config/cmd-server: clean
  * go test -count=1 -cover ./internal/scep/intune/...: 94.8%
    (target ≥85%)
  * go test -short across intune+service+config+handler+cmd-server:
    all green
  * G-3 docs-drift CI guard reproduced locally: docs-only filtered=
    empty, config-only=empty. The new env vars match the existing
    CERTCTL_SCEP_ allowlist prefix.

Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 8
      cowork/scep-rfc8894-intune/progress.md
      Constitutional rule: 'Always take the complete path, not the
      easy path' (cowork/CLAUDE.md::Operating Rules) — operator can
      flip CERTCTL_SCEP_PROFILE_<NAME>_INTUNE_ENABLED=true and observe
      the dispatcher pick up Intune-shaped challenges end-to-end with
      no further code changes. Foundation + plumbing ship together.
2026-04-29 15:34:19 +00:00
shankar0123 7e4d423561 feat(scep-intune): parser + validator for Microsoft Intune Connector challenge format
Phase 7 of the SCEP RFC 8894 + Intune master bundle. Adds the
internal/scep/intune package that validates Microsoft Intune Certificate
Connector signed challenges embedded in SCEP CSR challengePassword
attributes. This is the parsing/validation foundation; Phase 8 wires it
into the SCEP service dispatcher.

What's included:

  * doc.go — package architecture (Intune cloud → Connector → certctl
    SCEP server) + 'what this package is NOT' guard rails. We do NOT
    implement full JOSE: no JKU / kid / x5c trust, no JWKS fetch.
    Trust anchor is operator-supplied at startup and pinned. The
    package does NOT call Microsoft's API directly — the Connector
    already did that; we validate its signed attestation.

  * trust_anchor.go — LoadTrustAnchor(path) reads a PEM bundle of
    Intune Connector signing certs. Skips non-CERTIFICATE PEM blocks
    (operators sometimes paste chains with the priv key by mistake).
    Rejects empty bundles + expired certs at startup with an
    operator-actionable message including the cert subject. SIGHUP
    reload lands in Phase 8.5; today it's load-once-at-boot.

  * claim.go — ChallengeClaim struct + DeviceMatchesCSR helper.
    Set-equality semantics for SAN-DNS/SAN-RFC822/SAN-UPN: the CSR
    must carry EXACTLY the claim's elements, no extras and no missing.
    Empty claim slice = no constraint on that dimension.
    Per-dimension typed errors (ErrClaimCNMismatch /
    ErrClaimSANDNSMismatch / ErrClaimSANRFC822Mismatch /
    ErrClaimSANUPNMismatch) so audit logs surface the failure
    dimension without string-matching. extractUPNSans is stubbed to
    return nil with documented fail-closed behavior — non-empty UPN
    claims fail the equalSets check (correct behavior; the rare deploy
    that pins UPN SANs hot-fixes the ASN.1 walker per the inline
    comment).

  * replay.go — ReplayCache: bounded in-memory cache of seen nonces
    with TTL. Sized for 100,000 entries (60-min Connector validity ×
    25 RPS Intune fleet steady-state ≈ 90,000 challenges/hour with
    headroom). sync.Map for concurrent read/write; janitor goroutine
    wakes every TTL/4 to evict expired entries; at-cap O(N)
    oldest-eviction (rarely fires; janitor keeps the cache below
    cap). Redis-backed variant deferred to V3-Pro.

  * challenge.go — the load-bearing piece:

    - ParseChallenge(raw) splits the JWT-like compact serialization
      into header/payload/signature and base64url-decodes each.
      Tolerates both padded + unpadded encodings (some Connector
      builds emit padded; RFC 7515 §2 says unpadded; we accept both).
      Validates the header parses as JSON before returning so the
      malformed-signal lands earlier in the pipeline.

    - ValidateChallenge(raw, trust, expectedAudience, now):
        1. ParseChallenge
        2. JWS signature verify over (segment0 || '.' || segment1)
           — re-derived from the raw on-wire bytes, NOT
           re-base64-encoded, per RFC 7515 §3.1 (re-encoding could
           produce a byte-different input than what was signed)
        3. Signature alg dispatch:
             RS256: rsa.VerifyPKCS1v15(SHA-256)
             ES256: tries fixed-width r||s (JOSE-canonical) first,
                    falls back to ASN.1 DER (older Connectors)
             alg=none: explicit reject with audit-log-friendly
                       message (RFC 7515 §3.6 attack vector)
             HS*/PS*: rejected as 'unsupported alg' (no shared
                      secret in our threat model)
        4. Version-detection prelude (versionedChallenge struct +
           versionUnmarshalers map). Today's format is v1 (no
           explicit version field; absence IS the v1 signal). Adding
           v2 = adding a parser + a registration line; v1 path stays
           untouched. Defends against the inevitable Microsoft format
           change at ~30 LoC + 2 tests cost vs. a P0 incident.
        5. Time bounds (iat / exp); audience pin (skipped when
           expectedAudience == "").

      Replay protection is the CALLER's job (handler glues parser +
      cache; validator stays stateless + testable).

  * Typed errors: ErrChallengeMalformed / ErrChallengeSignature /
    ErrChallengeExpired / ErrChallengeNotYetValid /
    ErrChallengeWrongAudience / ErrChallengeReplay /
    ErrChallengeUnknownVersion. errors.Is-friendly so the handler
    can audit failure dimension.

Tests (94.8% coverage):

  * challenge_test.go (18 tests): happy-path RS256 + ES256
    fixed-width + ES256 DER; TamperedSignature; TamperedPayload;
    Expired; NotYetValid; WrongAudience; EmptyExpectedAudience
    disables check; RotatedTrustAnchor; EmptyTrustBundle;
    AlgNoneRejected; UnsupportedAlg (HS256); MissingAlg;
    VersionV1ExplicitOK; VersionUnknownRejected;
    MixedTrustBundle iter (skip key-type mismatches without
    surfacing as Signature err); NonJSONPayloadButValidSignature;
    Malformed cases (empty, missing dots, bad base64, non-JSON
    header — 9 sub-cases); PaddedBase64Tolerated.

  * claim_test.go (13 tests): per-dimension matching across CN +
    SAN-DNS + SAN-RFC822 + SAN-UPN; nil guards; case-insensitive DNS
    (RFC 4343); dedupe set-equality; empty claim = no constraint;
    UPN stub canary; normaliseSet edge cases; equalSets length
    mismatch.

  * replay_test.go (11 tests): first-fresh; duplicate-rejected;
    past-TTL-fresh; Sweep-evicts-expired; empty-nonce
    short-circuits; at-cap LRU eviction; default-cap=100k;
    Close-idempotent; TTL=0 disables janitor; concurrent-race-free
    (50 goroutines × 200 inserts); empty-nonce twice is fresh both
    times (we don't cache empties).

  * trust_anchor_test.go: HappyPath single + multi cert; SkipsNonCertBlocks
    (priv key + cert mix); EmptyBundleRejected; OnlyKeyBlocksRejected;
    ExpiredCertRejected (with subject CN in error); MalformedCertRejected;
    LoadTrustAnchor disk + EmptyPath + MissingFile.

  * fuzz_test.go: FuzzParseChallenge with seed corpus covering both
    the well-formed and the obvious-malformed shapes. Survived 187k
    execs in 21s without panic on the local burst; CI runs 5 min.

Verification:

  * gofmt -l ./internal/scep/intune: clean
  * go vet ./internal/scep/intune/...: clean
  * staticcheck ./internal/scep/intune/...: clean
  * go test -count=1 -cover ./internal/scep/intune/...: 94.8%
    (target was ≥85%)
  * go vet ./internal/... ./cmd/...: clean (no rest-of-repo regressions)
  * No new CERTCTL_* env vars (those land in Phase 8 with the
    config gate); G-3 docs-drift CI guard not triggered.
  * No new HTTP routes; openapi-parity guard not triggered.

Phase 8 will:
  - Add SCEPProfileConfig.Intune* env vars + preflight gate
  - Wire the validator into the SCEP service dispatcher
    (Intune-shaped challenges → validator; static → existing path)
  - Trust-anchor SIGHUP reload mirroring cmd/server/tls.go::watchSIGHUP
  - Per-claim rate limit + audit metrics

Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 7
      cowork/scep-rfc8894-intune/progress.md
2026-04-29 14:38:35 +00:00