Commit Graph

8 Commits

Author SHA1 Message Date
shankar0123 a41fc2d75c feat(ratelimit): Phase 13 Sprint 13.3 — wire backend selector + scheduler janitor + docs + helm (ARCH-M1 closure complete)
Phase 13 Sprint 13.3 — the completion half of the ARCH-M1
substantive close. Sprint 13.2 shipped the Postgres-backed
sliding-window limiter + multi-replica integration test; Sprint 13.3
wires the 6 call sites in cmd/server/main.go through the operator-
chosen backend selector, adds the rate_limit_buckets scheduler
janitor sweep, rewrites the observability doc, exposes the env-var
in the helm chart, and promotes the multi-replica integration test
to a required CI status check.

Signature ground-truth (sprint 13.2 + 13.3)
===========================================
Prompt-template signatures: `Allow(key string) error` and "5 call
sites." Actual repo: `Allow(key string, now time.Time) error` and 6
NewSlidingWindowLimiter call sites in cmd/server/main.go (the prompt
miscounted the second EST per-principal arm). Per CLAUDE.md "the repo
is truth," matched the live shape.

What changed
============

internal/config/server.go (+40 LOC):
  - Added `SlidingWindowBackend string` + `SlidingWindowJanitorInterval
    time.Duration` to RateLimitConfig with full operator-facing
    documentation of the two valid values (memory|postgres) +
    when-to-use-which decision tree.

internal/config/config.go (+27 LOC):
  - Load() reads CERTCTL_RATE_LIMIT_BACKEND (default "memory") +
    CERTCTL_RATE_LIMIT_JANITOR_INTERVAL (default 5m).
  - Validate() rejects anything other than ""/"memory"/"postgres"
    (empty = memory equivalence for test-built Configs that bypass
    Load()). Janitor interval must be ≥ 1 minute when set.
  - Failure modes return clear ::error:: with the env-var name + the
    valid values, so an operator typo ("postgress" → memory in a
    3-replica cluster) fails fast at startup.

internal/ratelimit/factory.go (NEW, 67 LOC):
  - NewLimiter(backend, db, maxN, window, mapCap) Limiter — single
    factory the 6 cmd/server/main.go call sites route through.
  - Drop-in signature: same maxN/window/mapCap as
    NewSlidingWindowLimiter (mapCap accepted + ignored for postgres
    — the rate_limit_buckets table grows until the janitor sweeps).
  - Defensive panic on unknown backend (config.Validate is SoT;
    this is belt-and-suspenders).

internal/ratelimit/postgres_gc.go (NEW, 73 LOC):
  - PostgresGC struct + NewPostgresGC + GarbageCollect.
  - Single-statement DELETE FROM rate_limit_buckets WHERE
    updated_at < NOW() - maxWindow. Idempotent.
  - maxWindow <= 0 is a no-op (operator opt-out).

internal/scheduler/scheduler.go (+90 LOC):
  - New RateLimitGarbageCollector interface (mirrors the
    ACMEGarbageCollector / SessionGarbageCollector contracts).
  - rateLimitGC field + rateLimitGCInterval + rateLimitGCRunning
    on Scheduler.
  - SetRateLimitGarbageCollector(gc) + SetRateLimitGCInterval(d)
    Setters following the existing acmeGC/sessionGC pattern.
  - rateLimitGCLoop() — JitteredTicker + atomic.Bool guard +
    per-tick context.WithTimeout(1m). Logs row count at Debug.
  - Loop counted in the Start() WaitGroup only when the GC is
    non-nil; cmd/server/main.go skips SetRateLimitGarbageCollector
    when backend=memory so the loop never launches for that case.

cmd/server/main.go (35 LOC diff):
  - All 6 ratelimit.NewSlidingWindowLimiter call sites now route
    through ratelimit.NewLimiter(cfg.RateLimit.SlidingWindowBackend,
    db, ...). Grep verification post-fix returns ZERO hits.
  - Six sites: breakglass loginLimiter (580), ocspLimiter (1003),
    exportLimiter (1068), EST failed-basic (1535), EST per-principal
    SCEP-mTLS arm (1591), EST per-principal SCEP arm (1613). The
    intune.NewPerDeviceRateLimiter site at line 1823 stays unmoved
    — its inner type-alias wrapper is the prompt's
    out-of-scope (cmd/server/*.go only).
  - Conditionally constructs PostgresGC + wires the scheduler janitor
    when backend=postgres; logs the wiring decision either way so
    operators see "rate-limit GC sweep enabled (postgres backend)"
    or "in-memory backend self-prunes" in the boot log.

internal/api/handler/{est,export,certificates,auth_breakglass}.go:
  - Replaced 5 *ratelimit.SlidingWindowLimiter field/Setter types
    with ratelimit.Limiter (the interface). Allow() satisfies the
    same call shape on both backends; the in-memory tests that
    construct *SlidingWindowLimiter still compile because the
    concrete type satisfies the interface (compile-time check in
    internal/ratelimit/limiter.go pins this).

docs/operator/observability.md (176 LOC diff):
  - Replaced the "per-process, in-memory, reset-on-restart, not
    shared across replicas" paragraph with the new
    configurable-backend section: operator decision tree,
    backend internals (memory vs postgres), janitor description,
    falsifiable closure proof (the Sprint 13.2 integration test
    name + invocation), helm chart wiring example.
  - Updated inventory to reflect the actual handler file paths +
    actual cap configurations (the prior doc said "60s window" for
    several limiters that actually use 60m / 24h windows).
  - Doc smoke confirmed: grep -c 'per-process, in-memory,
    reset-on-restart' docs/operator/observability.md = 0.

deploy/helm/certctl/values.yaml + templates/server-configmap.yaml +
templates/server-deployment.yaml:
  - Exposed server.rateLimiting.backend (default "memory") +
    server.rateLimiting.janitorInterval (default "5m") under the
    existing rateLimiting block.
  - ConfigMap renders both as rate-limit-backend +
    rate-limit-janitor-interval keys.
  - Deployment wires CERTCTL_RATE_LIMIT_BACKEND +
    CERTCTL_RATE_LIMIT_JANITOR_INTERVAL env vars from the configmap.
  - Helm render: `helm template deploy/helm/certctl --set
    server.rateLimiting.backend=postgres` shows the env-var on the
    server-deployment.yaml output.

.github/workflows/ci.yml (+12 LOC):
  - Added a new step in the Go Build & Test job that runs the
    Sprint 13.2 multi-replica integration test
    (TestRateLimit_PostgresBackend_CapEnforcedAcrossReplicas) with
    -tags=integration -race -timeout=300s. Fails the CI status check
    if the cross-replica row lock ever stops arbitrating across
    replicas — the ARCH-M1 closure regression gate.

Verification (all green locally; postgres integration via CI)
============================================================

  $ grep -nE 'NewSlidingWindowLimiter' cmd/server/*.go
    (zero hits — Sprint 13.3 receipt)

  $ go test -short -count=1 \
      ./internal/config/... ./internal/ratelimit/... \
      ./internal/scheduler/... ./internal/api/handler/... \
      ./cmd/server/...
    ok  internal/config       1.177s
    ok  internal/ratelimit    0.007s
    ok  internal/scheduler    9.165s
    ok  internal/api/handler  6.245s
    ok  cmd/server            0.390s

  $ staticcheck ./internal/ratelimit/... ./internal/scheduler/... \
      ./internal/config/... ./internal/api/handler/... ./cmd/server/...
    (clean)

  $ gofmt -l internal/ cmd/server/
    (clean)

  $ grep -c 'per-process, in-memory, reset-on-restart' \
      docs/operator/observability.md
    0   (doc smoke — the audit's verbatim phrasing is gone)

  $ bash scripts/ci-guards/G-3-env-docs-drift.sh
    G-3 env-docs-drift: clean.

  $ bash scripts/ci-guards/complete-path-config-coverage.sh
    OK — every CERTCTL_* env var (197) has at least one non-config-
    package consumer.

Selector contract verified — config.Validate() rejects any value
other than ""/memory/postgres at startup with a clear error message.

Sprint 13.4 next (ARCH-H1 OpenAPI authoring batch 1) is on a
different axis; ARCH-M1 closure is complete with this commit
modulo the Sprint 13.7 audit-HTML flip + zero-floor pin.

Closes: ARCH-M1 substantive remediation. The cross-replica rate-
limit-cap-enforcement gap that the audit recommended deferring to
v3 is closed; operators with server.replicas > 1 flip
CERTCTL_RATE_LIMIT_BACKEND=postgres and get exactly-cap enforcement
across the cluster (proved by the multi-replica integration test now
gating CI).
2026-05-14 11:52:13 +00:00
shankar0123 21aeed4f4e legal: addlicense headers + normalize legacy variants (Phase 0 RED-4)
Phase 0 closure (Path B2, post-rewrite):

addlicense sweep — adds the canonical certctl LLC copyright + BUSL-1.1
SPDX header to every production Go file. Template:

  // Copyright 2026 certctl LLC. All rights reserved.
  // SPDX-License-Identifier: BUSL-1.1

Coverage: 338 / 338 production Go files (cmd/ + internal/, excluding
*_test.go and **/testdata/**). Pre-sweep coverage was 22 / 338 (6.5%);
post-sweep is 338 / 338 (100%).

Normalized 22 pre-existing legacy headers (`// Copyright (c) certctl`
+ `// SPDX-License-Identifier: BSL-1.1`) and 1 file using a
`Certctl Contributors` attribution. The legacy SPDX ID `BSL-1.1`
is non-standard; the official SPDX identifier for Business Source
License 1.1 is `BUSL-1.1` (capital U). All 338 files now share the
canonical form.

Generated via:
  addlicense -c "certctl LLC" -y 2026 \
    -f cowork/legal/copyright-header.tpl \
    -ignore '**/testdata/**' -ignore '**/*_test.go' \
    cmd/ internal/

Verification:
  find cmd internal -name '*.go' -not -name '*_test.go' \
    -not -path '*/testdata/*' \
    -exec grep -L '^// Copyright 2026 certctl LLC' {} \; | wc -l

  Returns: 0

gofmt clean. Header additions are comments only, no compile impact.

Closes: cowork/certctl-architecture-diligence-audit.html#fix-RED-4
2026-05-13 21:23:35 +00:00
shankar0123 8b75e0311b chore: rename Go module path to github.com/certctl-io/certctl
Mechanical sed across the main go.mod's module declaration, the f5-mock-icontrol
sub-module's go.mod, every Go file's import path (361 files), and a rebuild of
the checked-in f5-mock-icontrol binary so its embedded build-info reflects the
new module path. No behavior change.

Choice B from cowork/transfer-certctl-to-org.md, executed 2026-05-04. Choice A
(keep module path declared as github.com/shankar0123/certctl regardless of
repo URL) shipped on the day of the org transfer (2026-05-03) since we had no
external Go consumers; this commit closes that deferral.

Backward-compat: GitHub HTTP redirects continue to forward
github.com/shankar0123/certctl → github.com/certctl-io/certctl at the URL
level, but Go's module proxy uses the path declared in go.mod as the
canonical name. Pre-fix, anyone trying `go get github.com/certctl-io/certctl/...`
hit a "module path mismatch" error because go.mod said
github.com/shankar0123/certctl and the URL they fetched it from said
certctl-io/certctl. Post-fix, the canonical name and the URL agree, so
go get / go install / external Go consumers / Go-tooling integrations
work cleanly via either the new path (preferred) or the old path (which
redirects and Go follows the redirect for source fetch).

Anyone still importing the old path inside their own code keeps working
provided they update their go.mod's `require` line to match — the module
path declared in their consumer's go.sum / go.mod is the authoritative
import name, so a mass sed across their import statements is the migration
on the consumer side. No external consumers exist today.

Diff shape:
  361 *.go files  — import path replacement only
    2 go.mod     — module declaration replacement only
    1 binary     — deploy/test/f5-mock-icontrol/f5-mock-icontrol rebuilt
                   so embedded build-info reflects the new path (8618965 vs
                   8618933 bytes; 32-byte diff is the build-info change)

  Total: 364 files, 730 insertions / 730 deletions, net-zero size, pure
  mechanical substitution.

Verification:
  gofmt: 17 files needed re-alignment after sed (the new path is one char
    shorter than the old, so column-aligned import groups drifted). Applied
    `gofmt -w` to fix.
  go mod tidy: clean exit on both modules.
  go vet ./...: clean exit.
  go build ./...: clean exit.
  go test -short -count=1 on representative packages: all green
    (internal/domain, internal/validation, internal/crypto, internal/crypto/signer,
    cmd/agent). Test output now reads `ok github.com/certctl-io/certctl/...`
    confirming the module path resolves correctly.
  binary: f5-mock-icontrol rebuilt; `strings | grep shankar0123` returns
    nothing; `strings | grep certctl-io/certctl` shows the new module path
    embedded in build-info.

Files intentionally NOT touched in this commit:
  README.md / CHANGELOG.md / docs/ / etc. — already swept to certctl-io
    URLs in commit 0729ee4 (the post-transfer URL refresh). This commit is
    purely the Go-tooling layer.
  Scarf pixels (`shankar0123.docker.scarf.sh/...`) — Scarf-account
    namespace, not a Go import or GitHub repo URL. Stays.

This is a non-blocking, non-customer-impacting change. Operators pulling
container images, running `make verify`, hitting the API, or installing the
agent see no functional difference. Only Go-tooling consumers (none today)
are affected, and they're enabled — not broken — by this commit.
2026-05-04 00:30:29 +00:00
shankar0123 43075a1b5c EST RFC 7030 hardening master bundle Phases 5-7: end-to-end serverkeygen
+ profile-driven csrattrs + admin observability with per-status
counters + reload-trust endpoint.

Phase 5 — RFC 7030 §4.4 server-driven key generation:
- internal/pkcs7/envelopeddata_builder.go is the inverse of the
  existing parser/decryptor: AES-256-CBC content cipher + RSA PKCS#1
  v1.5 keyTrans + per-call random IV. Round-trip pinned in test
  (BuildEnvelopedData → ParseEnvelopedData → Decrypt returns the
  original plaintext byte-for-byte).
- ESTService.SimpleServerKeygen runs the full §4.4 flow: parse client
  CSR → require RSA pubkey for keyTrans → resolve per-profile
  algorithm (RSA-2048 default; honors AllowedKeyAlgorithms) → in-
  memory keygen → re-build CSR with server pubkey → run existing
  issuer pipeline → marshal PKCS#8 → CMS-EnvelopedData wrap to a
  synthetic recipient cert wrapping the device's CSR-supplied pubkey
  → zeroize plaintext + PKCS#8 bytes → return CertPEM + ChainPEM
  + EncryptedKey. Typed sentinels ErrServerKeygenRequiresKey-
  Encipherment / ErrServerKeygenUnsupportedAlgorithm /
  ErrServerKeygenDisabled.
- ESTHandler.ServerKeygen + ServerKeygenMTLS emit RFC 7030 §4.4.2
  multipart/mixed with random per-response boundary; per-profile
  SetServerKeygenEnabled gate returns 404 when off (defense in depth
  even if the route was registered).
- New routes POST /.well-known/est/[<PathID>/]serverkeygen +
  /.well-known/est-mtls/<PathID>/serverkeygen; openapi.yaml +
  openapi-parity guard updated.

Phase 6 — Real csrattrs implementation:
- New CertificateProfile.RequiredCSRAttributes []string + migration
  000022_certificate_profiles_csrattrs.up.sql. The migration also
  lands the previously-unwired must_staple column (closes the 5.6
  follow-up loop where the field shipped at the domain + service
  layer but the postgres scan/insert/update never persisted it).
- domain.EKUStringToOID + AttributeStringToOID lookup tables: id-kp-*
  EKUs (RFC 5280 §4.2.1.12) + RFC 5280 DN attributes + RFC 2985
  PKCS#10 attributes + Microsoft Intune device-serial OID.
- ESTService.GetCSRAttrs replaces the v2.0.x nil/204 stub with a
  profile-derived SEQUENCE OF OID ASN.1 marshal. Unknown EKU /
  attribute strings dropped + warning-logged so a typo doesn't take
  down the entire endpoint.

Phase 7 — Admin observability + counters + reload-trust:
- internal/service/est_counters.go: estCounterTab (sync/atomic; 12
  named labels) + ESTStatsSnapshot per-profile shape +
  ESTService.Stats(now) zero-allocation accessor + ReloadTrust()
  SIGHUP-equivalent + SetESTAdminMetadata setter.
- Counter ticks wired into processEnrollment + SimpleServerKeygen at
  every success/failure leg.
- internal/api/handler/admin_est.go mirrors AdminSCEPIntune verbatim:
  Profiles + ReloadTrust handlers + AdminESTServiceImpl. Both
  endpoints admin-gated (M-008 triplet pinned + admin_est.go added
  to AdminGatedHandlers).
- New routes GET /api/v1/admin/est/profiles + POST /api/v1/admin/
  est/reload-trust; openapi.yaml documented; openapi-parity guard
  reproduced clean.
- cmd/server/main.go grows estServices map populated by the per-
  profile EST loop + handed to AdminEST. New MTLSTrust() +
  HasMTLSTrust() accessors on ESTHandler so main.go can pull the
  trust holder for the admin-metadata wire-up.
- Per-profile counter isolation regression test
  (internal/service/est_profile_counter_isolation_test.go) proves
  a future shared-counter refactor would fail at compile-time
  pointer-identity check.

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
disk-space testcontainers download), staticcheck clean across
cms/trustanchor/api/handler/api/router/scep/intune/ratelimit/
service/pkcs7/domain/cmd/server, go test -short -count=1 green
for every non-postgres package. G-3 docs-drift guard reproduced
locally clean (Phases 5-7 added zero new env vars; Phase 1
already documented per-profile SERVER_KEYGEN_ENABLED).

Spec preserved at cowork/est-rfc7030-hardening-prompt.md. Phases
8-13 (GUI ESTAdminPage / CLI+MCP / libest e2e / bulk revocation /
docs/est.md / release prep) remain — post-2.1.0 work.
2026-04-29 23:57:45 +00:00
shankar0123 aa139ee0d9 EST RFC 7030 hardening master bundle Phases 2-4: end-to-end mTLS sibling
route + RFC 9266 channel binding + HTTP Basic enrollment-password +
per-source-IP failed-auth limit + per-(CN, sourceIP) sliding-window cap.

Two new shared packages so EST + Intune share infrastructure:
- internal/cms/ — RFC 9266 tls-exporter extractor (ExtractTLSExporter
  with stdlib-panic recovery for synthetic ConnectionStates) +
  CSR-side channel-binding parser via raw TBSCertificationRequestInfo
  walk (the stdlib's csr.Attributes can't represent the OCTET STRING
  binding value), VerifyChannelBinding composite, EmbedChannel-
  BindingAttribute fixture helper, typed sentinel errors for missing
  / mismatch / not-TLS-1.3 mapped to HTTP 400 / 409 / 426 in handler.
- internal/trustanchor/ — extracted from scep/intune/trust_anchor*.go
  so the EST mTLS sibling route + Intune dispatcher share the same
  SIGHUP-reloadable PEM bundle primitive. intune.TrustAnchorHolder
  is now `= trustanchor.Holder` (type alias) + NewTrustAnchorHolder =
  trustanchor.New (function alias) — every existing call site compiles
  unchanged. Intune's LoadTrustAnchor is a thin wrapper over
  trustanchor.LoadBundle. White-box tests moved to the new package.
- internal/ratelimit/ — extracted from scep/intune/rate_limit.go (this
  was Phase 4.1, in the same bundle). intune.PerDeviceRateLimiter
  is now a thin wrapper preserving the (subject, issuer)→key
  composition; EST handler reaches for SlidingWindowLimiter directly.

ESTHandler grew six optional fields wired by per-profile setters
(SetMTLSTrust / SetChannelBindingRequired / SetEnrollmentPassword /
SetSourceIPRateLimiter / SetPerPrincipalRateLimiter / SetLabelForLog)
plus four new mTLS-route methods (CACertsMTLS / SimpleEnrollMTLS /
SimpleReEnrollMTLS / CSRAttrsMTLS); shared internal pipeline
handleEnrollOrReEnroll(reEnroll, viaMTLS) keeps the auth/binding/
rate-limit gates DRY. New router method RegisterESTMTLSHandlers
registers /.well-known/est-mtls/<PathID>/{cacerts,simpleenroll,
simplereenroll,csrattrs}; AuthExemptDispatchPrefixes extends the
no-auth chain to /.well-known/est-mtls.

cmd/server/main.go's EST loop wires per-profile mTLS holder +
channel-binding policy + per-principal limiter + (when EnrollmentPassword
non-empty) Basic + source-IP limiter; new preflightESTMTLSClientCATrust-
Bundle returns *trustanchor.Holder so SIGHUP rotates the EST mTLS
bundle live without restart. SCEP + EST mTLS profiles now share a
single union mtlsUnionPoolForTLS passed to buildServerTLSConfigWithMTLS
(replaces the protocol-specific scepMTLSUnionPoolForTLS); per-handler
re-verify enforces "cert must chain to THIS profile's bundle" so
cross-protocol bleed is blocked at the application layer even though
the TLS layer trusts certs from either pool's union.

Phase 3.3 source-IP failed-Basic limiter defaults: 10 attempts / 1h
/ 50k tracked IPs (no env var; tunable in a follow-up). Phase 4.2
per-principal limiter cap from CERTCTL_EST_PROFILE_<NAME>_RATE_
LIMIT_PER_PRINCIPAL_24H (existing field, Phase 1 shipped).

New tests:
- internal/cms/channelbinding_test.go: extractor + CSR-side parser +
  composite + TLS-1.3 round-trip end-to-end + EmbedChannelBinding-
  Attribute round-trip
- internal/trustanchor/holder_test.go: parseBundlePEM white-box +
  LoadBundle + Holder Get/Pool/SetLabelForLog/Reload-happy/
  Reload-keeps-old-on-failure/Reload-keeps-old-on-expired/
  WatchSIGHUP-reloads-pool/WatchSIGHUP-stop-clean
- internal/api/handler/est_hardening_test.go: 16 named cases covering
  mTLS no-trust-pool 500 + no-cert 401 + cross-profile cert 401 +
  happy-path 200 + CACertsMTLS auth gate + CSRAttrsMTLS auth gate +
  channel-binding required-absent-rejected + not-required-absent-
  allowed + writeChannelBindingError mapping + Basic no-header 401
  + Basic wrong-password 401 + Basic correct-200 + Basic-no-password
  no-gate + per-IP failed-attempt lockout 429 + per-principal
  blocks-after-cap + different-principals-independent + no-limiter-
  unbounded.

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
disk-space testcontainers download), staticcheck clean for
cms/trustanchor/api/handler/api/router/scep/intune/ratelimit/
cmd/server, go test -short -count=1 green for cms/trustanchor/
api/handler/api/router/scep/intune/ratelimit/service. G-3
docs-drift guard reproduced locally clean (Phase 1 already
documented every new env var; Phases 2-4 added zero new env vars).
2026-04-29 23:15:35 +00:00
shankar0123 1c099071d1 fix(bundle-4): EST/SCEP Attack Surface Hardening — 3 audit findings closed
Closes 3 findings (1 High + 1 Medium + 1 Low) from
/Users/shankar/Desktop/cowork/comprehensive-audit-2026-04-25/.

Bundle 4 hardens the only attack surface reachable by an anonymous network
attacker in certctl: the unauthenticated EST + SCEP enrollment endpoints.

Findings closed:

  - H-004 (High): Hand-rolled ASN.1 parser had no fuzz target.
    The audit's original framing pointed at internal/pkcs7/, but recon
    confirmed that package is an ASN.1 ENCODER (BuildCertsOnlyPKCS7,
    ASN1Wrap*, ASN1EncodeLength) — not a parser. The actual hand-rolled
    PKCS#7 PARSING reachable via anonymous network is in
    internal/api/handler/scep.go::extractCSRFromPKCS7 +
    parseSignedDataForCSR. Added native go fuzz targets:
      * internal/api/handler/scep_fuzz_test.go::FuzzExtractCSRFromPKCS7
      * internal/api/handler/scep_fuzz_test.go::FuzzParseSignedDataForCSR
      * internal/pkcs7/pkcs7_fuzz_test.go::FuzzPEMToDERChain (defense-in-depth)
      * internal/pkcs7/pkcs7_fuzz_test.go::FuzzASN1EncodeLength (defense-in-depth)
    Local 15s fuzz session: 150k execs on FuzzExtractCSRFromPKCS7,
    937k on FuzzPEMToDERChain, 925k on FuzzASN1EncodeLength — zero panics.

  - M-021 (Medium): EST TLS-Unique channel binding (RFC 7030 §3.2.3).
    Added internal/api/handler/est.go::verifyESTTransport — defense-in-depth
    TLS pre-conditions (r.TLS != nil; HandshakeComplete; TLS ≥ 1.2).
    The full §3.2.3 channel binding only applies when EST mTLS is in use;
    certctl does not currently support EST mTLS, so the §3.2.3 requirement
    is moot today. RFC 9266 (TLS 1.3 tls-exporter) and EST mTLS are
    documented as deferred follow-ups in the verifyESTTransport doc comment.

  - L-005 (Low): EST/SCEP issuer-binding fail-loud at startup.
    Pre-Bundle-4 cmd/server/main.go validated that CERTCTL_EST_ISSUER_ID and
    CERTCTL_SCEP_ISSUER_ID existed in the registry but did NOT validate the
    issuer TYPE could emit a CA cert. An operator binding EST to an ACME
    issuer (whose GetCACertPEM returns explicit error) booted successfully
    and only failed at first /est/cacerts request. Post-Bundle-4: new
    preflightEnrollmentIssuer helper calls GetCACertPEM(ctx) at startup
    with a 10s timeout. Failure logs the connector error + the candidate
    issuer types and os.Exit(1).

Tests added/modified:
  - internal/api/handler/est_transport_test.go (new) — 5 verifyESTTransport
    table cases covering plaintext-rejected, incomplete-handshake-rejected,
    TLS 1.0 rejected, TLS 1.2/1.3 accepted
  - cmd/server/preflight_test.go (new) — TestPreflightEnrollmentIssuer
    covering nil-connector, error-from-issuer, empty-PEM, valid cases
  - internal/api/handler/est_handler_test.go (modified) — 7 POST sites
    now stamp r.TLS to satisfy the new transport pre-condition
  - internal/integration/negative_test.go (modified) — setupTestServer
    wraps the test handler with a fake-TLS-state injector so the EST
    handler receives r.TLS != nil; production paths still rely on the
    real TLS listener

Threat model reference: TB-11 (EST/SCEP client ↔ Server) per
cowork/comprehensive-audit-2026-04-25/threat-model.md.
Standards: RFC 7030 §3.2.3, RFC 8894 §3, RFC 5652, RFC 9266 (deferred).
2026-04-25 21:14:41 +00:00
shankar0123 bcefb11e65 feat(M51): add SCEP server (RFC 8894) for MDM and network device enrollment
Implements Simple Certificate Enrollment Protocol with single-endpoint
operation-based dispatch (GetCACaps, GetCACert, PKIOperation), PKCS#7
SignedData CSR extraction with fallback for raw/base64 CSR, challenge
password authentication via CSR attributes, and shared internal/pkcs7
package extracted from EST handler to eliminate code duplication.

24 new tests (11 service + 13 handler) plus 5 shared pkcs7 package tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 16:47:18 -04:00
shankar0123 7d14635a72 feat: add EST server (RFC 7030) for device certificate enrollment (M23)
Implement Enrollment over Secure Transport protocol with 4 endpoints under
/.well-known/est/ — cacerts (CA chain distribution), simpleenroll (initial
enrollment), simplereenroll (certificate renewal), and csrattrs (CSR
attributes). PKCS#7 certs-only wire format with hand-rolled ASN.1, accepts
both PEM and base64-encoded DER CSRs, configurable issuer and profile
binding, full audit trail. 28 new tests (18 handler + 10 service).

Also includes:
- GetCACertPEM added to issuer connector interface (all 4 issuers updated)
- EST integration tests wired into e2e test suite (13 test cases)
- QA testing guide Part 26 (15 manual EST test cases)
- All docs updated: README, features, architecture, concepts, connectors,
  quickstart, demo-advanced (endpoint counts, MCP wording, agent IDs,
  issuer interface, resource lists, OpenSSL status)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 15:31:06 -04:00