Files
certctl/cmd/server/tls.go
T
shankar0123 aa139ee0d9 EST RFC 7030 hardening master bundle Phases 2-4: end-to-end mTLS sibling
route + RFC 9266 channel binding + HTTP Basic enrollment-password +
per-source-IP failed-auth limit + per-(CN, sourceIP) sliding-window cap.

Two new shared packages so EST + Intune share infrastructure:
- internal/cms/ — RFC 9266 tls-exporter extractor (ExtractTLSExporter
  with stdlib-panic recovery for synthetic ConnectionStates) +
  CSR-side channel-binding parser via raw TBSCertificationRequestInfo
  walk (the stdlib's csr.Attributes can't represent the OCTET STRING
  binding value), VerifyChannelBinding composite, EmbedChannel-
  BindingAttribute fixture helper, typed sentinel errors for missing
  / mismatch / not-TLS-1.3 mapped to HTTP 400 / 409 / 426 in handler.
- internal/trustanchor/ — extracted from scep/intune/trust_anchor*.go
  so the EST mTLS sibling route + Intune dispatcher share the same
  SIGHUP-reloadable PEM bundle primitive. intune.TrustAnchorHolder
  is now `= trustanchor.Holder` (type alias) + NewTrustAnchorHolder =
  trustanchor.New (function alias) — every existing call site compiles
  unchanged. Intune's LoadTrustAnchor is a thin wrapper over
  trustanchor.LoadBundle. White-box tests moved to the new package.
- internal/ratelimit/ — extracted from scep/intune/rate_limit.go (this
  was Phase 4.1, in the same bundle). intune.PerDeviceRateLimiter
  is now a thin wrapper preserving the (subject, issuer)→key
  composition; EST handler reaches for SlidingWindowLimiter directly.

ESTHandler grew six optional fields wired by per-profile setters
(SetMTLSTrust / SetChannelBindingRequired / SetEnrollmentPassword /
SetSourceIPRateLimiter / SetPerPrincipalRateLimiter / SetLabelForLog)
plus four new mTLS-route methods (CACertsMTLS / SimpleEnrollMTLS /
SimpleReEnrollMTLS / CSRAttrsMTLS); shared internal pipeline
handleEnrollOrReEnroll(reEnroll, viaMTLS) keeps the auth/binding/
rate-limit gates DRY. New router method RegisterESTMTLSHandlers
registers /.well-known/est-mtls/<PathID>/{cacerts,simpleenroll,
simplereenroll,csrattrs}; AuthExemptDispatchPrefixes extends the
no-auth chain to /.well-known/est-mtls.

cmd/server/main.go's EST loop wires per-profile mTLS holder +
channel-binding policy + per-principal limiter + (when EnrollmentPassword
non-empty) Basic + source-IP limiter; new preflightESTMTLSClientCATrust-
Bundle returns *trustanchor.Holder so SIGHUP rotates the EST mTLS
bundle live without restart. SCEP + EST mTLS profiles now share a
single union mtlsUnionPoolForTLS passed to buildServerTLSConfigWithMTLS
(replaces the protocol-specific scepMTLSUnionPoolForTLS); per-handler
re-verify enforces "cert must chain to THIS profile's bundle" so
cross-protocol bleed is blocked at the application layer even though
the TLS layer trusts certs from either pool's union.

Phase 3.3 source-IP failed-Basic limiter defaults: 10 attempts / 1h
/ 50k tracked IPs (no env var; tunable in a follow-up). Phase 4.2
per-principal limiter cap from CERTCTL_EST_PROFILE_<NAME>_RATE_
LIMIT_PER_PRINCIPAL_24H (existing field, Phase 1 shipped).

New tests:
- internal/cms/channelbinding_test.go: extractor + CSR-side parser +
  composite + TLS-1.3 round-trip end-to-end + EmbedChannelBinding-
  Attribute round-trip
- internal/trustanchor/holder_test.go: parseBundlePEM white-box +
  LoadBundle + Holder Get/Pool/SetLabelForLog/Reload-happy/
  Reload-keeps-old-on-failure/Reload-keeps-old-on-expired/
  WatchSIGHUP-reloads-pool/WatchSIGHUP-stop-clean
- internal/api/handler/est_hardening_test.go: 16 named cases covering
  mTLS no-trust-pool 500 + no-cert 401 + cross-profile cert 401 +
  happy-path 200 + CACertsMTLS auth gate + CSRAttrsMTLS auth gate +
  channel-binding required-absent-rejected + not-required-absent-
  allowed + writeChannelBindingError mapping + Basic no-header 401
  + Basic wrong-password 401 + Basic correct-200 + Basic-no-password
  no-gate + per-IP failed-attempt lockout 429 + per-principal
  blocks-after-cap + different-principals-independent + no-limiter-
  unbounded.

Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
disk-space testcontainers download), staticcheck clean for
cms/trustanchor/api/handler/api/router/scep/intune/ratelimit/
cmd/server, go test -short -count=1 green for cms/trustanchor/
api/handler/api/router/scep/intune/ratelimit/service. G-3
docs-drift guard reproduced locally clean (Phase 1 already
documented every new env var; Phases 2-4 added zero new env vars).
2026-04-29 23:15:35 +00:00

197 lines
8.1 KiB
Go

package main
import (
"crypto/tls"
"crypto/x509"
"fmt"
"log/slog"
"os"
"os/signal"
"sync"
"syscall"
)
// certHolder stores the server's TLS certificate under a mutex so it can be
// swapped atomically by a SIGHUP handler without restarting the server. A
// *tls.Config that wires GetCertificate → (*certHolder).GetCertificate reads
// through the holder on every ClientHello, so a successful reload takes
// effect on the next new connection immediately and without dropping
// in-flight requests.
//
// Concurrency: GetCertificate is invoked from crypto/tls handshake goroutines
// on every new inbound connection; Reload is invoked from the SIGHUP watcher
// goroutine. sync.Mutex is sufficient — TLS handshakes are not an inner-loop
// hot path and the critical section is a single pointer read.
type certHolder struct {
mu sync.Mutex
cert *tls.Certificate
certPath string
keyPath string
}
// newCertHolder loads the initial cert+key pair from disk and returns a
// holder ready to serve handshakes. Returns a non-nil error if either file
// is missing, unreadable, or the pair does not round-trip through
// tls.LoadX509KeyPair (for example the key does not sign the cert). The
// caller is expected to treat a non-nil error as a fail-loud startup gate
// and os.Exit(1) — the HTTPS-everywhere milestone (§3 locked decisions)
// prohibits plaintext HTTP fallback.
func newCertHolder(certPath, keyPath string) (*certHolder, error) {
cert, err := tls.LoadX509KeyPair(certPath, keyPath)
if err != nil {
return nil, fmt.Errorf("load TLS cert/key (cert=%q key=%q): %w", certPath, keyPath, err)
}
return &certHolder{
cert: &cert,
certPath: certPath,
keyPath: keyPath,
}, nil
}
// GetCertificate is the tls.Config.GetCertificate hook. Returns the current
// cert under the holder's mutex. ClientHelloInfo is ignored — the control
// plane does not multiplex by SNI.
func (h *certHolder) GetCertificate(_ *tls.ClientHelloInfo) (*tls.Certificate, error) {
h.mu.Lock()
defer h.mu.Unlock()
return h.cert, nil
}
// Reload re-reads the cert+key pair from disk and swaps the holder
// atomically on success. On failure the holder retains its previous cert
// and the error is propagated to the caller — the SIGHUP watcher logs and
// keeps serving the previous cert rather than crashing on a bad reload.
// This is deliberately "fail-safe on reload, fail-loud on startup": an
// operator rotating certs wants a recoverable error, not a restart loop.
func (h *certHolder) Reload() error {
cert, err := tls.LoadX509KeyPair(h.certPath, h.keyPath)
if err != nil {
return fmt.Errorf("reload TLS cert/key (cert=%q key=%q): %w", h.certPath, h.keyPath, err)
}
h.mu.Lock()
h.cert = &cert
h.mu.Unlock()
return nil
}
// watchSIGHUP installs a signal handler that calls Reload() on each SIGHUP.
// The returned stop function closes the internal done channel and stops
// signal delivery so the goroutine can exit cleanly during shutdown. Errors
// from Reload are logged but do not terminate the watcher — the operator
// can fix the files and send another SIGHUP.
//
// Defensive design note: this deliberately does NOT panic on Reload error
// even though HTTPS is mission-critical. A rotation that writes half-files
// (operator overwrites cert.pem then key.pem as two separate copies) would
// otherwise crash the server mid-rotation. Logging + retaining the old
// cert gives the operator a bounded window to fix and re-SIGHUP.
func (h *certHolder) watchSIGHUP(logger *slog.Logger) (stop func()) {
ch := make(chan os.Signal, 1)
signal.Notify(ch, syscall.SIGHUP)
done := make(chan struct{})
go func() {
for {
select {
case <-ch:
if err := h.Reload(); err != nil {
logger.Error("TLS cert reload failed; continuing with previous cert",
"error", err,
"cert_path", h.certPath,
"key_path", h.keyPath)
continue
}
logger.Info("TLS cert reloaded via SIGHUP",
"cert_path", h.certPath,
"key_path", h.keyPath)
case <-done:
signal.Stop(ch)
return
}
}
}()
return func() { close(done) }
}
// buildServerTLSConfig returns the TLS 1.3-only *tls.Config for the HTTPS
// server. Pinned per HTTPS-everywhere milestone §2.1 + §3 locked decisions:
//
// - MinVersion: TLS 1.3 (no TLS 1.2 escape hatch). Go 1.25's crypto/tls
// automatically rejects older versions.
// - CurvePreferences: explicit [X25519, P-256]. Explicit ordering keeps
// the handshake deterministic and documents the accepted curves.
// - No CipherSuites field: TLS 1.3 cipher suites are not negotiable in
// the handshake (all three mandatory suites — AES-128-GCM-SHA256,
// AES-256-GCM-SHA384, CHACHA20-POLY1305-SHA256 — are always offered).
// Go's crypto/tls ignores CipherSuites for TLS 1.3.
// - GetCertificate: reads through the holder so SIGHUP rotations take
// effect on the next new connection without a restart. Setting
// tls.Config.Certificates directly would pin the first-loaded cert
// and defeat SIGHUP reload.
func buildServerTLSConfig(holder *certHolder) *tls.Config {
return &tls.Config{
MinVersion: tls.VersionTLS13,
CurvePreferences: []tls.CurveID{tls.X25519, tls.CurveP256},
GetCertificate: holder.GetCertificate,
}
}
// buildServerTLSConfigWithMTLS extends buildServerTLSConfig with a client-cert
// trust pool for the SCEP/EST mTLS sibling routes.
//
// SCEP RFC 8894 + Intune master bundle Phase 6.5 introduced this for the
// /scep-mtls/<pathID> route; EST RFC 7030 hardening master bundle Phase 2
// extended it so the same TLS listener also serves /.well-known/est-mtls/
// <pathID>. Both protocols' mTLS profiles contribute their trust bundles
// to a UNION pool that the caller (cmd/server/main.go) builds by walking
// every enabled mTLS profile's bundle bytes once. The per-protocol
// handlers re-verify against just THIS profile's bundle (so an EST-mTLS
// bootstrap cert can't enroll against a SCEP-mTLS profile and vice versa).
//
// ClientAuth: VerifyClientCertIfGiven — request a cert during handshake; if
// the client presents one, verify it against the union pool; if absent, the
// request still reaches the handler and the per-route handler decides
// whether to accept. Critical that we do NOT use RequireAndVerifyClientCert
// here — that would break the standard /scep + /.well-known/est routes
// (challenge-password-only / unauth-or-Basic, no client cert expected).
//
// Pass clientCAs == nil to disable mTLS (no profile opted in across either
// protocol). The function then returns the same shape as
// buildServerTLSConfig.
func buildServerTLSConfigWithMTLS(holder *certHolder, clientCAs *x509.CertPool) *tls.Config {
cfg := buildServerTLSConfig(holder)
if clientCAs != nil {
cfg.ClientCAs = clientCAs
cfg.ClientAuth = tls.VerifyClientCertIfGiven
}
return cfg
}
// preflightServerTLS is the fail-loud startup gate for HTTPS. Returns a
// non-nil error when the TLS configuration is missing or the cert+key pair
// cannot be parsed, so the caller refuses to start the control plane
// (HTTPS-everywhere §3 locked decisions: no plaintext HTTP fallback).
//
// Duplicates the emptiness + stat + parse checks in config.Validate() for
// defense in depth, mirroring the pattern established by
// preflightSCEPChallengePassword (which itself duplicates
// config.Validate()'s SCEP check for CWE-306). Extracted into a separate
// function so the gate is unit-testable without booting the full server.
func preflightServerTLS(certPath, keyPath string) error {
if certPath == "" {
return fmt.Errorf("CERTCTL_SERVER_TLS_CERT_PATH is empty: HTTPS-only control plane refuses to start (see docs/tls.md)")
}
if keyPath == "" {
return fmt.Errorf("CERTCTL_SERVER_TLS_KEY_PATH is empty: HTTPS-only control plane refuses to start (see docs/tls.md)")
}
if _, err := os.Stat(certPath); err != nil {
return fmt.Errorf("TLS cert file %q unreadable: %w (see docs/tls.md)", certPath, err)
}
if _, err := os.Stat(keyPath); err != nil {
return fmt.Errorf("TLS key file %q unreadable: %w (see docs/tls.md)", keyPath, err)
}
if _, err := tls.LoadX509KeyPair(certPath, keyPath); err != nil {
return fmt.Errorf("TLS cert/key pair invalid (cert=%q key=%q): %w (see docs/tls.md)", certPath, keyPath, err)
}
return nil
}