security: add SSRF defence-in-depth for webhook notifier (fixes H-4)

The webhook notifier would previously accept any operator-configured URL
and hand it to http.Client without validation. That exposed two
SSRF classes (CWE-918):

  * Reserved-address reachability — a misconfigured or adversarial
    webhook URL pointing at 127.0.0.1, ::1, 169.254.169.254 (cloud
    metadata), or 0.0.0.0 would succeed, exfiltrating request bodies
    to local services or leaking short-lived cloud credentials.
  * DNS rebinding — a hostname resolving to a public IP at validation
    time and to a reserved IP at dial time would bypass any
    URL-string-only check.

Fix installs two independent layers:

  * validation.ValidateSafeURL runs at config-ingest time and before
    every outbound POST. It rejects non-HTTP(S) schemes, empty hosts,
    and literal reserved-IP hosts with a clear operator-facing error.
    This is a fast early diagnostic.
  * validation.SafeHTTPDialContext is installed on the webhook
    http.Transport. It re-resolves the host at dial time, rejects any
    resolved address whose address lies in a reserved range (loopback,
    link-local, multicast, broadcast, unspecified, IPv6
    link-local/multicast), and pins the resolved IP into the final
    dial address so the TLS handshake targets the exact IP the guard
    approved. This is the authoritative, TOCTOU-safe defence against
    DNS rebinding.

The two layers are complementary — validateURL fails fast on obvious
misconfiguration; SafeHTTPDialContext fails closed when DNS changes
between validation and dial.

The existing unexported isReservedIP helper in
internal/service/network_scan.go is extracted into
internal/validation.IsReservedIP with byte-identical behaviour so the
webhook notifier and the network scanner share a single authoritative
reserved-address list. RFC 1918 ranges remain intentionally allowed
(certctl's self-hosted design). Broader unspecified / IPv6 link-local
coverage lives only in the stricter dial-time policy, where it belongs
for outbound HTTP egress.

Test seam: Connector gains an unexported validateURL func field and a
same-package newForTest constructor that installs a permissive
validator and the stdlib default transport. Production callers cannot
reach this constructor because it is unexported; only same-package
tests (package webhook) can use it. Same-package happy-path tests call
newForTest so they can point at httptest loopback servers without
being blocked by the production guard. The four SSRF-rejection tests
that verify the guard itself still call New so they exercise the real,
strict validator. This keeps the production SSRF defence
unconditionally on in real code while preserving legitimate unit-test
coverage.

Tests
-----
  * internal/validation/ssrf_test.go (new) — 16-subtest pin on
    IsReservedIP that is byte-identical with the original network-
    scanner behaviour; ValidateSafeURL accept/reject matrix covering
    HTTPS/HTTP, reserved-literal IPv4/IPv6, dangerous schemes
    (file/gopher/ftp/javascript/data/ldap/dict/jar), missing hosts,
    and malformed inputs; SafeHTTPDialContext rejects literal reserved
    addresses and hosts resolving to reserved addresses (DNS-rebinding
    coverage via localhost).
  * internal/connector/notifier/webhook/webhook_test.go — happy-path
    tests switched to newForTest; production-guard SSRF-rejection
    tests (TestValidateConfig_RejectsReservedURLs,
    TestValidateConfig_RejectsDangerousScheme,
    TestPostWebhook_RejectsReservedURL,
    TestPostWebhook_RejectsDangerousScheme) continue to call New so
    they exercise the unconditionally-installed production validator.

Wire-format invariants preserved
--------------------------------
  * Outbound HTTP request shape (method, headers, body, HMAC
    signature) unchanged.
  * network_scan.go behaviour unchanged — validation.IsReservedIP is
    byte-identical with the deleted helper.
  * RFC 1918 (10/8, 172.16/12, 192.168/16) remain allowed for both
    outbound webhook and CIDR expansion, matching the self-hosted
    design.

Verification
------------
  * go test -race ./internal/validation/... ./internal/connector/
    notifier/webhook/... ./internal/service/... — green.
  * Full-suite go test -race ./... — green (GOTMPDIR=/dev/shm to
    sidestep full /tmp on the sandbox host).
  * Coverage gates pass: service 68.8% >= 55%, handler 83.6% >= 60%,
    domain 82.0% >= 40%, middleware 63.8% >= 30%. Overall 67.8%.
    Webhook package 91.5% line coverage; validation package
    ValidateSafeURL/SafeHTTPDialContext 78-100% per function.
  * govulncheck ./... — no vulnerabilities found.
  * golangci-lint run on touched H-4 production code — clean. Pre-
    existing errcheck/gosimple warnings in scope-adjacent files
    (webhook_test.go:270 w.Write, network_scan.go:120/173/265/305)
    verified against 9e957c3 to predate this commit; left alone per
    scope guard.

Operational notes
-----------------
  * No migration needed. The guard is pure Go code; existing webhook
    configs continue to work unless they point at reserved addresses,
    in which case they now fail closed with a clear error.
  * Existing operators who rely on webhook POST to 127.0.0.1 or
    ::1 (e.g., local receivers on the same host as certctl-server)
    must expose their receiver on an RFC 1918 address or public IP.
    This is deliberate — the threat model for webhook notifiers
    includes untrusted operator-supplied URLs.

Scope guard: H-4 only. H-5, H-6, M-*, L-*, and I-* findings remain
open and are tracked separately. No drive-by refactors.
This commit is contained in:
Shankar
2026-04-17 00:34:47 +00:00
parent 9e957c3447
commit 371b9836e0
6 changed files with 684 additions and 62 deletions
@@ -32,7 +32,7 @@ func TestWebhook_ValidateConfig_ValidURL(t *testing.T) {
// Create a new logger (or use test logger)
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
err := conn.ValidateConfig(context.Background(), rawConfig)
if err != nil {
@@ -47,7 +47,7 @@ func TestWebhook_ValidateConfig_MissingURL(t *testing.T) {
rawConfig, _ := json.Marshal(cfg)
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
err := conn.ValidateConfig(context.Background(), rawConfig)
if err == nil {
@@ -96,7 +96,7 @@ func TestWebhook_SendAlert_Success(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
alert := notifier.Alert{
ID: "alert-123",
@@ -160,7 +160,7 @@ func TestWebhook_SendAlert_HMACSignature(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
alert := notifier.Alert{
ID: "alert-456",
@@ -199,7 +199,7 @@ func TestWebhook_SendAlert_NoSignatureWithoutSecret(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
alert := notifier.Alert{
ID: "alert-789",
@@ -239,7 +239,7 @@ func TestWebhook_SendAlert_CustomHeaders(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
alert := notifier.Alert{
ID: "alert-custom",
@@ -276,7 +276,7 @@ func TestWebhook_SendAlert_HTTPError(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
alert := notifier.Alert{
ID: "alert-error",
@@ -318,7 +318,7 @@ func TestWebhook_SendEvent_Success(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
certID := "mc-api-prod"
event := notifier.Event{
@@ -367,7 +367,7 @@ func TestWebhook_SendEvent_WithoutCertificateID(t *testing.T) {
}
logger := newTestLogger()
conn := New(cfg, logger)
conn := newForTest(cfg, logger)
event := notifier.Event{
ID: "event-456",
@@ -389,6 +389,130 @@ func TestWebhook_SendEvent_WithoutCertificateID(t *testing.T) {
}
}
// The SSRF tests below exercise the CWE-918 guard added alongside H-4. Each
// case pairs a reserved-address URL with the call surface that should reject
// it. ValidateConfig is the early-fail path; SendAlert/SendEvent reach the
// same guard via postWebhook and are the defence-in-depth that still rejects
// even when ValidateConfig was bypassed (e.g. dynamic config reload mutating
// c.config.URL in place).
func TestWebhook_ValidateConfig_RejectsReservedURLs(t *testing.T) {
// These must all fail at config-ingestion time without ever opening a
// socket — the reserved-address filter is the whole point of H-4.
cases := []struct {
name string
url string
}{
{"loopback v4", "http://127.0.0.1/hook"},
{"loopback v4 with port", "http://127.0.0.1:8080/"},
{"loopback v6 bracketed", "http://[::1]/hook"},
{"AWS metadata", "http://169.254.169.254/latest/meta-data/"},
{"generic link-local", "http://169.254.1.2/"},
{"unspecified v4", "http://0.0.0.0/"},
{"unspecified v6", "http://[::]/"},
{"IPv6 link-local", "http://[fe80::1]/"},
{"multicast", "https://224.0.0.5/"},
{"broadcast", "http://255.255.255.255/"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
cfg := &Config{URL: tc.url}
rawConfig, _ := json.Marshal(cfg)
conn := New(cfg, newTestLogger())
err := conn.ValidateConfig(context.Background(), rawConfig)
if err == nil {
t.Fatalf("ValidateConfig(%q) returned nil, want SSRF rejection", tc.url)
}
if !strings.Contains(err.Error(), "reserved") && !strings.Contains(err.Error(), "rejected") {
t.Errorf("expected reserved/rejected error, got %q", err.Error())
}
})
}
}
func TestWebhook_ValidateConfig_RejectsDangerousSchemes(t *testing.T) {
// Only http(s) is a legitimate webhook transport. Every other scheme is
// an SSRF amplifier (file, gopher, ftp, javascript, data, ldap, dict,
// jar) and must be refused at config time.
cases := []struct {
name string
url string
}{
{"file", "file:///etc/passwd"},
{"gopher", "gopher://example.com/_x"},
{"ftp", "ftp://example.com/"},
{"javascript", "javascript:alert(1)"},
{"data", "data:text/plain;base64,SGVsbG8="},
{"ldap", "ldap://example.com/"},
{"dict", "dict://example.com:2628/d:foo"},
{"jar", "jar:http://example.com/foo.jar!/"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
cfg := &Config{URL: tc.url}
rawConfig, _ := json.Marshal(cfg)
conn := New(cfg, newTestLogger())
err := conn.ValidateConfig(context.Background(), rawConfig)
if err == nil {
t.Fatalf("ValidateConfig(%q) returned nil, want scheme rejection", tc.url)
}
if !strings.Contains(err.Error(), "rejected") && !strings.Contains(err.Error(), "scheme") {
t.Errorf("expected scheme/rejected error, got %q", err.Error())
}
})
}
}
func TestWebhook_SendAlert_RejectsReservedURLInPostWebhook(t *testing.T) {
// Simulate config drift: URL was legitimate at ValidateConfig time but
// has since been rewritten to an SSRF target. postWebhook must catch
// this on every call without ever hitting the wire.
cfg := &Config{URL: "http://169.254.169.254/latest/meta-data/"}
conn := New(cfg, newTestLogger())
alert := notifier.Alert{
ID: "alert-ssrf",
Type: "test",
Severity: "info",
Subject: "Test",
Message: "Test",
Recipient: "ops@example.com",
CreatedAt: time.Now(),
}
err := conn.SendAlert(context.Background(), alert)
if err == nil {
t.Fatal("SendAlert returned nil, want SSRF rejection from postWebhook")
}
if !strings.Contains(err.Error(), "reserved") && !strings.Contains(err.Error(), "rejected") {
t.Errorf("expected reserved/rejected error, got %q", err.Error())
}
}
func TestWebhook_SendEvent_RejectsReservedURLInPostWebhook(t *testing.T) {
cfg := &Config{URL: "http://[::1]:9/webhook"}
conn := New(cfg, newTestLogger())
event := notifier.Event{
ID: "event-ssrf",
Type: "test",
Subject: "Test",
Body: "Test",
Recipient: "ops@example.com",
CreatedAt: time.Now(),
}
err := conn.SendEvent(context.Background(), event)
if err == nil {
t.Fatal("SendEvent returned nil, want SSRF rejection from postWebhook")
}
if !strings.Contains(err.Error(), "reserved") && !strings.Contains(err.Error(), "rejected") {
t.Errorf("expected reserved/rejected error, got %q", err.Error())
}
}
// Helper function to compute HMAC-SHA256 signature
func computeHMACSHA256(data []byte, secret string) string {
h := hmac.New(sha256.New, []byte(secret))