security(email): sanitize body fields against content injection (CodeQL #11, CWE-640)

CodeQL alert #11 (go/email-injection, CWE-640 / OWASP Content Spoofing) flagged the wc.Write(message) sink at internal/connector/notifier/email/ email.go:208 because attacker-controllable fields flow into the email body unchecked. Threat model: Headers (From, To, Subject) were already protected by validation.ValidateHeaderValue (CWE-113 SMTP header injection, closed in commit 3853b74). The remaining gap was the body. An attacker controls multiple fields that surface to the body of alert/event notifications: - alert.Subject, alert.Message - event.Subject, event.Body, *event.CertificateID - alert.Metadata + event.Metadata key/value pairs These can carry CR/LF (forged 'Reply-To: attacker@evil.com' inside the body that recipients skim), NUL bytes (RFC 5321 4.5.2 violation that some MTAs truncate at), bidi-override Unicode (visually- spoofable URLs), zero-width / invisible Unicode (phishing), or malformed UTF-8 (Go emits U+FFFD which becomes a glyph in mail clients). The HTML email path (digest service) already uses html/template upstream and is safe via contextual auto-escape. This commit closes the plaintext path. Fix: internal/validation/headers.go gains SanitizeEmailBodyValue — a sanitizer that NEVER errors (the right contract for body content; over-eager rejection drops operator notifications) and scrubs: - NUL bytes (stripped entirely) - bare CR / LF (replaced with space — single fields should never carry their own line breaks; the surrounding template handles legitimate CRLFs) - C0 control chars < 0x20 except TAB - DEL (0x7F) + C1 control chars (0x80-0x9F) - U+FFFD (defense in depth: malformed UTF-8 -> Go emits this; strip so attacker-planted invalid bytes don't survive as an arbitrary glyph) - Bidi-override Unicode (U+202A..U+202E, U+2066..U+2069) - Zero-width / invisible Unicode (U+200B..U+200D, U+2060..U+2063, U+FEFF, U+180E) - Catch-all unicode.IsControl for anything not enumerated above Codepoint table uses numeric ranges rather than rune-literal switch cases — Go source rejects literal invisible characters (BOM U+FEFF) mid-file, so the table compares against numeric values. internal/connector/notifier/email/email.go applies the sanitizer at every interpolation site: - formatAlertBody: alert.ID/Type/Severity/Subject/Message (CreatedAt is time.Time -> RFC3339, structural, not sanitized) - formatEventBody: event.ID/Type/Subject/Body, *CertificateID (CreatedAt structural, not sanitized) - formatMetadata: both keys and values The sendEmail / formatEmailMessage call sites continue to validate headers (From / To / Subject) via the existing ValidateHeaderValue fail-closed gate; the new sanitizer is body-side only. Tests (internal/validation/headers_test.go): TestSanitizeEmailBodyValue_PreservesSafeInput Pin: ordinary ASCII, UTF-8 multibyte (résumé / 日本語 / مرحبا), tabs, common cert DNs, URLs all flow through unchanged. TestSanitizeEmailBodyValue_StripsControlChars Table-driven across NUL, bare LF/CR, CRLF, BEL, backspace, DEL, C1 (U+0080 / U+009F), U+FFFD, TAB-preserve. TestSanitizeEmailBodyValue_StripsBidiOverride 7 attacker payloads (RLO, LRO, LRI, zero-width space, ZWNJ, BOM, MVS) — each must produce a non-identity output. TestSanitizeEmailBodyValue_ContentSpoofingScenario The CodeQL example case: 'alert\r\nReply-To: attacker@evil.com\r\n Click https://evil.example.com/reset' — verify NO CR/LF survives. Verified locally: gofmt: clean. go vet ./...: exit 0. go test -short -count=1 ./internal/validation/...: ok 0.374s go test -short -count=1 ./internal/connector/notifier/email/...: ok 0.186s Reference: https://github.com/certctl-io/certctl/security/code-scanning/11 Closes CodeQL alert #11 (go/email-injection).
2026-06-07 15:01:32 +00:00 · 2026-05-04 04:56:13 +00:00
parent e50ba168ac
commit 23c593089d
3 changed files with 297 additions and 4 deletions
@@ -356,6 +356,21 @@ func (c *Connector) formatHTMLEmailMessage(from, to, subject, htmlBody string) (
 }

 // formatAlertBody formats an alert notification as email body text.
+//
+// CodeQL go/email-injection (CWE-640 / OWASP Content Spoofing) defense:
+// every field interpolated into the body that may carry attacker-
+// controlled content (alert.Subject, alert.Message, alert.Metadata
+// values, alert.ID / Type / Severity which originate from the API
+// surface) is routed through validation.SanitizeEmailBodyValue before
+// formatting. The sanitizer strips NUL bytes (RFC 5321 §4.5.2 violation),
+// bare CR/LF within a single field (forged header-boundary attempts),
+// bidi-override Unicode (visually-spoofable URLs), zero-width / invisible
+// codepoints, and C0/C1 control chars. CreatedAt is a time.Time —
+// formatted via RFC3339; not user-controllable so unsanitized.
+//
+// Header values (From, To, Subject) are protected separately by
+// validation.ValidateHeaderValue at sendEmail entry (CWE-113 SMTP header
+// injection — see commit 9e957c3).
 func (c *Connector) formatAlertBody(alert notifier.Alert) string {
 	body := fmt.Sprintf(`
 Certificate Alert Notification
@@ -372,16 +387,29 @@ Message:
 %s

 %s
-`, alert.ID, alert.Type, alert.Severity, alert.CreatedAt.Format(time.RFC3339), alert.Subject, alert.Message, c.formatMetadata(alert.Metadata))
+`,
+		validation.SanitizeEmailBodyValue(alert.ID),
+		validation.SanitizeEmailBodyValue(alert.Type),
+		validation.SanitizeEmailBodyValue(alert.Severity),
+		alert.CreatedAt.Format(time.RFC3339),
+		validation.SanitizeEmailBodyValue(alert.Subject),
+		validation.SanitizeEmailBodyValue(alert.Message),
+		c.formatMetadata(alert.Metadata),
+	)

 	return body
 }

 // formatEventBody formats an event notification as email body text.
+//
+// Same CodeQL go/email-injection mitigation as formatAlertBody — every
+// user-controllable interpolated field routes through
+// validation.SanitizeEmailBodyValue. CreatedAt is unsanitized (time.Time
+// → RFC3339 is structural, not user-controllable).
 func (c *Connector) formatEventBody(event notifier.Event) string {
 	certInfo := ""
 	if event.CertificateID != nil {
-		certInfo = fmt.Sprintf("Certificate ID: %s\n", *event.CertificateID)
+		certInfo = fmt.Sprintf("Certificate ID: %s\n", validation.SanitizeEmailBodyValue(*event.CertificateID))
 	}

 	body := fmt.Sprintf(`
@@ -398,12 +426,27 @@ Body:
 %s

 %s
-`, event.ID, event.Type, event.CreatedAt.Format(time.RFC3339), certInfo, event.Subject, event.Body, c.formatMetadata(event.Metadata))
+`,
+		validation.SanitizeEmailBodyValue(event.ID),
+		validation.SanitizeEmailBodyValue(event.Type),
+		event.CreatedAt.Format(time.RFC3339),
+		certInfo,
+		validation.SanitizeEmailBodyValue(event.Subject),
+		validation.SanitizeEmailBodyValue(event.Body),
+		c.formatMetadata(event.Metadata),
+	)

 	return body
 }

 // formatMetadata formats metadata as a readable string.
+//
+// Both keys and values can carry attacker-controlled content (cert
+// subject DN fragments, discovered cert metadata, owner/team labels —
+// all originate from API surfaces an attacker may influence). Both are
+// routed through validation.SanitizeEmailBodyValue. Closes the
+// CodeQL go/email-injection finding alongside formatAlertBody +
+// formatEventBody.
 func (c *Connector) formatMetadata(metadata map[string]string) string {
 	if len(metadata) == 0 {
 		return ""
@@ -411,7 +454,10 @@ func (c *Connector) formatMetadata(metadata map[string]string) string {

 	metadataStr := "\nMetadata:\n"
 	for key, value := range metadata {
-		metadataStr += fmt.Sprintf("  %s: %s\n", key, value)
+		metadataStr += fmt.Sprintf("  %s: %s\n",
+			validation.SanitizeEmailBodyValue(key),
+			validation.SanitizeEmailBodyValue(value),
+		)
 	}

 	return metadataStr