ci(codeql): teach analyzer about ValidateSafeURL SSRF barrier

Closes CodeQL alert #23 (go/request-forgery, Critical) at the structural level — by telling CodeQL what the runtime code already does — rather than via per-line `// codeql[...]` suppressions. Background. internal/service/scep_probe.go:232 calls client.Do(req) where the request URL is built from operator-supplied input. The runtime defense is two-layer: 1. validation.ValidateSafeURL(rawURL) at scep_probe.go:86 rejects non-http(s) schemes, empty hosts, literal-IP hosts in reserved ranges (loopback, link-local incl. cloud metadata 169.254.169.254, multicast, broadcast, unspecified, IPv6 link-local), and DNS names whose A/AAAA resolution returns any reserved IP. RFC 1918 is intentionally NOT blocked — see internal/validation/ssrf.go:17-21 for the design rationale. 2. validation.SafeHTTPDialContext on the http.Transport (line 254) re-resolves at dial time, applies the same reserved-IP set, and pins the dial to a literal non-reserved IP — defeating DNS rebinding between validate and dial. CodeQL's go/request-forgery query is a syntactic taint-tracking rule with no built-in knowledge of either validator, so it reports the finding even though the runtime is correctly defended. The fix. Add a Models-as-Data (MaD) extension at .github/codeql/ declaring ValidateSafeURL as a request-forgery barrier. The barrier applies to Argument[0] (the URL parameter), which means the analyzer treats every URL flowing through ValidateSafeURL as sanitized for the request-forgery taint set. After this lands: - Alert #23 dismisses at scep_probe.go:232. - The same model applies to the second site of this exact shape — webhook notifier's outbound client.Do (internal/connector/ notifier/webhook/webhook.go) — without per-line annotations. - Future code that flows operator URLs through ValidateSafeURL inherits the barrier automatically. This is the structural fix, not a band-aid: - Band-aid (rejected): `// codeql[go/request-forgery]` suppression on line 232. Suppresses one alert; doesn't teach the analyzer. Webhook notifier would need the same comment when its sibling rule landing fires. - Structural (this change): teach CodeQL via models-as-data, in config checked into the repo, that lives next to the workflow that uses it. The validators ARE sanitizers in the runtime — this PR makes the analyzer's model match reality. Files: - .github/codeql/qlpack.yml — local model pack manifest, declares extensionTargets: codeql/go-all: '*' - .github/codeql/models/request-forgery-sanitizers.model.yml — barrierModel row for validation.ValidateSafeURL Argument[0] / request-forgery taint kind / manual provenance - .github/codeql/codeql-config.yml — references the local pack + keeps security-and-quality query suite scope - .github/workflows/codeql.yml — Initialize CodeQL step picks up config-file: ./.github/codeql/codeql-config.yml. The existing `queries: security-and-quality` line stays so even if the config file fails to load, the suite scope is preserved. - docs/architecture.md::Input Validation and SSRF Protection — extended to name the egress validators (ValidateSafeURL + SafeHTTPDialContext) and the call sites (SCEP probe + webhook notifier). Closes the docs gap surfaced during the audit; the egress threat-model previously lived only in source comments. Requires CodeQL CLI ≥ 2.25.2 for the barrierModel extensible predicate (Go MaD support added 2026-04-21). github/codeql-action@v3 ships a recent enough CLI by default; if a future analysis fails with "unknown extensible predicate barrierModel", the action's CLI has regressed below 2.25.2 — pin a newer action version rather than reverting this pack. Documented inline in qlpack.yml. References: - https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/ - https://github.blog/changelog/2026-04-21-codeql-now-supports-sanitizers-and-validators-in-models-as-data/
2026-06-07 22:01:36 +00:00 · 2026-05-01 00:28:26 +00:00
parent 69266c8ed2
commit d8026d5f67
5 changed files with 146 additions and 0 deletions
@@ -0,0 +1,36 @@
+# CodeQL analysis config for certctl.
+#
+# Loaded by .github/workflows/codeql.yml's `Initialize CodeQL` step via
+# `config-file:`. Two responsibilities:
+#
+#  1. Re-declare the query suite the workflow runs (security-and-quality)
+#     so that disabling the action's default suite via `disable-default-
+#     queries: true` doesn't accidentally drop coverage.
+#
+#  2. Load the local model pack at .github/codeql/, which adds project-
+#     specific Models-as-Data extensions (sanitizers, sinks, summaries)
+#     for the standard Go queries. See ./qlpack.yml for the full motivation.
+#
+# Path-ignore is intentionally empty — every path that ships with the
+# repo is in scope. Test files are NOT excluded; if a vulnerability
+# regresses in a test fixture and is later promoted to production, we
+# want CodeQL to catch it on first appearance.
+
+name: certctl-codeql
+
+# Run the same query suite the workflow has been running pre-config-file:
+# security-and-quality (security findings + maintainability/correctness).
+# Listing it here ensures the suite stays in scope even if the action's
+# default behavior shifts.
+queries:
+  - uses: security-and-quality
+
+# Load the local model pack. This is what makes the SSRF sanitizer
+# barrier rows in models/request-forgery-sanitizers.model.yml apply to
+# the standard go/request-forgery query.
+#
+# `${{ }}` is not used here — the path is relative to the config file's
+# directory, not to the repo root, per CodeQL action docs.
+packs:
+  go:
+    - ./
@@ -0,0 +1,57 @@
+# Models-as-Data sanitizer rows for the go/request-forgery query.
+#
+# Each row in `data` is a 9-tuple matching the `barrierModel` extensible
+# predicate signature for Go:
+#
+#   (package, type, subtypes, name, signature, ext, output, kind, provenance)
+#
+# Where:
+#   - package    — Go import path of the sanitizer
+#   - type       — receiver type ("" for package-level functions)
+#   - subtypes   — false for non-method functions; true to apply to subtypes
+#   - name       — function/method name
+#   - signature  — empty for Go (the column is not used)
+#   - ext        — empty (reserved)
+#   - output     — access path that becomes the barrier; "Argument[N]" means
+#                  the Nth argument is sanitized after the call
+#   - kind       — taint kind the barrier applies to ("request-forgery" for
+#                  the go/request-forgery query)
+#   - provenance — origin tag for the model row ("manual" — hand-authored)
+#
+# References:
+#   - https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/
+#   - https://github.blog/changelog/2026-04-21-codeql-now-supports-sanitizers-and-validators-in-models-as-data/
+
+extensions:
+  # ---------------------------------------------------------------------------
+  # validation.ValidateSafeURL — primary egress URL validator.
+  # ---------------------------------------------------------------------------
+  # Signature: func ValidateSafeURL(rawURL string) error
+  #
+  # Rejects (returns non-nil error for):
+  #   - non-http/https schemes (file://, gopher://, ftp://, data:, etc.)
+  #   - URLs missing a host
+  #   - literal-IP hosts in: 127/8 + ::1 (loopback), 169.254.0.0/16 (link-
+  #     local incl. AWS/Azure/GCP cloud metadata at 169.254.169.254),
+  #     224.0.0.0/4 + ff00::/8 (multicast), 255.255.255.255 (broadcast),
+  #     0.0.0.0 + :: (unspecified), fe80::/10 (IPv6 link-local)
+  #   - DNS names whose A/AAAA resolution returns any IP in the set above
+  #
+  # Source of truth: internal/validation/ssrf.go (ValidateSafeURL +
+  # IsReservedIP + isReservedIPForDial). RFC 1918 (10/8, 172.16/12,
+  # 192.168/16) is intentionally NOT blocked — see the comment block at
+  # ssrf.go:17-21 for the design rationale.
+  #
+  # The companion runtime defense is SafeHTTPDialContext (installed on the
+  # http.Transport via http.Transport.DialContext) which re-resolves the
+  # host at dial time and pins the dial to a literal non-reserved IP,
+  # defeating DNS rebinding. SafeHTTPDialContext returns a closure rather
+  # than acting as a direct sanitizer in dataflow terms, so it isn't
+  # modeled here — but ValidateSafeURL alone is sufficient to dismiss the
+  # request-forgery alerts at the call sites that use it (scep_probe.go,
+  # webhook.go).
+  - addsTo:
+      pack: codeql/go-all
+      extensible: barrierModel
+    data:
+      - ["github.com/shankar0123/certctl/internal/validation", "", false, "ValidateSafeURL", "", "", "Argument[0]", "request-forgery", "manual"]
@@ -0,0 +1,42 @@
+# certctl CodeQL model pack — extends the standard Go queries with project-
+# specific data-flow knowledge (sanitizers, sinks, summaries).
+#
+# Why this exists: CodeQL's standard `go/request-forgery` query is a syntactic
+# taint-tracking rule. It traces operator-supplied URLs into HTTP egress sinks
+# (`http.Client.Do`) and reports — but it has no built-in knowledge of
+# certctl's `internal/validation.ValidateSafeURL` SSRF guard. The validator
+# IS a sanitizer (rejects loopback, link-local incl. cloud metadata
+# 169.254.169.254, multicast, broadcast, unspecified, IPv6 link-local;
+# rejects DNS names whose A/AAAA records resolve into any of those ranges)
+# but CodeQL doesn't know that, so the analyzer reports a finding the
+# runtime defense already mitigates.
+#
+# This pack uses Models-as-Data (MaD) extensions to declare the validator as
+# a barrier for the request-forgery query. After this pack is loaded:
+#   - The alert at internal/service/scep_probe.go:232 (CodeQL #23) is
+#     dismissed at source, not via per-line `// codeql[...]` suppression.
+#   - The same model applies to the second site of this shape — webhook
+#     notifier's outbound `client.Do` (internal/connector/notifier/webhook/
+#     webhook.go) — without per-line annotations.
+#   - Future code that flows operator URLs through ValidateSafeURL gets the
+#     same treatment automatically.
+#
+# Loading: codeql-config.yml's `packs:` field references this pack by its
+# `name` below. The `extensionTargets:` map declares which upstream pack the
+# extension data plugs into (codeql/go-all is the Go standard library pack).
+# The `dataExtensions:` glob matches the .model.yml files in models/.
+#
+# MaD `barrierModel` extension was added for Go in CodeQL 2.25.2 (2026-04-21).
+# `github/codeql-action@v3` (pinned in .github/workflows/codeql.yml) pulls a
+# CLI version >= 2.25.2 by default. If a future analysis fails with
+# "unknown extensible predicate barrierModel", the action's CLI version has
+# regressed below 2.25.2 — pin a newer action version rather than reverting
+# this pack.
+
+name: shankar0123/certctl-codeql-models
+version: 0.0.1
+library: true
+extensionTargets:
+  codeql/go-all: '*'
+dataExtensions:
+  - models/*.model.yml
@@ -70,6 +70,15 @@ jobs:
          # maintainability/correctness issues that the smaller security-extended
          # suite skips. Comparable scope to what Aikido / SonarCloud run.
          queries: security-and-quality
+          # Custom config + model pack at .github/codeql/. The pack declares
+          # internal/validation.ValidateSafeURL as a request-forgery barrier
+          # via Models-as-Data, dismissing the alert that fires at every site
+          # using the validator (scep_probe.go, webhook.go) without per-line
+          # `// codeql[...]` suppressions. See .github/codeql/qlpack.yml for
+          # the full motivation. Requires CodeQL CLI ≥ 2.25.2 for the
+          # barrierModel extension; codeql-action@v3 ships a recent enough
+          # CLI by default.
+          config-file: ./.github/codeql/codeql-config.yml

      - name: Autobuild
        uses: github/codeql-action/autobuild@v3