Revert CodeQL custom config + sanitizer model — leave alert #23 open

Reverts:
  482e952 ci(codeql): rewire local model pack discovery — fix 1122f5a silent no-op
  1122f5a ci(codeql): teach analyzer about ValidateSafeURL SSRF barrier

Net: drops .github/codeql/ entirely; restores the codeql.yml workflow
and the docs/architecture.md::Input Validation and SSRF Protection
section to their pre-1122f5a state. Alert #23 (go/request-forgery,
Critical) at internal/service/scep_probe.go:232 stays OPEN to be
resolved later.

Why this revert exists. The original Option A (model pack barrier
declaration) was the right idea on paper — teach the analyzer that
internal/validation.ValidateSafeURL sanitizes the URL argument so
the request-forgery taint trace stops there. Two iterations in
(1122f5a + 482e952), the pack still wasn't loading:

  - 1122f5a used `packs: { go: ['./'] }` in codeql-config.yml. That
    field expects pack names, not paths; the local pack silently
    never registered. CodeQL ran clean but emitted the same alert.

  - 482e952 restructured into .github/codeql/certctl-models/ + named
    the pack + added `additional-packs: .github/codeql` to the action
    init step. Surface looked correct against the pattern I'd
    researched (vscode-codeql, CodeQL docs). But:

      Warning: Unexpected input(s) 'additional-packs',
      valid inputs are [..., packs, ...]

      A fatal error occurred:
      'shankar0123/certctl-models' not found in the registry
      'https://ghcr.io/v2/'.

    `additional-packs` is not a valid input on github/codeql-
    action/init@v3 (verified directly against init/action.yml on
    that branch). Without a valid path-resolver input, the CLI
    fell back to the public registry, where the pack obviously
    isn't published. CodeQL run #56 fatal-errored.

The next iteration would have been: codeql-workspace.yml at the
repo root, OR convert to a query pack referenced via `queries:
./path`, OR publish to GHCR, OR drop MaD and write custom QL.
Each is its own incremental commit with its own failure modes I
can't pre-validate without a CI push, against a `barrierModel`
feature for Go that's too new (added 2026-04-21) to have shipped
public examples to copy from.

Honest cost-benefit. The runtime at scep_probe.go:232 is correct
on day one — `ValidateSafeURL` rejects reserved-IP targets at the
service entry; `SafeHTTPDialContext` re-resolves at dial time and
pins to a literal non-reserved IP, defeating DNS rebinding.
CodeQL is reporting a known-class false positive on a known-good
sanitizer pattern. The cost of teaching CodeQL about a 2-site
validator (this + webhook notifier's client.Do) — multiple
iterations of pack-discovery infrastructure, a `.github/codeql/`
tree to maintain, version-tracking against codeql-action and
CodeQL-CLI updates — exceeds the benefit of silencing those 2
alerts.

The right path forward, when capacity exists: either land a
short justified `// codeql[go/request-forgery]` annotation at
each of the 2 sites with a comment block citing ValidateSafeURL
+ SafeHTTPDialContext, OR dismiss alert #23 in the GitHub
Security UI as "won't fix — false positive" with the same
justification in the dismissal comment. Both are real fixes for
the underlying problem (analyzer's model differs from runtime
reality at known-safe call sites). Neither requires new CI
infrastructure.

Until then, the alert stays open. The Security tab is a public
signal — anyone reviewing the certctl repo sees that we've left
this finding visible rather than hidden it via config. That's
itself a security-posture statement.

Specific files restored:
  - .github/workflows/codeql.yml: drops `config-file:` and
    `additional-packs:` from Initialize CodeQL step. Workflow is
    byte-equivalent to its pre-1122f5a state (verified).
  - .github/codeql/: directory removed (3 files: qlpack.yml,
    codeql-config.yml, certctl-models/models/*.model.yml).
  - docs/architecture.md::Input Validation and SSRF Protection:
    drops the "Outbound HTTP egress" paragraph that was added in
    1122f5a. The original section's coverage of shell input
    validators + network-scanner reserved-IP filter remains
    intact — that's what was there before.

Other commits between 1122f5a and now (c4157fd — encryption-key
fix + H-1 regression guard) are PRESERVED. They're unrelated to
CodeQL and remain valid.
This commit is contained in:
shankar0123
2026-05-01 01:28:54 +00:00
parent 482e952dde
commit e06447b763
5 changed files with 0 additions and 178 deletions
-2
View File
@@ -993,8 +993,6 @@ Audit recording is async (via goroutine) so it never blocks the HTTP response. I
All shell-facing inputs (connector scripts, domain names, ACME tokens) are validated through `internal/validation/command.go` before reaching shell execution. `ValidateShellCommand()` denies all shell metacharacters. `ValidateDomainName()` enforces RFC 1123. `ValidateACMEToken()` restricts to base64url characters. The network scanner filters reserved IP ranges (loopback, link-local including cloud metadata 169.254.169.254, multicast, broadcast) to prevent SSRF, while preserving RFC 1918 private ranges for legitimate internal scanning.
**Outbound HTTP egress** is defended in two layers, both implemented in `internal/validation/ssrf.go`. Layer one is `ValidateSafeURL(rawURL)`, called at the surface of every outbound HTTP path that accepts an operator-supplied URL. It rejects non-`http(s)` schemes, empty hosts, literal-IP hosts in reserved ranges (loopback `127/8` + `::1`, IPv4 link-local `169.254/16` including AWS/Azure/GCP cloud metadata at `169.254.169.254`, IPv4 multicast `224/4`, IPv4 broadcast `255.255.255.255`, unspecified `0.0.0.0` + `::`, and IPv6 link-local `fe80::/10` + IPv6 multicast `ff00::/8`), and DNS names whose `A`/`AAAA` resolution returns any IP in that set. Layer two is `SafeHTTPDialContext(timeout)`, installed on the `http.Transport.DialContext` of every long-lived outbound HTTP client — it re-resolves the target host at dial time, applies the same reserved-IP set, and pins the TCP dial to a literal non-reserved IP so a racing DNS rebind cannot redirect the connection between validation and dial. RFC 1918 ranges (`10/8`, `172.16/12`, `192.168/16`) are intentionally NOT treated as reserved — certctl is designed to manage certificates inside private networks, and filtering them would break the primary use case. Current call sites: the SCEP probe in the network scanner (`internal/service/scep_probe.go`) and the webhook notifier (`internal/connector/notifier/webhook/`); both wire `ValidateSafeURL` at config + request entry and `SafeHTTPDialContext` on their dedicated `http.Client`. The egress validators are also declared as request-forgery barriers in the CodeQL Models-as-Data extension at `.github/codeql/models/`, so the static analyzer's view of the dataflow matches the runtime defense.
### Request Body Size Limits
All incoming HTTP request bodies are capped by `http.MaxBytesReader` middleware (default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`). Requests exceeding the limit receive a 413 Request Entity Too Large response. The middleware is positioned before authentication in the chain so oversized payloads are rejected early, before any auth processing or database work occurs. Requests without bodies (GET, HEAD, nil body) skip the limit check.