Revert CodeQL custom config + sanitizer model — leave alert #23 open

Reverts:
  482e952 ci(codeql): rewire local model pack discovery — fix 1122f5a silent no-op
  1122f5a ci(codeql): teach analyzer about ValidateSafeURL SSRF barrier

Net: drops .github/codeql/ entirely; restores the codeql.yml workflow
and the docs/architecture.md::Input Validation and SSRF Protection
section to their pre-1122f5a state. Alert #23 (go/request-forgery,
Critical) at internal/service/scep_probe.go:232 stays OPEN to be
resolved later.

Why this revert exists. The original Option A (model pack barrier
declaration) was the right idea on paper — teach the analyzer that
internal/validation.ValidateSafeURL sanitizes the URL argument so
the request-forgery taint trace stops there. Two iterations in
(1122f5a + 482e952), the pack still wasn't loading:

  - 1122f5a used `packs: { go: ['./'] }` in codeql-config.yml. That
    field expects pack names, not paths; the local pack silently
    never registered. CodeQL ran clean but emitted the same alert.

  - 482e952 restructured into .github/codeql/certctl-models/ + named
    the pack + added `additional-packs: .github/codeql` to the action
    init step. Surface looked correct against the pattern I'd
    researched (vscode-codeql, CodeQL docs). But:

      Warning: Unexpected input(s) 'additional-packs',
      valid inputs are [..., packs, ...]

      A fatal error occurred:
      'shankar0123/certctl-models' not found in the registry
      'https://ghcr.io/v2/'.

    `additional-packs` is not a valid input on github/codeql-
    action/init@v3 (verified directly against init/action.yml on
    that branch). Without a valid path-resolver input, the CLI
    fell back to the public registry, where the pack obviously
    isn't published. CodeQL run #56 fatal-errored.

The next iteration would have been: codeql-workspace.yml at the
repo root, OR convert to a query pack referenced via `queries:
./path`, OR publish to GHCR, OR drop MaD and write custom QL.
Each is its own incremental commit with its own failure modes I
can't pre-validate without a CI push, against a `barrierModel`
feature for Go that's too new (added 2026-04-21) to have shipped
public examples to copy from.

Honest cost-benefit. The runtime at scep_probe.go:232 is correct
on day one — `ValidateSafeURL` rejects reserved-IP targets at the
service entry; `SafeHTTPDialContext` re-resolves at dial time and
pins to a literal non-reserved IP, defeating DNS rebinding.
CodeQL is reporting a known-class false positive on a known-good
sanitizer pattern. The cost of teaching CodeQL about a 2-site
validator (this + webhook notifier's client.Do) — multiple
iterations of pack-discovery infrastructure, a `.github/codeql/`
tree to maintain, version-tracking against codeql-action and
CodeQL-CLI updates — exceeds the benefit of silencing those 2
alerts.

The right path forward, when capacity exists: either land a
short justified `// codeql[go/request-forgery]` annotation at
each of the 2 sites with a comment block citing ValidateSafeURL
+ SafeHTTPDialContext, OR dismiss alert #23 in the GitHub
Security UI as "won't fix — false positive" with the same
justification in the dismissal comment. Both are real fixes for
the underlying problem (analyzer's model differs from runtime
reality at known-safe call sites). Neither requires new CI
infrastructure.

Until then, the alert stays open. The Security tab is a public
signal — anyone reviewing the certctl repo sees that we've left
this finding visible rather than hidden it via config. That's
itself a security-posture statement.

Specific files restored:
  - .github/workflows/codeql.yml: drops `config-file:` and
    `additional-packs:` from Initialize CodeQL step. Workflow is
    byte-equivalent to its pre-1122f5a state (verified).
  - .github/codeql/: directory removed (3 files: qlpack.yml,
    codeql-config.yml, certctl-models/models/*.model.yml).
  - docs/architecture.md::Input Validation and SSRF Protection:
    drops the "Outbound HTTP egress" paragraph that was added in
    1122f5a. The original section's coverage of shell input
    validators + network-scanner reserved-IP filter remains
    intact — that's what was there before.

Other commits between 1122f5a and now (c4157fd — encryption-key
fix + H-1 regression guard) are PRESERVED. They're unrelated to
CodeQL and remain valid.
This commit is contained in:
shankar0123
2026-05-01 01:28:54 +00:00
parent 482e952dde
commit e06447b763
5 changed files with 0 additions and 178 deletions
@@ -1,61 +0,0 @@
# Models-as-Data sanitizer rows for the go/request-forgery query.
#
# Each row in `data` is a 9-tuple matching the `barrierModel` extensible
# predicate signature for Go:
#
# (package, type, subtypes, name, signature, ext, output, kind, provenance)
#
# Where:
# - package — Go import path of the sanitizer
# - type — receiver type ("" for package-level functions)
# - subtypes — false for non-method functions; true to apply to subtypes
# - name — function/method name
# - signature — empty for Go (the column is not used)
# - ext — empty (reserved)
# - output — access path that becomes the barrier; "Argument[N]" means
# the Nth argument is sanitized after the call
# - kind — taint kind the barrier applies to ("request-forgery" for
# the go/request-forgery query)
# - provenance — origin tag for the model row ("manual" — hand-authored)
#
# References:
# - https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/
# - https://github.blog/changelog/2026-04-21-codeql-now-supports-sanitizers-and-validators-in-models-as-data/
extensions:
# ---------------------------------------------------------------------------
# validation.ValidateSafeURL — primary egress URL validator.
# ---------------------------------------------------------------------------
# Signature: func ValidateSafeURL(rawURL string) error
#
# Rejects (returns non-nil error for):
# - non-http/https schemes (file://, gopher://, ftp://, data:, etc.)
# - URLs missing a host
# - literal-IP hosts in: 127/8 + ::1 (loopback), 169.254.0.0/16 (link-
# local incl. AWS/Azure/GCP cloud metadata at 169.254.169.254),
# 224.0.0.0/4 + ff00::/8 (multicast), 255.255.255.255 (broadcast),
# 0.0.0.0 + :: (unspecified), fe80::/10 (IPv6 link-local)
# - DNS names whose A/AAAA resolution returns any IP in the set above
#
# Source of truth: internal/validation/ssrf.go (ValidateSafeURL +
# IsReservedIP + isReservedIPForDial). RFC 1918 (10/8, 172.16/12,
# 192.168/16) is intentionally NOT blocked — see the comment block at
# ssrf.go:17-21 for the design rationale.
#
# The companion runtime defense is SafeHTTPDialContext (installed on the
# http.Transport via http.Transport.DialContext) which re-resolves the
# host at dial time and pins the dial to a literal non-reserved IP,
# defeating DNS rebinding. SafeHTTPDialContext returns a closure rather
# than acting as a direct sanitizer in dataflow terms, so it isn't
# modeled here — but ValidateSafeURL alone is sufficient to dismiss the
# request-forgery alerts at the call sites that use it (scep_probe.go,
# webhook.go).
- addsTo:
pack: codeql/go-all
extensible: barrierModel
data:
# subtypes uses Python-style `False` (capitalized) to match every
# shipped CodeQL Go .model.yml — see e.g. github/codeql/go/ql/lib/ext/
# database.sql.model.yml. SnakeYAML accepts lowercase too, but
# capitalized matches the canonical convention.
- ["github.com/shankar0123/certctl/internal/validation", "", False, "ValidateSafeURL", "", "", "Argument[0]", "request-forgery", "manual"]
-55
View File
@@ -1,55 +0,0 @@
# certctl CodeQL model pack — extends the standard Go queries with project-
# specific data-flow knowledge (sanitizers, sinks, summaries).
#
# Why this exists: CodeQL's standard `go/request-forgery` query is a syntactic
# taint-tracking rule. It traces operator-supplied URLs into HTTP egress sinks
# (`http.Client.Do`) and reports — but it has no built-in knowledge of
# certctl's `internal/validation.ValidateSafeURL` SSRF guard. The validator
# IS a sanitizer (rejects loopback, link-local incl. cloud metadata
# 169.254.169.254, multicast, broadcast, unspecified, IPv6 link-local;
# rejects DNS names whose A/AAAA records resolve into any of those ranges)
# but CodeQL doesn't know that, so the analyzer reports a finding the
# runtime defense already mitigates.
#
# This pack uses Models-as-Data (MaD) extensions to declare the validator as
# a barrier for the request-forgery query. After this pack is loaded:
# - The alert at internal/service/scep_probe.go:232 (CodeQL #23) is
# dismissed at source, not via per-line `// codeql[...]` suppression.
# - The same model applies to the second site of this shape — webhook
# notifier's outbound `client.Do` (internal/connector/notifier/webhook/
# webhook.go) — without per-line annotations.
# - Future code that flows operator URLs through ValidateSafeURL gets the
# same treatment automatically.
#
# Pack-loading mechanism (post-correction in commit fix-up):
# - This pack lives at .github/codeql/certctl-models/. Its NAME is the
# `name:` field below.
# - .github/workflows/codeql.yml's Initialize CodeQL step passes
# `additional-packs: .github/codeql` to the action — that's the parent
# directory the CodeQL CLI's resolver searches for unpublished packs.
# - .github/codeql/codeql-config.yml then references this pack by NAME
# in `packs: { go: [shankar0123/certctl-models] }`. The CLI looks the
# name up against the additional-packs path, finds this qlpack.yml,
# loads the pack and its data extensions.
#
# An earlier draft (commit d8026d5) tried `packs: { go: ['./'] }` in
# codeql-config.yml, which is the wrong syntax — that field expects pack
# names, not paths. The pack silently never loaded; alert #23 stayed open
# across two CodeQL runs (d8026d5 + 4bb7a74). Pack-name + additional-packs
# is the supported path; verified against the github/vscode-codeql working
# example.
#
# MaD `barrierModel` extension was added for Go in CodeQL 2.25.2 (2026-04-21).
# `github/codeql-action@v3` (pinned in .github/workflows/codeql.yml) pulls a
# CLI version >= 2.25.2 by default. If a future analysis fails with
# "unknown extensible predicate barrierModel", the action's CLI version has
# regressed below 2.25.2 — pin a newer action version rather than reverting
# this pack.
name: shankar0123/certctl-models
version: 0.0.1
library: true
extensionTargets:
codeql/go-all: '*'
dataExtensions:
- models/*.model.yml
-41
View File
@@ -1,41 +0,0 @@
# CodeQL analysis config for certctl.
#
# Loaded by .github/workflows/codeql.yml's `Initialize CodeQL` step via
# `config-file:`. Two responsibilities:
#
# 1. Re-declare the query suite the workflow runs (security-and-quality)
# so that disabling the action's default suite via `disable-default-
# queries: true` doesn't accidentally drop coverage.
#
# 2. Load the local model pack at .github/codeql/certctl-models/, which
# adds project-specific Models-as-Data extensions (barriers, sinks,
# summaries) for the standard Go queries. See
# certctl-models/qlpack.yml for the full motivation.
#
# Path-ignore is intentionally empty — every path that ships with the
# repo is in scope. Test files are NOT excluded; if a vulnerability
# regresses in a test fixture and is later promoted to production, we
# want CodeQL to catch it on first appearance.
name: certctl-codeql
# Run the same query suite the workflow has been running pre-config-file:
# security-and-quality (security findings + maintainability/correctness).
# Listing it here ensures the suite stays in scope even if the action's
# default behavior shifts.
queries:
- uses: security-and-quality
# Load the local model pack BY NAME. The action's `init` step resolves
# this name against the path it was given via `additional-packs: .github/
# codeql` — that path is the parent directory of certctl-models/, where
# the pack's qlpack.yml declares `name: shankar0123/certctl-models`.
#
# An earlier draft used `packs: { go: ['./'] }` (a relative path). That's
# the wrong syntax — the `packs:` field expects pack NAMES, not paths.
# Local-by-path is not supported here; the discovery happens via
# additional-packs + name lookup. Verified against the github/vscode-codeql
# working setup.
packs:
go:
- shankar0123/certctl-models
-19
View File
@@ -70,25 +70,6 @@ jobs:
# maintainability/correctness issues that the smaller security-extended
# suite skips. Comparable scope to what Aikido / SonarCloud run.
queries: security-and-quality
# Custom config that pulls in the local model pack at
# .github/codeql/certctl-models/. The pack declares
# internal/validation.ValidateSafeURL as a request-forgery barrier
# via Models-as-Data, dismissing the alert that fires at every
# call site using the validator (scep_probe.go, webhook.go)
# without per-line `// codeql[...]` suppressions. See
# .github/codeql/certctl-models/qlpack.yml for the full motivation.
# Requires CodeQL CLI ≥ 2.25.2 for the barrierModel extension;
# codeql-action@v3 ships a recent enough CLI by default.
config-file: ./.github/codeql/codeql-config.yml
# Tells the CodeQL CLI's pack resolver where to find unpublished
# local packs. The pack referenced by name in codeql-config.yml's
# `packs:` directive (shankar0123/certctl-models) is resolved
# against this path. Without it, the name would fail to resolve
# and the pack would silently not load — that's exactly what
# happened to commits d8026d5 + 4bb7a74 (alert #23 stayed open
# across two CodeQL runs because the pack never made it into the
# analysis). Verified pattern: github/vscode-codeql.
additional-packs: .github/codeql
- name: Autobuild
uses: github/codeql-action/autobuild@v3
-2
View File
@@ -993,8 +993,6 @@ Audit recording is async (via goroutine) so it never blocks the HTTP response. I
All shell-facing inputs (connector scripts, domain names, ACME tokens) are validated through `internal/validation/command.go` before reaching shell execution. `ValidateShellCommand()` denies all shell metacharacters. `ValidateDomainName()` enforces RFC 1123. `ValidateACMEToken()` restricts to base64url characters. The network scanner filters reserved IP ranges (loopback, link-local including cloud metadata 169.254.169.254, multicast, broadcast) to prevent SSRF, while preserving RFC 1918 private ranges for legitimate internal scanning.
**Outbound HTTP egress** is defended in two layers, both implemented in `internal/validation/ssrf.go`. Layer one is `ValidateSafeURL(rawURL)`, called at the surface of every outbound HTTP path that accepts an operator-supplied URL. It rejects non-`http(s)` schemes, empty hosts, literal-IP hosts in reserved ranges (loopback `127/8` + `::1`, IPv4 link-local `169.254/16` including AWS/Azure/GCP cloud metadata at `169.254.169.254`, IPv4 multicast `224/4`, IPv4 broadcast `255.255.255.255`, unspecified `0.0.0.0` + `::`, and IPv6 link-local `fe80::/10` + IPv6 multicast `ff00::/8`), and DNS names whose `A`/`AAAA` resolution returns any IP in that set. Layer two is `SafeHTTPDialContext(timeout)`, installed on the `http.Transport.DialContext` of every long-lived outbound HTTP client — it re-resolves the target host at dial time, applies the same reserved-IP set, and pins the TCP dial to a literal non-reserved IP so a racing DNS rebind cannot redirect the connection between validation and dial. RFC 1918 ranges (`10/8`, `172.16/12`, `192.168/16`) are intentionally NOT treated as reserved — certctl is designed to manage certificates inside private networks, and filtering them would break the primary use case. Current call sites: the SCEP probe in the network scanner (`internal/service/scep_probe.go`) and the webhook notifier (`internal/connector/notifier/webhook/`); both wire `ValidateSafeURL` at config + request entry and `SafeHTTPDialContext` on their dedicated `http.Client`. The egress validators are also declared as request-forgery barriers in the CodeQL Models-as-Data extension at `.github/codeql/models/`, so the static analyzer's view of the dataflow matches the runtime defense.
### Request Body Size Limits
All incoming HTTP request bodies are capped by `http.MaxBytesReader` middleware (default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`). Requests exceeding the limit receive a 413 Request Entity Too Large response. The middleware is positioned before authentication in the chain so oversized payloads are rejected early, before any auth processing or database work occurs. Requests without bodies (GET, HEAD, nil body) skip the limit check.