feat(security): Sprint 5 ACQ — RED-003 deny-empty flip + SEC-009/RED-005 RFC1918 opt-in

Acquisition-audit Sprint 5 ACQ closure (2026-05-16). Two
independent findings ship together because they share Load() /
main.go wiring; the closure comments tie each line to its finding.

PART A — RED-003 (agent-bootstrap deny-empty cutover)
=====================================================

Phase 2 SEC-H1 closure (2026-05-13) introduced the
CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY staged feature flag with
default `false` so v2.1.x operators wouldn't get a surprise
fail-closed on upgrade. This commit flips the default to `true`
(per the staged plan in the existing CHANGELOG "Breaking changes
(scheduled for v2.2.0)" block). Operators who haven't generated a
real bootstrap token yet keep the v2.1.x warn-mode pass-through
for one upgrade window by setting
CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY=false explicitly.

Demo-mode escape hatch: CERTCTL_DEMO_MODE_ACK=true skips the
fail-closed gate so the screenshot/demo path stays one-command-up.
The accompanying boot-banner WARN at cmd/server/main.go:124-126
keeps demo mode visible in every log scraper, so this override
cannot silently re-enable warn-mode in production.

internal/config/config.go
  - Load() default for AgentBootstrapTokenDenyEmpty flipped to true
  - Validate() gate now also checks !c.Auth.DemoModeAck so the demo
    override line up with the boot-banner WARN
  - Closure comment block updated to cross-reference Sprint 5 ACQ
    and the CHANGELOG v2.2.0 entry

cmd/server/main.go
  - Updated boot-time WARN message to reflect the new default
    (deny-empty=true) — the warn now fires only in the two
    explicit override scenarios (warn-mode opt-back or demo mode),
    and explains the operator action either way
  - Info-line on configured-token path unchanged

PART B — SEC-009 + RED-005 (opt-in RFC1918 outbound block)
==========================================================

internal/validation/ssrf.go::IsReservedIP has always intentionally
left RFC 1918 ranges (10/8, 172.16/12, 192.168/16) NOT-reserved
because certctl is designed to manage certificates inside private
networks. For operators on hosted IaaS where RFC1918 IS internal
trust (kubeadm-default 10.96.0.0/12 service CIDR exposes the
Kubernetes API on 10.96.0.1; cloud-provider internal monitoring;
hosted-bastion subnets), this default is a real exposure path.

Add a package-level atomic.Bool toggle in internal/validation/ssrf.go
that, when on, extends IsReservedIP to ALSO return true for the
three RFC1918 ranges. Every IsReservedIP-derived path
(SafeHTTPDialContext, ValidateSafeURL, the network scanner, the
webhook + OIDC + ACME callers) picks up the new policy
transitively without per-call-site changes.

internal/validation/ssrf.go
  - blockRFC1918Outbound atomic.Bool + SetBlockRFC1918Outbound /
    BlockRFC1918OutboundEnabled accessor pair
  - rfc1918Nets pre-parsed at package init (panic on parse failure
    surfaces a misconfigured ssrf package immediately, not via a
    silently disabled toggle)
  - IsReservedIP checks the toggle after the existing reserved-IP
    checks
  - Header comment rewritten to document the toggle + the
    transitive coverage

internal/config/config.go
  - New NetworkConfig sub-config; Config gains a Network field
  - Load() reads CERTCTL_BLOCK_RFC1918_OUTBOUND env var (default
    false; preserves the existing self-hosted threat model)
  - NetworkConfig docstring lists the operator-trap (enabling this
    also blocks RFC1918 from the network scanner) so an operator
    cert-discovering their own RFC1918 space doesn't get a
    silently-empty scan result

cmd/server/main.go
  - Wires validation.SetBlockRFC1918Outbound after config.Load and
    near the demo-mode banner / agent-bootstrap-token block; emits
    a one-shot INFO line when the toggle is enabled so the policy
    is visible in journals

Tests
=====

internal/config/config_test.go
  - TestLoad_AgentBootstrapTokenDenyEmpty_DefaultIsTrue — pins the
    default flip at the boot path (Load returns the flipped value)
  - TestValidate_DenyEmptyDefault_RefusesWithoutToken — pins the
    fail-closed behavior under the new default
  - TestValidate_DenyEmptyExplicitFalse_AllowsEmpty — pins the
    v2.1.x back-compat escape hatch
  - TestValidate_DenyEmpty_DemoModeAckOverride_AllowsEmpty — pins
    the demo-mode override

internal/validation/ssrf_test.go
  - TestIsReservedIP_RFC1918_OptIn — pins toggle-off / toggle-on
    behavior across all three RFC1918 ranges, edge cases
    immediately outside the ranges, and the toggle-back-off path
  - TestSafeHTTPDialContext_RFC1918_OptIn — pins that the toggle
    reaches the dial-time SSRF check transitively (not just
    IsReservedIP in isolation)

Test-helper updates (Sprint-5-induced churn):
  - internal/config/config_test.go::setMinimalValidEnv now sets
    CERTCTL_AGENT_BOOTSTRAP_TOKEN to a placeholder so Load()-based
    tests that don't specifically exercise the empty-token gate
    keep passing under the new fail-closed default. Tests that DO
    exercise the empty-token path explicitly override back to "".
  - internal/config/config_est_profiles_test.go +
    internal/config/config_scep_profiles_test.go: same placeholder
    fix for the four Load()-based EST/SCEP profile tests.
  - cmd/server/main_test.go::TestMain_ServerConfigFromEnvironment +
    TestMain_AuthTypeConfiguration: same fix at the main.go test
    layer with prior-value restore.

Verified locally: gofmt -l clean; go vet clean; staticcheck clean
across internal/config, internal/validation, cmd/server; short
tests green on all three packages; targeted -v run of all six new
test names confirms PASS.
This commit is contained in:
shankar0123
2026-05-16 19:13:52 +00:00
parent 374ec574c5
commit 5ea45a19b9
8 changed files with 403 additions and 32 deletions
+29 -8
View File
@@ -48,6 +48,7 @@ import (
"github.com/certctl-io/certctl/internal/service"
authsvc "github.com/certctl-io/certctl/internal/service/auth"
"github.com/certctl-io/certctl/internal/trustanchor"
"github.com/certctl-io/certctl/internal/validation"
)
func main() {
@@ -124,19 +125,39 @@ func main() {
logger.Warn("⚠ DEMO MODE ACTIVE — CERTCTL_DEMO_MODE_ACK=true is set; every request is served as the synthetic admin actor `actor-demo-anon` (no authentication enforced). This deployment MUST NOT hold production keys, certificates, or audit history. To promote to production: (1) unset CERTCTL_DEMO_MODE_ACK; (2) set CERTCTL_AUTH_TYPE=api-key or oidc; (3) set CERTCTL_AUTH_SECRET to a fresh `openssl rand -base64 32`; (4) set CERTCTL_KEYGEN_MODE=agent; (5) rotate CERTCTL_CONFIG_ENCRYPTION_KEY to a fresh `openssl rand -base64 32` (≥ 32 bytes, not the change-me placeholder); (6) restart the server. See docs/operator/security.md for the full posture.")
}
// Bundle-5 / Audit H-007: deprecation WARN when the agent bootstrap
// token is unset. Pre-Bundle-5 there was no token at all; the v2.0.x
// default keeps the warn-mode pass-through so existing demo deploys
// keep working, but operators must set CERTCTL_AGENT_BOOTSTRAP_TOKEN
// before v2.2.0 lands. This is a one-shot startup line — the
// per-request path stays silent so a busy registration endpoint
// doesn't flood the log.
// Bundle-5 / Audit H-007 + acquisition-audit RED-003 closure
// (Sprint 5 ACQ, 2026-05-16): deny-empty default for the agent
// bootstrap token. v2.2.0 flipped CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY
// from false → true; Validate() now refuses to start with an
// empty token UNLESS the operator either (a) explicitly opts back
// into v2.1.x warn-mode with CERTCTL_AGENT_BOOTSTRAP_TOKEN_DENY_EMPTY=false
// or (b) is running a demo deploy (CERTCTL_DEMO_MODE_ACK=true).
//
// The remaining code path here only fires in those two override
// scenarios — in both cases the operator has accepted the
// posture, but a one-shot startup line keeps the warn-mode case
// visible in journals.
if cfg.Auth.AgentBootstrapToken == "" {
logger.Warn("agent bootstrap token unset (CERTCTL_AGENT_BOOTSTRAP_TOKEN) — agents may self-register without authentication; this default will become deny-by-default in v2.2.0; generate one with: openssl rand -hex 32")
logger.Warn("agent bootstrap token unset (CERTCTL_AGENT_BOOTSTRAP_TOKEN) — agents may self-register without authentication; running in v2.1.x-compat warn-mode (DENY_EMPTY=false) or demo mode (DEMO_MODE_ACK=true). Production deploys MUST set the token; generate with: openssl rand -base64 32")
} else {
logger.Info("agent bootstrap token configured (length redacted; constant-time compare on POST /api/v1/agents)")
}
// Acquisition-audit SEC-009 + RED-005 closure (Sprint 5 ACQ,
// 2026-05-16). Opt-in RFC1918 outbound block for hosted-IaaS
// operators where private-IP space carries internal trust
// (Kubernetes API on 10.96.0.1 in default kubeadm clusters,
// cloud-provider monitoring endpoints, etc.). The toggle wires
// into the package-level state in internal/validation/ssrf.go;
// from there every IsReservedIP-derived path (SafeHTTPDialContext,
// ValidateSafeURL, the network scanner, the webhook + OIDC + ACME
// callers) picks up the policy transitively. Default false
// preserves the existing self-hosted threat model.
validation.SetBlockRFC1918Outbound(cfg.Network.BlockRFC1918Outbound)
if cfg.Network.BlockRFC1918Outbound {
logger.Info("RFC1918 outbound block ENABLED (CERTCTL_BLOCK_RFC1918_OUTBOUND=true) — 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 are reserved for outbound HTTP egress AND for the network scanner")
}
// Phase 6 SCALE-M3 closure (2026-05-14): operator-overridable
// package-level default for the asyncpoll MaxWait fallback.
// Per-connector overrides (CERTCTL_DIGICERT_POLL_MAX_WAIT_SECONDS,