fix(bundle-5): Operational Liveness + Bootstrap — 4 audit findings closed

Closes Audit-2026-04-25 H-006 (High), H-007 (High), M-011 (Medium),
L-006 (Low — verified-already-closed via C-1 master closure in v2.0.54).
Hardens the orchestrator-facing surface — k8s probes, agent enrollment,
shutdown audit drain, scheduler config plumbing.

What changed
- internal/api/handler/health.go — split contract:
    * /health stays shallow 200 (k8s liveness — process alive)
    * /ready accepts *sql.DB; runs db.PingContext(2s); 503 on failure
    * Nil DB path returns 200 + db=not_configured (test fixtures)
- internal/api/handler/agent_bootstrap.go (NEW) — verifyBootstrapToken:
    * empty expected = warn-mode pass-through
    * non-empty = `Authorization: Bearer <token>` required
    * crypto/subtle.ConstantTimeCompare; length-mismatch path runs dummy
      compare to keep timing uniform
    * ErrBootstrapTokenInvalid sentinel
- internal/api/handler/agents.go — RegisterAgent calls verifyBootstrapToken
  BEFORE body parse so unauth probes don't even allocate a JSON decoder
- internal/config/config.go — two new env vars:
    * CERTCTL_AGENT_BOOTSTRAP_TOKEN  (Auth.AgentBootstrapToken)
    * CERTCTL_AUDIT_FLUSH_TIMEOUT_SECONDS (Server.AuditFlushTimeoutSeconds)
- cmd/server/main.go — 3 changes:
    * pass *sql.DB into NewHealthHandler (H-006)
    * pass cfg.Auth.AgentBootstrapToken into NewAgentHandler (H-007)
    * configurable shutdown audit-flush timeout (M-011)
    * one-shot startup WARN when bootstrap token unset (deprecation)
- new tests: agent_bootstrap_test.go (full deny/accept/warn-mode coverage,
  constant-time compare path, length-mismatch); health_test.go extended
  with /ready DB-probe failure (503), nil-DB pass-through, /health-shallow

L-006 verified
- cmd/server/main.go:557 already calls
  sched.SetShortLivedExpiryCheckInterval(cfg.Scheduler.ShortLivedExpiryCheckInterval)
  per the C-1 master closure in v2.0.54. Bundle 5 confirms; no code change.

Threat model: TB-1 (operator/orchestrator), TB-2 (Agent↔Server).
- CWE-754 (Improper Check for Unusual or Exceptional Conditions) for H-006
- CWE-306 + CWE-288 (Missing Authentication for Critical Function) for H-007

Verification
- go vet ./...                               → clean
- go build ./...                             → clean
- go test -short -count=1 ./...              → all packages pass
- targeted Bundle-5 regressions               → all pass
- npx tsc --noEmit (web)                     → clean
- npx vitest run (web)                       → in-flight (sandbox 45s
  ceiling exceeded; no failure markers in dot stream; no frontend
  changes in this bundle so no regression risk)
- python3 yaml.safe_load(api/openapi.yaml)   → 89 paths

Backward compatibility
- Bootstrap token defaults to empty (warn-mode) — existing demo
  deployments unaffected. Server logs deprecation WARN; v2.2.0 will
  require it.
- Audit flush timeout default 30s preserves prior behaviour.
- Helm chart already routes readiness probe to /ready (no chart change
  needed); now /ready actually probes the DB.

Bundle 5 of the 2026-04-25 comprehensive audit.
This commit is contained in:
shankar0123
2026-04-25 23:54:18 +00:00
parent 018b705b91
commit 85e60b24ec
12 changed files with 586 additions and 63 deletions
+25 -3
View File
@@ -40,13 +40,22 @@ type AgentService interface {
}
// AgentHandler handles HTTP requests for agent operations.
//
// Bundle-5 / Audit H-007: BootstrapToken is the pre-shared secret enforced
// on RegisterAgent. Empty = warn-mode pass-through; non-empty triggers the
// constant-time compare in verifyBootstrapToken. See agent_bootstrap.go.
type AgentHandler struct {
svc AgentService
svc AgentService
BootstrapToken string
}
// NewAgentHandler creates a new AgentHandler with a service dependency.
func NewAgentHandler(svc AgentService) AgentHandler {
return AgentHandler{svc: svc}
//
// Bundle-5 / Audit H-007: bootstrapToken (may be empty for warn-mode) gates
// the registration endpoint. main.go reads cfg.Auth.AgentBootstrapToken and
// passes it here.
func NewAgentHandler(svc AgentService, bootstrapToken string) AgentHandler {
return AgentHandler{svc: svc, BootstrapToken: bootstrapToken}
}
// ListAgents lists all registered agents.
@@ -118,6 +127,12 @@ func (h AgentHandler) GetAgent(w http.ResponseWriter, r *http.Request) {
// RegisterAgent registers a new agent.
// POST /api/v1/agents
//
// Bundle-5 / Audit H-007 / CWE-306 + CWE-288: bootstrap-token gate runs
// BEFORE body parse so an unauthenticated probe can't even cause a JSON
// allocation. When CERTCTL_AGENT_BOOTSTRAP_TOKEN is set on the server,
// callers must include `Authorization: Bearer <token>`. See
// agent_bootstrap.go for the verification helper.
func (h AgentHandler) RegisterAgent(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
Error(w, http.StatusMethodNotAllowed, "Method not allowed")
@@ -126,6 +141,13 @@ func (h AgentHandler) RegisterAgent(w http.ResponseWriter, r *http.Request) {
requestID := middleware.GetRequestID(r.Context())
// Bundle-5 / H-007: bootstrap-token gate. Returns 401 with a fixed
// error string on miss so a token spray can't infer credential shape.
if err := verifyBootstrapToken(r, h.BootstrapToken); err != nil {
ErrorWithRequestID(w, http.StatusUnauthorized, "invalid_or_missing_bootstrap_token", requestID)
return
}
var agent domain.Agent
if err := json.NewDecoder(r.Body).Decode(&agent); err != nil {
ErrorWithRequestID(w, http.StatusBadRequest, "Invalid request body", requestID)