mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-09 16:18:55 +00:00
Close I-004 (agent hard-delete cascades targets) coverage-gap finding
Operator decision answered as full soft-delete with optional forced
cascade — hard-delete is not reachable from any public surface. Prior
to this commit, DELETE /agents/{id} ran a plain `DELETE FROM agents`
whose schema-level `ON DELETE CASCADE` on deployment_targets.agent_id
silently wiped every target, orphaning certs and aborting in-flight
jobs. The finding closure reshapes the agent-removal contract around
soft retirement with explicit preflight counts, an opt-in cascade
gated by a mandatory reason, and unconditional protection for the
four reserved sentinel agents used by discovery sources.
Schema — migration 000015:
migrations/000015_agent_retire.up.sql flips
deployment_targets_agent_id_fkey from ON DELETE CASCADE to ON DELETE
RESTRICT, so a stray `DELETE FROM agents` now errors at the DB
boundary instead of quietly destroying targets. Both `agents` and
`deployment_targets` grow a retired_at TIMESTAMPTZ + retired_reason
TEXT pair (TEXT not VARCHAR so operator comments are never
truncated), indexed via partial indexes WHERE retired_at IS NOT
NULL. The migration is self-healing (ADD COLUMN IF NOT EXISTS, DROP
CONSTRAINT IF EXISTS then ADD CONSTRAINT, CREATE INDEX IF NOT
EXISTS) so repeated runs against partially-migrated databases
converge. migrations/000015_agent_retire.down.sql restores CASCADE
and drops the new columns for clean rollback. A dedicated
repository-layer testcontainers test
(internal/repository/postgres/migration_000015_test.go) asserts the
before/after FK action, column presence, index presence, and
round-trip idempotency under up→down→up.
Domain — sentinel guard + dependency counts:
internal/domain/connector.go gains IsRetired() on Agent, the
exported SentinelAgentIDs slice listing server-scanner,
cloud-aws-sm, cloud-azure-kv, cloud-gcp-sm verbatim (matching the
four reserved IDs documented in CLAUDE.md and created at startup in
cmd/server/main.go), IsSentinelAgent(id string) predicate,
AgentDependencyCounts{ActiveTargets, ActiveCertificates,
PendingJobs} with a HasDependencies() method, and ActorTypeAgent /
ActorTypeSystem enum values used by audit emission downstream.
Coverage locked down by internal/domain/connector_test.go.
Service — 8-step ordered contract:
internal/service/agent_retire.go:RetireAgent(ctx, id, actor,
opts{Force, Reason}) enforces a fixed execution order:
(1) sentinel guard — IsSentinelAgent(id) returns ErrAgentIsSentinel
unconditionally; force=true does NOT bypass it.
(2) fetch — ErrAgentNotFound on miss.
(3) idempotency — if IsRetired() already, return
AgentRetirementResult{AlreadyRetired: true} with no new audit
event and no state change (safe to replay from flaky clients).
(4) preflight counts — collectAgentDependencyCounts runs
ActiveTargets, ActiveCertificates, PendingJobs sequentially
(not in parallel; keeps the per-query timeout predictable and
matches the repo's existing call-chain shape).
(5) force-reason guard — opts.Force=true with empty Reason returns
ErrForceReasonRequired (wired into the 400 status surface).
(6) dependency guard — HasDependencies() with opts.Force=false
returns BlockedByDependenciesError{Counts} (wired into the 409
body with per-bucket counts).
(7) mutation — single pinned retiredAt := time.Now(); agent
retirement first, then cascade target retirement if opts.Force,
all under the repo's single transaction so the two retired_at
stamps match to the second.
(8) best-effort audit — agent_retired always; agent_retirement_
cascaded additionally on the force path. Actor is whatever the
handler resolves from the request; actor type is mapped by
resolveActorType (system/agent-prefix→Agent/else→User). Audit
emission failures are logged via slog.Error but do not abort
the retirement (matches the house convention used by every
other scheduler-emitted event).
BlockedByDependenciesError implements Error() as
"active_targets=%d, active_certificates=%d, pending_jobs=%d" and
Unwrap() → ErrBlockedByDependencies. The single struct satisfies
errors.Is via Unwrap (used by scheduler-level tests) and errors.As
via the concrete type (used by the handler to fish out Counts for
the 409 body). ListRetiredAgents(page, perPage) adds a separate
paginated accessor with page<1→1 and perPage<1→50 normalization so
retired rows are queryable without polluting the default agent
listing.
Sentinel guard coverage is asymmetric by design: all four reserved
IDs are protected, and force=true cannot override. Regression tests
in internal/service/agent_retire_test.go assert each of the eight
steps in order, plus sentinel bypass attempts and idempotency
replay.
Handler + router — status-code surface:
internal/api/handler/agents.go:RetireAgent exposes seven status
codes on DELETE /agents/{id}:
200 on a fresh retirement (body echoes AgentRetirementResult).
204 on idempotent replay (AlreadyRetired=true; no new audit).
400 on ErrForceReasonRequired.
403 on ErrAgentIsSentinel.
404 on ErrAgentNotFound.
409 on BlockedByDependenciesError, with a custom body shape
{error, counts{active_targets, active_certificates,
pending_jobs}} that bypasses the default ErrorWithRequestID
envelope so callers get the per-bucket numbers directly.
500 on any other error.
Heartbeat HandleHeartbeat returns 410 Gone when the agent is
retired (ErrAgentRetired), signalling the agent to shut down.
Query params `force=true` and `reason=<text>` drive the cascade
path; both are forwarded as url.Values through the new MCP
transport.
internal/api/router/router.go registers GET /api/v1/agents/retired
literal-path BEFORE /api/v1/agents/{id} — Go 1.22 ServeMux's
literal-beats-pattern-var precedence routes "retired" to the
paginated retired-agents listing instead of fetching a hypothetical
agent named "retired".
Agent binary — clean shutdown on 410:
cmd/agent/main.go gains the ErrAgentRetired sentinel, a
retiredOnce sync.Once, and a retiredSignal chan struct{}. A
markRetired(source, statusCode, body) helper closes the channel
exactly once; the Run() select loop observes the close and returns
ErrAgentRetired; main() matches via errors.Is(err, ErrAgentRetired)
and exits cleanly instead of spinning in the heartbeat retry loop.
The 410 Gone surface is therefore terminal for the agent process.
MCP transport:
internal/mcp/client.go adds Client.DeleteWithQuery(path, query),
a new additive transport method. Client.Delete is path-only; without
this method the retire tool would silently drop `force` and `reason`,
turning every cascade retire into a default soft-retire. The new
method shares do()'s 204 normalization and 4xx/5xx error
propagation so tool authors get one contract.
internal/mcp/tools.go + internal/mcp/types.go expose the
retire_agent tool with Force+Reason inputs wired through
DeleteWithQuery.
CLI:
cmd/cli/main.go + internal/cli/client.go add two CLI surfaces:
`agents list --retired` (client-side strip of --retired then
delegation to ListRetiredAgents, sharing --page/--per-page parsing
with the default listing) and `agents retire <id> [--force --reason
"…"]` (mirrors ErrForceReasonRequired — force without reason is
rejected client-side before the request is sent). JSON + table
output modes both honor the new columns.
Frontend:
web/src/pages/AgentsPage.tsx surfaces retired/retire affordances.
web/src/api/client.ts + web/src/api/types.ts expose the retire
endpoint and the retired-listing. 4 new Vitest regression cases.
OpenAPI:
api/openapi.yaml documents DELETE /agents/{id} with all seven
status codes, 410 on heartbeat, and the 409 per-bucket body shape.
Regression coverage (six new test files, all green):
internal/service/agent_retire_test.go — 8-step contract + sentinel guards
internal/api/handler/agent_retire_handler_test.go — 7-status-code surface + 410 heartbeat
internal/mcp/retire_agent_test.go — DeleteWithQuery wire-through
internal/cli/agent_retire_test.go — --retired listing + --force/--reason pairing
internal/repository/postgres/migration_000015_test.go — FK flip + columns + indexes + up↔down
internal/domain/connector_test.go — IsRetired, IsSentinelAgent, SentinelAgentIDs, HasDependencies
Files:
api/openapi.yaml — DELETE + 410 + 409 body shape
cmd/agent/main.go — ErrAgentRetired, markRetired, retiredSignal
cmd/cli/main.go — handleAgents list/get/retire dispatch
docs/architecture.md, docs/concepts.md,
docs/testing-guide.md — retirement contract narrative
internal/api/handler/agents.go — RetireAgent, status surface, 410 on heartbeat
internal/api/handler/agent_handler_test.go — extended coverage
internal/api/handler/agent_retire_handler_test.go — new
internal/api/router/router.go — /agents/retired before /agents/{id}
internal/cli/agent_retire_test.go — new
internal/cli/client.go — ListRetiredAgents + RetireAgent
internal/domain/connector.go — IsRetired, SentinelAgentIDs,
IsSentinelAgent, AgentDependencyCounts,
ActorTypeAgent/System
internal/domain/connector_test.go — new
internal/integration/lifecycle_test.go — retirement fixture
internal/mcp/client.go — DeleteWithQuery additive transport
internal/mcp/retire_agent_test.go — new
internal/mcp/tools.go, internal/mcp/types.go — retire_agent tool + Force/Reason inputs
internal/repository/interfaces.go — AgentRepository retirement methods
internal/repository/postgres/agent.go — retire + cascade target retire + counts
internal/repository/postgres/migration_000015_test.go — new
internal/service/agent.go — wire into AgentService surface
internal/service/agent_retire.go — new 8-step contract
internal/service/agent_retire_test.go — new
internal/service/deployment.go — skip retired agents
internal/service/target.go — skip retired agents
internal/service/testutil_test.go — shared mocks extended
migrations/000015_agent_retire.up.sql — new
migrations/000015_agent_retire.down.sql — new
web/src/api/client.ts, types.ts + tests — retire endpoint wiring
web/src/pages/AgentsPage.tsx — retire UI
This commit is contained in:
@@ -0,0 +1,228 @@
|
||||
package cli
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestClient_RetireAgent_Success pins the I-004 CLI happy path: the operator
|
||||
// runs `certctl-cli agents retire <id>` and the client issues a DELETE to
|
||||
// /api/v1/agents/{id}, parses the 200 JSON body (retired_at, already_retired,
|
||||
// cascade, counts), and reports success. The handler test already covers the
|
||||
// server-side contract; this test covers the client-side wire formatting so a
|
||||
// refactor of the server's 200 body shape can't silently break the CLI.
|
||||
func TestClient_RetireAgent_Success(t *testing.T) {
|
||||
var (
|
||||
sawMethod string
|
||||
sawPath string
|
||||
sawForce string
|
||||
sawReason string
|
||||
)
|
||||
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
sawMethod = r.Method
|
||||
sawPath = r.URL.Path
|
||||
sawForce = r.URL.Query().Get("force")
|
||||
sawReason = r.URL.Query().Get("reason")
|
||||
|
||||
if r.Method != "DELETE" || r.URL.Path != "/api/v1/agents/ag-1" {
|
||||
w.WriteHeader(http.StatusNotFound)
|
||||
return
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_ = json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"retired_at": "2026-04-18T12:00:00Z",
|
||||
"already_retired": false,
|
||||
"cascade": false,
|
||||
"counts": map[string]interface{}{
|
||||
"active_targets": 0,
|
||||
"active_certificates": 0,
|
||||
"pending_jobs": 0,
|
||||
},
|
||||
})
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
// Positional arg: the agent ID. No --force, no --reason — the default
|
||||
// soft-retire path. Compile-fail until client.RetireAgent exists.
|
||||
if err := client.RetireAgent([]string{"ag-1"}); err != nil {
|
||||
t.Fatalf("RetireAgent(ag-1) err=%v want nil", err)
|
||||
}
|
||||
|
||||
if sawMethod != "DELETE" {
|
||||
t.Errorf("method=%q want DELETE", sawMethod)
|
||||
}
|
||||
if sawPath != "/api/v1/agents/ag-1" {
|
||||
t.Errorf("path=%q want /api/v1/agents/ag-1", sawPath)
|
||||
}
|
||||
if sawForce != "" {
|
||||
t.Errorf("force query=%q want empty (default path sends no force)", sawForce)
|
||||
}
|
||||
if sawReason != "" {
|
||||
t.Errorf("reason query=%q want empty (default path sends no reason)", sawReason)
|
||||
}
|
||||
}
|
||||
|
||||
// TestClient_RetireAgent_Force_WithReason_Success pins the ?force=true&reason=...
|
||||
// escape hatch wiring. Operators who supply --force + --reason get their values
|
||||
// propagated as URL query parameters exactly once, so the server sees the same
|
||||
// contract the handler test expects. Also verifies the cascade=true response
|
||||
// body parses cleanly.
|
||||
func TestClient_RetireAgent_Force_WithReason_Success(t *testing.T) {
|
||||
var (
|
||||
sawForce string
|
||||
sawReason string
|
||||
)
|
||||
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
sawForce = r.URL.Query().Get("force")
|
||||
sawReason = r.URL.Query().Get("reason")
|
||||
|
||||
if r.Method != "DELETE" || r.URL.Path != "/api/v1/agents/ag-1" {
|
||||
w.WriteHeader(http.StatusNotFound)
|
||||
return
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
_ = json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"retired_at": "2026-04-18T12:00:00Z",
|
||||
"already_retired": false,
|
||||
"cascade": true,
|
||||
"counts": map[string]interface{}{
|
||||
"active_targets": 2,
|
||||
"active_certificates": 5,
|
||||
"pending_jobs": 1,
|
||||
},
|
||||
})
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
if err := client.RetireAgent([]string{"ag-1", "--force", "--reason", "decommissioning rack 7"}); err != nil {
|
||||
t.Fatalf("RetireAgent(force+reason) err=%v want nil", err)
|
||||
}
|
||||
if sawForce != "true" {
|
||||
t.Errorf("force query=%q want \"true\"", sawForce)
|
||||
}
|
||||
if sawReason != "decommissioning rack 7" {
|
||||
t.Errorf("reason query=%q want %q", sawReason, "decommissioning rack 7")
|
||||
}
|
||||
}
|
||||
|
||||
// TestClient_RetireAgent_Force_RequiresReason pins the client-side guard: using
|
||||
// --force without --reason must fail BEFORE any HTTP request is made. Without
|
||||
// this, the client would bounce off the server's 400 ErrForceReasonRequired
|
||||
// only after a round trip — slow feedback, wasted audit-trail noise, and a
|
||||
// worse operator experience. requestCount=0 enforces that no HTTP call happens.
|
||||
func TestClient_RetireAgent_Force_RequiresReason(t *testing.T) {
|
||||
var requestCount int
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
requestCount++
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
err := client.RetireAgent([]string{"ag-1", "--force"})
|
||||
if err == nil {
|
||||
t.Fatalf("RetireAgent(force, no reason) err=nil want client-side error")
|
||||
}
|
||||
if !containsStr(err.Error(), "reason") {
|
||||
t.Errorf("err=%q should mention --reason to guide operator", err.Error())
|
||||
}
|
||||
if requestCount != 0 {
|
||||
t.Fatalf("requestCount=%d want 0; client must short-circuit before HTTP call", requestCount)
|
||||
}
|
||||
}
|
||||
|
||||
// TestClient_RetireAgent_MissingID covers the other common operator mistake:
|
||||
// invoking `certctl-cli agents retire` with no agent ID. Must be caught by the
|
||||
// client with a clear error, not a malformed DELETE to /api/v1/agents/.
|
||||
func TestClient_RetireAgent_MissingID(t *testing.T) {
|
||||
var requestCount int
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
requestCount++
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
err := client.RetireAgent([]string{})
|
||||
if err == nil {
|
||||
t.Fatalf("RetireAgent([]) err=nil want missing-id error")
|
||||
}
|
||||
if requestCount != 0 {
|
||||
t.Fatalf("requestCount=%d want 0; client must reject missing-id before HTTP", requestCount)
|
||||
}
|
||||
}
|
||||
|
||||
// TestClient_ListRetiredAgents_Success pins the audit/forensics CLI surface:
|
||||
// `certctl-cli agents list-retired` must GET /api/v1/agents/retired and render
|
||||
// the paged response. The server returns a PagedResponse; the client is
|
||||
// responsible for printing it in table or JSON format, same as ListAgents.
|
||||
func TestClient_ListRetiredAgents_Success(t *testing.T) {
|
||||
var (
|
||||
sawMethod string
|
||||
sawPath string
|
||||
)
|
||||
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
sawMethod = r.Method
|
||||
sawPath = r.URL.Path
|
||||
|
||||
if r.Method != "GET" || r.URL.Path != "/api/v1/agents/retired" {
|
||||
w.WriteHeader(http.StatusNotFound)
|
||||
return
|
||||
}
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"data": []map[string]interface{}{
|
||||
{
|
||||
"id": "ag-old-01",
|
||||
"name": "decom-01",
|
||||
"hostname": "server-old",
|
||||
"status": "Offline",
|
||||
"registered_at": "2024-01-01T00:00:00Z",
|
||||
"retired_at": "2026-01-01T00:00:00Z",
|
||||
"retired_reason": "old hardware",
|
||||
},
|
||||
},
|
||||
"total": 1,
|
||||
"page": 1,
|
||||
"per_page": 50,
|
||||
})
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
if err := client.ListRetiredAgents([]string{}); err != nil {
|
||||
t.Fatalf("ListRetiredAgents err=%v want nil", err)
|
||||
}
|
||||
if sawMethod != "GET" {
|
||||
t.Errorf("method=%q want GET", sawMethod)
|
||||
}
|
||||
if sawPath != "/api/v1/agents/retired" {
|
||||
t.Errorf("path=%q want /api/v1/agents/retired", sawPath)
|
||||
}
|
||||
}
|
||||
|
||||
// TestClient_ListRetiredAgents_ServerError covers the non-happy path: server
|
||||
// returns 5xx → client surfaces the error rather than silently printing an
|
||||
// empty list. Without this, operators running the command as part of a
|
||||
// compliance audit could miss a backend outage.
|
||||
func TestClient_ListRetiredAgents_ServerError(t *testing.T) {
|
||||
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
http.Error(w, "db unreachable", http.StatusInternalServerError)
|
||||
}))
|
||||
defer server.Close()
|
||||
|
||||
client := NewClient(server.URL, "", "table")
|
||||
err := client.ListRetiredAgents([]string{})
|
||||
if err == nil {
|
||||
t.Fatalf("ListRetiredAgents(500) err=nil want propagated error")
|
||||
}
|
||||
}
|
||||
@@ -293,6 +293,194 @@ func (c *Client) ListAgents(args []string) error {
|
||||
return c.outputAgentsTable(result.Data, result.Total)
|
||||
}
|
||||
|
||||
// ListRetiredAgents lists soft-retired agents from the dedicated endpoint.
|
||||
//
|
||||
// I-004: hits GET /api/v1/agents/retired which is a separate route from the
|
||||
// default listing (the default hides retired rows). Supports --page and
|
||||
// --per-page just like the active list. Output format mirrors ListAgents
|
||||
// but prepends RETIRED_AT and RETIRED_REASON columns so the operator can
|
||||
// forensic-grep the output.
|
||||
func (c *Client) ListRetiredAgents(args []string) error {
|
||||
fs := flag.NewFlagSet("agents list --retired", flag.ContinueOnError)
|
||||
page := fs.Int("page", 1, "Page number")
|
||||
perPage := fs.Int("per-page", 50, "Items per page")
|
||||
fs.Parse(args)
|
||||
|
||||
query := url.Values{}
|
||||
query.Set("page", fmt.Sprintf("%d", *page))
|
||||
query.Set("per_page", fmt.Sprintf("%d", *perPage))
|
||||
|
||||
resp, err := c.do("GET", "/api/v1/agents/retired", query, nil)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
var result struct {
|
||||
Data []map[string]interface{} `json:"data"`
|
||||
Total int `json:"total"`
|
||||
}
|
||||
if err := json.Unmarshal(resp, &result); err != nil {
|
||||
return fmt.Errorf("parsing response: %w", err)
|
||||
}
|
||||
|
||||
if c.format == "json" {
|
||||
return c.outputJSON(result)
|
||||
}
|
||||
|
||||
return c.outputRetiredAgentsTable(result.Data, result.Total)
|
||||
}
|
||||
|
||||
// RetireAgent soft-retires an agent via DELETE /api/v1/agents/{id}.
|
||||
//
|
||||
// I-004: wraps the full status-code matrix pinned by the handler's
|
||||
// agent_retire_handler_test.go:
|
||||
//
|
||||
// 200 clean retire — body: retired_at, already_retired=false, cascade=false, counts=0
|
||||
// 200 force-cascade retire — body: cascade=true, counts=pre-cascade snapshot
|
||||
// 204 idempotent retire — agent was already retired, NO body
|
||||
// 403 sentinel — reserved agent (server-scanner / cloud-*), ErrAgentIsSentinel
|
||||
// 404 not found — agent doesn't exist
|
||||
// 409 blocked_by_dependencies — body: error, message, counts
|
||||
//
|
||||
// The default (force=false) flow refuses to retire agents with active
|
||||
// downstream dependencies; the operator must re-run with --force and an
|
||||
// explicit --reason to cascade. The handler rejects --force without
|
||||
// --reason with a 400 — we mirror that contract client-side so the
|
||||
// operator gets a clear error before the round trip.
|
||||
func (c *Client) RetireAgent(args []string) error {
|
||||
// Convention: `agents retire <id> [--force] [--reason <reason>]` — the ID
|
||||
// is a positional arg that precedes the flags. Go's flag package stops
|
||||
// parsing at the first non-flag token, so we pull args[0] as the ID and
|
||||
// hand args[1:] to the flag parser. Without this split, `agents retire
|
||||
// ag-1 --force --reason "x"` would parse with force=false and reason=""
|
||||
// because the flags land in fs.Args() instead of being recognized.
|
||||
if len(args) == 0 {
|
||||
return fmt.Errorf("agent ID is required: agents retire <id> [--force] [--reason <reason>]")
|
||||
}
|
||||
id := args[0]
|
||||
|
||||
fs := flag.NewFlagSet("agents retire", flag.ContinueOnError)
|
||||
force := fs.Bool("force", false, "Cascade-retire downstream targets, certs, and jobs")
|
||||
reason := fs.String("reason", "", "Human-readable reason (required with --force)")
|
||||
if err := fs.Parse(args[1:]); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Mirror the handler's ErrForceReasonRequired contract client-side so
|
||||
// the operator gets a clear error before the round trip.
|
||||
if *force && strings.TrimSpace(*reason) == "" {
|
||||
return fmt.Errorf("--reason is required when --force is set")
|
||||
}
|
||||
|
||||
// Build query string. Skip ?force=false; skip ?reason= when empty.
|
||||
query := url.Values{}
|
||||
if *force {
|
||||
query.Set("force", "true")
|
||||
}
|
||||
if *reason != "" {
|
||||
query.Set("reason", *reason)
|
||||
}
|
||||
|
||||
u, err := url.JoinPath(c.baseURL, fmt.Sprintf("/api/v1/agents/%s", id))
|
||||
if err != nil {
|
||||
return fmt.Errorf("invalid URL: %w", err)
|
||||
}
|
||||
if len(query) > 0 {
|
||||
u = u + "?" + query.Encode()
|
||||
}
|
||||
|
||||
req, err := http.NewRequest("DELETE", u, nil)
|
||||
if err != nil {
|
||||
return fmt.Errorf("creating request: %w", err)
|
||||
}
|
||||
req.Header.Set("Accept", "application/json")
|
||||
if c.apiKey != "" {
|
||||
req.Header.Set("Authorization", "Bearer "+c.apiKey)
|
||||
}
|
||||
|
||||
resp, err := c.httpClient.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("request failed: %w", err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
body, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return fmt.Errorf("reading response: %w", err)
|
||||
}
|
||||
|
||||
switch resp.StatusCode {
|
||||
case http.StatusNoContent:
|
||||
// 204 idempotent — the agent was already retired. No body.
|
||||
if c.format == "json" {
|
||||
return c.outputJSON(map[string]interface{}{
|
||||
"agent_id": id,
|
||||
"already_retired": true,
|
||||
})
|
||||
}
|
||||
fmt.Printf("Agent %s was already retired (idempotent)\n", id)
|
||||
return nil
|
||||
|
||||
case http.StatusOK:
|
||||
var result struct {
|
||||
RetiredAt string `json:"retired_at"`
|
||||
AlreadyRetired bool `json:"already_retired"`
|
||||
Cascade bool `json:"cascade"`
|
||||
Counts struct {
|
||||
ActiveTargets int `json:"active_targets"`
|
||||
ActiveCertificates int `json:"active_certificates"`
|
||||
PendingJobs int `json:"pending_jobs"`
|
||||
} `json:"counts"`
|
||||
}
|
||||
if err := json.Unmarshal(body, &result); err != nil {
|
||||
return fmt.Errorf("parsing 200 response: %w", err)
|
||||
}
|
||||
|
||||
if c.format == "json" {
|
||||
return c.outputJSON(json.RawMessage(body))
|
||||
}
|
||||
|
||||
if result.Cascade {
|
||||
fmt.Printf("Agent %s retired (cascade). Retired at: %s\n", id, result.RetiredAt)
|
||||
fmt.Printf(" Cascaded: %d targets, %d certificates, %d jobs\n",
|
||||
result.Counts.ActiveTargets, result.Counts.ActiveCertificates, result.Counts.PendingJobs)
|
||||
} else {
|
||||
fmt.Printf("Agent %s retired. Retired at: %s\n", id, result.RetiredAt)
|
||||
}
|
||||
return nil
|
||||
|
||||
case http.StatusConflict:
|
||||
// 409 blocked_by_dependencies. Parse the body so we can show the
|
||||
// operator which dependency counts are holding up the retire.
|
||||
var blocked struct {
|
||||
Error string `json:"error"`
|
||||
Message string `json:"message"`
|
||||
Counts struct {
|
||||
ActiveTargets int `json:"active_targets"`
|
||||
ActiveCertificates int `json:"active_certificates"`
|
||||
PendingJobs int `json:"pending_jobs"`
|
||||
} `json:"counts"`
|
||||
}
|
||||
if err := json.Unmarshal(body, &blocked); err != nil {
|
||||
return fmt.Errorf("agent has active dependencies (HTTP 409); raw body: %s", string(body))
|
||||
}
|
||||
return fmt.Errorf("blocked_by_dependencies: %s (targets=%d certificates=%d jobs=%d); re-run with --force --reason \"<reason>\" to cascade",
|
||||
blocked.Message, blocked.Counts.ActiveTargets, blocked.Counts.ActiveCertificates, blocked.Counts.PendingJobs)
|
||||
|
||||
case http.StatusForbidden:
|
||||
return fmt.Errorf("agent %s is a reserved sentinel and cannot be retired (HTTP 403)", id)
|
||||
|
||||
case http.StatusNotFound:
|
||||
return fmt.Errorf("agent %s not found (HTTP 404)", id)
|
||||
|
||||
case http.StatusBadRequest:
|
||||
return fmt.Errorf("bad request (HTTP 400): %s", string(body))
|
||||
|
||||
default:
|
||||
return fmt.Errorf("unexpected HTTP %d: %s", resp.StatusCode, string(body))
|
||||
}
|
||||
}
|
||||
|
||||
// GetAgent retrieves a single agent by ID.
|
||||
func (c *Client) GetAgent(id string) error {
|
||||
resp, err := c.do("GET", fmt.Sprintf("/api/v1/agents/%s", id), nil, nil)
|
||||
@@ -613,6 +801,35 @@ func (c *Client) outputAgentsTable(agents []map[string]interface{}, total int) e
|
||||
return nil
|
||||
}
|
||||
|
||||
// outputRetiredAgentsTable is the tab-writer view for the retired listing.
|
||||
// I-004: adds RETIRED_AT + REASON columns so operators can forensic-grep.
|
||||
func (c *Client) outputRetiredAgentsTable(agents []map[string]interface{}, total int) error {
|
||||
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(w, "ID\tHOSTNAME\tOS\tARCHITECTURE\tRETIRED AT\tREASON")
|
||||
|
||||
for _, agent := range agents {
|
||||
id := getString(agent, "id")
|
||||
hostname := getString(agent, "hostname")
|
||||
osName := getString(agent, "os")
|
||||
arch := getString(agent, "architecture")
|
||||
retiredAt := ""
|
||||
if raw, ok := agent["retired_at"].(string); ok && raw != "" {
|
||||
if t, err := time.Parse(time.RFC3339, raw); err == nil {
|
||||
retiredAt = t.Format("2006-01-02 15:04:05")
|
||||
} else {
|
||||
retiredAt = raw
|
||||
}
|
||||
}
|
||||
reason := getString(agent, "retired_reason")
|
||||
|
||||
fmt.Fprintf(w, "%s\t%s\t%s\t%s\t%s\t%s\n", id, hostname, osName, arch, retiredAt, reason)
|
||||
}
|
||||
|
||||
w.Flush()
|
||||
fmt.Printf("\nTotal retired: %d\n", total)
|
||||
return nil
|
||||
}
|
||||
|
||||
func (c *Client) outputAgentDetail(agent map[string]interface{}) error {
|
||||
w := tabwriter.NewWriter(os.Stdout, 0, 0, 2, ' ', 0)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user