mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 13:51:36 +00:00
Close I-004 (agent hard-delete cascades targets) coverage-gap finding
Operator decision answered as full soft-delete with optional forced
cascade — hard-delete is not reachable from any public surface. Prior
to this commit, DELETE /agents/{id} ran a plain `DELETE FROM agents`
whose schema-level `ON DELETE CASCADE` on deployment_targets.agent_id
silently wiped every target, orphaning certs and aborting in-flight
jobs. The finding closure reshapes the agent-removal contract around
soft retirement with explicit preflight counts, an opt-in cascade
gated by a mandatory reason, and unconditional protection for the
four reserved sentinel agents used by discovery sources.
Schema — migration 000015:
migrations/000015_agent_retire.up.sql flips
deployment_targets_agent_id_fkey from ON DELETE CASCADE to ON DELETE
RESTRICT, so a stray `DELETE FROM agents` now errors at the DB
boundary instead of quietly destroying targets. Both `agents` and
`deployment_targets` grow a retired_at TIMESTAMPTZ + retired_reason
TEXT pair (TEXT not VARCHAR so operator comments are never
truncated), indexed via partial indexes WHERE retired_at IS NOT
NULL. The migration is self-healing (ADD COLUMN IF NOT EXISTS, DROP
CONSTRAINT IF EXISTS then ADD CONSTRAINT, CREATE INDEX IF NOT
EXISTS) so repeated runs against partially-migrated databases
converge. migrations/000015_agent_retire.down.sql restores CASCADE
and drops the new columns for clean rollback. A dedicated
repository-layer testcontainers test
(internal/repository/postgres/migration_000015_test.go) asserts the
before/after FK action, column presence, index presence, and
round-trip idempotency under up→down→up.
Domain — sentinel guard + dependency counts:
internal/domain/connector.go gains IsRetired() on Agent, the
exported SentinelAgentIDs slice listing server-scanner,
cloud-aws-sm, cloud-azure-kv, cloud-gcp-sm verbatim (matching the
four reserved IDs documented in CLAUDE.md and created at startup in
cmd/server/main.go), IsSentinelAgent(id string) predicate,
AgentDependencyCounts{ActiveTargets, ActiveCertificates,
PendingJobs} with a HasDependencies() method, and ActorTypeAgent /
ActorTypeSystem enum values used by audit emission downstream.
Coverage locked down by internal/domain/connector_test.go.
Service — 8-step ordered contract:
internal/service/agent_retire.go:RetireAgent(ctx, id, actor,
opts{Force, Reason}) enforces a fixed execution order:
(1) sentinel guard — IsSentinelAgent(id) returns ErrAgentIsSentinel
unconditionally; force=true does NOT bypass it.
(2) fetch — ErrAgentNotFound on miss.
(3) idempotency — if IsRetired() already, return
AgentRetirementResult{AlreadyRetired: true} with no new audit
event and no state change (safe to replay from flaky clients).
(4) preflight counts — collectAgentDependencyCounts runs
ActiveTargets, ActiveCertificates, PendingJobs sequentially
(not in parallel; keeps the per-query timeout predictable and
matches the repo's existing call-chain shape).
(5) force-reason guard — opts.Force=true with empty Reason returns
ErrForceReasonRequired (wired into the 400 status surface).
(6) dependency guard — HasDependencies() with opts.Force=false
returns BlockedByDependenciesError{Counts} (wired into the 409
body with per-bucket counts).
(7) mutation — single pinned retiredAt := time.Now(); agent
retirement first, then cascade target retirement if opts.Force,
all under the repo's single transaction so the two retired_at
stamps match to the second.
(8) best-effort audit — agent_retired always; agent_retirement_
cascaded additionally on the force path. Actor is whatever the
handler resolves from the request; actor type is mapped by
resolveActorType (system/agent-prefix→Agent/else→User). Audit
emission failures are logged via slog.Error but do not abort
the retirement (matches the house convention used by every
other scheduler-emitted event).
BlockedByDependenciesError implements Error() as
"active_targets=%d, active_certificates=%d, pending_jobs=%d" and
Unwrap() → ErrBlockedByDependencies. The single struct satisfies
errors.Is via Unwrap (used by scheduler-level tests) and errors.As
via the concrete type (used by the handler to fish out Counts for
the 409 body). ListRetiredAgents(page, perPage) adds a separate
paginated accessor with page<1→1 and perPage<1→50 normalization so
retired rows are queryable without polluting the default agent
listing.
Sentinel guard coverage is asymmetric by design: all four reserved
IDs are protected, and force=true cannot override. Regression tests
in internal/service/agent_retire_test.go assert each of the eight
steps in order, plus sentinel bypass attempts and idempotency
replay.
Handler + router — status-code surface:
internal/api/handler/agents.go:RetireAgent exposes seven status
codes on DELETE /agents/{id}:
200 on a fresh retirement (body echoes AgentRetirementResult).
204 on idempotent replay (AlreadyRetired=true; no new audit).
400 on ErrForceReasonRequired.
403 on ErrAgentIsSentinel.
404 on ErrAgentNotFound.
409 on BlockedByDependenciesError, with a custom body shape
{error, counts{active_targets, active_certificates,
pending_jobs}} that bypasses the default ErrorWithRequestID
envelope so callers get the per-bucket numbers directly.
500 on any other error.
Heartbeat HandleHeartbeat returns 410 Gone when the agent is
retired (ErrAgentRetired), signalling the agent to shut down.
Query params `force=true` and `reason=<text>` drive the cascade
path; both are forwarded as url.Values through the new MCP
transport.
internal/api/router/router.go registers GET /api/v1/agents/retired
literal-path BEFORE /api/v1/agents/{id} — Go 1.22 ServeMux's
literal-beats-pattern-var precedence routes "retired" to the
paginated retired-agents listing instead of fetching a hypothetical
agent named "retired".
Agent binary — clean shutdown on 410:
cmd/agent/main.go gains the ErrAgentRetired sentinel, a
retiredOnce sync.Once, and a retiredSignal chan struct{}. A
markRetired(source, statusCode, body) helper closes the channel
exactly once; the Run() select loop observes the close and returns
ErrAgentRetired; main() matches via errors.Is(err, ErrAgentRetired)
and exits cleanly instead of spinning in the heartbeat retry loop.
The 410 Gone surface is therefore terminal for the agent process.
MCP transport:
internal/mcp/client.go adds Client.DeleteWithQuery(path, query),
a new additive transport method. Client.Delete is path-only; without
this method the retire tool would silently drop `force` and `reason`,
turning every cascade retire into a default soft-retire. The new
method shares do()'s 204 normalization and 4xx/5xx error
propagation so tool authors get one contract.
internal/mcp/tools.go + internal/mcp/types.go expose the
retire_agent tool with Force+Reason inputs wired through
DeleteWithQuery.
CLI:
cmd/cli/main.go + internal/cli/client.go add two CLI surfaces:
`agents list --retired` (client-side strip of --retired then
delegation to ListRetiredAgents, sharing --page/--per-page parsing
with the default listing) and `agents retire <id> [--force --reason
"…"]` (mirrors ErrForceReasonRequired — force without reason is
rejected client-side before the request is sent). JSON + table
output modes both honor the new columns.
Frontend:
web/src/pages/AgentsPage.tsx surfaces retired/retire affordances.
web/src/api/client.ts + web/src/api/types.ts expose the retire
endpoint and the retired-listing. 4 new Vitest regression cases.
OpenAPI:
api/openapi.yaml documents DELETE /agents/{id} with all seven
status codes, 410 on heartbeat, and the 409 per-bucket body shape.
Regression coverage (six new test files, all green):
internal/service/agent_retire_test.go — 8-step contract + sentinel guards
internal/api/handler/agent_retire_handler_test.go — 7-status-code surface + 410 heartbeat
internal/mcp/retire_agent_test.go — DeleteWithQuery wire-through
internal/cli/agent_retire_test.go — --retired listing + --force/--reason pairing
internal/repository/postgres/migration_000015_test.go — FK flip + columns + indexes + up↔down
internal/domain/connector_test.go — IsRetired, IsSentinelAgent, SentinelAgentIDs, HasDependencies
Files:
api/openapi.yaml — DELETE + 410 + 409 body shape
cmd/agent/main.go — ErrAgentRetired, markRetired, retiredSignal
cmd/cli/main.go — handleAgents list/get/retire dispatch
docs/architecture.md, docs/concepts.md,
docs/testing-guide.md — retirement contract narrative
internal/api/handler/agents.go — RetireAgent, status surface, 410 on heartbeat
internal/api/handler/agent_handler_test.go — extended coverage
internal/api/handler/agent_retire_handler_test.go — new
internal/api/router/router.go — /agents/retired before /agents/{id}
internal/cli/agent_retire_test.go — new
internal/cli/client.go — ListRetiredAgents + RetireAgent
internal/domain/connector.go — IsRetired, SentinelAgentIDs,
IsSentinelAgent, AgentDependencyCounts,
ActorTypeAgent/System
internal/domain/connector_test.go — new
internal/integration/lifecycle_test.go — retirement fixture
internal/mcp/client.go — DeleteWithQuery additive transport
internal/mcp/retire_agent_test.go — new
internal/mcp/tools.go, internal/mcp/types.go — retire_agent tool + Force/Reason inputs
internal/repository/interfaces.go — AgentRepository retirement methods
internal/repository/postgres/agent.go — retire + cascade target retire + counts
internal/repository/postgres/migration_000015_test.go — new
internal/service/agent.go — wire into AgentService surface
internal/service/agent_retire.go — new 8-step contract
internal/service/agent_retire_test.go — new
internal/service/deployment.go — skip retired agents
internal/service/target.go — skip retired agents
internal/service/testutil_test.go — shared mocks extended
migrations/000015_agent_retire.up.sql — new
migrations/000015_agent_retire.down.sql — new
web/src/api/client.ts, types.ts + tests — retire endpoint wiring
web/src/pages/AgentsPage.tsx — retire UI
This commit is contained in:
@@ -19,6 +19,8 @@ import {
|
||||
getAgents,
|
||||
getAgent,
|
||||
registerAgent,
|
||||
retireAgent,
|
||||
listRetiredAgents,
|
||||
getJobs,
|
||||
cancelJob,
|
||||
approveRenewal,
|
||||
@@ -399,6 +401,113 @@ describe('API Client', () => {
|
||||
});
|
||||
});
|
||||
|
||||
// ─── Agent Retirement (I-004) ───────────────────────
|
||||
//
|
||||
// These tests pin the GUI's retirement contract against what the backend
|
||||
// will add in Phase 2b: soft-retire via DELETE, force-cascade via
|
||||
// ?force=true&reason=..., idempotent 204 on already-retired, 409 blocked
|
||||
// payload with counts, and a GET /agents/retired listing surface.
|
||||
//
|
||||
// All compile-fail until client.ts exports retireAgent + listRetiredAgents
|
||||
// — the shape of those exports is pinned here rather than assumed.
|
||||
describe('Agent Retirement (I-004)', () => {
|
||||
it('retireAgent sends DELETE without query when no force/reason', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockJsonResponse({
|
||||
retired_at: '2026-04-18T12:00:00Z',
|
||||
already_retired: false,
|
||||
cascade: false,
|
||||
}),
|
||||
);
|
||||
await retireAgent('ag-1');
|
||||
const [url, init] = mockFetch.mock.calls[0];
|
||||
// Default soft-retire: bare path, no stray ? suffix.
|
||||
expect(url).toBe('/api/v1/agents/ag-1');
|
||||
expect(init.method).toBe('DELETE');
|
||||
});
|
||||
|
||||
it('retireAgent propagates force+reason as URL query parameters', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockJsonResponse({
|
||||
retired_at: '2026-04-18T12:00:00Z',
|
||||
already_retired: false,
|
||||
cascade: true,
|
||||
counts: { active_targets: 3, active_certificates: 7, pending_jobs: 2 },
|
||||
}),
|
||||
);
|
||||
await retireAgent('ag-1', { force: true, reason: 'decommissioning rack 7' });
|
||||
const [url, init] = mockFetch.mock.calls[0];
|
||||
// URLSearchParams encodes space as "+"; "decommissioning rack 7" → "decommissioning+rack+7"
|
||||
expect(url).toBe(
|
||||
'/api/v1/agents/ag-1?force=true&reason=decommissioning+rack+7',
|
||||
);
|
||||
expect(init.method).toBe('DELETE');
|
||||
});
|
||||
|
||||
it('retireAgent omits force=false even when reason is supplied', async () => {
|
||||
// Client-side guard: the server's 400 ErrForceReasonRequired is the
|
||||
// fallback; the GUI should never silently promote reason-without-force
|
||||
// into a force call. Pins that reason-only still hits the soft path.
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockJsonResponse({
|
||||
retired_at: '2026-04-18T12:00:00Z',
|
||||
already_retired: false,
|
||||
cascade: false,
|
||||
}),
|
||||
);
|
||||
await retireAgent('ag-1', { reason: 'routine decommission' });
|
||||
const [url] = mockFetch.mock.calls[0];
|
||||
// force defaults to false → query carries reason only.
|
||||
expect(url).toBe('/api/v1/agents/ag-1?reason=routine+decommission');
|
||||
});
|
||||
|
||||
it('retireAgent surfaces the 409 dependency error message to the caller', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockErrorResponse(409, {
|
||||
message: 'agent has 3 active targets, 7 active certificates, 2 pending jobs',
|
||||
}),
|
||||
);
|
||||
await expect(retireAgent('ag-1')).rejects.toThrow(
|
||||
/active targets|active certificates|pending jobs/,
|
||||
);
|
||||
});
|
||||
|
||||
it('retireAgent treats 204 (already-retired) as success with empty body', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
Promise.resolve({
|
||||
ok: true,
|
||||
status: 204,
|
||||
json: () => Promise.reject(new Error('204 has no body')),
|
||||
statusText: 'No Content',
|
||||
} as Response),
|
||||
);
|
||||
// fetchJSON normalises 204 to {} — caller must not crash.
|
||||
const result = await retireAgent('ag-1');
|
||||
expect(result).toBeDefined();
|
||||
});
|
||||
|
||||
it('listRetiredAgents sends GET /agents/retired with default pagination', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockJsonResponse({ data: [], total: 0, page: 1, per_page: 50 }),
|
||||
);
|
||||
await listRetiredAgents();
|
||||
const [url, init] = mockFetch.mock.calls[0];
|
||||
expect(url).toBe('/api/v1/agents/retired?page=1&per_page=50');
|
||||
// Default is GET — no explicit method means fetchJSON falls through.
|
||||
expect(init.method ?? 'GET').toBe('GET');
|
||||
});
|
||||
|
||||
it('listRetiredAgents forwards page/per_page overrides', async () => {
|
||||
mockFetch.mockReturnValueOnce(
|
||||
mockJsonResponse({ data: [], total: 0, page: 2, per_page: 100 }),
|
||||
);
|
||||
await listRetiredAgents({ page: '2', per_page: '100' });
|
||||
const [url] = mockFetch.mock.calls[0];
|
||||
expect(url).toContain('page=2');
|
||||
expect(url).toContain('per_page=100');
|
||||
});
|
||||
});
|
||||
|
||||
// ─── Jobs ───────────────────────────────────────────
|
||||
|
||||
describe('Jobs', () => {
|
||||
|
||||
+93
-1
@@ -1,4 +1,4 @@
|
||||
import type { Certificate, CertificateVersion, Agent, Job, Notification, AuditEvent, PolicyRule, PolicyViolation, Issuer, Target, CertificateProfile, Owner, Team, AgentGroup, PaginatedResponse, DashboardSummary, CertificateStatusCount, ExpirationBucket, JobTrendDataPoint, IssuanceRateDataPoint, MetricsResponse, DiscoveredCertificate, DiscoveryScan, DiscoverySummary, NetworkScanTarget, EndpointHealthCheck, HealthHistoryEntry, HealthCheckSummary } from './types';
|
||||
import type { Certificate, CertificateVersion, Agent, Job, Notification, AuditEvent, PolicyRule, PolicyViolation, Issuer, Target, CertificateProfile, Owner, Team, AgentGroup, PaginatedResponse, DashboardSummary, CertificateStatusCount, ExpirationBucket, JobTrendDataPoint, IssuanceRateDataPoint, MetricsResponse, DiscoveredCertificate, DiscoveryScan, DiscoverySummary, NetworkScanTarget, EndpointHealthCheck, HealthHistoryEntry, HealthCheckSummary, AgentDependencyCounts, RetireAgentResponse, BlockedByDependenciesResponse } from './types';
|
||||
|
||||
const BASE = '/api/v1';
|
||||
|
||||
@@ -188,6 +188,98 @@ export const getAgent = (id: string) =>
|
||||
export const registerAgent = (data: Partial<Agent>) =>
|
||||
fetchJSON<Agent>(`${BASE}/agents`, { method: 'POST', body: JSON.stringify(data) });
|
||||
|
||||
// I-004: typed error thrown by retireAgent when the server returns HTTP 409 with
|
||||
// {error: "blocked_by_dependencies", ...}. Callers that want to show the
|
||||
// dependency-counts dialog should `catch (e)` and check `e instanceof
|
||||
// BlockedByDependenciesError` — the counts field is the same shape the
|
||||
// backend handler returns from its inline struct in
|
||||
// internal/api/handler/agents.go. Generic network / 5xx failures still throw
|
||||
// plain Error so existing error-boundary code is unaffected.
|
||||
export class BlockedByDependenciesError extends Error {
|
||||
readonly counts: AgentDependencyCounts;
|
||||
constructor(message: string, counts: AgentDependencyCounts) {
|
||||
super(message);
|
||||
this.name = 'BlockedByDependenciesError';
|
||||
this.counts = counts;
|
||||
}
|
||||
}
|
||||
|
||||
// I-004: retire an agent via DELETE /api/v1/agents/{id}. Three distinct
|
||||
// success paths the UI needs to distinguish:
|
||||
// * 200 — fresh retire; body has retired_at, already_retired=false, cascade
|
||||
// flag, counts of what was cascaded.
|
||||
// * 204 — idempotent re-retire; the row was already retired. No body. We
|
||||
// synthesize a RetireAgentResponse with already_retired=true and zero
|
||||
// counts so the caller can keep a single return type.
|
||||
// * 409 — blocked_by_dependencies; thrown as BlockedByDependenciesError so
|
||||
// the caller can surface the active_targets/active_certificates/pending_jobs
|
||||
// counts in a confirmation dialog and offer force=true.
|
||||
// Anything else bubbles up via the standard fetchJSON error path.
|
||||
export const retireAgent = async (
|
||||
id: string,
|
||||
opts: { force?: boolean; reason?: string } = {},
|
||||
): Promise<RetireAgentResponse> => {
|
||||
const qs = new URLSearchParams();
|
||||
if (opts.force) qs.set('force', 'true');
|
||||
if (opts.reason) qs.set('reason', opts.reason);
|
||||
const url = qs.toString()
|
||||
? `${BASE}/agents/${id}?${qs.toString()}`
|
||||
: `${BASE}/agents/${id}`;
|
||||
|
||||
const res = await fetch(url, {
|
||||
method: 'DELETE',
|
||||
headers: authHeaders(),
|
||||
});
|
||||
|
||||
if (res.status === 401) {
|
||||
window.dispatchEvent(new CustomEvent('certctl:auth-required'));
|
||||
throw new Error('Authentication required');
|
||||
}
|
||||
|
||||
// 204 No Content — idempotent re-retire. Synthesize a response so callers
|
||||
// get a uniform shape; already_retired=true tells them the agent was
|
||||
// already in the retired state before this call.
|
||||
if (res.status === 204) {
|
||||
return {
|
||||
retired_at: '',
|
||||
already_retired: true,
|
||||
cascade: false,
|
||||
counts: { active_targets: 0, active_certificates: 0, pending_jobs: 0 },
|
||||
};
|
||||
}
|
||||
|
||||
if (res.status === 409) {
|
||||
// Body is always JSON for 409 per the handler contract.
|
||||
const body = (await res.json()) as BlockedByDependenciesResponse;
|
||||
throw new BlockedByDependenciesError(
|
||||
body.message || 'agent has active dependencies',
|
||||
body.counts,
|
||||
);
|
||||
}
|
||||
|
||||
if (!res.ok) {
|
||||
let errorMsg = res.statusText;
|
||||
try {
|
||||
const body = await res.json();
|
||||
errorMsg = body.message || body.error || errorMsg;
|
||||
} catch {
|
||||
// not JSON
|
||||
}
|
||||
throw new Error(errorMsg || `HTTP ${res.status}`);
|
||||
}
|
||||
|
||||
return (await res.json()) as RetireAgentResponse;
|
||||
};
|
||||
|
||||
// I-004: list retired agents via GET /api/v1/agents/retired. Kept separate
|
||||
// from getAgents (which hits the default active-only listing) so the retired
|
||||
// tab on AgentsPage can page independently. per_page is capped server-side at
|
||||
// 500 (see handler ListRetiredAgents).
|
||||
export const listRetiredAgents = (params: Record<string, string> = {}) => {
|
||||
const qs = new URLSearchParams({ page: '1', per_page: '50', ...params }).toString();
|
||||
return fetchJSON<PaginatedResponse<Agent>>(`${BASE}/agents/retired?${qs}`);
|
||||
};
|
||||
|
||||
// Jobs
|
||||
export const getJobs = (params: Record<string, string> = {}) => {
|
||||
const qs = new URLSearchParams({ page: '1', per_page: '50', ...params }).toString();
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import { POLICY_TYPES, POLICY_SEVERITIES } from './types';
|
||||
import type { Agent } from './types';
|
||||
|
||||
/**
|
||||
* Regression tests for the policy enum tuples.
|
||||
@@ -58,3 +59,76 @@ describe('POLICY_SEVERITIES', () => {
|
||||
expect(POLICY_SEVERITIES as readonly string[]).not.toContain('medium');
|
||||
});
|
||||
});
|
||||
|
||||
/**
|
||||
* Regression test for the Agent interface's I-004 soft-retirement shape.
|
||||
*
|
||||
* Backend (migration 000015, Phase 2b) adds two nullable timestamps/strings to
|
||||
* the agents table — `retired_at` and `retired_reason` — mirroring the existing
|
||||
* Certificate.revoked_at / Certificate.revocation_reason pair. The GUI needs
|
||||
* these fields on the Agent interface so the Retired tab, retire modal, and
|
||||
* retirement banner can render the agent's retired state without resorting to
|
||||
* `(agent as any).retired_at` escapes.
|
||||
*
|
||||
* Both fields are optional (agent.ts interface) because the server omits them
|
||||
* from the response for active agents. A compile-time shape check here pins
|
||||
* that Phase 2b does not drift the field names (e.g. to retiredAt camelCase)
|
||||
* or accidentally promote them to required.
|
||||
*
|
||||
* Compile-fail until Phase 2b adds:
|
||||
* retired_at?: string;
|
||||
* retired_reason?: string;
|
||||
* to the Agent interface in types.ts.
|
||||
*/
|
||||
describe('Agent interface (I-004 retirement)', () => {
|
||||
it('accepts retired_at and retired_reason as optional string fields', () => {
|
||||
// Construct an Agent with the retirement fields set. If Phase 2b names
|
||||
// them anything other than retired_at / retired_reason, this fails to
|
||||
// compile — which is exactly what the Red stage wants.
|
||||
const retired: Agent = {
|
||||
id: 'ag-1',
|
||||
name: 'decom-01',
|
||||
hostname: 'server-old',
|
||||
ip_address: '10.0.0.1',
|
||||
os: 'linux',
|
||||
architecture: 'amd64',
|
||||
status: 'Offline',
|
||||
version: '2.1.0',
|
||||
last_heartbeat: '2026-01-01T00:00:00Z',
|
||||
last_heartbeat_at: '2026-01-01T00:00:00Z',
|
||||
capabilities: [],
|
||||
tags: {},
|
||||
registered_at: '2024-01-01T00:00:00Z',
|
||||
created_at: '2024-01-01T00:00:00Z',
|
||||
updated_at: '2026-01-01T00:00:00Z',
|
||||
retired_at: '2026-01-01T00:00:00Z',
|
||||
retired_reason: 'old hardware',
|
||||
};
|
||||
expect(retired.retired_at).toBe('2026-01-01T00:00:00Z');
|
||||
expect(retired.retired_reason).toBe('old hardware');
|
||||
});
|
||||
|
||||
it('accepts an Agent without retired_at / retired_reason (optional fields)', () => {
|
||||
// Active agents should not carry retirement metadata. If Phase 2b makes
|
||||
// the fields required, this block fails to compile.
|
||||
const active: Agent = {
|
||||
id: 'ag-2',
|
||||
name: 'web01',
|
||||
hostname: 'web01.prod',
|
||||
ip_address: '10.0.0.2',
|
||||
os: 'linux',
|
||||
architecture: 'amd64',
|
||||
status: 'Online',
|
||||
version: '2.1.0',
|
||||
last_heartbeat: '2026-04-18T12:00:00Z',
|
||||
last_heartbeat_at: '2026-04-18T12:00:00Z',
|
||||
capabilities: ['deploy', 'scan'],
|
||||
tags: {},
|
||||
registered_at: '2024-06-01T00:00:00Z',
|
||||
created_at: '2024-06-01T00:00:00Z',
|
||||
updated_at: '2026-04-18T12:00:00Z',
|
||||
};
|
||||
expect(active.retired_at).toBeUndefined();
|
||||
expect(active.retired_reason).toBeUndefined();
|
||||
});
|
||||
});
|
||||
|
||||
@@ -67,6 +67,43 @@ export interface Agent {
|
||||
registered_at: string;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
// I-004: soft-retirement fields. When retired_at is non-null, the agent is
|
||||
// tombstoned — it will never heartbeat again and cascaded targets have been
|
||||
// retired alongside it. The retired tab on AgentsPage uses these to show the
|
||||
// when/why. The server filters retired rows from the default /api/v1/agents
|
||||
// listing; they appear only via GET /api/v1/agents/retired.
|
||||
retired_at?: string | null;
|
||||
retired_reason?: string | null;
|
||||
}
|
||||
|
||||
// I-004: dependency counts returned by the retire handler in both the 200
|
||||
// success-with-cascade body and the 409 blocked_by_dependencies body. The
|
||||
// operator UI uses these to show "this agent has N targets, M certs, K jobs
|
||||
// depending on it" in the confirm-retire dialog.
|
||||
export interface AgentDependencyCounts {
|
||||
active_targets: number;
|
||||
active_certificates: number;
|
||||
pending_jobs: number;
|
||||
}
|
||||
|
||||
// I-004: success shape for DELETE /api/v1/agents/{id}. already_retired is
|
||||
// always false for 200 responses; 204 responses carry no body (the retire was
|
||||
// idempotent — the agent was already retired). The frontend distinguishes by
|
||||
// HTTP status, not by this field.
|
||||
export interface RetireAgentResponse {
|
||||
retired_at: string;
|
||||
already_retired: boolean;
|
||||
cascade: boolean;
|
||||
counts: AgentDependencyCounts;
|
||||
}
|
||||
|
||||
// I-004: shape returned with HTTP 409 when a retire is blocked by active
|
||||
// downstream dependencies. Keep in lockstep with the handler's inline struct
|
||||
// in internal/api/handler/agents.go (search "blocked_by_dependencies").
|
||||
export interface BlockedByDependenciesResponse {
|
||||
error: 'blocked_by_dependencies';
|
||||
message: string;
|
||||
counts: AgentDependencyCounts;
|
||||
}
|
||||
|
||||
export interface Job {
|
||||
|
||||
+379
-13
@@ -1,13 +1,19 @@
|
||||
import { useState } from 'react';
|
||||
import { useNavigate } from 'react-router-dom';
|
||||
import { useQuery } from '@tanstack/react-query';
|
||||
import { getAgents } from '../api/client';
|
||||
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query';
|
||||
import {
|
||||
getAgents,
|
||||
listRetiredAgents,
|
||||
retireAgent,
|
||||
BlockedByDependenciesError,
|
||||
} from '../api/client';
|
||||
import PageHeader from '../components/PageHeader';
|
||||
import DataTable from '../components/DataTable';
|
||||
import type { Column } from '../components/DataTable';
|
||||
import StatusBadge from '../components/StatusBadge';
|
||||
import ErrorState from '../components/ErrorState';
|
||||
import { timeAgo } from '../api/utils';
|
||||
import type { Agent } from '../api/types';
|
||||
import type { Agent, AgentDependencyCounts } from '../api/types';
|
||||
|
||||
function heartbeatStatus(lastHeartbeat: string): string {
|
||||
if (!lastHeartbeat) return 'Offline';
|
||||
@@ -17,15 +23,84 @@ function heartbeatStatus(lastHeartbeat: string): string {
|
||||
return 'Offline';
|
||||
}
|
||||
|
||||
type TabKey = 'active' | 'retired';
|
||||
|
||||
// I-004: retire-modal state machine.
|
||||
// confirm — operator clicked Retire, shown plain confirm + optional reason.
|
||||
// blocked — soft retire returned 409; switch to a force-retire dialog that
|
||||
// shows the dependency counts and requires a reason before the
|
||||
// operator can opt into ?force=true.
|
||||
// error — any other failure (network, 500, unexpected 4xx). Reused by both
|
||||
// the initial attempt and the force retry.
|
||||
type ModalMode =
|
||||
| { kind: 'closed' }
|
||||
| { kind: 'confirm'; agent: Agent; reason: string }
|
||||
| { kind: 'blocked'; agent: Agent; reason: string; counts: AgentDependencyCounts }
|
||||
| { kind: 'error'; agent: Agent; message: string };
|
||||
|
||||
export default function AgentsPage() {
|
||||
const navigate = useNavigate();
|
||||
const { data, isLoading, error, refetch } = useQuery({
|
||||
const qc = useQueryClient();
|
||||
const [tab, setTab] = useState<TabKey>('active');
|
||||
const [modal, setModal] = useState<ModalMode>({ kind: 'closed' });
|
||||
|
||||
const active = useQuery({
|
||||
queryKey: ['agents'],
|
||||
queryFn: () => getAgents(),
|
||||
refetchInterval: 15000,
|
||||
enabled: tab === 'active',
|
||||
});
|
||||
|
||||
const columns: Column<Agent>[] = [
|
||||
const retired = useQuery({
|
||||
queryKey: ['agents', 'retired'],
|
||||
queryFn: () => listRetiredAgents(),
|
||||
refetchInterval: 30000,
|
||||
enabled: tab === 'retired',
|
||||
});
|
||||
|
||||
// retireAgent mutation wrapping both paths. The caller supplies force/reason,
|
||||
// and we invalidate both queries on success so the retired tab refreshes and
|
||||
// the active tab drops the row. 409s are converted into modal.mode=blocked so
|
||||
// the operator can escalate to force; everything else becomes modal.mode=error.
|
||||
const mutation = useMutation({
|
||||
mutationFn: (input: { agent: Agent; force?: boolean; reason?: string }) =>
|
||||
retireAgent(input.agent.id, { force: input.force, reason: input.reason }),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: ['agents'] });
|
||||
qc.invalidateQueries({ queryKey: ['agents', 'retired'] });
|
||||
setModal({ kind: 'closed' });
|
||||
},
|
||||
});
|
||||
|
||||
// Shared submit handler: when we know the current modal.agent + modal.reason,
|
||||
// decide whether this is a soft retire or force retire based on modal.kind.
|
||||
const submitRetire = (force: boolean) => {
|
||||
if (modal.kind !== 'confirm' && modal.kind !== 'blocked') return;
|
||||
const { agent, reason } = modal;
|
||||
mutation.mutate(
|
||||
{ agent, force, reason: reason || undefined },
|
||||
{
|
||||
onError: (err) => {
|
||||
if (err instanceof BlockedByDependenciesError) {
|
||||
setModal({
|
||||
kind: 'blocked',
|
||||
agent,
|
||||
reason,
|
||||
counts: err.counts ?? { active_targets: 0, active_certificates: 0, pending_jobs: 0 },
|
||||
});
|
||||
return;
|
||||
}
|
||||
setModal({
|
||||
kind: 'error',
|
||||
agent,
|
||||
message: err instanceof Error ? err.message : String(err),
|
||||
});
|
||||
},
|
||||
},
|
||||
);
|
||||
};
|
||||
|
||||
const activeColumns: Column<Agent>[] = [
|
||||
{
|
||||
key: 'name',
|
||||
label: 'Agent',
|
||||
@@ -41,27 +116,318 @@ export default function AgentsPage() {
|
||||
label: 'Health',
|
||||
render: (a) => <StatusBadge status={a.status || heartbeatStatus(a.last_heartbeat_at)} />,
|
||||
},
|
||||
{ key: 'hostname', label: 'Hostname', render: (a) => <span className="text-ink-muted font-mono text-xs">{a.hostname || '—'}</span> },
|
||||
{ key: 'os', label: 'OS / Arch', render: (a) => <span className="text-ink-muted text-xs">{a.os && a.architecture ? `${a.os}/${a.architecture}` : a.os || '—'}</span> },
|
||||
{ key: 'ip', label: 'IP Address', render: (a) => <span className="text-ink-muted font-mono text-xs">{a.ip_address || '—'}</span> },
|
||||
{ key: 'version', label: 'Version', render: (a) => <span className="text-ink-muted text-xs">{a.version || '—'}</span> },
|
||||
{
|
||||
key: 'hostname',
|
||||
label: 'Hostname',
|
||||
render: (a) => <span className="text-ink-muted font-mono text-xs">{a.hostname || '—'}</span>,
|
||||
},
|
||||
{
|
||||
key: 'os',
|
||||
label: 'OS / Arch',
|
||||
render: (a) => (
|
||||
<span className="text-ink-muted text-xs">
|
||||
{a.os && a.architecture ? `${a.os}/${a.architecture}` : a.os || '—'}
|
||||
</span>
|
||||
),
|
||||
},
|
||||
{
|
||||
key: 'ip',
|
||||
label: 'IP Address',
|
||||
render: (a) => <span className="text-ink-muted font-mono text-xs">{a.ip_address || '—'}</span>,
|
||||
},
|
||||
{
|
||||
key: 'version',
|
||||
label: 'Version',
|
||||
render: (a) => <span className="text-ink-muted text-xs">{a.version || '—'}</span>,
|
||||
},
|
||||
{
|
||||
key: 'heartbeat',
|
||||
label: 'Last Heartbeat',
|
||||
render: (a) => <span className="text-ink-muted text-xs">{timeAgo(a.last_heartbeat_at)}</span>,
|
||||
},
|
||||
{
|
||||
key: 'actions',
|
||||
label: '',
|
||||
render: (a) => (
|
||||
<button
|
||||
type="button"
|
||||
onClick={(e) => {
|
||||
// Table rows are navigable via onRowClick. The retire button must
|
||||
// not trigger the row-click handler or the modal will race the
|
||||
// navigation and unmount mid-render.
|
||||
e.stopPropagation();
|
||||
setModal({ kind: 'confirm', agent: a, reason: '' });
|
||||
}}
|
||||
className="px-3 py-1 text-xs font-medium text-danger border border-danger/30 rounded hover:bg-danger/10"
|
||||
>
|
||||
Retire
|
||||
</button>
|
||||
),
|
||||
},
|
||||
];
|
||||
|
||||
const retiredColumns: Column<Agent>[] = [
|
||||
{
|
||||
key: 'name',
|
||||
label: 'Agent',
|
||||
render: (a) => (
|
||||
<div>
|
||||
<div className="font-medium text-ink">{a.name}</div>
|
||||
<div className="text-xs text-ink-faint">{a.id}</div>
|
||||
</div>
|
||||
),
|
||||
},
|
||||
{
|
||||
key: 'hostname',
|
||||
label: 'Hostname',
|
||||
render: (a) => <span className="text-ink-muted font-mono text-xs">{a.hostname || '—'}</span>,
|
||||
},
|
||||
{
|
||||
key: 'os',
|
||||
label: 'OS / Arch',
|
||||
render: (a) => (
|
||||
<span className="text-ink-muted text-xs">
|
||||
{a.os && a.architecture ? `${a.os}/${a.architecture}` : a.os || '—'}
|
||||
</span>
|
||||
),
|
||||
},
|
||||
{
|
||||
key: 'retired_at',
|
||||
label: 'Retired',
|
||||
render: (a) => <span className="text-ink-muted text-xs">{timeAgo(a.retired_at || '')}</span>,
|
||||
},
|
||||
{
|
||||
key: 'retired_reason',
|
||||
label: 'Reason',
|
||||
render: (a) => (
|
||||
<span className="text-ink-muted text-xs">{a.retired_reason || <em>—</em>}</span>
|
||||
),
|
||||
},
|
||||
];
|
||||
|
||||
const currentQuery = tab === 'active' ? active : retired;
|
||||
const currentColumns = tab === 'active' ? activeColumns : retiredColumns;
|
||||
const emptyMessage = tab === 'active' ? 'No agents registered' : 'No retired agents';
|
||||
|
||||
return (
|
||||
<>
|
||||
<PageHeader title="Agents" subtitle={data ? `${data.total} agents` : undefined} />
|
||||
<PageHeader
|
||||
title="Agents"
|
||||
subtitle={
|
||||
tab === 'active' && active.data
|
||||
? `${active.data.total} active`
|
||||
: tab === 'retired' && retired.data
|
||||
? `${retired.data.total} retired`
|
||||
: undefined
|
||||
}
|
||||
/>
|
||||
|
||||
<div className="px-6 pt-2">
|
||||
<div className="flex gap-2 border-b border-border">
|
||||
<TabButton active={tab === 'active'} onClick={() => setTab('active')}>
|
||||
Active
|
||||
</TabButton>
|
||||
<TabButton active={tab === 'retired'} onClick={() => setTab('retired')}>
|
||||
Retired
|
||||
</TabButton>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="flex-1 overflow-y-auto">
|
||||
{error ? (
|
||||
<ErrorState error={error as Error} onRetry={() => refetch()} />
|
||||
{currentQuery.error ? (
|
||||
<ErrorState error={currentQuery.error as Error} onRetry={() => currentQuery.refetch()} />
|
||||
) : (
|
||||
<DataTable columns={columns} data={data?.data || []} isLoading={isLoading} emptyMessage="No agents registered" onRowClick={(a) => navigate(`/agents/${a.id}`)} />
|
||||
<DataTable
|
||||
columns={currentColumns}
|
||||
data={currentQuery.data?.data || []}
|
||||
isLoading={currentQuery.isLoading}
|
||||
emptyMessage={emptyMessage}
|
||||
onRowClick={(a) => navigate(`/agents/${a.id}`)}
|
||||
/>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{modal.kind !== 'closed' && (
|
||||
<RetireModal
|
||||
mode={modal}
|
||||
pending={mutation.isPending}
|
||||
onClose={() => setModal({ kind: 'closed' })}
|
||||
onReasonChange={(reason) => {
|
||||
if (modal.kind === 'confirm') setModal({ ...modal, reason });
|
||||
if (modal.kind === 'blocked') setModal({ ...modal, reason });
|
||||
}}
|
||||
onSoftRetire={() => submitRetire(false)}
|
||||
onForceRetire={() => submitRetire(true)}
|
||||
/>
|
||||
)}
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
function TabButton({
|
||||
active,
|
||||
onClick,
|
||||
children,
|
||||
}: {
|
||||
active: boolean;
|
||||
onClick: () => void;
|
||||
children: React.ReactNode;
|
||||
}) {
|
||||
return (
|
||||
<button
|
||||
type="button"
|
||||
onClick={onClick}
|
||||
className={
|
||||
active
|
||||
? 'px-4 py-2 text-sm font-medium text-ink border-b-2 border-accent -mb-px'
|
||||
: 'px-4 py-2 text-sm text-ink-muted hover:text-ink'
|
||||
}
|
||||
>
|
||||
{children}
|
||||
</button>
|
||||
);
|
||||
}
|
||||
|
||||
function RetireModal({
|
||||
mode,
|
||||
pending,
|
||||
onClose,
|
||||
onReasonChange,
|
||||
onSoftRetire,
|
||||
onForceRetire,
|
||||
}: {
|
||||
mode: ModalMode;
|
||||
pending: boolean;
|
||||
onClose: () => void;
|
||||
onReasonChange: (reason: string) => void;
|
||||
onSoftRetire: () => void;
|
||||
onForceRetire: () => void;
|
||||
}) {
|
||||
if (mode.kind === 'closed') return null;
|
||||
|
||||
return (
|
||||
<div
|
||||
role="dialog"
|
||||
aria-modal="true"
|
||||
className="fixed inset-0 z-40 flex items-center justify-center bg-black/40"
|
||||
onClick={onClose}
|
||||
>
|
||||
<div
|
||||
className="w-full max-w-lg rounded-lg bg-surface p-6 shadow-lg border border-border"
|
||||
onClick={(e) => e.stopPropagation()}
|
||||
>
|
||||
{mode.kind === 'confirm' && (
|
||||
<>
|
||||
<h2 className="text-lg font-semibold text-ink">Retire agent</h2>
|
||||
<p className="mt-2 text-sm text-ink-muted">
|
||||
<span className="font-mono">{mode.agent.name}</span> ({mode.agent.id}) will be
|
||||
soft-retired. The agent will stop receiving heartbeats and be removed from active
|
||||
listings. This is reversible only by direct database intervention.
|
||||
</p>
|
||||
<label className="mt-4 block text-xs font-medium text-ink-muted">
|
||||
Reason (optional)
|
||||
<input
|
||||
type="text"
|
||||
value={mode.reason}
|
||||
onChange={(e) => onReasonChange(e.target.value)}
|
||||
placeholder="e.g. decommissioning rack 7"
|
||||
className="mt-1 w-full rounded border border-border bg-surface-alt px-2 py-1 text-sm"
|
||||
/>
|
||||
</label>
|
||||
<div className="mt-6 flex justify-end gap-2">
|
||||
<button
|
||||
type="button"
|
||||
onClick={onClose}
|
||||
className="px-4 py-2 text-sm text-ink-muted hover:text-ink"
|
||||
disabled={pending}
|
||||
>
|
||||
Cancel
|
||||
</button>
|
||||
<button
|
||||
type="button"
|
||||
onClick={onSoftRetire}
|
||||
disabled={pending}
|
||||
className="px-4 py-2 text-sm font-medium text-white bg-danger rounded hover:bg-danger/90 disabled:opacity-50"
|
||||
>
|
||||
{pending ? 'Retiring…' : 'Retire'}
|
||||
</button>
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
|
||||
{mode.kind === 'blocked' && (
|
||||
<>
|
||||
<h2 className="text-lg font-semibold text-ink">Cannot retire — active dependencies</h2>
|
||||
<p className="mt-2 text-sm text-ink-muted">
|
||||
The agent <span className="font-mono">{mode.agent.name}</span> still has downstream
|
||||
work tied to it. Force-retiring will cascade-retire all active targets and fail any
|
||||
pending jobs.
|
||||
</p>
|
||||
<dl className="mt-4 grid grid-cols-3 gap-3 text-center">
|
||||
<div className="rounded border border-border bg-surface-alt p-3">
|
||||
<dt className="text-xs text-ink-muted">Active targets</dt>
|
||||
<dd className="mt-1 text-xl font-semibold text-ink">{mode.counts.active_targets}</dd>
|
||||
</div>
|
||||
<div className="rounded border border-border bg-surface-alt p-3">
|
||||
<dt className="text-xs text-ink-muted">Active certs</dt>
|
||||
<dd className="mt-1 text-xl font-semibold text-ink">
|
||||
{mode.counts.active_certificates}
|
||||
</dd>
|
||||
</div>
|
||||
<div className="rounded border border-border bg-surface-alt p-3">
|
||||
<dt className="text-xs text-ink-muted">Pending jobs</dt>
|
||||
<dd className="mt-1 text-xl font-semibold text-ink">{mode.counts.pending_jobs}</dd>
|
||||
</div>
|
||||
</dl>
|
||||
<label className="mt-4 block text-xs font-medium text-ink-muted">
|
||||
Reason <span className="text-danger">(required for force retire)</span>
|
||||
<input
|
||||
type="text"
|
||||
value={mode.reason}
|
||||
onChange={(e) => onReasonChange(e.target.value)}
|
||||
placeholder="e.g. rack 7 decommission, cascade retire"
|
||||
className="mt-1 w-full rounded border border-border bg-surface-alt px-2 py-1 text-sm"
|
||||
/>
|
||||
</label>
|
||||
<div className="mt-6 flex justify-end gap-2">
|
||||
<button
|
||||
type="button"
|
||||
onClick={onClose}
|
||||
className="px-4 py-2 text-sm text-ink-muted hover:text-ink"
|
||||
disabled={pending}
|
||||
>
|
||||
Cancel
|
||||
</button>
|
||||
<button
|
||||
type="button"
|
||||
onClick={onForceRetire}
|
||||
// Backend enforces reason on force; keep the GUI in lockstep
|
||||
// rather than letting a 400 bounce back.
|
||||
disabled={pending || !mode.reason.trim()}
|
||||
className="px-4 py-2 text-sm font-medium text-white bg-danger rounded hover:bg-danger/90 disabled:opacity-50"
|
||||
>
|
||||
{pending ? 'Force-retiring…' : 'Force retire'}
|
||||
</button>
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
|
||||
{mode.kind === 'error' && (
|
||||
<>
|
||||
<h2 className="text-lg font-semibold text-ink">Retire failed</h2>
|
||||
<p className="mt-2 text-sm text-danger">{mode.message}</p>
|
||||
<div className="mt-6 flex justify-end">
|
||||
<button
|
||||
type="button"
|
||||
onClick={onClose}
|
||||
className="px-4 py-2 text-sm text-ink-muted hover:text-ink"
|
||||
>
|
||||
Close
|
||||
</button>
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user