Files
certctl/internal/mcp/tools_admin.go
T
shankar0123 fbe053aa0c refactor(mcp): split tools.go by tool domain — Option B sibling-files (Phase 9, 10 of N)
Phase 9 ARCH-M2 closure Sprint 10. Splits internal/mcp/tools.go
(was 1867 LOC, the second-largest backend hotspot after the
service/acme.go cuts in Sprints 9 + 9b) via the Option B sibling-
file pattern — new files stay in `package mcp` so every external
caller of `mcp.RegisterTools(...)` resolves the same way. Pure
mechanical relocation; no signature, no behavior, no import-graph
change.

Why this is naturally suited to Option B
========================================
The mcp package already follows the sibling-file convention:
tools_audit_fix.go (registerAuditFixTools), tools_auth.go
(registerAuthTools), tools_auth_bundle2.go (registerAuthBundle2Tools),
and tools_est.go (registerESTTools) each carry a single
register-function each, all in the same `mcp` package. Sprint 10
extends that pattern to the 22 register-functions still inside
tools.go.

The structure of tools.go is unusually clean for a refactor: every
domain has its own `// ── DomainName ──` banner above its
register-function, and every register-function ends with a `}` +
blank line before the next domain's banner. The RegisterTools
dispatcher stayed in tools.go and still invokes each
registerXxxTools(...) in the same order — calls cross a file
boundary but stay in `package mcp`, so same-package resolution
makes them zero-cost.

What moved
==========

New `internal/mcp/tools_certificates.go` (404 LOC) — certificate-
lifecycle domain:
  - registerCertificateTools (cert CRUD + revocation)
  - registerCRLOCSPTools
  - registerRenewalPolicyTools (Phase C P1-1..P1-5)
  - registerVerificationTools (Phase G P1-32/P1-34/P1-35)

New `internal/mcp/tools_agents.go` (266 LOC) — agent-management
domain:
  - registerAgentTools (per-agent CRUD + lifecycle)
  - registerAgentGroupTools

New `internal/mcp/tools_resources.go` (565 LOC) — resource-
management / configuration surface:
  - registerIssuerTools, registerTargetTools
  - registerPolicyTools, registerProfileTools
  - registerTeamTools, registerOwnerTools
  - registerNotificationTools
  - registerIntermediateCATools (Phase F P1-6..P1-9)

New `internal/mcp/tools_jobs.go` (170 LOC) — workflow domain:
  - registerJobTools
  - registerApprovalTools + approvalDecisionPayload struct
    (Phase A P1-28..P1-31)

New `internal/mcp/tools_discovery.go` (169 LOC) — discovery domain:
  - registerNetworkScanTools (Phase D P1-14..P1-19)
  - registerDiscoveryReadTools (Phase E P1-10..P1-13)

New `internal/mcp/tools_admin.go` (369 LOC) — observability / admin
domain:
  - registerAuditTools, registerStatsTools, registerDigestTools,
    registerMetricsTools, registerHealthTools
  - registerHealthCheckTools (Phase B P1-20..P1-27)

What stays in tools.go (109 LOC, down from 1867)
================================================
  - The RegisterTools dispatcher (still owns the canonical
    registration order; calls cross-file but stay in-package).
  - The three Bundle-3 wrappers + helper that every register
    function consumes: textResult (the json.RawMessage success-path
    fence), errorResult (the failure-path fence), paginationQuery
    (the URL helper).

The unused `context` import is dropped from tools.go as a clean
side effect — none of the four surviving functions take a
context.Context. Per-import audit on every new file:
  - tools_certificates.go: context, fmt, gomcp
  - tools_agents.go: context, fmt, net/url, gomcp
  - tools_resources.go: context, gomcp
  - tools_jobs.go: context, gomcp
  - tools_discovery.go: context, gomcp
  - tools_admin.go: context, net/url, strconv, gomcp
None of the moved code touched encoding/json directly — that import
stays inside tools.go for textResult's json.RawMessage param.

Bundle-3 fence guardrail update
===============================
The existing TestFenceGuardrail_NoBareCallToolResult guardrail in
fence_guardrail_test.go fails any file that constructs
gomcp.CallToolResult{...} literals outside the tools.go allowlist.
registerCRLOCSPTools — which moved to tools_certificates.go — has
two pre-existing literal CallToolResult constructions: each returns
a server-built status string of the form "DER CRL retrieved (%d
bytes, content-type: %s)" or "OCSP response retrieved (...)". The
byte count is `len(raw)` (server-controlled) and the content-type
comes from the HTTP header on the upstream PKI endpoint
(server-controlled in self-hosted deployments). Both predate
Bundle-3 fencing.

Two options to keep CI green:
  (a) Route through textResult — but that changes behavior (adds
      the UNTRUSTED MCP_RESPONSE fence around the response), which
      breaks the "mechanical relocation, no behavior change" rule
      Sprint 10 commits to.
  (b) Add tools_certificates.go to the allowlist with a comment
      explaining the carve-out is pre-existing and Sprint 10
      preserves byte-exact behavior.

This commit takes option (b). The allowlist comment in
fence_guardrail_test.go documents the carve-out, points at the
specific tools (CRL + OCSP binary-pass-through with server-built
status descriptions), and flags tightening these two sites through
textResult as a follow-up concern (open question: does the format
break MCP consumers that parse the description text).

Net effect
==========
tools.go: 1867 → 109 LOC (-1758 = -94.2%). Six new sibling files at
1943 LOC total (109 LOC of header + Phase 9 doc-comment overhead
per file = ~185 LOC of added documentation; the rest is moved
code). The biggest pre-Sprint-10 hotspot in the mcp package is now
smaller than tools_test.go (435 LOC).

Cumulative Phase 9 progress
===========================
  config.go        3403 → 1342 (-60.6%, Sprints 1-7)
  cmd/server/main.go 2966 → 2260 (-23.8%, Sprints 8 + 8b)
  service/acme.go  1965 → 1162 (-40.9%, Sprints 9 + 9b)
  mcp/tools.go     1867 →  109 (-94.2%, Sprint 10)
  TOTAL across 4 files: 10,201 → 4,873 LOC = -5,328 (-52.2%)

Behavior preservation contract
==============================
1. gofmt -l clean across all 8 affected files.
2. go vet ./internal/mcp/... — no findings.
3. staticcheck ./internal/mcp/... ./cmd/mcp-server/... — no findings.
4. go test -short -count=1 ./internal/mcp/... — green (includes the
   TestFenceGuardrail_NoBareCallToolResult guardrail post-allowlist-
   update, the tools_per_tool_test.go suite that exercises every
   moved register function, and the injection_regression_test.go
   suite that pins Bundle-3 fencing behavior on the wrapper layer).
5. Broader-importer build green: go build ./... .
6. Broader-importer tests green: go test -short ./cmd/mcp-server/...
   ./internal/api/handler/... ./cmd/server/... .

Same-package resolution means the RegisterTools dispatcher's
13-line call list in tools.go reaches each registerXxxTools across
six new sibling files via compile-time-resolved package-level
names; the public mcp.RegisterTools entry point + its (s, client)
signature is unchanged.

What remains for Phase 9
========================
Two sibling-file splits queued:
  - Sprint 11: internal/api/handler/auth_session_oidc.go (1577 LOC)
    split per handler verb (login / callback / refresh / logout /
    backchannel).
  - Sprint 12: cmd/agent/main.go (1489 LOC) mirroring the cmd/server
    pattern from Sprints 8 + 8b.

Refs: ARCH-M2 (god-files), Phase 9 audit. Sprint 10 closes the MCP
hotspot from the audit's top-6 list.
2026-05-14 10:15:21 +00:00

369 lines
16 KiB
Go

// Copyright 2026 certctl LLC. All rights reserved.
// SPDX-License-Identifier: BUSL-1.1
package mcp
import (
"context"
"net/url"
"strconv"
gomcp "github.com/modelcontextprotocol/go-sdk/mcp"
)
// Phase 9 ARCH-M2 closure Sprint 10 (2026-05-14): extracted from
// internal/mcp/tools.go via the Option B sibling-file pattern.
//
// This file groups the observability / admin MCP tool domain — the
// read-mostly surface an LLM consumer uses to assess fleet state:
//
// - registerAuditTools — audit-log read.
// - registerStatsTools — aggregated counters (certs by
// status / source / issuer; agents by state; jobs by status).
// - registerDigestTools — point-in-time fleet digest snapshot.
// - registerMetricsTools — raw Prometheus exposition pass-through.
// - registerHealthTools — service health probes + a handful of
// historical-placement claim/dismiss subtools (see
// tools_discovery.go for the duplicate-by-design comment).
// - registerHealthCheckTools — Phase B P1-20..P1-27 — health-check
// CRUD + the certificate-health-monitor surface.
//
// paginationQuery (in tools.go) is consumed by some of these
// register functions via net/url + strconv (Itoa); the imports
// stay local to this file.
// ── Audit ───────────────────────────────────────────────────────────
func registerAuditTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_list_audit_events",
Description: "List immutable audit trail events. Shows actor, action, resource, and timestamp for all lifecycle operations.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input ListParams) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/audit", paginationQuery(input.Page, input.PerPage))
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_get_audit_event",
Description: "Get a specific audit event by ID.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input GetByIDInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/audit/"+input.ID, nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}
// ── Stats ───────────────────────────────────────────────────────────
func registerStatsTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_dashboard_summary",
Description: "Get high-level dashboard metrics: total/expiring/expired/revoked certs, active/offline agents, pending/failed/completed jobs.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/stats/summary", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_certificates_by_status",
Description: "Get certificate counts grouped by status (Active, Expiring, Expired, Revoked, etc.).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/stats/certificates-by-status", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_expiration_timeline",
Description: "Get certificates expiring per day for the next N days (default 30, max 365).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input TimelineInput) (*gomcp.CallToolResult, any, error) {
q := url.Values{}
if input.Days > 0 {
q.Set("days", strconv.Itoa(input.Days))
}
data, err := c.Get("/api/v1/stats/expiration-timeline", q)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_job_trends",
Description: "Get job success/failure trends per day for the past N days (default 30, max 365).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input TimelineInput) (*gomcp.CallToolResult, any, error) {
q := url.Values{}
if input.Days > 0 {
q.Set("days", strconv.Itoa(input.Days))
}
data, err := c.Get("/api/v1/stats/job-trends", q)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_issuance_rate",
Description: "Get new certificate issuance count per day for the past N days (default 30, max 365).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input TimelineInput) (*gomcp.CallToolResult, any, error) {
q := url.Values{}
if input.Days > 0 {
q.Set("days", strconv.Itoa(input.Days))
}
data, err := c.Get("/api/v1/stats/issuance-rate", q)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}
// ── Digest ──────────────────────────────────────────────────────────
func registerDigestTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_preview_digest",
Description: "Preview the scheduled certificate digest email in HTML format. Shows summary of certificate status, pending jobs, and expiring certificates.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/digest/preview", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_send_digest",
Description: "Trigger immediate sending of the certificate digest email to configured recipients. If no explicit recipients are configured, sends to certificate owners.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Post("/api/v1/digest/send", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}
// ── Metrics ─────────────────────────────────────────────────────────
func registerMetricsTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_metrics",
Description: "Get system metrics snapshot: gauge metrics (cert/agent/job counts), counters (completed/failed totals), and server uptime.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/metrics", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}
// ── Health ──────────────────────────────────────────────────────────
func registerHealthTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_health",
Description: "Check certctl server health status.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/health", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_ready",
Description: "Check certctl server readiness (database connectivity, etc.).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/ready", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_auth_info",
Description: "Get auth configuration (auth type and whether auth is required).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/auth/info", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_auth_check",
Description: "Validate that the configured API key is accepted by the server.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/auth/check", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
// I-2 closure (cat-i-b0924b6675f8): pre-I-2 the README claimed "all
// API endpoints are exposed via MCP" but the discovered-certificate
// lifecycle (claim + dismiss) was never wrapped — operators using
// MCP clients had no path to bring an
// out-of-band cert under management or to mark a benign discovery
// as not-of-interest without dropping to the REST API directly.
// These two tools wrap the existing HTTP handlers
// (DiscoveryHandler.ClaimDiscovered + DismissDiscovered).
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_claim_discovered_certificate",
Description: "Link a discovered certificate (dc-*) to an existing managed certificate (mc-*) via POST /api/v1/discovered-certificates/{id}/claim. Use this to bring an out-of-band cert (e.g. one found by an agent filesystem scan or a network scan) under certctl management without re-issuing — the discovered row is marked Managed and its managed_certificate_id is set so subsequent renewals/revocations on the managed cert update both rows.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input ClaimDiscoveredCertificateInput) (*gomcp.CallToolResult, any, error) {
body := map[string]string{"managed_certificate_id": input.ManagedCertificateID}
data, err := c.Post("/api/v1/discovered-certificates/"+input.ID+"/claim", body)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_dismiss_discovered_certificate",
Description: "Dismiss a discovered certificate (POST /api/v1/discovered-certificates/{id}/dismiss). Use this to mark a discovery as not-of-interest (e.g. expired self-signed test certs found by a network scan) — the row stops appearing in the unmanaged-list view but is preserved in the DB for audit history.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input DismissDiscoveredCertificateInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Post("/api/v1/discovered-certificates/"+input.ID+"/dismiss", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}
// ── Health Checks (Phase B — P1-20..P1-27) ──────────────────────────
//
// 2026-05-05 CLI/API/MCP↔GUI parity audit closure. AI-assistant queries like
// "are any health checks failing?" / "ack the prod nginx incident" had no
// MCP path — operators had to drop to curl. Mirrors the existing target
// resource shape (CRUD + history + summary + acknowledge).
func registerHealthCheckTools(s *gomcp.Server, c *Client) {
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_list_health_checks",
Description: "List monitored TLS endpoint health checks (GET /api/v1/health-checks). Optional filters: status, certificate_id, network_scan_target_id, enabled.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input ListHealthChecksInput) (*gomcp.CallToolResult, any, error) {
q := paginationQuery(input.Page, input.PerPage)
if input.Status != "" {
q.Set("status", input.Status)
}
if input.CertificateID != "" {
q.Set("certificate_id", input.CertificateID)
}
if input.NetworkScanTargetID != "" {
q.Set("network_scan_target_id", input.NetworkScanTargetID)
}
if input.Enabled != "" {
q.Set("enabled", input.Enabled)
}
data, err := c.Get("/api/v1/health-checks", q)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_health_check_summary",
Description: "Return aggregate counts of TLS health-check states (GET /api/v1/health-checks/summary). Useful for dashboard-style queries about endpoint posture.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input EmptyInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/health-checks/summary", nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_get_health_check",
Description: "Get a single TLS endpoint health check (GET /api/v1/health-checks/{id}).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input GetByIDInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Get("/api/v1/health-checks/"+input.ID, nil)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_create_health_check",
Description: "Create a TLS endpoint health check (POST /api/v1/health-checks). Required: endpoint (host:port). Server-side defaults: check_interval_seconds=300, degraded_threshold=2, down_threshold=5.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input CreateHealthCheckInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Post("/api/v1/health-checks", input)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_update_health_check",
Description: "Update a TLS endpoint health check (PUT /api/v1/health-checks/{id}). The handler performs a merge update: non-zero numeric fields and non-empty strings overwrite, zero values preserve existing.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input UpdateHealthCheckInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Put("/api/v1/health-checks/"+input.ID, input)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_delete_health_check",
Description: "Delete a TLS endpoint health check (DELETE /api/v1/health-checks/{id}).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input GetByIDInput) (*gomcp.CallToolResult, any, error) {
data, err := c.Delete("/api/v1/health-checks/" + input.ID)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_health_check_history",
Description: "Get probe history for a TLS endpoint health check (GET /api/v1/health-checks/{id}/history). Default limit 100; max 1000 (clamped server-side).",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input HealthCheckHistoryInput) (*gomcp.CallToolResult, any, error) {
q := url.Values{}
if input.Limit > 0 {
q.Set("limit", strconv.Itoa(input.Limit))
}
data, err := c.Get("/api/v1/health-checks/"+input.ID+"/history", q)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
gomcp.AddTool(s, &gomcp.Tool{
Name: "certctl_acknowledge_health_check",
Description: "Acknowledge a TLS health-check incident (POST /api/v1/health-checks/{id}/acknowledge). Marks the check Acknowledged=true; the handler records the actor (defaults to 'unknown' if absent) for the audit trail.",
}, func(ctx context.Context, req *gomcp.CallToolRequest, input AcknowledgeHealthCheckInput) (*gomcp.CallToolResult, any, error) {
body := struct {
Actor string `json:"actor,omitempty"`
}{Actor: input.Actor}
data, err := c.Post("/api/v1/health-checks/"+input.ID+"/acknowledge", body)
if err != nil {
return errorResult(err)
}
return textResult(data)
})
}