mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 17:51:29 +00:00
87213128cc
Pre-G-2 internal/domain/connector.go::Agent::APIKeyHash was tagged
`json:"api_key_hash"` and shipped on every wire surface that returned
domain.Agent — GET /api/v1/agents (PagedResponse{Data: agents}),
GET /api/v1/agents/{id}, GET /api/v1/agents/retired, and the
POST /api/v1/agents registration response. Every authenticated client
(browser, CLI --json, MCP tool calls) received the SHA-256-of-the-API-key
string. The browser silently dropped it because web/src/api/types.ts
omits the field, but CLI and MCP consumers print full JSON so the hash
was visible there. Even though the value is a hash and not the plaintext
key, shipping it gives an attacker an offline brute-force target if the
API-key entropy is low (certctl doesn't enforce a minimum on operator-
supplied keys), and there's no business reason for any client to ever
receive it — the value is server-internal, used only for the lookup at
internal/repository/postgres/agent.go::GetByAPIKey. (Audit:
cat-s5-apikey_leak in coverage-gap-audit-2026-04-24-v5/unified-audit.md.)
We chose the audit's recommended fix (json:"-") plus a defense-in-depth
MarshalJSON plus a CI guardrail. Three layers because struct-tag
redaction alone is one rebase away from being silently reverted, the
custom MarshalJSON catches the case where a parent struct embeds Agent
under a different tag, and the CI grep blocks reintroduction at the spec
or frontend boundary even without a code review catching it.
Files changed:
Phase 1 — Domain redaction:
- internal/domain/connector.go: APIKeyHash tag flipped from
`json:"api_key_hash"` to `json:"-"`. New Agent.MarshalJSON
with value receiver + type-alias-recursion-break that explicitly
zeroes APIKeyHash on the marshal-time copy. Long-form docblock
explaining the G-2 closure rationale + cross-references to
service.RegisterAgent (populator), repository.AgentRepository::
GetByAPIKey (consumer), docs/architecture.md (DB-shape vs
API-shape distinction), and the audit finding.
Phase 2 — Domain tests (5 test functions):
- internal/domain/connector_test.go: TestAgent_MarshalJSON_RedactsAPIKeyHash
pins the marshal-boundary contract on a value receiver. ...RedactsViaPointer
pins the *Agent path. ...RedactsInSlice pins the []Agent path that the
ListAgents handler actually emits via PagedResponse. ...DoesNotMutateReceiver
pins the by-value-receiver contract so a future refactor that switches
to pointer-receiver gets caught. ...RoundTrip pins the wire-shape
guarantee that APIKeyHash is dropped on encode and cannot reappear on
decode. Single sentinel value ("sha256:LEAKED-CREDENTIAL-DERIVATIVE-
SENTINEL") flows through every fixture for grep-ability on regression.
Phase 3 — Handler tests (4 test functions):
- internal/api/handler/agent_handler_test.go: TestListAgents_DoesNotLeakAPIKeyHash,
TestGetAgent_DoesNotLeakAPIKeyHash, TestRegisterAgent_DoesNotLeakAPIKeyHash,
TestListRetiredAgents_DoesNotLeakAPIKeyHash. Each asserts (a) the
literal substring "api_key_hash" is absent from the httptest-captured
body, (b) the leak sentinel value is absent, (c) the non-leaked fields
ARE present (sanity that the handler is serving real data, not just
empty payloads). Shared sentinel "sha256:LEAKED-CREDENTIAL-DERIVATIVE-
HANDLER-SENTINEL" so a single grep over a failing test's output
identifies the leak surface immediately.
Phase 4 — Spec / docs:
- api/openapi.yaml: api_key_hash property REMOVED from Agent schema
(was at line 3690). Inline G-2 comment naming the closure + the
database-vs-API-shape distinction so a future spec edit doesn't
silently re-introduce the field.
- docs/architecture.md: ER-diagram block already documents the agents
table including api_key_hash (DB shape — correct). Added a sibling
note paragraph immediately below the diagram explaining that several
columns are intentionally server-internal (api_key_hash redaction
+ issuers.config / deployment_targets.config encrypted shadow), with
cross-references to the redaction enforcement site, the OpenAPI
schema, the frontend interface, and the CI guardrail.
- web/src/api/types.ts: Agent interface unchanged in shape (already
omitted the field) but added a leading comment block explaining
WHY the omission is intentional — stops a future frontend dev from
"completing" the interface from the OpenAPI spec or the Go struct.
Phase 5 — CI guardrail:
- .github/workflows/ci.yml: new "Forbidden api_key_hash JSON-shape
regression guard (G-2)" step. Scoped patterns catch the actual
regression shapes — Go struct tag (json:"api_key_hash"), frontend
interface declaration, OpenAPI schema property, YAML enum/array
membership. Repository / migration / seed / service / integration /
unit-test / comment lines exempt. Verified locally on the real tree
(passes) and against 4 synthetic regression patterns (each fires
the guardrail). Mirrors the G-1 pattern from .github/workflows/
ci.yml lines 47-108.
Phase 5b — Sweep verification (no changes, results documented for the
next reader):
- internal/api/middleware/audit.go: doesn't serialize Agent struct;
records request body only. No leak.
- service.RegisterAgent audit-event payload: `map[string]interface{}{
"name": name, "hostname": hostname}` — name + hostname only,
no APIKeyHash. No leak.
- All 9 slog sites that mention agent: scalar attrs only ("agent_id",
"error", "agent_hostname"), never the full struct. No leak.
- internal/mcp, internal/cli, cmd/cli, cmd/mcp-server: zero matches
for APIKeyHash / api_key_hash. Both pass server JSON verbatim, so
the wire-side fix transitively closes them.
Verification (all gates pass):
- go build ./...
- go vet ./...
- go test -short ./... — every package green
- go test -short -race ./internal/domain/... ./internal/api/handler/... — clean
- govulncheck ./... — no vulnerabilities in our code
- helm lint deploy/helm/certctl/ — clean
- helm template smoke render — succeeds
- python3 yaml.safe_load on api/openapi.yaml — parses
- OpenAPI Agent schema scan: no api_key_hash property
- CI guardrail mirror: clean on real tree, fires on all 4 synthetic
regression patterns
- Domain pkg coverage: Agent.MarshalJSON 100%, connector.go total 87.5%
- Handler pkg coverage: 79.2%
Sample response body (httptest captured during verification, GET
/api/v1/agents/{id} via the new handler test):
{"id":"agent-demo","name":"demo-agent","hostname":"demo.host",
"status":"Online","last_heartbeat_at":"2026-04-24T11:59:30Z",
"registered_at":"2026-04-24T12:00:00Z","os":"linux",
"architecture":"amd64","ip_address":"10.0.0.42",
"version":"v2.0.49"}
Note the absence of any api_key_hash key, even though the in-memory
struct passed to the handler had APIKeyHash set to a sentinel.
Out of scope (intentionally untouched):
- internal/repository/postgres/agent.go SELECT/INSERT/UPDATE/scan
paths and GetByAPIKey lookup — DB column stays, repo still
populates the struct, auth lookup still works. The redaction is a
marshal-boundary concern.
- migrations/000001_initial_schema.up.sql + migrations/seed_*.sql —
DB schema and seed data unchanged.
- internal/service/agent.go::RegisterAgent — service-side hashing
and persistence unchanged.
- Other domain types with potential credential-derivative fields
(Issuer.Config, DeploymentTarget.Config, notifier configs). Not
flagged by the audit; some are already protected (e.g.,
DeploymentTarget.EncryptedConfig []byte `json:"-"`). File a
separate audit pass if recon surfaces additional leaks.
- Per-resource DTO layer across every handler. Single audit
finding, single domain type.
- A separate possible follow-up: the v2 RegisterAgent endpoint
doesn't return the plaintext API key to the agent, which may
mean self-bootstrap via POST /api/v1/agents is broken. Verified
during recon; out of scope for G-2; should be its own ticket.
Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
§2 P1 cluster, cat-s5-apikey_leak
Audit recommendation: 'json:"-" or API-response DTO
excluding APIKeyHash' — went with the json:"-" + MarshalJSON
defense-in-depth pair plus CI guardrail and structural docs.
216 lines
9.8 KiB
Go
216 lines
9.8 KiB
Go
package domain
|
|
|
|
import (
|
|
"encoding/json"
|
|
"time"
|
|
)
|
|
|
|
// Issuer represents a certificate authority or ACME provider.
|
|
type Issuer struct {
|
|
ID string `json:"id"`
|
|
Name string `json:"name"`
|
|
Type IssuerType `json:"type"`
|
|
Config json.RawMessage `json:"config"`
|
|
EncryptedConfig []byte `json:"-"` // AES-GCM encrypted full config (never exposed via API)
|
|
Enabled bool `json:"enabled"`
|
|
LastTestedAt *time.Time `json:"last_tested_at,omitempty"`
|
|
TestStatus string `json:"test_status,omitempty"`
|
|
Source string `json:"source,omitempty"`
|
|
CreatedAt time.Time `json:"created_at"`
|
|
UpdatedAt time.Time `json:"updated_at"`
|
|
}
|
|
|
|
// DeploymentTarget represents a target system where certificates are deployed.
|
|
type DeploymentTarget struct {
|
|
ID string `json:"id"`
|
|
Name string `json:"name"`
|
|
Type TargetType `json:"type"`
|
|
AgentID string `json:"agent_id"`
|
|
Config json.RawMessage `json:"config"`
|
|
EncryptedConfig []byte `json:"-"` // AES-GCM encrypted full config (never exposed via API)
|
|
Enabled bool `json:"enabled"`
|
|
LastTestedAt *time.Time `json:"last_tested_at,omitempty"`
|
|
TestStatus string `json:"test_status,omitempty"`
|
|
Source string `json:"source,omitempty"`
|
|
RetiredAt *time.Time `json:"retired_at,omitempty"` // I-004: soft-retirement timestamp (nil = active)
|
|
RetiredReason *string `json:"retired_reason,omitempty"` // I-004: reason captured at cascade retirement
|
|
CreatedAt time.Time `json:"created_at"`
|
|
UpdatedAt time.Time `json:"updated_at"`
|
|
}
|
|
|
|
// Agent represents an agent running on a target system.
|
|
type Agent struct {
|
|
ID string `json:"id"`
|
|
Name string `json:"name"`
|
|
Hostname string `json:"hostname"`
|
|
Status AgentStatus `json:"status"`
|
|
LastHeartbeatAt *time.Time `json:"last_heartbeat_at,omitempty"`
|
|
RegisteredAt time.Time `json:"registered_at"`
|
|
// APIKeyHash is the SHA-256 of the agent's plaintext API key,
|
|
// populated by service.RegisterAgent (`hashAPIKey(apiKey)`) and
|
|
// consumed by repository.AgentRepository::GetByAPIKey at auth time.
|
|
// It is server-internal: never serialized to clients, never echoed
|
|
// via CLI / MCP / agent registration response, never logged.
|
|
//
|
|
// G-2 (P1): pre-G-2 the field was tagged `json:"api_key_hash"` and
|
|
// shipped on every /api/v1/agents response (cat-s5-apikey_leak). Even
|
|
// SHA-256 should not be shipped to clients — it gives an offline
|
|
// brute-force target if API-key entropy is low (certctl doesn't enforce
|
|
// a minimum on operator-supplied keys), and there is no business reason
|
|
// for any client to ever receive it. Post-G-2 the JSON tag is "-" and
|
|
// Agent.MarshalJSON below zeroes the field on a copy before delegating
|
|
// to the default marshal — defense in depth so a future tag-revert by
|
|
// refactor cannot reopen the leak. The DB column, repo SELECT/INSERT/
|
|
// UPDATE paths, and service-side hashing are unchanged. See
|
|
// docs/architecture.md ER diagram (which documents DB shape, not API
|
|
// shape) and coverage-gap-audit-2026-04-24-v5/unified-audit.md
|
|
// cat-s5-apikey_leak for the full closure rationale.
|
|
APIKeyHash string `json:"-"`
|
|
OS string `json:"os"`
|
|
Architecture string `json:"architecture"`
|
|
IPAddress string `json:"ip_address"`
|
|
Version string `json:"version"`
|
|
// I-004: soft-retirement fields. An agent with RetiredAt != nil is the
|
|
// canonical "retired" state. The Status column remains as before (Online
|
|
// / Offline / Degraded) and is preserved at retirement time as the
|
|
// last-seen operational status; RetiredAt is the source of truth for
|
|
// "should we filter this row from active listings?".
|
|
RetiredAt *time.Time `json:"retired_at,omitempty"`
|
|
RetiredReason *string `json:"retired_reason,omitempty"`
|
|
}
|
|
|
|
// MarshalJSON implements json.Marshaler. It explicitly zeros APIKeyHash
|
|
// before serialization to defense-in-depth the `json:"-"` tag above.
|
|
//
|
|
// G-2 (P1): pre-G-2 the field was tagged `json:"api_key_hash"` and
|
|
// shipped on every /api/v1/agents response (cat-s5-apikey_leak). Post-G-2
|
|
// the tag is "-" and this method enforces redaction even if the tag is
|
|
// reverted by a future refactor — the receiver is by-value so the
|
|
// APIKeyHash = "" assignment mutates only the marshal-time copy, never
|
|
// the caller's original. The type-alias trick (`type alias Agent`)
|
|
// breaks the recursive MarshalJSON call that would otherwise stack-
|
|
// overflow. Both *Agent and Agent receivers route through here because
|
|
// the json package looks the method up via reflect.Value, and a value
|
|
// receiver satisfies both kinds of pointer.
|
|
//
|
|
// Auditor's note for the next reader: do NOT remove this method even if
|
|
// the json:"-" tag stays. The CI guardrail at .github/workflows/ci.yml
|
|
// also blocks reintroduction at the tag site, but this method is the
|
|
// last line of defense for serialization paths that bypass struct tags
|
|
// (e.g., a future MarshalJSON on a parent struct that embeds Agent).
|
|
func (a Agent) MarshalJSON() ([]byte, error) {
|
|
type alias Agent // breaks recursion: alias has no MarshalJSON method
|
|
a.APIKeyHash = ""
|
|
return json.Marshal(alias(a))
|
|
}
|
|
|
|
// IsRetired returns true when this agent has been soft-retired.
|
|
// I-004: callers that iterate active agents (stats dashboard, stale-offline
|
|
// sweeper, handler-facing list) must skip retired rows by default.
|
|
func (a *Agent) IsRetired() bool { return a != nil && a.RetiredAt != nil }
|
|
|
|
// AgentDependencyCounts captures the active downstream rows that would be
|
|
// affected by retiring an agent. Returned by the preflight pass on
|
|
// DELETE /api/v1/agents/{id}. Zero counts mean a clean soft-retire is safe;
|
|
// any non-zero count blocks a default retire with HTTP 409 and requires an
|
|
// explicit ?force=true&reason=... escape hatch from the operator.
|
|
type AgentDependencyCounts struct {
|
|
ActiveTargets int `json:"active_targets"` // deployment_targets.agent_id=id AND retired_at IS NULL
|
|
ActiveCertificates int `json:"active_certificates"` // certificates currently deployed via one of this agent's active targets
|
|
PendingJobs int `json:"pending_jobs"` // jobs.agent_id=id AND status IN (Pending, AwaitingCSR, AwaitingApproval, Running)
|
|
}
|
|
|
|
// HasDependencies reports whether any preflight counter is non-zero.
|
|
func (d AgentDependencyCounts) HasDependencies() bool {
|
|
return d.ActiveTargets > 0 || d.ActiveCertificates > 0 || d.PendingJobs > 0
|
|
}
|
|
|
|
// SentinelAgentIDs enumerates the four reserved agent identities that back
|
|
// non-agent discovery subsystems. These rows are created by cmd/server on
|
|
// startup and retiring them would orphan their subsystem — the network
|
|
// scanner and the three cloud secret-manager sources all key writes to
|
|
// these IDs via service.SentinelAgentID / service.SentinelAWSSecretsMgr /
|
|
// service.SentinelAzureKeyVault / service.SentinelGCPSecretMgr. The four
|
|
// literal IDs below MUST stay in lockstep with those service-package
|
|
// constants (see internal/service/network_scan.go line 23 and
|
|
// internal/service/cloud_discovery.go lines 14-16).
|
|
//
|
|
// The retirement service refuses them unconditionally — even with
|
|
// ?force=true — via ErrAgentIsSentinel. Living here (and not in the
|
|
// service package) lets handler, repository, and scheduler code filter
|
|
// them without importing service and creating a cycle.
|
|
var SentinelAgentIDs = []string{
|
|
"server-scanner",
|
|
"cloud-aws-sm",
|
|
"cloud-azure-kv",
|
|
"cloud-gcp-sm",
|
|
}
|
|
|
|
// IsSentinelAgent reports whether id matches one of the four reserved
|
|
// sentinel agent IDs. A linear scan is fine — the slice is length 4 and
|
|
// the check is rare (only on retirement attempts and sweeper filters).
|
|
func IsSentinelAgent(id string) bool {
|
|
for _, s := range SentinelAgentIDs {
|
|
if s == id {
|
|
return true
|
|
}
|
|
}
|
|
return false
|
|
}
|
|
|
|
// AgentMetadata contains runtime metadata reported by agents via heartbeat.
|
|
type AgentMetadata struct {
|
|
OS string `json:"os"`
|
|
Architecture string `json:"architecture"`
|
|
Hostname string `json:"hostname"`
|
|
IPAddress string `json:"ip_address"`
|
|
Version string `json:"version"`
|
|
}
|
|
|
|
// AgentStatus represents the operational status of an agent.
|
|
type AgentStatus string
|
|
|
|
const (
|
|
AgentStatusOnline AgentStatus = "Online"
|
|
AgentStatusOffline AgentStatus = "Offline"
|
|
AgentStatusDegraded AgentStatus = "Degraded"
|
|
)
|
|
|
|
// IssuerType represents the type of certificate authority.
|
|
type IssuerType string
|
|
|
|
const (
|
|
IssuerTypeACME IssuerType = "ACME"
|
|
IssuerTypeGenericCA IssuerType = "GenericCA"
|
|
IssuerTypeStepCA IssuerType = "StepCA"
|
|
IssuerTypeOpenSSL IssuerType = "OpenSSL"
|
|
IssuerTypeVault IssuerType = "VaultPKI"
|
|
IssuerTypeDigiCert IssuerType = "DigiCert"
|
|
IssuerTypeSectigo IssuerType = "Sectigo"
|
|
IssuerTypeGoogleCAS IssuerType = "GoogleCAS"
|
|
IssuerTypeAWSACMPCA IssuerType = "AWSACMPCA"
|
|
IssuerTypeEntrust IssuerType = "Entrust"
|
|
IssuerTypeGlobalSign IssuerType = "GlobalSign"
|
|
IssuerTypeEJBCA IssuerType = "EJBCA"
|
|
)
|
|
|
|
// TargetType represents the type of deployment target.
|
|
type TargetType string
|
|
|
|
const (
|
|
TargetTypeNGINX TargetType = "NGINX"
|
|
TargetTypeApache TargetType = "Apache"
|
|
TargetTypeHAProxy TargetType = "HAProxy"
|
|
TargetTypeF5 TargetType = "F5"
|
|
TargetTypeIIS TargetType = "IIS"
|
|
TargetTypeTraefik TargetType = "Traefik"
|
|
TargetTypeCaddy TargetType = "Caddy"
|
|
TargetTypeEnvoy TargetType = "Envoy"
|
|
TargetTypePostfix TargetType = "Postfix"
|
|
TargetTypeDovecot TargetType = "Dovecot"
|
|
TargetTypeSSH TargetType = "SSH"
|
|
TargetTypeWinCertStore TargetType = "WinCertStore"
|
|
TargetTypeJavaKeystore TargetType = "JavaKeystore"
|
|
TargetTypeKubernetesSecrets TargetType = "KubernetesSecrets"
|
|
)
|