mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-09 11:58:53 +00:00
ceca3647eb
Closes Top-10 fix #5 of the 2026-05-03 issuer-coverage audit (see cowork/issuer-coverage-audit-2026-05-03/RESULTS.md). Pre-fix, the VaultPKI adapter authenticated with a static token and never called renew-self. Long-lived deploys hit token expiry; the first operator-visible signal was failed cert renewals on production targets. This commit: 1. Connector.Start(ctx) spawns a goroutine that calls POST /v1/auth/token/renew-self at TTL/2 cadence (computed from a one-shot lookup-self at startup). Honours ctx.Done() for graceful shutdown via a per-loop done channel + Stop(). 2. On `renewable: false` response (initial lookup OR any subsequent renewal), the loop emits a WARN, increments the not_renewable counter, and exits. The operator must rotate the token before Vault's Max TTL elapses. 3. New Prometheus counter certctl_vault_token_renewals_total with labels result={success,failure,not_renewable}. Registered alongside existing certctl_issuance_* counters in internal/api/handler/metrics.go. 4. ERROR-level logging on renewal failure with operator-actionable substring ("vault token renewal failed; rotate the token before TTL expires") so journalctl + grep find it. Loop keeps ticking after a failure — transient blips don't kill it. New optional issuer.Lifecycle interface: type Lifecycle interface { Start(ctx context.Context) error Stop() } Connectors that hold no background goroutines (almost all of them) do not implement this — IssuerRegistry.StartLifecycles / StopLifecycles feature-detect via type assertion. New lifecycle-bearing connectors plug in by implementing the interface; no further registry plumbing required. Wiring (cmd/server/main.go): - service.NewVaultRenewalMetrics() instance is shared between issuerRegistry.SetVaultRenewalMetrics (so Vault connectors built by Rebuild get a recorder) and metricsHandler.SetVaultRenewals (so the Prometheus exposer emits the new series). - issuerRegistry.StartLifecycles(ctx) is called after issuerService.BuildRegistry; defer issuerRegistry.StopLifecycles is paired so goroutines exit cleanly on signal. - IssuerConnectorAdapter.Underlying() exposes the wrapped issuer.Connector so registry-level machinery can reach the concrete connector behind the adapter without duplicating the wiring at every call site. Tests (internal/connector/issuer/vault/vault_renew_test.go): - TestVault_RenewLoop_TickAtHalfTTL — three ticks → three renewals, all "success". - TestVault_RenewLoop_StopsOnNotRenewable — second renewal returns renewable=false, loop exits, third tick fires no HTTP call. - TestVault_RenewLoop_FailureSurfacesViaMetric — first renewal 403 bumps "failure", second renewal succeeds → loop kept ticking. - TestVault_RenewLoop_CtxCancellation_StopsCleanly — Stop returns within 200ms after ctx cancel. - TestVault_RenewLoop_StartsNothingWhenNotRenewable — token already non-renewable at boot ⇒ no goroutine, "not_renewable" metric increments at startup so operators see it in Grafana. - TestVault_ComputeInterval — 4 cases pinning TTL/2 + minRenewInterval floor. - TestVault_RenewSelf_ParseFailure_NamesActionableInError — surfaced error contains "vault token renewal failed" + "rotate the token". Cadence is dynamic — every successful renewal re-derives TTL/2 from the renewed lease's lease_duration, so a short bootstrap token that gets renewed up to a longer Max TTL shifts to the longer cadence automatically (defends against degenerate fast ticking on a token whose Max TTL is far longer than its initial TTL). Documentation: - docs/connectors.md Vault PKI section gains "Token TTL + automatic renewal" subsection (operator-facing: cadence, metric, renewable=false rotation playbook). Out of scope (intentional, flagged in the audit follow-up): - AppRole / Kubernetes / AWS IAM auth methods (different renewal semantics). - Hot-reload of rotated token from disk (operator restarts today; future: GUI/MCP issuer-update path triggers Rebuild which Stops the old connector and Starts the new one). - Auto-re-auth after token death (operator playbook owns it). CHANGELOG.md is intentionally not hand-edited (per CHANGELOG.md itself: "no longer maintains a hand-edited per-version changelog; per-release notes are auto-generated from commit messages between consecutive tags"). Verified locally: - gofmt clean. - go vet ./internal/service/... ./internal/api/handler/... ./internal/connector/issuer/vault/... ./cmd/server/... clean. - go test -short -count=1 ./internal/connector/issuer/vault/... ./internal/service/... ./internal/api/handler/... green. - go test -race -count=10 -run 'TestVault_RenewLoop|TestVault_ComputeInterval' ./internal/connector/issuer/vault/... green. Audit reference: cowork/issuer-coverage-audit-2026-05-03/RESULTS.md Top-10 fix #5.
192 lines
7.1 KiB
Go
192 lines
7.1 KiB
Go
package service
|
|
|
|
import (
|
|
"context"
|
|
"time"
|
|
|
|
"github.com/shankar0123/certctl/internal/connector/issuer"
|
|
)
|
|
|
|
// IssuerConnectorAdapter bridges the connector-layer issuer.Connector interface with the
|
|
// service-layer IssuerConnector interface. This maintains dependency inversion: the service
|
|
// layer defines the interface it needs, and this adapter wraps the concrete connector.
|
|
//
|
|
// Metrics: when issuerType + metrics are set via SetMetrics, the adapter
|
|
// records every IssueCertificate / RenewCertificate call into the
|
|
// IssuanceMetrics tables (audit fix #4). Untyped or unmetricked
|
|
// adapters (test path) skip the recording — nil-guard everywhere.
|
|
type IssuerConnectorAdapter struct {
|
|
connector issuer.Connector
|
|
issuerType string
|
|
metrics *IssuanceMetrics
|
|
}
|
|
|
|
// NewIssuerConnectorAdapter wraps an issuer.Connector to implement
|
|
// service.IssuerConnector. Existing call sites (28+) keep this
|
|
// signature; metrics are wired via SetMetrics post-construction by
|
|
// the production code path (issuer_registry.go) so test sites that
|
|
// don't care about metrics stay one-arg.
|
|
func NewIssuerConnectorAdapter(c issuer.Connector) IssuerConnector {
|
|
return &IssuerConnectorAdapter{connector: c}
|
|
}
|
|
|
|
// SetMetrics wires per-issuer-type issuance metrics. issuerType is the
|
|
// factory key (e.g. "local", "acme", "digicert") — must match one of
|
|
// the closed-enum values the metrics doc references. metrics may be
|
|
// nil to disable recording. Closes the #4 audit-readiness blocker
|
|
// (per-issuer-type metrics).
|
|
func (a *IssuerConnectorAdapter) SetMetrics(issuerType string, metrics *IssuanceMetrics) {
|
|
a.issuerType = issuerType
|
|
a.metrics = metrics
|
|
}
|
|
|
|
// Underlying returns the wrapped issuer.Connector so registry-level
|
|
// machinery (StartLifecycles / StopLifecycles, Bundle G audit-row
|
|
// pairing, future feature-detect interfaces) can reach the concrete
|
|
// connector behind the adapter without duplicating the wiring at
|
|
// every call site. Returns interface{} rather than issuer.Connector
|
|
// so callers do their own type assertion against optional extension
|
|
// interfaces (issuer.Lifecycle, etc.) without an import dependency
|
|
// fan-out from this package.
|
|
func (a *IssuerConnectorAdapter) Underlying() interface{} {
|
|
return a.connector
|
|
}
|
|
|
|
// recordIssuance is the metrics-recording side effect at the adapter
|
|
// boundary. Bumps the issuance counter (success/failure) and the
|
|
// duration histogram; on failure also bumps the failure-by-error-class
|
|
// counter via ClassifyError.
|
|
//
|
|
// nil-guarded: when metrics or issuerType are unset, it's a no-op.
|
|
func (a *IssuerConnectorAdapter) recordIssuance(start time.Time, err error) {
|
|
if a.metrics == nil || a.issuerType == "" {
|
|
return
|
|
}
|
|
duration := time.Since(start)
|
|
if err != nil {
|
|
a.metrics.RecordIssuance(a.issuerType, "failure", duration)
|
|
a.metrics.RecordFailure(a.issuerType, ClassifyError(err))
|
|
} else {
|
|
a.metrics.RecordIssuance(a.issuerType, "success", duration)
|
|
}
|
|
}
|
|
|
|
// IssueCertificate delegates to the underlying connector's IssueCertificate method,
|
|
// translating between service-layer and connector-layer types.
|
|
//
|
|
// SCEP RFC 8894 + Intune master bundle Phase 5.6 follow-up: mustStaple flows
|
|
// through to the IssuanceRequest.MustStaple field. Only the local issuer
|
|
// honors it (RFC 7633 id-pe-tlsfeature extension); upstream connectors
|
|
// silently ignore the field.
|
|
func (a *IssuerConnectorAdapter) IssueCertificate(ctx context.Context, commonName string, sans []string, csrPEM string, ekus []string, maxTTLSeconds int, mustStaple bool) (*IssuanceResult, error) {
|
|
start := time.Now()
|
|
result, err := a.connector.IssueCertificate(ctx, issuer.IssuanceRequest{
|
|
CommonName: commonName,
|
|
SANs: sans,
|
|
CSRPEM: csrPEM,
|
|
EKUs: ekus,
|
|
MaxTTLSeconds: maxTTLSeconds,
|
|
MustStaple: mustStaple,
|
|
})
|
|
a.recordIssuance(start, err)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
return &IssuanceResult{
|
|
CertPEM: result.CertPEM,
|
|
ChainPEM: result.ChainPEM,
|
|
Serial: result.Serial,
|
|
NotBefore: result.NotBefore,
|
|
NotAfter: result.NotAfter,
|
|
}, nil
|
|
}
|
|
|
|
// RenewCertificate delegates to the underlying connector's RenewCertificate method,
|
|
// translating between service-layer and connector-layer types. Metrics:
|
|
// renewal is recorded into the same certctl_issuance_* series as
|
|
// initial issuance — operationally, renewal IS issuance from the
|
|
// connector's perspective.
|
|
func (a *IssuerConnectorAdapter) RenewCertificate(ctx context.Context, commonName string, sans []string, csrPEM string, ekus []string, maxTTLSeconds int, mustStaple bool) (*IssuanceResult, error) {
|
|
start := time.Now()
|
|
result, err := a.connector.RenewCertificate(ctx, issuer.RenewalRequest{
|
|
CommonName: commonName,
|
|
SANs: sans,
|
|
CSRPEM: csrPEM,
|
|
EKUs: ekus,
|
|
MaxTTLSeconds: maxTTLSeconds,
|
|
MustStaple: mustStaple,
|
|
})
|
|
a.recordIssuance(start, err)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
return &IssuanceResult{
|
|
CertPEM: result.CertPEM,
|
|
ChainPEM: result.ChainPEM,
|
|
Serial: result.Serial,
|
|
NotBefore: result.NotBefore,
|
|
NotAfter: result.NotAfter,
|
|
}, nil
|
|
}
|
|
|
|
// RevokeCertificate delegates to the underlying connector's RevokeCertificate method.
|
|
func (a *IssuerConnectorAdapter) RevokeCertificate(ctx context.Context, serial string, reason string) error {
|
|
var reasonPtr *string
|
|
if reason != "" {
|
|
reasonPtr = &reason
|
|
}
|
|
return a.connector.RevokeCertificate(ctx, issuer.RevocationRequest{
|
|
Serial: serial,
|
|
Reason: reasonPtr,
|
|
})
|
|
}
|
|
|
|
// GenerateCRL delegates to the underlying connector.
|
|
func (a *IssuerConnectorAdapter) GenerateCRL(ctx context.Context, entries []CRLEntry) ([]byte, error) {
|
|
// Convert service-layer CRLEntry to connector-layer RevokedCertEntry
|
|
connEntries := make([]issuer.RevokedCertEntry, len(entries))
|
|
for i, e := range entries {
|
|
connEntries[i] = issuer.RevokedCertEntry{
|
|
SerialNumber: e.SerialNumber,
|
|
RevokedAt: e.RevokedAt,
|
|
ReasonCode: e.ReasonCode,
|
|
}
|
|
}
|
|
return a.connector.GenerateCRL(ctx, connEntries)
|
|
}
|
|
|
|
// SignOCSPResponse delegates to the underlying connector.
|
|
func (a *IssuerConnectorAdapter) SignOCSPResponse(ctx context.Context, req OCSPSignRequest) ([]byte, error) {
|
|
return a.connector.SignOCSPResponse(ctx, issuer.OCSPSignRequest{
|
|
CertSerial: req.CertSerial,
|
|
CertStatus: req.CertStatus,
|
|
RevokedAt: req.RevokedAt,
|
|
RevocationReason: req.RevocationReason,
|
|
ThisUpdate: req.ThisUpdate,
|
|
NextUpdate: req.NextUpdate,
|
|
Nonce: req.Nonce, // RFC 6960 §4.4.1 echo (production hardening II Phase 1)
|
|
})
|
|
}
|
|
|
|
// GetCACertPEM delegates to the underlying connector.
|
|
func (a *IssuerConnectorAdapter) GetCACertPEM(ctx context.Context) (string, error) {
|
|
return a.connector.GetCACertPEM(ctx)
|
|
}
|
|
|
|
// GetRenewalInfo delegates to the underlying connector, translating between service-layer and connector-layer types.
|
|
func (a *IssuerConnectorAdapter) GetRenewalInfo(ctx context.Context, certPEM string) (*RenewalInfoResult, error) {
|
|
result, err := a.connector.GetRenewalInfo(ctx, certPEM)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
if result == nil {
|
|
return nil, nil
|
|
}
|
|
return &RenewalInfoResult{
|
|
SuggestedWindowStart: result.SuggestedWindowStart,
|
|
SuggestedWindowEnd: result.SuggestedWindowEnd,
|
|
RetryAfter: result.RetryAfter,
|
|
ExplanationURL: result.ExplanationURL,
|
|
}, nil
|
|
}
|