mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 20:51:30 +00:00
dfa9faa426
D-008 was a three-part drift in the policy engine that made the
D-005/D-006 remediation cosmetic below the DB layer:
(a) migrations/seed.sql INSERTed rules with pre-D-005 lowercase
types ('ownership', 'environment', 'lifetime', 'renewal_window')
that the handler validator rejects on Create/Update but that
raw SQL INSERTs bypassed entirely. At runtime evaluateRule's
switch fell through to the default "unknown policy rule type"
error branch on every demo rule × every cert × every cycle,
flooding logs while emitting zero violations.
(b) migrations/seed_demo.sql persisted lowercase severity values
('critical', 'error', 'warning') on policy_violations rows.
INSERT succeeded because that column had no CHECK, but any
frontend comparing against the canonical PolicySeverity enum
mis-categorized every seeded violation.
(c) evaluateRule hardcoded Severity: PolicySeverityWarning on
every emitted violation and ignored rule.Config entirely —
so the D-006 per-rule severity column (000013) and every
per-arm Config JSON ({allowed_issuer_ids, allowed_domains,
required_keys, allowed, lead_time_days, max_days}) was dead
data below the evaluation layer.
This commit lands (a)+(b)+(c) atomically. Shipping any subset
leaves the feature half-working.
## Changes
Domain (internal/domain/policy.go):
* Add PolicyTypeCertificateLifetime as the 6th TitleCase canonical.
Pre-D-008 the seeded "max-certificate-lifetime" rule had no engine
arm — routing it through RenewalLeadTime would conflate "how
close to expiry before we renew" with "how long can the cert
possibly be", two distinct semantics. The new type accepts
config {"max_days": int} and flags certs whose
NotAfter - NotBefore exceeds the cap.
Handler validator (internal/api/handler/validation.go):
* ValidatePolicyType allowlist grown to 6 canonicals
(AllowedIssuers, AllowedDomains, RequiredMetadata,
AllowedEnvironments, RenewalLeadTime, CertificateLifetime).
OpenAPI (api/openapi.yaml):
* PolicyType enum grown to match domain.
Frontend (web/src/api/types.ts, types.test.ts):
* POLICY_TYPES tuple gains CertificateLifetime; pin test asserts
all 6 canonicals and rejects casing drift.
Migration 000014 (policy_violations severity CHECK):
* Named CHECK constraint (policy_violations_severity_check)
mirroring 000013's allowlist, defense-in-depth at the DB layer
against future drift from bypassed writes (migrations, psql
sessions, future callers). Symmetric down migration drops by
name.
Seed data:
* migrations/seed.sql rewritten to emit TitleCase canonicals with
per-arm config JSON that actually exercises the config-consuming
paths (not the missing-field backstops):
- pr-require-owner → RequiredMetadata {"required_keys":["owner"]} Warning
- pr-allowed-environments → AllowedEnvironments {"allowed":["production","staging","development"]} Error
- pr-max-certificate-lifetime → CertificateLifetime {"max_days":90} Critical
- pr-min-renewal-window → RenewalLeadTime {"lead_time_days":14} Warning
Severities are now differentiated per rule (D-006 intent).
* migrations/seed_demo.sql violation rows flipped to TitleCase
severity ('Critical', 'Error', 'Warning') so migration 000014
applies cleanly on upgrade paths.
Engine rewrite (internal/service/policy.go):
* evaluateRule rewritten. All six arms now:
1. Parse rule.Config into the per-arm typed struct.
2. Bad JSON → log at ValidateCertificate boundary and skip
this rule (no co-located poisoning of other rules in the
same batch).
3. Empty/null Config → emit the pre-D-008 missing-field
violation (backwards compat invariant — operators who
haven't reconfigured still see the same output).
4. Violations emitted carry rule.Severity (no more hardcoded
Warning); D-006 column is now load-bearing.
* CertificateLifetime arm reads NotBefore/NotAfter from the
certificate's latest version via CertRepo. Injected via
PolicyService.SetCertRepo() setter — avoids churning ~36
NewPolicyService call sites while keeping the lifetime arm
optional (degrades to a log+skip if the setter is not wired).
Server wiring (cmd/server/main.go):
* policyService.SetCertRepo(certRepo) wired after construction.
Tests (internal/service/policy_test.go):
* 25 new subtests across 5 groups:
- TestEvaluateRule_SeverityPassThrough (6): every rule type
emits violations carrying rule.Severity, not hardcoded.
- TestEvaluateRule_ConfigConsumed (12): every per-arm Config
path exercised positive + negative.
- TestEvaluateRule_EmptyConfig_BackCompat (3): empty/null
Config still emits pre-D-008 missing-field violations.
- TestEvaluateRule_BadConfig_SkipsRule: malformed JSON logs
and skips cleanly without poisoning neighbors.
- TestEvaluateRule_CertificateLifetime_RepoScenarios (3):
ok when repo wired, log+skip when not, handles missing
NotBefore/NotAfter edges.
Provenance: D-008 surfaced during D-005/D-006 remediation review
in 7a0ea35. That commit added persistence and CI pins for the
severity field but did not re-verify the evaluation layer
consumed it; this finding and fix close the audit-process gap.
767 lines
31 KiB
Go
767 lines
31 KiB
Go
package main
|
|
|
|
import (
|
|
"context"
|
|
"fmt"
|
|
"log/slog"
|
|
"net"
|
|
"net/http"
|
|
"os"
|
|
"os/signal"
|
|
"strconv"
|
|
"syscall"
|
|
"time"
|
|
|
|
"github.com/shankar0123/certctl/internal/api/handler"
|
|
"github.com/shankar0123/certctl/internal/api/middleware"
|
|
"github.com/shankar0123/certctl/internal/api/router"
|
|
"github.com/shankar0123/certctl/internal/config"
|
|
"github.com/shankar0123/certctl/internal/domain"
|
|
discoveryawssm "github.com/shankar0123/certctl/internal/connector/discovery/awssm"
|
|
discoveryazurekv "github.com/shankar0123/certctl/internal/connector/discovery/azurekv"
|
|
discoverygcpsm "github.com/shankar0123/certctl/internal/connector/discovery/gcpsm"
|
|
notifyemail "github.com/shankar0123/certctl/internal/connector/notifier/email"
|
|
notifyopsgenie "github.com/shankar0123/certctl/internal/connector/notifier/opsgenie"
|
|
notifypagerduty "github.com/shankar0123/certctl/internal/connector/notifier/pagerduty"
|
|
notifyslack "github.com/shankar0123/certctl/internal/connector/notifier/slack"
|
|
notifyteams "github.com/shankar0123/certctl/internal/connector/notifier/teams"
|
|
"github.com/shankar0123/certctl/internal/repository/postgres"
|
|
"github.com/shankar0123/certctl/internal/scheduler"
|
|
"github.com/shankar0123/certctl/internal/service"
|
|
)
|
|
|
|
func main() {
|
|
// Load configuration
|
|
cfg, err := config.Load()
|
|
if err != nil {
|
|
fmt.Fprintf(os.Stderr, "Failed to load configuration: %v\n", err)
|
|
os.Exit(1)
|
|
}
|
|
|
|
// Set up structured logging
|
|
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
|
|
Level: cfg.GetLogLevel(),
|
|
}))
|
|
|
|
logger.Info("certctl server starting",
|
|
"version", "2.0.9",
|
|
"server_host", cfg.Server.Host,
|
|
"server_port", cfg.Server.Port)
|
|
|
|
// Initialize database connection pool
|
|
db, err := postgres.NewDB(cfg.Database.URL)
|
|
if err != nil {
|
|
logger.Error("failed to connect to database", "error", err)
|
|
os.Exit(1)
|
|
}
|
|
defer db.Close()
|
|
logger.Info("connected to database")
|
|
|
|
// Run migrations
|
|
logger.Info("running migrations", "path", cfg.Database.MigrationsPath)
|
|
if err := postgres.RunMigrations(db, cfg.Database.MigrationsPath); err != nil {
|
|
logger.Error("failed to run migrations", "error", err)
|
|
os.Exit(1)
|
|
}
|
|
logger.Info("migrations completed")
|
|
|
|
// Initialize repositories with real PostgreSQL connection
|
|
auditRepo := postgres.NewAuditRepository(db)
|
|
certificateRepo := postgres.NewCertificateRepository(db)
|
|
issuerRepo := postgres.NewIssuerRepository(db)
|
|
targetRepo := postgres.NewTargetRepository(db)
|
|
agentRepo := postgres.NewAgentRepository(db)
|
|
jobRepo := postgres.NewJobRepository(db)
|
|
policyRepo := postgres.NewPolicyRepository(db)
|
|
notificationRepo := postgres.NewNotificationRepository(db)
|
|
renewalPolicyRepo := postgres.NewRenewalPolicyRepository(db)
|
|
profileRepo := postgres.NewProfileRepository(db)
|
|
teamRepo := postgres.NewTeamRepository(db)
|
|
ownerRepo := postgres.NewOwnerRepository(db)
|
|
logger.Info("initialized all repositories")
|
|
|
|
// Initialize dynamic issuer registry.
|
|
// Issuers are loaded from the database (with AES-256-GCM encrypted config).
|
|
// On first boot with an empty database, env var issuers are seeded automatically.
|
|
//
|
|
// M-8 (CWE-916 / CWE-329): the encryption passphrase is passed as a raw
|
|
// string into IssuerService / TargetService / IssuerRegistry. Each call to
|
|
// crypto.EncryptIfKeySet generates a fresh 16-byte PBKDF2 salt and emits a
|
|
// v2 blob (magic 0x02 || salt || nonce || sealed). Decryption auto-detects
|
|
// v1 legacy blobs (no magic) and falls back to the fixed v1 salt for
|
|
// backward compatibility; v1 blobs transparently upgrade to v2 on next
|
|
// write. DO NOT pre-derive the key here with crypto.DeriveKey — that was
|
|
// the v1 fixed-salt behaviour that M-8 removes.
|
|
encryptionKey := cfg.Encryption.ConfigEncryptionKey
|
|
if encryptionKey != "" {
|
|
logger.Info("config encryption enabled (AES-256-GCM, per-ciphertext PBKDF2 salt)")
|
|
} else {
|
|
// C-2 fix: fail closed at startup when database-sourced issuer or target
|
|
// rows exist without a configured encryption key. Previously the server
|
|
// would emit a one-line warning and silently persist new GUI-created
|
|
// configs as plaintext (CWE-311). Refuse to start instead: the operator
|
|
// must either configure CERTCTL_CONFIG_ENCRYPTION_KEY or remove the
|
|
// vulnerable rows before the control plane can boot.
|
|
ctx := context.Background()
|
|
dbIssuers, ierr := issuerRepo.List(ctx)
|
|
if ierr != nil {
|
|
logger.Error("startup check: failed to list issuers", "error", ierr)
|
|
os.Exit(1)
|
|
}
|
|
dbTargets, terr := targetRepo.List(ctx)
|
|
if terr != nil {
|
|
logger.Error("startup check: failed to list targets", "error", terr)
|
|
os.Exit(1)
|
|
}
|
|
var dbIssuerCount, dbTargetCount int
|
|
for _, iss := range dbIssuers {
|
|
if iss != nil && iss.Source == "database" {
|
|
dbIssuerCount++
|
|
}
|
|
}
|
|
for _, tgt := range dbTargets {
|
|
if tgt != nil && tgt.Source == "database" {
|
|
dbTargetCount++
|
|
}
|
|
}
|
|
if dbIssuerCount > 0 || dbTargetCount > 0 {
|
|
logger.Error(
|
|
"startup refused: CERTCTL_CONFIG_ENCRYPTION_KEY is not set but database-sourced configs exist "+
|
|
"(would expose sensitive fields as plaintext, CWE-311). "+
|
|
"Set the encryption key or remove the affected rows before restarting.",
|
|
"database_sourced_issuers", dbIssuerCount,
|
|
"database_sourced_targets", dbTargetCount,
|
|
)
|
|
os.Exit(1)
|
|
}
|
|
logger.Warn("CERTCTL_CONFIG_ENCRYPTION_KEY not set — env-seeded issuers will be stored in plaintext; GUI-created issuers and targets will be rejected until a key is configured")
|
|
}
|
|
|
|
issuerRegistry := service.NewIssuerRegistry(logger)
|
|
|
|
// Initialize revocation repository
|
|
revocationRepo := postgres.NewRevocationRepository(db)
|
|
|
|
// Initialize services (following the dependency graph)
|
|
auditService := service.NewAuditService(auditRepo)
|
|
policyService := service.NewPolicyService(policyRepo, auditService)
|
|
policyService.SetCertRepo(certificateRepo) // D-008: CertificateLifetime arm needs CertificateVersion.NotBefore/NotAfter
|
|
certificateService := service.NewCertificateService(certificateRepo, policyService, auditService)
|
|
notifierRegistry := make(map[string]service.Notifier)
|
|
|
|
// Wire notifier connectors from config
|
|
if cfg.Notifiers.SlackWebhookURL != "" {
|
|
slackNotifier := notifyslack.New(notifyslack.Config{
|
|
WebhookURL: cfg.Notifiers.SlackWebhookURL,
|
|
ChannelOverride: cfg.Notifiers.SlackChannel,
|
|
Username: cfg.Notifiers.SlackUsername,
|
|
})
|
|
notifierRegistry["Slack"] = slackNotifier
|
|
logger.Info("Slack notifier enabled")
|
|
}
|
|
if cfg.Notifiers.TeamsWebhookURL != "" {
|
|
teamsNotifier := notifyteams.New(notifyteams.Config{
|
|
WebhookURL: cfg.Notifiers.TeamsWebhookURL,
|
|
})
|
|
notifierRegistry["Teams"] = teamsNotifier
|
|
logger.Info("Teams notifier enabled")
|
|
}
|
|
if cfg.Notifiers.PagerDutyRoutingKey != "" {
|
|
pdNotifier := notifypagerduty.New(notifypagerduty.Config{
|
|
RoutingKey: cfg.Notifiers.PagerDutyRoutingKey,
|
|
Severity: cfg.Notifiers.PagerDutySeverity,
|
|
})
|
|
notifierRegistry["PagerDuty"] = pdNotifier
|
|
logger.Info("PagerDuty notifier enabled")
|
|
}
|
|
if cfg.Notifiers.OpsGenieAPIKey != "" {
|
|
ogNotifier := notifyopsgenie.New(notifyopsgenie.Config{
|
|
APIKey: cfg.Notifiers.OpsGenieAPIKey,
|
|
Priority: cfg.Notifiers.OpsGeniePriority,
|
|
})
|
|
notifierRegistry["OpsGenie"] = ogNotifier
|
|
logger.Info("OpsGenie notifier enabled")
|
|
}
|
|
|
|
// Wire email notifier if SMTP is configured
|
|
var emailAdapter *notifyemail.NotifierAdapter
|
|
if cfg.Notifiers.SMTPHost != "" && cfg.Notifiers.SMTPFromAddress != "" {
|
|
emailConnector := notifyemail.New(¬ifyemail.Config{
|
|
SMTPHost: cfg.Notifiers.SMTPHost,
|
|
SMTPPort: cfg.Notifiers.SMTPPort,
|
|
Username: cfg.Notifiers.SMTPUsername,
|
|
Password: cfg.Notifiers.SMTPPassword,
|
|
FromAddress: cfg.Notifiers.SMTPFromAddress,
|
|
UseTLS: cfg.Notifiers.SMTPUseTLS,
|
|
}, logger)
|
|
emailAdapter = notifyemail.NewNotifierAdapter(emailConnector)
|
|
notifierRegistry["Email"] = emailAdapter
|
|
logger.Info("Email notifier enabled",
|
|
"smtp_host", cfg.Notifiers.SMTPHost,
|
|
"smtp_port", cfg.Notifiers.SMTPPort,
|
|
"from", cfg.Notifiers.SMTPFromAddress)
|
|
}
|
|
|
|
notificationService := service.NewNotificationService(notificationRepo, notifierRegistry)
|
|
notificationService.SetOwnerRepo(ownerRepo)
|
|
|
|
// Create RevocationSvc with its dependencies
|
|
revocationSvc := service.NewRevocationSvc(certificateRepo, revocationRepo, auditService)
|
|
revocationSvc.SetIssuerRegistry(issuerRegistry)
|
|
revocationSvc.SetNotificationService(notificationService)
|
|
|
|
// Create CAOperationsSvc with its dependencies
|
|
caOperationsSvc := service.NewCAOperationsSvc(revocationRepo, certificateRepo, profileRepo)
|
|
caOperationsSvc.SetIssuerRegistry(issuerRegistry)
|
|
|
|
// Wire sub-services into CertificateService
|
|
certificateService.SetRevocationSvc(revocationSvc)
|
|
certificateService.SetCAOperationsSvc(caOperationsSvc)
|
|
certificateService.SetTargetRepo(targetRepo)
|
|
certificateService.SetJobRepo(jobRepo)
|
|
certificateService.SetKeygenMode(cfg.Keygen.Mode)
|
|
renewalService := service.NewRenewalService(certificateRepo, jobRepo, renewalPolicyRepo, profileRepo, auditService, notificationService, issuerRegistry, cfg.Keygen.Mode)
|
|
renewalService.SetTargetRepo(targetRepo)
|
|
deploymentService := service.NewDeploymentService(jobRepo, targetRepo, agentRepo, certificateRepo, auditService, notificationService)
|
|
jobService := service.NewJobService(jobRepo, renewalService, deploymentService, logger)
|
|
agentService := service.NewAgentService(agentRepo, certificateRepo, jobRepo, targetRepo, auditService, issuerRegistry, renewalService)
|
|
agentService.SetProfileRepo(profileRepo)
|
|
issuerService := service.NewIssuerService(issuerRepo, auditService, issuerRegistry, encryptionKey, logger)
|
|
|
|
// Seed issuers from env vars on first boot (empty database only), then build registry
|
|
issuerService.SeedFromEnvVars(context.Background(), cfg)
|
|
if err := issuerService.BuildRegistry(context.Background()); err != nil {
|
|
logger.Error("failed to build issuer registry from database", "error", err)
|
|
}
|
|
logger.Info("issuer registry loaded", "issuers", issuerRegistry.Len())
|
|
targetService := service.NewTargetService(targetRepo, auditService, agentRepo, encryptionKey, logger)
|
|
profileService := service.NewProfileService(profileRepo, auditService)
|
|
teamService := service.NewTeamService(teamRepo, auditService)
|
|
ownerService := service.NewOwnerService(ownerRepo, auditService)
|
|
agentGroupRepo := postgres.NewAgentGroupRepository(db)
|
|
agentGroupService := service.NewAgentGroupService(agentGroupRepo, auditService)
|
|
discoveryRepo := postgres.NewDiscoveryRepository(db)
|
|
discoveryService := service.NewDiscoveryService(discoveryRepo, certificateRepo, auditService)
|
|
networkScanRepo := postgres.NewNetworkScanRepository(db)
|
|
networkScanService := service.NewNetworkScanService(networkScanRepo, discoveryService, auditService, logger)
|
|
logger.Info("initialized network scan service")
|
|
|
|
// Ensure the sentinel "server-scanner" agent exists for network discovery dedup.
|
|
// This agent ID is used as the agent_id in discovered_certificates for network-scanned certs.
|
|
if cfg.NetworkScan.Enabled {
|
|
sentinelAgent := &domain.Agent{
|
|
ID: service.SentinelAgentID,
|
|
Name: "Network Scanner (Server-Side)",
|
|
Status: domain.AgentStatusOnline,
|
|
}
|
|
// M-6: use CreateIfNotExists so duplicate rows on restart/upgrade are
|
|
// idempotent without swallowing unrelated DB failures (CWE-662).
|
|
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAgent)
|
|
if err != nil {
|
|
logger.Error("sentinel agent creation failed", "id", service.SentinelAgentID, "error", err)
|
|
} else if created {
|
|
logger.Info("sentinel agent created", "id", service.SentinelAgentID)
|
|
} else {
|
|
logger.Debug("sentinel agent already exists", "id", service.SentinelAgentID)
|
|
}
|
|
}
|
|
|
|
// Initialize cloud discovery sources (M50)
|
|
var cloudDiscoveryService *service.CloudDiscoveryService
|
|
if cfg.CloudDiscovery.Enabled {
|
|
cloudDiscoveryService = service.NewCloudDiscoveryService(discoveryService, logger)
|
|
|
|
// AWS Secrets Manager
|
|
if cfg.CloudDiscovery.AWSSM.Enabled {
|
|
awsSource := discoveryawssm.New(&cfg.CloudDiscovery.AWSSM, logger)
|
|
cloudDiscoveryService.RegisterSource(awsSource)
|
|
// Create sentinel agent for AWS SM
|
|
sentinelAWS := &domain.Agent{
|
|
ID: service.SentinelAWSSecretsMgr,
|
|
Name: "AWS Secrets Manager Discovery",
|
|
Status: domain.AgentStatusOnline,
|
|
}
|
|
// M-6: idempotent create (CWE-662).
|
|
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAWS)
|
|
if err != nil {
|
|
logger.Error("sentinel agent creation failed", "id", service.SentinelAWSSecretsMgr, "error", err)
|
|
} else if created {
|
|
logger.Info("sentinel agent created", "id", service.SentinelAWSSecretsMgr)
|
|
} else {
|
|
logger.Debug("sentinel agent already exists", "id", service.SentinelAWSSecretsMgr)
|
|
}
|
|
}
|
|
|
|
// Azure Key Vault
|
|
if cfg.CloudDiscovery.AzureKV.Enabled {
|
|
azureSource := discoveryazurekv.New(discoveryazurekv.Config{
|
|
VaultURL: cfg.CloudDiscovery.AzureKV.VaultURL,
|
|
TenantID: cfg.CloudDiscovery.AzureKV.TenantID,
|
|
ClientID: cfg.CloudDiscovery.AzureKV.ClientID,
|
|
ClientSecret: cfg.CloudDiscovery.AzureKV.ClientSecret,
|
|
}, logger)
|
|
cloudDiscoveryService.RegisterSource(azureSource)
|
|
sentinelAzure := &domain.Agent{
|
|
ID: service.SentinelAzureKeyVault,
|
|
Name: "Azure Key Vault Discovery",
|
|
Status: domain.AgentStatusOnline,
|
|
}
|
|
// M-6: idempotent create (CWE-662).
|
|
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelAzure)
|
|
if err != nil {
|
|
logger.Error("sentinel agent creation failed", "id", service.SentinelAzureKeyVault, "error", err)
|
|
} else if created {
|
|
logger.Info("sentinel agent created", "id", service.SentinelAzureKeyVault)
|
|
} else {
|
|
logger.Debug("sentinel agent already exists", "id", service.SentinelAzureKeyVault)
|
|
}
|
|
}
|
|
|
|
// GCP Secret Manager
|
|
if cfg.CloudDiscovery.GCPSM.Enabled {
|
|
gcpSource := discoverygcpsm.New(&cfg.CloudDiscovery.GCPSM, logger)
|
|
cloudDiscoveryService.RegisterSource(gcpSource)
|
|
sentinelGCP := &domain.Agent{
|
|
ID: service.SentinelGCPSecretMgr,
|
|
Name: "GCP Secret Manager Discovery",
|
|
Status: domain.AgentStatusOnline,
|
|
}
|
|
// M-6: idempotent create (CWE-662).
|
|
created, err := agentRepo.CreateIfNotExists(context.Background(), sentinelGCP)
|
|
if err != nil {
|
|
logger.Error("sentinel agent creation failed", "id", service.SentinelGCPSecretMgr, "error", err)
|
|
} else if created {
|
|
logger.Info("sentinel agent created", "id", service.SentinelGCPSecretMgr)
|
|
} else {
|
|
logger.Debug("sentinel agent already exists", "id", service.SentinelGCPSecretMgr)
|
|
}
|
|
}
|
|
|
|
logger.Info("cloud discovery enabled",
|
|
"sources", cloudDiscoveryService.SourceCount(),
|
|
"interval", cfg.CloudDiscovery.Interval.String())
|
|
}
|
|
|
|
logger.Info("initialized all services")
|
|
|
|
// Initialize bulk revocation service
|
|
bulkRevocationService := service.NewBulkRevocationService(revocationSvc, certificateRepo, auditService, logger)
|
|
|
|
// Initialize stats and metrics services
|
|
statsService := service.NewStatsService(certificateRepo, jobRepo, agentRepo)
|
|
logger.Info("initialized stats service")
|
|
|
|
// Initialize API handlers
|
|
certificateHandler := handler.NewCertificateHandler(certificateService)
|
|
issuerHandler := handler.NewIssuerHandler(issuerService)
|
|
targetHandler := handler.NewTargetHandler(targetService)
|
|
agentHandler := handler.NewAgentHandler(agentService)
|
|
jobHandler := handler.NewJobHandler(jobService)
|
|
policyHandler := handler.NewPolicyHandler(policyService)
|
|
profileHandler := handler.NewProfileHandler(profileService)
|
|
teamHandler := handler.NewTeamHandler(teamService)
|
|
ownerHandler := handler.NewOwnerHandler(ownerService)
|
|
agentGroupHandler := handler.NewAgentGroupHandler(agentGroupService)
|
|
auditHandler := handler.NewAuditHandler(auditService)
|
|
notificationHandler := handler.NewNotificationHandler(notificationService)
|
|
statsHandler := handler.NewStatsHandler(statsService)
|
|
metricsHandler := handler.NewMetricsHandler(statsService, time.Now())
|
|
healthHandler := handler.NewHealthHandler(cfg.Auth.Type)
|
|
discoveryHandler := handler.NewDiscoveryHandler(discoveryService)
|
|
networkScanHandler := handler.NewNetworkScanHandler(networkScanService)
|
|
verificationService := service.NewVerificationService(jobRepo, auditService, logger)
|
|
verificationHandler := handler.NewVerificationHandler(verificationService)
|
|
exportService := service.NewExportService(certificateRepo, auditService)
|
|
exportHandler := handler.NewExportHandler(exportService)
|
|
|
|
bulkRevocationHandler := handler.NewBulkRevocationHandler(bulkRevocationService)
|
|
|
|
// Initialize digest service (requires email notifier)
|
|
var digestService *service.DigestService
|
|
var digestHandler *handler.DigestHandler
|
|
if cfg.Digest.Enabled && emailAdapter != nil {
|
|
digestService = service.NewDigestService(
|
|
statsService, certificateRepo, ownerRepo, emailAdapter, cfg.Digest.Recipients, logger,
|
|
)
|
|
digestHandler = handler.NewDigestHandler(digestService)
|
|
logger.Info("digest service enabled",
|
|
"interval", cfg.Digest.Interval.String(),
|
|
"recipients", len(cfg.Digest.Recipients))
|
|
} else {
|
|
// Create a no-op digest handler for route registration
|
|
digestHandler = handler.NewDigestHandler(nil)
|
|
if cfg.Digest.Enabled && emailAdapter == nil {
|
|
logger.Warn("digest enabled but SMTP not configured — digest emails will not be sent")
|
|
}
|
|
}
|
|
|
|
// Initialize health check service (M48)
|
|
var healthCheckService *service.HealthCheckService
|
|
var healthCheckHandler *handler.HealthCheckHandler
|
|
if cfg.HealthCheck.Enabled {
|
|
healthCheckRepo := postgres.NewHealthCheckRepository(db)
|
|
healthCheckService = service.NewHealthCheckService(
|
|
healthCheckRepo,
|
|
auditService,
|
|
logger,
|
|
cfg.HealthCheck.MaxConcurrent,
|
|
time.Duration(cfg.HealthCheck.DefaultTimeout)*time.Millisecond,
|
|
cfg.HealthCheck.HistoryRetention,
|
|
cfg.HealthCheck.AutoCreate,
|
|
)
|
|
healthCheckHandler = handler.NewHealthCheckHandler(healthCheckService)
|
|
logger.Info("health check service enabled",
|
|
"interval", cfg.HealthCheck.CheckInterval.String(),
|
|
"max_concurrent", cfg.HealthCheck.MaxConcurrent)
|
|
} else {
|
|
// Create a no-op health check handler for route registration
|
|
healthCheckHandler = handler.NewHealthCheckHandler(nil)
|
|
}
|
|
|
|
logger.Info("initialized all handlers")
|
|
|
|
// Create context with cancellation
|
|
ctx, cancel := context.WithCancel(context.Background())
|
|
defer cancel()
|
|
|
|
// Initialize scheduler
|
|
sched := scheduler.NewScheduler(
|
|
renewalService,
|
|
jobService,
|
|
agentService,
|
|
notificationService,
|
|
networkScanService,
|
|
logger,
|
|
)
|
|
|
|
// Configure scheduler intervals from config
|
|
sched.SetRenewalCheckInterval(cfg.Scheduler.RenewalCheckInterval)
|
|
sched.SetJobProcessorInterval(cfg.Scheduler.JobProcessorInterval)
|
|
sched.SetAgentHealthCheckInterval(cfg.Scheduler.AgentHealthCheckInterval)
|
|
sched.SetNotificationProcessInterval(cfg.Scheduler.NotificationProcessInterval)
|
|
if cfg.NetworkScan.Enabled {
|
|
sched.SetNetworkScanInterval(cfg.NetworkScan.ScanInterval)
|
|
logger.Info("network scanning enabled", "interval", cfg.NetworkScan.ScanInterval.String())
|
|
}
|
|
if digestService != nil {
|
|
sched.SetDigestService(digestService)
|
|
sched.SetDigestInterval(cfg.Digest.Interval)
|
|
logger.Info("digest scheduler enabled", "interval", cfg.Digest.Interval.String())
|
|
}
|
|
if healthCheckService != nil {
|
|
sched.SetHealthCheckService(healthCheckService)
|
|
sched.SetHealthCheckInterval(cfg.HealthCheck.CheckInterval)
|
|
logger.Info("health check scheduler enabled", "interval", cfg.HealthCheck.CheckInterval.String())
|
|
}
|
|
if cloudDiscoveryService != nil && cloudDiscoveryService.SourceCount() > 0 {
|
|
sched.SetCloudDiscoveryService(cloudDiscoveryService)
|
|
sched.SetCloudDiscoveryInterval(cfg.CloudDiscovery.Interval)
|
|
logger.Info("cloud discovery scheduler enabled",
|
|
"interval", cfg.CloudDiscovery.Interval.String(),
|
|
"sources", cloudDiscoveryService.SourceCount())
|
|
}
|
|
|
|
// Start scheduler
|
|
logger.Info("starting scheduler")
|
|
startedChan := sched.Start(ctx)
|
|
<-startedChan
|
|
logger.Info("scheduler started")
|
|
|
|
// Build the API router with all handlers
|
|
apiRouter := router.New()
|
|
apiRouter.RegisterHandlers(router.HandlerRegistry{
|
|
Certificates: certificateHandler,
|
|
Issuers: issuerHandler,
|
|
Targets: targetHandler,
|
|
Agents: agentHandler,
|
|
Jobs: jobHandler,
|
|
Policies: policyHandler,
|
|
Profiles: profileHandler,
|
|
Teams: teamHandler,
|
|
Owners: ownerHandler,
|
|
AgentGroups: agentGroupHandler,
|
|
Audit: auditHandler,
|
|
Notifications: notificationHandler,
|
|
Stats: statsHandler,
|
|
Metrics: metricsHandler,
|
|
Health: healthHandler,
|
|
Discovery: discoveryHandler,
|
|
NetworkScan: networkScanHandler,
|
|
Verification: verificationHandler,
|
|
Export: exportHandler,
|
|
Digest: *digestHandler,
|
|
HealthChecks: healthCheckHandler,
|
|
BulkRevocation: bulkRevocationHandler,
|
|
})
|
|
// Register EST (RFC 7030) handlers if enabled
|
|
if cfg.EST.Enabled {
|
|
issuerConn, ok := issuerRegistry.Get(cfg.EST.IssuerID)
|
|
if !ok {
|
|
logger.Error("EST issuer not found in registry", "issuer_id", cfg.EST.IssuerID)
|
|
os.Exit(1)
|
|
}
|
|
estService := service.NewESTService(cfg.EST.IssuerID, issuerConn, auditService, logger)
|
|
estService.SetProfileRepo(profileRepo)
|
|
if cfg.EST.ProfileID != "" {
|
|
estService.SetProfileID(cfg.EST.ProfileID)
|
|
}
|
|
estHandler := handler.NewESTHandler(estService)
|
|
apiRouter.RegisterESTHandlers(estHandler)
|
|
logger.Info("EST server enabled",
|
|
"issuer_id", cfg.EST.IssuerID,
|
|
"profile_id", cfg.EST.ProfileID,
|
|
"endpoints", "/.well-known/est/{cacerts,simpleenroll,simplereenroll,csrattrs}")
|
|
}
|
|
|
|
// Register SCEP (RFC 8894) handlers if enabled
|
|
if cfg.SCEP.Enabled {
|
|
// H-2 fix: fail closed at startup when SCEP is enabled without a
|
|
// challenge password configured. Previously the service-layer guard
|
|
// at internal/service/scep.go:72-79 skipped the password check when
|
|
// s.challengePassword == "", meaning any client that could reach the
|
|
// /scep endpoint could enroll an arbitrary CSR against the configured
|
|
// issuer (CWE-306, missing authentication for a critical function).
|
|
// Refuse to start instead: the operator must set
|
|
// CERTCTL_SCEP_CHALLENGE_PASSWORD (or disable SCEP) before the control
|
|
// plane can boot.
|
|
if err := preflightSCEPChallengePassword(cfg.SCEP.Enabled, cfg.SCEP.ChallengePassword); err != nil {
|
|
logger.Error(
|
|
"startup refused: SCEP is enabled but CERTCTL_SCEP_CHALLENGE_PASSWORD is not set "+
|
|
"(would allow unauthenticated certificate enrollment, CWE-306). "+
|
|
"Set a non-empty challenge password or disable SCEP before restarting.",
|
|
"error", err,
|
|
)
|
|
os.Exit(1)
|
|
}
|
|
issuerConn, ok := issuerRegistry.Get(cfg.SCEP.IssuerID)
|
|
if !ok {
|
|
logger.Error("SCEP issuer not found in registry", "issuer_id", cfg.SCEP.IssuerID)
|
|
os.Exit(1)
|
|
}
|
|
scepService := service.NewSCEPService(cfg.SCEP.IssuerID, issuerConn, auditService, logger, cfg.SCEP.ChallengePassword)
|
|
scepService.SetProfileRepo(profileRepo)
|
|
if cfg.SCEP.ProfileID != "" {
|
|
scepService.SetProfileID(cfg.SCEP.ProfileID)
|
|
}
|
|
scepHandler := handler.NewSCEPHandler(scepService)
|
|
apiRouter.RegisterSCEPHandlers(scepHandler)
|
|
logger.Info("SCEP server enabled",
|
|
"issuer_id", cfg.SCEP.IssuerID,
|
|
"profile_id", cfg.SCEP.ProfileID,
|
|
"challenge_password_set", cfg.SCEP.ChallengePassword != "",
|
|
"endpoints", "/scep?operation={GetCACaps,GetCACert,PKIOperation}")
|
|
}
|
|
|
|
logger.Info("registered all API handlers")
|
|
|
|
// Build middleware stack
|
|
authMiddleware := middleware.NewAuth(middleware.AuthConfig{
|
|
Type: cfg.Auth.Type,
|
|
Secret: cfg.Auth.Secret,
|
|
})
|
|
corsMiddleware := middleware.NewCORS(middleware.CORSConfig{
|
|
AllowedOrigins: cfg.CORS.AllowedOrigins,
|
|
})
|
|
|
|
structuredLogger := middleware.NewLogging(logger)
|
|
|
|
// Request body size limit middleware — prevents memory exhaustion attacks (CWE-400)
|
|
bodyLimitMiddleware := middleware.NewBodyLimit(middleware.BodyLimitConfig{
|
|
MaxBytes: cfg.Server.MaxBodySize,
|
|
})
|
|
logger.Info("request body size limit enabled", "max_bytes", cfg.Server.MaxBodySize)
|
|
|
|
// API audit log middleware — records every API call to the audit trail
|
|
auditAdapter := middleware.NewAuditServiceAdapter(
|
|
func(ctx context.Context, actor string, actorType string, action string, resourceType string, resourceID string, details map[string]interface{}) error {
|
|
return auditService.RecordEvent(ctx, actor, domain.ActorType(actorType), action, resourceType, resourceID, details)
|
|
},
|
|
)
|
|
auditMiddleware := middleware.NewAuditLog(auditAdapter, middleware.AuditConfig{
|
|
ExcludePaths: []string{"/health", "/ready"},
|
|
Logger: logger,
|
|
})
|
|
logger.Info("API audit logging enabled (excluding /health, /ready)")
|
|
|
|
middlewareStack := []func(http.Handler) http.Handler{
|
|
middleware.RequestID,
|
|
structuredLogger,
|
|
middleware.Recovery,
|
|
bodyLimitMiddleware,
|
|
corsMiddleware,
|
|
authMiddleware,
|
|
auditMiddleware.Middleware,
|
|
}
|
|
|
|
// Add rate limiter if enabled
|
|
if cfg.RateLimit.Enabled {
|
|
rateLimiter := middleware.NewRateLimiter(middleware.RateLimitConfig{
|
|
RPS: cfg.RateLimit.RPS,
|
|
BurstSize: cfg.RateLimit.BurstSize,
|
|
})
|
|
middlewareStack = []func(http.Handler) http.Handler{
|
|
middleware.RequestID,
|
|
structuredLogger,
|
|
middleware.Recovery,
|
|
bodyLimitMiddleware,
|
|
rateLimiter,
|
|
corsMiddleware,
|
|
authMiddleware,
|
|
auditMiddleware.Middleware,
|
|
}
|
|
logger.Info("rate limiting enabled", "rps", cfg.RateLimit.RPS, "burst", cfg.RateLimit.BurstSize)
|
|
}
|
|
|
|
if cfg.Auth.Type == "none" {
|
|
logger.Warn("authentication disabled (CERTCTL_AUTH_TYPE=none) — not suitable for production")
|
|
} else {
|
|
logger.Info("authentication enabled", "type", cfg.Auth.Type)
|
|
}
|
|
|
|
if cfg.Keygen.Mode == "server" {
|
|
logger.Warn("server-side key generation enabled (CERTCTL_KEYGEN_MODE=server) — private keys touch control plane, demo only")
|
|
} else {
|
|
logger.Info("agent-side key generation enabled — private keys never leave agent infrastructure")
|
|
}
|
|
|
|
// Apply middleware to API router
|
|
apiHandler := middleware.Chain(apiRouter, middlewareStack...)
|
|
|
|
// Wrap with dashboard static file serving
|
|
// Vite builds to web/dist/; fall back to web/ for legacy single-file SPA
|
|
var finalHandler http.Handler
|
|
webDir := "./web/dist"
|
|
if _, err := os.Stat(webDir + "/index.html"); err != nil {
|
|
webDir = "./web"
|
|
}
|
|
// Health/ready routes bypass the full middleware stack (no auth required).
|
|
// These are registered on the inner router without auth, but the outer
|
|
// middleware chain wraps everything. Route them directly to the inner router.
|
|
noAuthHandler := middleware.Chain(apiRouter,
|
|
middleware.RequestID,
|
|
structuredLogger,
|
|
middleware.Recovery,
|
|
)
|
|
|
|
if _, err := os.Stat(webDir + "/index.html"); err == nil {
|
|
fileServer := http.FileServer(http.Dir(webDir))
|
|
finalHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
path := r.URL.Path
|
|
// Health/ready and auth/info bypass auth middleware.
|
|
// Health/ready: Docker/K8s health probes don't carry Bearer tokens.
|
|
// auth/info: React app calls this before login to detect auth mode.
|
|
if path == "/health" || path == "/ready" || path == "/api/v1/auth/info" {
|
|
noAuthHandler.ServeHTTP(w, r)
|
|
return
|
|
}
|
|
// All other API and EST routes go through the full middleware stack (with auth)
|
|
if (len(path) >= 8 && path[:8] == "/api/v1/") ||
|
|
(len(path) >= 16 && path[:16] == "/.well-known/est") {
|
|
apiHandler.ServeHTTP(w, r)
|
|
return
|
|
}
|
|
// Try to serve static files (JS, CSS, assets)
|
|
if len(path) > 8 && path[:8] == "/assets/" {
|
|
fileServer.ServeHTTP(w, r)
|
|
return
|
|
}
|
|
// SPA fallback: serve index.html for all other routes
|
|
http.ServeFile(w, r, webDir+"/index.html")
|
|
})
|
|
logger.Info("dashboard available at /", "web_dir", webDir)
|
|
} else {
|
|
// No dashboard: route health/auth-info without auth, everything else through full stack
|
|
finalHandler = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
|
path := r.URL.Path
|
|
if path == "/health" || path == "/ready" || path == "/api/v1/auth/info" {
|
|
noAuthHandler.ServeHTTP(w, r)
|
|
return
|
|
}
|
|
apiHandler.ServeHTTP(w, r)
|
|
})
|
|
logger.Info("dashboard directory not found, serving API only")
|
|
}
|
|
|
|
// Server configuration
|
|
addr := net.JoinHostPort(cfg.Server.Host, strconv.Itoa(cfg.Server.Port))
|
|
httpServer := &http.Server{
|
|
Addr: addr,
|
|
Handler: finalHandler,
|
|
ReadTimeout: 30 * time.Second,
|
|
ReadHeaderTimeout: 5 * time.Second,
|
|
WriteTimeout: 120 * time.Second, // Must accommodate ACME issuance (order + challenge + finalize)
|
|
IdleTimeout: 60 * time.Second,
|
|
}
|
|
|
|
// Start HTTP server in background
|
|
logger.Info("starting HTTP server", "address", addr)
|
|
go func() {
|
|
if err := httpServer.ListenAndServe(); err != nil && err != http.ErrServerClosed {
|
|
logger.Error("HTTP server error", "error", err)
|
|
}
|
|
}()
|
|
|
|
// Wait for shutdown signal
|
|
sigChan := make(chan os.Signal, 1)
|
|
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
|
|
|
|
sig := <-sigChan
|
|
logger.Info("received shutdown signal", "signal", sig.String())
|
|
|
|
// Graceful shutdown
|
|
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
|
|
defer shutdownCancel()
|
|
|
|
cancel() // Stop scheduler
|
|
|
|
// Wait for in-flight scheduler work to complete (up to 30 seconds)
|
|
logger.Info("waiting for scheduler to complete in-flight work")
|
|
if err := sched.WaitForCompletion(30 * time.Second); err != nil {
|
|
logger.Warn("scheduler work did not complete in time", "error", err)
|
|
}
|
|
|
|
logger.Info("shutting down HTTP server")
|
|
if err := httpServer.Shutdown(shutdownCtx); err != nil {
|
|
logger.Error("HTTP server shutdown error", "error", err)
|
|
}
|
|
|
|
// Drain in-flight audit-recording goroutines before closing the DB pool.
|
|
// The audit middleware spawns one goroutine per non-excluded request; those
|
|
// goroutines run detached from the request context and write to the
|
|
// audit_events table via the same *sql.DB. Without this drain, SIGTERM
|
|
// would close the DB pool while recordings were mid-flight, silently
|
|
// dropping audit events (M-1, CWE-662 / CWE-400).
|
|
logger.Info("flushing audit middleware in-flight recordings")
|
|
if err := auditMiddleware.Flush(shutdownCtx); err != nil {
|
|
logger.Warn("audit middleware flush did not complete in time", "error", err)
|
|
}
|
|
|
|
// Close database connection
|
|
if err := db.Close(); err != nil {
|
|
logger.Error("error closing database connection", "error", err)
|
|
}
|
|
|
|
logger.Info("certctl server stopped")
|
|
}
|
|
|
|
// preflightSCEPChallengePassword enforces the H-2 fix: if SCEP is enabled, a
|
|
// non-empty challenge password MUST be configured. Returns a non-nil error
|
|
// otherwise so the caller can refuse to start the control plane (CWE-306,
|
|
// missing authentication for a critical function).
|
|
//
|
|
// This helper is extracted so the check can be unit tested without booting
|
|
// the full server. The caller (main) is responsible for translating the
|
|
// returned error into a structured log line and os.Exit(1).
|
|
func preflightSCEPChallengePassword(enabled bool, challengePassword string) error {
|
|
if !enabled {
|
|
return nil
|
|
}
|
|
if challengePassword == "" {
|
|
return fmt.Errorf("SCEP enabled but CERTCTL_SCEP_CHALLENGE_PASSWORD is empty: " +
|
|
"SCEP enrollment would accept any client (CWE-306); " +
|
|
"configure a non-empty shared secret or set CERTCTL_SCEP_ENABLED=false")
|
|
}
|
|
return nil
|
|
}
|
|
|