mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-14 09:48:58 +00:00
repo,service: introduce WithinTx and atomic audit rows for issue/renew/revoke
Closes the #3 acquisition-readiness blocker from the 2026-05-01 issuer coverage audit (Part 1.5 finding #1: audit row not transactional with issuance). AuditRepository.Create previously ran on the package-level *sql.DB while the certificate insert / version insert / revocation insert ran on independent connections — a failed audit INSERT after a successful operation INSERT was silently lost. SOX §404 over IT general controls, PCI-DSS §10 audit logging, HIPAA §164.312(b) audit controls, and CA/B Forum Baseline Requirements §5.4.1 audit log records all presume audit-with-operation atomicity. Design — Option A (Querier abstraction). The chosen pattern: a shared repository.Querier interface (subset of *sql.DB and *sql.Tx) plus a postgres.WithinTx helper that begins a tx, runs fn, commits on nil error, rolls back on error or panic, and returns the wrapped result. Repository methods that participate in a service-layer transaction expose a *WithTx variant taking repository.Querier; the bare methods remain for stand-alone use. A repository.Transactor abstracts the "begin tx, run fn, commit/rollback" lifecycle so service-layer code runs multi-write operations atomically without holding *sql.DB directly. Option B (UnitOfWork) was considered but adds boilerplate without behavioral benefit for the current scope. Option C (context-carried tx) was explicitly rejected — it hides the transactional boundary from the type system, reproducing the class of bug we're fixing. This commit: - Adds internal/repository/querier.go with the Querier interface (compile-time guards that *sql.DB and *sql.Tx satisfy it) and the Transactor interface for service-layer use. - Adds internal/repository/postgres/tx.go with the WithinTx helper (begin/fn/commit/rollback with panic recovery) and a transactor type that satisfies repository.Transactor. - Adds CreateWithTx variants on AuditRepository, CertificateRepository (Create + Update + CreateVersion), and RevocationRepository. Existing bare methods now delegate to the *WithTx variant using the package-level *sql.DB so existing call sites are behavior-preserving. - Updates repository/interfaces.go: AuditRepository, CertificateRepository, and RevocationRepository declare the new *WithTx methods. Adds an atomicity contract doc-comment on AuditRepository pointing at WithinTx + the audit blocker. - Adds AuditService.RecordEventWithTx, mirroring RecordEvent but routing through CreateWithTx so the audit row is part of the caller's transaction. Same redaction + marshalling contract. - Refactors three audit-emitting service paths to use Transactor.WithinTx when SetTransactor was wired, with a legacy fallback for backward compat: * CertificateService.Create — cert insert + audit row in one tx. * RevocationSvc.RevokeCertificateWithActor — cert status update + revocation row + audit row in one tx. The OCSP cache invalidate remains best-effort (out of scope per the prompt). * RenewalService CompleteServerRenewal — cert version insert + cert update + audit row in one tx. Job status update stays outside the audit-atomicity scope (job state lives outside the operator-facing audit trail). - Adds SetTransactor on CertificateService, RevocationSvc, and RenewalService. cmd/server/main.go wires a single Transactor instance shared across all three so all audit-emitting paths run their writes in transactions backed by the same *sql.DB handle. - Updates 5 mock implementations to satisfy the new interface methods: mockCertRepo (testutil_test.go), mockCertRepoWithGetError (shortlived_test.go), fakeRevocationRepo (crl_cache_test.go), intuneE2EAuditRepo (scep_intune_e2e_test.go), and the integration- test mocks (lifecycle_test.go: mockCertificateRepository, mockAuditRepository, mockRevocationRepository). All *WithTx mocks ignore the Querier and delegate to the bare method (mocks have no DB; in-memory state is shared regardless of "tx"). - Adds a service-layer test mockTransactor with BeginTxErr and CommitErr knobs so the atomic-audit tests can assert error propagation through the transactional boundary. - Adds internal/repository/postgres/tx_test.go: unit-level test that WithinTx surfaces "begin tx" wrap when BeginTx fails, and that Transactor.WithinTx delegates correctly. Real-Postgres rollback semantics are covered by the testcontainers tests in the postgres package — sandbox disk pressure prevented adding a sqlmock dep for the in-fn / commit-failure unit test, so those scenarios are exercised through atomic_audit_test.go using the mockTransactor's CommitErr / BeginTxErr fields. - Adds internal/service/atomic_audit_test.go: * TestCertificateService_Create_AtomicWithTx — asserts audit insert failure inside the tx surfaces as the operation's error (closes the blocker contract). * TestCertificateService_Create_LegacyPathLogs — pins the backward-compat behavior when SetTransactor isn't wired: audit failure is logged-not-failed, matching pre-fix. * TestCertificateService_Create_TransactorBeginFailure — BeginTx error path: operation fails, no cert insert, no audit insert. * TestCertificateService_Create_TransactorCommitFailure — Commit error after successful in-fn writes surfaces as the operation's error. Real Postgres can fail Commit on serialization conflicts; the service must report this. Out of scope (separate follow-up commits, same shape): - Issuer CRUD audit atomicity. - Target CRUD audit atomicity. - Agent retire (already transactional via RetireAgentWithCascade; verified, not changed). - Renewal-policy CRUD audit atomicity. - Owner/team/agent-group CRUD audit atomicity. - Discovery / health-check audit atomicity. Verified locally: - gofmt -l . clean - go vet ./... clean - staticcheck ./... clean - golangci-lint run --timeout 5m ./... → 0 issues - go test -short -count=1 ./internal/service/ green - go test -short -count=1 ./internal/api/handler/ green - go test -short -count=1 ./internal/integration/ green - go test -short -count=1 ./internal/repository/postgres/ green - go build ./... success Audit reference: cowork/issuer-coverage-audit-2026-05-01/RESULTS.md Top-10 fix #3 (Part 3, narrative section).
This commit is contained in:
+56
-21
@@ -31,6 +31,10 @@ type RenewalService struct {
|
||||
notificationSvc *NotificationService
|
||||
issuerRegistry *IssuerRegistry
|
||||
keygenMode string // "agent" (default) or "server" (demo only)
|
||||
// tx — when set, wraps the cert version insert + cert update + audit
|
||||
// row in a single transaction. Closes the #3 audit-readiness blocker
|
||||
// for the renewal path. Optional via SetTransactor.
|
||||
tx repository.Transactor
|
||||
}
|
||||
|
||||
// SetTargetRepo sets the target repository for resolving agent_id on deployment jobs.
|
||||
@@ -38,6 +42,14 @@ func (s *RenewalService) SetTargetRepo(repo repository.TargetRepository) {
|
||||
s.targetRepo = repo
|
||||
}
|
||||
|
||||
// SetTransactor wires a Transactor for atomic renewal completion (cert
|
||||
// version insert + cert update + audit row in a single transaction).
|
||||
// Closes the #3 audit-readiness blocker for the renewal path. Optional
|
||||
// — nil reverts to legacy non-transactional behavior.
|
||||
func (s *RenewalService) SetTransactor(tx repository.Transactor) {
|
||||
s.tx = tx
|
||||
}
|
||||
|
||||
// IssuerConnector defines the service-layer interface for interacting with certificate issuers.
|
||||
// This is distinct from the connector-layer issuer.Connector interface to maintain dependency
|
||||
// inversion. Use IssuerConnectorAdapter to bridge between the two.
|
||||
@@ -508,23 +520,58 @@ func (s *RenewalService) processRenewalServerKeygen(ctx context.Context, job *do
|
||||
CreatedAt: time.Now(),
|
||||
}
|
||||
|
||||
if err := s.certRepo.CreateVersion(ctx, version); err != nil {
|
||||
s.failJob(ctx, job, fmt.Sprintf("version creation failed: %v", err))
|
||||
return fmt.Errorf("failed to create certificate version: %w", err)
|
||||
}
|
||||
|
||||
// Update certificate status and expiry
|
||||
cert.Status = domain.CertificateStatusActive
|
||||
cert.ExpiresAt = result.NotAfter
|
||||
now := time.Now()
|
||||
cert.LastRenewalAt = &now
|
||||
cert.UpdatedAt = now
|
||||
if err := s.certRepo.Update(ctx, cert); err != nil {
|
||||
s.failJob(ctx, job, fmt.Sprintf("cert update failed: %v", err))
|
||||
return fmt.Errorf("failed to update certificate: %w", err)
|
||||
|
||||
auditDetails := map[string]interface{}{
|
||||
"job_id": job.ID,
|
||||
"serial": result.Serial,
|
||||
"not_after": result.NotAfter,
|
||||
"keygen_mode": "server",
|
||||
}
|
||||
|
||||
// Mark renewal job as completed
|
||||
// Atomic three-write path (when SetTransactor was wired): version
|
||||
// insert + cert update + audit row in a single transaction. Closes
|
||||
// the #3 audit-readiness blocker for the renewal path.
|
||||
if s.tx != nil {
|
||||
if err := s.tx.WithinTx(ctx, func(q repository.Querier) error {
|
||||
if err := s.certRepo.CreateVersionWithTx(ctx, q, version); err != nil {
|
||||
return fmt.Errorf("failed to create certificate version: %w", err)
|
||||
}
|
||||
if err := s.certRepo.UpdateWithTx(ctx, q, cert); err != nil {
|
||||
return fmt.Errorf("failed to update certificate: %w", err)
|
||||
}
|
||||
if err := s.auditService.RecordEventWithTx(ctx, q, "system", domain.ActorTypeSystem,
|
||||
"renewal_job_completed", "certificate", job.CertificateID, auditDetails); err != nil {
|
||||
return fmt.Errorf("failed to record audit event: %w", err)
|
||||
}
|
||||
return nil
|
||||
}); err != nil {
|
||||
s.failJob(ctx, job, err.Error())
|
||||
return err
|
||||
}
|
||||
} else {
|
||||
// Legacy non-transactional path — pre-fix behavior.
|
||||
if err := s.certRepo.CreateVersion(ctx, version); err != nil {
|
||||
s.failJob(ctx, job, fmt.Sprintf("version creation failed: %v", err))
|
||||
return fmt.Errorf("failed to create certificate version: %w", err)
|
||||
}
|
||||
if err := s.certRepo.Update(ctx, cert); err != nil {
|
||||
s.failJob(ctx, job, fmt.Sprintf("cert update failed: %v", err))
|
||||
return fmt.Errorf("failed to update certificate: %w", err)
|
||||
}
|
||||
if auditErr := s.auditService.RecordEvent(ctx, "system", domain.ActorTypeSystem,
|
||||
"renewal_job_completed", "certificate", job.CertificateID, auditDetails); auditErr != nil {
|
||||
slog.Error("failed to record audit event", "error", auditErr)
|
||||
}
|
||||
}
|
||||
|
||||
// Mark renewal job as completed (independent of the cert/audit
|
||||
// transaction — job state lives outside the audit-atomicity scope).
|
||||
if err := s.jobRepo.UpdateStatus(ctx, job.ID, domain.JobStatusCompleted, ""); err != nil {
|
||||
return fmt.Errorf("failed to update job status: %w", err)
|
||||
}
|
||||
@@ -537,18 +584,6 @@ func (s *RenewalService) processRenewalServerKeygen(ctx context.Context, job *do
|
||||
slog.Error("failed to send renewal notification", "error", err)
|
||||
}
|
||||
|
||||
// Record audit event
|
||||
if auditErr := s.auditService.RecordEvent(ctx, "system", domain.ActorTypeSystem,
|
||||
"renewal_job_completed", "certificate", job.CertificateID,
|
||||
map[string]interface{}{
|
||||
"job_id": job.ID,
|
||||
"serial": result.Serial,
|
||||
"not_after": result.NotAfter,
|
||||
"keygen_mode": "server",
|
||||
}); auditErr != nil {
|
||||
slog.Error("failed to record audit event", "error", auditErr)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user