mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-12 18:48:51 +00:00
I-003: job timeout reaper closes AwaitingCSR/AwaitingApproval gap
Add 11th always-on scheduler loop that transitions jobs stuck in
AwaitingCSR (default 24h TTL) or AwaitingApproval (default 168h TTL)
to Failed. I-001's retry loop then auto-promotes eligible Failed jobs
back to Pending. No new status enum, no schema migration.
- JobRepository.ListTimedOutAwaitingJobs with per-status cutoff WHERE
- JobService.ReapTimedOutJobs mirrors RetryFailedJobs structure
- Scheduler jobTimeoutLoop with atomic.Bool idempotency guard, 2m
per-tick context, WaitGroup shutdown drain
- Config: CERTCTL_JOB_TIMEOUT_INTERVAL (10m), CERTCTL_JOB_AWAITING_CSR_TIMEOUT
(24h), CERTCTL_JOB_AWAITING_APPROVAL_TIMEOUT (168h)
- Audit event per transition: actor=system, actorType=System,
action=job_timeout, details={old_status, new_status, timeout_reason,
age_hours}
- 14 new tests: 3 config, 7 service, 4 scheduler
This commit is contained in:
@@ -151,6 +151,11 @@ type JobRepository interface {
|
||||
// to Running) and locks AwaitingCSR jobs against concurrent observers (leaving state intact,
|
||||
// since the CSR-submission path drives the next transition). H-6 (CWE-362) race remediation.
|
||||
ClaimPendingByAgentID(ctx context.Context, agentID string) ([]*domain.Job, error)
|
||||
// ListTimedOutAwaitingJobs returns jobs stuck in AwaitingCSR (created before csrCutoff) or
|
||||
// AwaitingApproval (created before approvalCutoff). The reaper loop transitions them to
|
||||
// Failed; I-001's retry loop then auto-promotes eligible Failed jobs back to Pending.
|
||||
// I-003 coverage-gap closure.
|
||||
ListTimedOutAwaitingJobs(ctx context.Context, csrCutoff, approvalCutoff time.Time) ([]*domain.Job, error)
|
||||
}
|
||||
|
||||
// RenewalPolicyRepository defines operations for managing renewal policies.
|
||||
|
||||
Reference in New Issue
Block a user