asyncpoll: refactor Sectigo / Entrust / GlobalSign to bounded polling (Phase 2)

Phase 2 of the #5 acquisition-readiness fix from the 2026-05-01 issuer
coverage audit. Phase 1 (commit 711265b) shipped the shared asyncpoll
package and refactored DigiCert as the reference. This commit applies
the same pattern to the remaining three async-CA connectors and adds
the operator-facing docs.

Per-connector refactors:

- Sectigo (sectigo.go): GetOrderStatus now wraps pollEnrollmentOnce in
  asyncpoll.Poll. The collectNotReady sentinel (cert approved by SCM
  but not yet retrievable from the collect endpoint) maps to
  StillPending and rides the backoff schedule rather than the prior
  "return pending immediately" branch. Added isPermanentStatusError
  helper to distinguish transient HTTP errors (5xx / 429 / network)
  from permanent ones (4xx / parse failure) — the wrapped checkStatus
  errors get triaged at the poll closure boundary.

- Entrust (entrust.go): GetOrderStatus wraps pollEnrollmentOnce. The
  AWAITING_APPROVAL status maps to StillPending; operators using
  approval-pending workflows where humans approve enrollments should
  bump CERTCTL_ENTRUST_POLL_MAX_WAIT_SECONDS to 86400 (24h) so a
  single scheduler tick can wait through the approval window. The
  default 10-minute deadline matches the other three connectors.

- GlobalSign (globalsign.go): GetOrderStatus wraps pollCertificateOnce.
  GlobalSign tracks orders by serial number rather than order ID, but
  the polling shape is identical to the other three. Status-code
  triage matches DigiCert: 4xx (not 429) is permanent, 5xx / 429 /
  network is transient.

Per-connector Config field added:
- DigiCert.PollMaxWaitSeconds (env CERTCTL_DIGICERT_POLL_MAX_WAIT_SECONDS)
- Sectigo.PollMaxWaitSeconds (env CERTCTL_SECTIGO_POLL_MAX_WAIT_SECONDS)
- Entrust.PollMaxWaitSeconds (env CERTCTL_ENTRUST_POLL_MAX_WAIT_SECONDS)
- GlobalSign.PollMaxWaitSeconds (env CERTCTL_GLOBALSIGN_POLL_MAX_WAIT_SECONDS)

internal/config/config.go env-var loaders updated for all four. Default
is 600 seconds (10 minutes); zero falls back to the asyncpoll package
default.

Test-helper updates: every existing test that exercises the pending
branch (collectNotReady, AWAITING_APPROVAL, status="pending", etc.)
now sets PollMaxWaitSeconds=1 in its Config so the test doesn't block
on the production-default 10-minute deadline. Tests that exercise
permanent-error branches (404, 401, malformed JSON, etc.) continue
to return immediately.

Test sites updated:
- buildSectigoConnector helper + GetOrderStatus_CollectNotReady test
- buildEntrustConnector helper + GetOrderStatus_Pending test
- buildGlobalsignConnector helper + GetOrderStatus_Pending test +
  the GetHTTPClient_NoMTLSCertPaths test (network failure now rides
  the backoff schedule rather than returning immediately)

Documentation:
- docs/async-polling.md: new operator reference covering the backoff
  schedule, status-code triage, the four env vars, failure modes, and
  where the implementation lives. Audit blocker citation included.
- docs/connectors.md: per-issuer sections for DigiCert, Sectigo,
  Entrust, GlobalSign each gain the PollMaxWaitSeconds env var row
  and a cross-link to async-polling.md.

Lint cleanup: simplified the isPermanentStatusError branch to satisfy
staticcheck S1008 (single-line return for a final boolean check).

Verified locally:
- gofmt -l . clean
- go vet ./... clean
- staticcheck ./... clean
- golangci-lint run --timeout 5m ./... → 0 issues
- go test -short -count=1 across all 4 connector packages + config + asyncpoll: green

Audit reference: cowork/issuer-coverage-audit-2026-05-01/RESULTS.md
Top-10 fix #5 — Phase 2.
This commit is contained in:
shankar0123
2026-05-02 02:41:36 +00:00
parent 633a10aa4e
commit 0509790325
12 changed files with 523 additions and 122 deletions
+121 -19
View File
@@ -37,6 +37,7 @@ import (
"time"
"github.com/shankar0123/certctl/internal/connector/issuer"
"github.com/shankar0123/certctl/internal/connector/issuer/asyncpoll"
)
// Config represents the Sectigo Certificate Manager issuer connector configuration.
@@ -69,6 +70,25 @@ type Config struct {
// Default: "https://cert-manager.com/api".
// Set via CERTCTL_SECTIGO_BASE_URL environment variable.
BaseURL string `json:"base_url"`
// PollMaxWaitSeconds caps how long GetOrderStatus blocks doing
// internal exponential-backoff polling before returning
// StillPending. Default 600 (10 minutes). Sectigo's
// collectNotReady sentinel maps to StillPending so recently-
// issued certs that aren't yet retrievable get the backoff
// schedule rather than tight-loop polling.
//
// Set via CERTCTL_SECTIGO_POLL_MAX_WAIT_SECONDS. Audit fix #5.
PollMaxWaitSeconds int `json:"poll_max_wait_seconds,omitempty"`
}
// pollMaxWait returns the configured PollMaxWait as a time.Duration,
// or the asyncpoll package default if unset.
func (c *Config) pollMaxWait() time.Duration {
if c.PollMaxWaitSeconds <= 0 {
return asyncpoll.DefaultMaxWait
}
return time.Duration(c.PollMaxWaitSeconds) * time.Second
}
// Connector implements the issuer.Connector interface for Sectigo Certificate Manager.
@@ -355,30 +375,94 @@ func (c *Connector) RevokeCertificate(ctx context.Context, request issuer.Revoca
return nil
}
// GetOrderStatus checks the status of a Sectigo certificate enrollment.
// If the enrollment is "Issued", downloads the certificate and returns it.
// If still pending, returns pending status for continued polling.
// GetOrderStatus checks the status of a Sectigo certificate enrollment
// using bounded internal polling (asyncpoll.Poll). One call blocks for
// up to PollMaxWait (default 10m) doing exponential backoff with
// jitter; returns Done with the cert, Failed with the rejection
// reason, or StillPending if the deadline expires (caller can
// re-invoke).
//
// Audit fix #5 Phase 2: previously each scheduler tick made one HTTP
// call against an unready order. Sectigo's collectNotReady sentinel
// (cert approved but not yet generated) now maps to StillPending and
// rides the backoff schedule rather than tight-loop polling.
func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer.OrderStatus, error) {
c.logger.Debug("checking Sectigo enrollment status", "ssl_id", orderID)
// Parse sslId from string
// Parse sslId from string once at entry — invalid ID is a
// permanent error, no point polling.
var sslId int
if _, err := fmt.Sscanf(orderID, "%d", &sslId); err != nil {
return nil, fmt.Errorf("invalid Sectigo ssl_id: %s", orderID)
}
var done *issuer.OrderStatus
var lastPendingMsg string
cfg := asyncpoll.Config{MaxWait: c.config.pollMaxWait()}
res, err := asyncpoll.Poll(ctx, cfg, func(ctx context.Context) (asyncpoll.Result, error) {
status, result, pollErr := c.pollEnrollmentOnce(ctx, sslId, orderID)
if status != nil {
switch result {
case asyncpoll.Done:
done = status
case asyncpoll.StillPending:
if status.Message != nil {
lastPendingMsg = *status.Message
}
}
}
return result, pollErr
})
now := time.Now()
switch res {
case asyncpoll.Done:
return done, nil
case asyncpoll.Failed:
return nil, err
default:
msg := lastPendingMsg
if msg == "" {
msg = fmt.Sprintf("enrollment %s still pending after PollMaxWait", orderID)
}
return &issuer.OrderStatus{
OrderID: orderID,
Status: "pending",
Message: &msg,
UpdatedAt: now,
}, nil
}
}
// pollEnrollmentOnce makes one HTTP GET against the Sectigo SCM
// status endpoint and translates the response into an asyncpoll.Result
// plus (when applicable) a populated OrderStatus.
//
// collectNotReady is the load-bearing Sectigo sentinel: even when
// the SCM status endpoint reports "Issued", the cert may not yet be
// retrievable from the collect endpoint. We treat this as
// StillPending so the backoff schedule applies.
func (c *Connector) pollEnrollmentOnce(ctx context.Context, sslId int, orderID string) (*issuer.OrderStatus, asyncpoll.Result, error) {
status, err := c.checkStatus(ctx, sslId)
if err != nil {
return nil, err
// Triage by examining the wrapped status code: 4xx (not 429)
// is permanent (404 = enrollment doesn't exist, 400 = bad
// request, 401/403 = auth). Parse failures are also
// permanent — the upstream's response shape is broken.
// 5xx / 429 / network errors are transient and ride the
// backoff schedule.
if isPermanentStatusError(err) {
return nil, asyncpoll.Failed, err
}
return nil, asyncpoll.StillPending, err
}
now := time.Now()
switch status.Status {
case "Issued":
certPEM, chainPEM, serial, notBefore, notAfter, collectErr := c.collectCertificate(ctx, sslId)
if collectErr != nil {
// Cert approved but not yet generated — treat as pending
if isCollectNotReady(collectErr) {
msg := fmt.Sprintf("enrollment %s is issued but certificate not yet generated", orderID)
return &issuer.OrderStatus{
@@ -386,15 +470,11 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
Status: "pending",
Message: &msg,
UpdatedAt: now,
}, nil
}, asyncpoll.StillPending, nil
}
return nil, fmt.Errorf("failed to collect certificate: %w", collectErr)
return nil, asyncpoll.Failed, fmt.Errorf("failed to collect certificate: %w", collectErr)
}
c.logger.Info("Sectigo enrollment completed",
"ssl_id", orderID,
"serial", serial)
c.logger.Info("Sectigo enrollment completed", "ssl_id", orderID, "serial", serial)
return &issuer.OrderStatus{
OrderID: orderID,
Status: "completed",
@@ -404,7 +484,7 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
NotBefore: &notBefore,
NotAfter: &notAfter,
UpdatedAt: now,
}, nil
}, asyncpoll.Done, nil
case "Applied", "Pending":
msg := fmt.Sprintf("enrollment %s is %s", orderID, status.Status)
@@ -413,7 +493,7 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
Status: "pending",
Message: &msg,
UpdatedAt: now,
}, nil
}, asyncpoll.StillPending, nil
case "Rejected":
msg := fmt.Sprintf("enrollment %s was rejected", orderID)
@@ -422,7 +502,7 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
Status: "failed",
Message: &msg,
UpdatedAt: now,
}, nil
}, asyncpoll.Done, nil
case "Revoked", "Expired", "Not Enrolled":
msg := fmt.Sprintf("enrollment %s has status: %s", orderID, status.Status)
@@ -431,7 +511,7 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
Status: "failed",
Message: &msg,
UpdatedAt: now,
}, nil
}, asyncpoll.Done, nil
default:
msg := fmt.Sprintf("unknown enrollment status: %s", status.Status)
@@ -440,10 +520,32 @@ func (c *Connector) GetOrderStatus(ctx context.Context, orderID string) (*issuer
Status: "pending",
Message: &msg,
UpdatedAt: now,
}, nil
}, asyncpoll.StillPending, nil
}
}
// isPermanentStatusError reports whether an error returned from
// checkStatus represents a permanent client-side failure (4xx other
// than 429, or a body-parse failure). Used by pollEnrollmentOnce to
// distinguish "stop polling" from "transient; keep polling".
//
// Heuristic-based on the error wrap shape: checkStatus formats HTTP
// status errors as "Sectigo status returned %d:" so we can grep for
// known permanent codes. Parse-failure errors contain "parse status
// response".
func isPermanentStatusError(err error) bool {
if err == nil {
return false
}
msg := err.Error()
for _, code := range []string{"returned 400", "returned 401", "returned 403", "returned 404"} {
if strings.Contains(msg, code) {
return true
}
}
return strings.Contains(msg, "parse status response")
}
// checkStatus retrieves the enrollment status from Sectigo.
func (c *Connector) checkStatus(ctx context.Context, sslId int) (*statusResponse, error) {
statusURL := fmt.Sprintf("%s/ssl/v1/%d", c.config.BaseURL, sslId)
@@ -20,13 +20,14 @@ func buildSectigoConnector(t *testing.T, baseURL string) *sectigo.Connector {
t.Helper()
c := sectigo.New(nil, slog.Default())
cfg := sectigo.Config{
BaseURL: baseURL,
CustomerURI: "tcust",
Login: "user",
Password: "pw",
CertType: 1,
OrgID: 2,
Term: 365,
BaseURL: baseURL,
CustomerURI: "tcust",
Login: "user",
Password: "pw",
CertType: 1,
OrgID: 2,
Term: 365,
PollMaxWaitSeconds: 1, // keep async-pending tests fast
}
raw, _ := json.Marshal(cfg)
if err := c.ValidateConfig(context.Background(), raw); err != nil {
@@ -381,11 +381,12 @@ func TestSectigoConnector(t *testing.T) {
defer srv.Close()
config := &sectigo.Config{
CustomerURI: "test-org",
Login: "api-user",
Password: "api-pass",
OrgID: 12345,
BaseURL: srv.URL,
CustomerURI: "test-org",
Login: "api-user",
Password: "api-pass",
OrgID: 12345,
BaseURL: srv.URL,
PollMaxWaitSeconds: 1, // keep pending tests fast
}
connector := sectigo.New(config, logger)
@@ -449,11 +450,12 @@ func TestSectigoConnector(t *testing.T) {
defer srv.Close()
config := &sectigo.Config{
CustomerURI: "test-org",
Login: "api-user",
Password: "api-pass",
OrgID: 12345,
BaseURL: srv.URL,
CustomerURI: "test-org",
Login: "api-user",
Password: "api-pass",
OrgID: 12345,
BaseURL: srv.URL,
PollMaxWaitSeconds: 1, // keep collect-not-ready tests fast
}
connector := sectigo.New(config, logger)