docs: factuality sweep — fix 3 broken links + 12 count claims (audit findings 2026-05-05)

Per the cowork/docs-audit-2026-05-05/ end-to-end factuality audit (20 confirmed findings across 76 docs, 7 parallel subagents + audit-of-the-audit). Hot + Warm tier fixes ship here; STALE findings (qa-test-suite.md test-count snapshot) need 'make qa-stats' which is operator-side. BROKEN links repaired (3): - docs/reference/api.md L195: [Quick Start](quickstart.md) → ../getting-started/quickstart.md (404 pre-fix) - docs/reference/api.md L196: [Connector Guide](connectors.md) → connectors/index.md (Phase 4 rename, was 404 pre-fix) - docs/reference/protocols/scep-intune.md L377: [legacy-est-scep.md](legacy-est-scep.md) → scep-server.md (file was deleted in Phase 7 commit e9b1510) INCORRECT count claims repaired (12): - api.md L5 + L18-19 + L155: '78 API operations' / '# 78' / 'all 78 documented operations' → re-derive via grep -cE '^\s+operationId:' (actual at HEAD: 144) - architecture.md L66 (Mermaid label) + L502 + L1047 + L1253: '8 always-on + 4 optional loops' / '12-loop topology' → 9 always-on + 5 opt-in loops (14 total). Always-on/opt-in breakdown derived from cmd/server/main.go startup wiring: always-on are agentHealthCheck, crlGeneration, jobProcessor, jobRetry, jobTimeout, notificationProcess, notificationRetry, renewalCheck, shortLivedExpiryCheck (9); opt-in are networkScan, digest, healthCheck, cloudDiscovery, acmeGC (5). Re-derive count via grep -cE '^func \(s \*Scheduler\) [a-zA-Z]+Loop' internal/scheduler/scheduler.go. - configuration.md L31: '12 loops, 8 always-on + 4 opt-in' → '14 loops, 9 always-on + 5 opt-in'. Self-introduced regression from commit 3275f9f (2026-05-05). - mcp.md L11 + L65: 'all 78 API endpoints' / '78 available tools' → re-derive via grep -cE 'mcp\.AddTool\(' (actual at HEAD: 87 MCP tools, 144 API operations). - connectors/index.md L111: '9 built-in' issuer connectors → '12 built-in', extending the inline enumeration to include Entrust, GlobalSign, EJBCA (which had been added since the L111 prose was written). Local-CA framing extended to mention tree mode + ADCS sub-CA mode-doc. - connectors/index.md L112: '14 built-in' target connectors → '15 built-in', adding AWS ACM target + Azure Key Vault target (which had been added since the L112 prose was written). - why-certctl.md L37 + the inline list: 'Nine issuer connectors ship today' → 'Twelve issuer connectors', adding AWS ACM PCA, Entrust, GlobalSign, EJBCA to the list and removing the misleading 'EST enrollment' bullet (EST is a protocol surface, not an issuer; clarified in trailing note). - why-certctl.md L66: '13 deployment targets' → '15', adding Kubernetes Secrets, AWS ACM, and Azure KV to the inline list. - why-certctl.md L92: 'supports 9 issuer types' → '12 issuer types'. - quickstart.md L135: '35 demo certificates across 5 issuers' → re-derive cert count via 'grep -oE "mc-[a-z0-9_-]+" migrations/seed_demo.sql | sort -u | wc -l' (actual: 32, matches README L86; quickstart was off-by-3). - quickstart.md L452 (Demo Data Reference table): Certificates '35' → '32' (matches the cert count from seed_demo.sql). Verification: - grep confirms no remaining stale refs across the touched files (8 files, 31 insertions / 28 deletions). - All 24 ci-guards/*.sh pass locally. - The audit's STALE findings (S-1, S-2 qa-test-suite.md Bundle-P snapshot) are operator-side: run 'make qa-stats' to refresh the Test Suite Health table. Companion: cowork/docs-audit-2026-05-05/RESULTS.md captures the full audit with subagent false positives and missed findings called out.
2026-06-07 12:41:30 +00:00 · 2026-05-05 06:15:35 +00:00
parent d809874fa1
commit 622cd29f20
8 changed files with 31 additions and 28 deletions
@@ -63,7 +63,7 @@ flowchart TB
        API["REST API\n(Go net/http, :8443)"]
        SVC["Service Layer"]
        REPO["Repository Layer\n(database/sql + lib/pq)"]
-        SCHED["Background Scheduler\n8 always-on + 4 optional loops"]
+        SCHED["Background Scheduler\n9 always-on + 5 opt-in loops"]
        DASH["Web Dashboard\n(React SPA)"]
    end

@@ -499,7 +499,7 @@ For incident-response events requiring fleet-wide revocation (key compromise, CA

 ### 4. Automatic Renewal

-The control plane runs a scheduler with 8 always-on loops plus up to 4 optional loops (enabled by configuration). `internal/scheduler/scheduler.go:262-265` is the authoritative count.
+The control plane runs a scheduler with 9 always-on loops plus up to 5 opt-in loops (enabled by configuration). Re-derive the count via `grep -cE '^func \(s \*Scheduler\) [a-zA-Z]+Loop' internal/scheduler/scheduler.go`; the opt-in gating lives in `cmd/server/main.go` startup wiring (`cfg.NetworkScan.Enabled`, `digestService != nil`, `healthCheckService != nil`, `cloudDiscoveryService != nil`, `cfg.ACMEServer.Enabled && cfg.ACMEServer.GCInterval > 0`).

 ```mermaid
 flowchart LR
@@ -1044,7 +1044,7 @@ For deployments that need JWT/OIDC/mTLS, the standard pattern is to put an authe

 ### Concurrency Safety

-The background scheduler uses `sync/atomic.Bool` idempotency guards on every loop (8 always-on plus up to 4 optional) — if a tick fires while the previous iteration is still running, it skips. A `sync.WaitGroup` tracks all in-flight goroutines. `WaitForCompletion(timeout)` blocks during shutdown until all work finishes or the timeout expires, preventing state corruption from mid-flight database operations during process exit.
+The background scheduler uses `sync/atomic.Bool` idempotency guards on every loop (9 always-on plus up to 5 opt-in) — if a tick fires while the previous iteration is still running, it skips. A `sync.WaitGroup` tracks all in-flight goroutines. `WaitForCompletion(timeout)` blocks during shutdown until all work finishes or the timeout expires, preventing state corruption from mid-flight database operations during process exit.

 The job-processor tick fans the per-job work out across up to `CERTCTL_RENEWAL_CONCURRENCY` goroutines (default 25), gated by `golang.org/x/sync/semaphore.Weighted`. The cap is the operator's lever for "how many concurrent CA calls per scheduler tick" — operators with permissive upstream limits and large fleets (>10k certs) can bump to 100; operators with strict limits or async-CA-heavy fleets should stay at 25 or lower. Values ≤ 0 normalise to 1 (sequential). The Acquire is ctx-aware so a shutdown-driven ctx cancel interrupts the dispatch loop promptly; in-flight goroutines drain via Wait before the tick returns. Closes the #9 acquisition-readiness blocker from the 2026-05-01 issuer coverage audit (pre-fix the fan-out had no cap, so a 5,000-cert sweep tripped DigiCert / Entrust / Sectigo rate limits and the next tick re-fanned-out the same calls).

@@ -1250,7 +1250,7 @@ flowchart TB

 1. **Pluggable sources** — Each cloud provider implements the `DiscoverySource` interface (Name, Type, Discover, ValidateConfig). Three built-in sources: AWS Secrets Manager, Azure Key Vault, GCP Secret Manager
 2. **CloudDiscoveryService orchestrator** — Iterates registered sources, calls `Discover()` on each, feeds reports into `ProcessDiscoveryReport()`. Errors from one source don't prevent other sources from running
-3. **Scheduler integration** — opt-in cloud discovery scheduler loop (6h default; see `docs/architecture.md` 12-loop topology), runs immediately on startup, `atomic.Bool` idempotency guard
+3. **Scheduler integration** — opt-in cloud discovery scheduler loop (6h default; one of the 14 loops in the scheduler topology — see the Background Scheduler section above), runs immediately on startup, `atomic.Bool` idempotency guard
 4. **Sentinel agents** — Each source uses its own sentinel agent ID (`cloud-aws-sm`, `cloud-azure-kv`, `cloud-gcp-sm`) for dedup and triage filtering
 5. **Source path format** — `aws-sm://{region}/{secret}`, `azure-kv://{cert-name}/{version}`, `gcp-sm://{project}/{secret}`
 6. **No new schema** — Reuses existing `discovered_certificates` and `discovery_scans` tables. Sentinel agent IDs leverage existing `(fingerprint_sha256, agent_id, source_path)` dedup constraint