mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:21:35 +00:00
feat: M18b Filesystem Certificate Discovery — agent scanning, server dedup, triage API
Agent-side:
- Filesystem scanner walks configured directories (CERTCTL_DISCOVERY_DIRS)
- Parses PEM (.pem, .crt, .cer, .cert) and DER (.der) certificate files
- Extracts CN, SANs, serial, issuer/subject DN, validity, key info, SHA-256 fingerprint
- Reports discoveries to control plane on startup + every 6 hours
- Skips files >1MB and private key files
Server-side:
- Migration 000006: discovered_certificates + discovery_scans tables
- Domain model: DiscoveredCertificate, DiscoveryScan, DiscoveryReport
- Three triage states: Unmanaged, Managed (claimed), Dismissed
- Repository with upsert dedup (fingerprint + agent + path)
- Service layer: process reports, claim, dismiss, list, summary
- 7 new API endpoints (84 total):
POST /agents/{id}/discoveries, GET /discovered-certificates,
GET /discovered-certificates/{id}, POST .../claim, POST .../dismiss,
GET /discovery-scans, GET /discovery-summary
- Audit trail: scan_completed, cert_claimed, cert_dismissed events
Tests: 28 new test functions (domain, handler, service layers)
Docs: README, quickstart, demo-guide, demo-advanced, architecture,
concepts, connectors, features.md all updated
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -703,6 +703,61 @@ flowchart TB
|
||||
|
||||
For production, you would also add an ingress controller, TLS termination for the certctl API itself, and external PostgreSQL (RDS, Cloud SQL, etc.).
|
||||
|
||||
## Discovery Data Flow (M18b)
|
||||
|
||||
Certificate discovery enables operators to build a complete inventory of existing certificates before managing them with certctl. Here's how data flows through the system:
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
AGENT["certctl-agent\n(on infrastructure)"]
|
||||
SCAN["Filesystem Scanner\n(CERTCTL_DISCOVERY_DIRS)"]
|
||||
EXTRACT["Extract Metadata\n(CN, SANs, serial, issuer, expiry, fingerprint)"]
|
||||
REPORT["POST /api/v1/agents/{id}/discoveries\n(submit scan results)"]
|
||||
HANDLER["Discovery Handler\n(parse request)"]
|
||||
SERVICE["Discovery Service\n(ProcessDiscoveryReport)"]
|
||||
REPO["Discovery Repository\n(upsert with fingerprint dedup)"]
|
||||
DB["PostgreSQL\ndiscovered_certificates\ndiscovery_scans tables"]
|
||||
AUDIT["Audit Service\n(RecordDiscoveryScanCompleted)"]
|
||||
API_LIST["GET /api/v1/discovered-certificates\n(list for triage)"]
|
||||
API_CLAIM["POST /discovered-certificates/{id}/claim\n(operator claims cert)"]
|
||||
API_DISMISS["POST /discovered-certificates/{id}/dismiss\n(operator dismisses)"]
|
||||
UPDATE_STATUS["Update Status\n(Unmanaged → Managed/Dismissed)"]
|
||||
|
||||
AGENT -->|"Scan loop\n(startup + 6h)"| SCAN
|
||||
SCAN --> EXTRACT
|
||||
EXTRACT --> REPORT
|
||||
REPORT --> HANDLER
|
||||
HANDLER --> SERVICE
|
||||
SERVICE --> REPO
|
||||
REPO -->|"Dedup by fingerprint\n+ agent + path"| DB
|
||||
SERVICE --> AUDIT
|
||||
AUDIT -->|"discovery_scan_completed"| DB
|
||||
DB -->|"query unmanaged"| API_LIST
|
||||
API_LIST -->|"operator reviews"| API_CLAIM
|
||||
API_LIST -->|"operator reviews"| API_DISMISS
|
||||
API_CLAIM --> UPDATE_STATUS
|
||||
API_DISMISS --> UPDATE_STATUS
|
||||
UPDATE_STATUS -->|"RecordDiscoveryCertClaimed\nRecordDiscoveryCertDismissed"| AUDIT
|
||||
AUDIT --> DB
|
||||
```
|
||||
|
||||
**Key steps:**
|
||||
|
||||
1. **Agent-side discovery** — Agent scans `CERTCTL_DISCOVERY_DIRS` on startup and every 6 hours, walking directories recursively and parsing PEM/DER files
|
||||
2. **Metadata extraction** — For each certificate found, extract: common name, SANs, serial number, issuer DN, subject DN, expiration date, key algorithm, key size, is_ca flag, SHA-256 fingerprint (used as dedup key)
|
||||
3. **Server submission** — Agent POSTs scan results as `DiscoveryReport` to `POST /api/v1/agents/{id}/discoveries`
|
||||
4. **Deduplication** — Server uses fingerprint + agent ID + filesystem path as unique key; prevents duplicate records of the same cert on the same agent
|
||||
5. **Storage** — Records stored in `discovered_certificates` table with status = "Unmanaged"
|
||||
6. **Audit** — `discovery_scan_completed` event logged with agent ID, cert count, scan timestamp
|
||||
7. **Operator triage** — Operator queries `GET /api/v1/discovered-certificates?status=Unmanaged` to see new findings
|
||||
8. **Claim or dismiss** — For each unmanaged cert, operator either:
|
||||
- **Claims it** via `POST /discovered-certificates/{id}/claim` — links to existing managed cert or creates new enrollment
|
||||
- **Dismisses it** via `POST /discovered-certificates/{id}/dismiss` — removes from triage, marked as "Dismissed"
|
||||
9. **Status tracking** — `discovery_cert_claimed` and `discovery_cert_dismissed` events audit the operator's decision
|
||||
10. **Summary** — `GET /api/v1/discovery-summary` returns count of Unmanaged, Managed, and Dismissed certs (useful for compliance reporting)
|
||||
|
||||
This data flow is pull-based and non-blocking. Agents discover at their own pace; the server stores results for later review. There's no pressure to claim or dismiss; operators can leave certificates in "Unmanaged" status indefinitely.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
certctl uses a layered testing approach aligned with the handler → service → repository architecture, with 860+ tests across five layers (service, handler, integration, connector, and frontend). The goal is high-confidence regression prevention at the service and handler layers, where the most complex business logic lives, combined with integration tests that exercise the full request path from HTTP to database.
|
||||
|
||||
Reference in New Issue
Block a user