Complete M1, M1.1, M2: end-to-end lifecycle, agent deployment, ACME v2

- Wire issuer connector end-to-end with IssuerConnectorAdapter (dependency inversion)
- Renewal/issuance job processor: RSA key + CSR generation, Local CA signing, cert version storage
- Agent work API (GET /agents/{id}/work) and job status API (POST /agents/{id}/jobs/{job_id}/status)
- Agent-side deployment: WorkItem enrichment with target type/config, NGINX/F5/IIS connector invocation
- Full ACME v2 implementation: HTTP-01 challenge solving, account registration, order lifecycle
- Update all docs (README, architecture, connectors, demo-advanced, quickstart) for M1-M2
- Fix go vet warning in deployment.go (non-constant format string)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
shankar0123
2026-03-14 23:49:45 -04:00
parent 73d5d848d5
commit ae67b10708
16 changed files with 985 additions and 201 deletions
+62 -12
View File
@@ -8,7 +8,7 @@ New to certificates? Read the [Concepts Guide](concepts.md) first.
### Design Principles
1. **Zero Private Key Exposure** — Private keys are generated and managed only on agents, never sent to the control plane
1. **Private Key Isolation (V2+ goal)** — In V1, the Local CA generates server-side keys for simplicity. V2+ moves key generation to agents so private keys never touch the control plane
2. **Decoupled Operations** — Agents operate autonomously; the control plane coordinates but doesn't block agent function
3. **Audit-First** — Complete traceability of all issuance, deployment, and rotation events
4. **Connector Architecture** — Pluggable issuers, targets, and notifiers for extensibility
@@ -73,9 +73,9 @@ The server exposes a REST API under `/api/v1/` and optionally serves the web das
### Agents
Lightweight Go processes that run on or near your infrastructure. An agent generates private keys locally, creates CSRs, receives signed certificates from the control plane, deploys them to target systems, and reports status back. Agents communicate with the control plane via HTTP and authenticate with API keys.
Lightweight Go processes that run on or near your infrastructure. Agents poll the control plane for pending deployment jobs, fetch signed certificates, deploy them to target systems, and report job status back. In V2+, agents will also generate private keys locally and create CSRs. Agents communicate with the control plane via HTTP and authenticate with API keys.
The agent runs two background loops: a heartbeat (every 60 seconds) to signal it's alive, and a work poll (every 30 seconds) to check for pending jobs.
The agent runs two background loops: a heartbeat (every 60 seconds) to signal it's alive, and a work poll (every 30 seconds) to check for pending deployment jobs via `GET /api/v1/agents/{id}/work`. When a job is found, the agent fetches the certificate, executes the deployment, and reports status via `POST /api/v1/agents/{id}/jobs/{job_id}/status`.
### Web Dashboard
@@ -223,7 +223,38 @@ sequenceDiagram
API-->>U: 201 Created + JSON body
```
### 2. Agent Requests Certificate (CSR → Issuance)
### 2. Certificate Issuance
#### V1: Server-Side Key Generation (Local CA)
In V1, the control plane generates keys and CSRs server-side for the Local CA. This simplifies the initial implementation — the full agent-side key generation flow is planned for V2+.
```mermaid
sequenceDiagram
participant U as User / Scheduler
participant API as Control Plane API
participant SVC as RenewalService
participant ISS as IssuerConnector
participant DB as PostgreSQL
U->>API: POST /api/v1/certificates/{id}/renew
API->>SVC: ProcessRenewalJob(job)
SVC->>SVC: Generate RSA-2048 key pair (server-side)
SVC->>SVC: Create CSR with CN + SANs
SVC->>ISS: IssueCertificate(commonName, sans, csrPEM)
ISS-->>SVC: IssuanceResult{cert_pem, chain_pem, serial, not_after}
SVC->>SVC: Compute SHA-256 fingerprint
SVC->>DB: INSERT INTO certificate_versions (PEM chain + CSR)
SVC->>DB: UPDATE managed_certificates SET status='Active', expires_at
SVC->>DB: INSERT INTO audit_events
SVC->>DB: CREATE deployment jobs for all mapped targets
Note over SVC: Deployment jobs picked up by agents<br/>via GET /api/v1/agents/{id}/work
```
#### V2+ (Planned): Agent-Side Key Generation
```mermaid
sequenceDiagram
@@ -232,22 +263,19 @@ sequenceDiagram
participant ISS as Issuer Connector
participant DB as PostgreSQL
A->>A: Generate RSA-2048 key pair
A->>A: Generate RSA-2048 key pair locally
A->>A: Create CSR (CN + SANs, public key only)
A->>API: POST /api/v1/agents/{id}/csr<br/>{csr_pem: "-----BEGIN..."}
A->>API: POST /api/v1/agents/{id}/csr<br/>{csr_pem, certificate_id}
API->>API: Validate CSR format
API->>ISS: IssueCertificate(IssuanceRequest{CSR})
ISS-->>API: IssuanceResult{cert_pem, chain_pem, serial, not_after}
API->>DB: INSERT INTO certificate_versions
API->>DB: UPDATE managed_certificates SET status='Active'
API->>DB: INSERT INTO audit_events
API-->>A: {certificate_pem, chain_pem}<br/>(NO private key in response)
A->>A: Store cert.pem + chain.pem locally
Note over A: key.pem stays on agent<br/>Never transmitted anywhere
A->>A: Store cert + chain locally (key never leaves agent)
A->>A: Deploy to target system
```
@@ -320,6 +348,26 @@ flowchart TB
end
```
### IssuerConnectorAdapter (Dependency Inversion)
The service layer defines its own `IssuerConnector` interface (`internal/service/renewal.go`) while the connector layer has its own `issuer.Connector` interface (`internal/connector/issuer/interface.go`). The `IssuerConnectorAdapter` (`internal/service/issuer_adapter.go`) bridges the two, translating between their request/response types. This maintains clean dependency inversion — the service package never imports the connector package directly.
```mermaid
flowchart LR
SVC["Service Layer<br/>service.IssuerConnector"] --> ADAPT["IssuerConnectorAdapter<br/>(bridges interfaces)"]
ADAPT --> CONN["Connector Layer<br/>issuer.Connector"]
CONN --> LC["Local CA"]
CONN --> ACME["ACME v2"]
```
Registration happens in `cmd/server/main.go`:
```go
localCA := local.New(nil, logger)
issuerRegistry := map[string]service.IssuerConnector{
"iss-local": service.NewIssuerConnectorAdapter(localCA),
}
```
### Issuer Connector
Handles certificate issuance from CAs.
@@ -394,14 +442,16 @@ flowchart LR
style ROT fill:#efe,stroke:#3c3
```
Private keys follow a strict lifecycle:
**V1 (Current):** The Local CA issuer generates RSA-2048 keys and CSRs server-side within `RenewalService.ProcessRenewalJob`. Private key material is stored alongside the CSR in the `certificate_versions` table. This is a pragmatic V1 trade-off to get the end-to-end lifecycle working.
**V2+ (Target Architecture):** Private keys follow a strict lifecycle on agents:
1. **Generated on the agent** — never sent to the control plane
2. **Stored on the agent** — file permissions 0600, owned by the agent process user
3. **Used by the agent** — for deployment to targets and CSR generation
4. **Rotated by the agent** — old keys deleted after successful renewal
The control plane only ever handles public material: certificates, chains, and CSRs. This is a deliberate architectural decision — even if the control plane database is compromised, no private keys are exposed.
The V2+ architecture ensures the control plane only handles public material: certificates, chains, and CSRs.
### Authentication
+50 -7
View File
@@ -95,6 +95,29 @@ Configuration:
Location: `internal/connector/issuer/local/local.go`
### Built-in: ACME v2 (Let's Encrypt, Sectigo, ZeroSSL)
The ACME connector implements the full ACME v2 protocol using Go's `golang.org/x/crypto/acme` package. It supports HTTP-01 challenge solving via a built-in temporary HTTP server that starts on demand during certificate issuance.
Configuration:
```json
{
"directory_url": "https://acme-staging-v02.api.letsencrypt.org/directory",
"email": "admin@example.com",
"http_port": 80
}
```
For HTTP-01 to work, the domain being validated must resolve to the machine running the connector, and the configured HTTP port must be reachable from the internet. The connector automatically registers an ACME account, creates orders, solves challenges, finalizes with the CSR, and downloads the issued certificate chain.
Environment variables for the default ACME connector:
- `CERTCTL_ACME_DIRECTORY_URL` — ACME directory URL
- `CERTCTL_ACME_EMAIL` — Contact email for account registration
The connector is registered in the issuer registry under `iss-acme-staging` and `iss-acme-prod`. Use `iss-acme-staging` for Let's Encrypt staging (rate-limit-friendly testing) and `iss-acme-prod` for production certificates.
Location: `internal/connector/issuer/acme/acme.go`
### Building a Custom Issuer
Here's the structure for a HashiCorp Vault PKI issuer:
@@ -293,16 +316,36 @@ To add a new connector:
2. Implement the interface (all methods required)
3. Register it in the service layer during server initialization in `cmd/server/main.go`:
3. Register it in the service layer during server initialization in `cmd/server/main.go`.
### IssuerConnectorAdapter
Issuer connectors use an adapter pattern to bridge the connector-layer `issuer.Connector` interface with the service-layer `service.IssuerConnector` interface. This maintains dependency inversion — the service package never imports the connector package directly.
The adapter (`internal/service/issuer_adapter.go`) translates between the two interface types:
```go
// For issuers
issuerRegistry := map[string]service.IssuerConnector{
"local": localCAConnector,
"acme": acmeConnector,
"vault": vaultConnector, // your new issuer
}
// Wrap your connector implementation with the adapter
import "github.com/shankar0123/certctl/internal/service"
myIssuer := myissuer.New(config)
adapted := service.NewIssuerConnectorAdapter(myIssuer)
```
Register adapted connectors keyed by the issuer ID from the database:
```go
// In cmd/server/main.go
localCA := local.New(nil, logger)
issuerRegistry := map[string]service.IssuerConnector{
"iss-local": service.NewIssuerConnectorAdapter(localCA),
"iss-vault": service.NewIssuerConnectorAdapter(vaultIssuer), // your new issuer
}
```
### Notifier Registration
```go
// For notifiers
notifierRegistry := map[string]service.Notifier{
"Email": emailNotifier,
+37 -12
View File
@@ -215,13 +215,13 @@ Expected response:
The `202 Accepted` status code is deliberate. Certificate issuance can take seconds (Local CA) to minutes (ACME DNS challenges). The API doesn't block the caller — it creates a job and returns. The job processor loop (runs every 30 seconds) picks up pending jobs and executes them.
**What happens during a real renewal (production flow):**
**What happens during renewal (V1 flow with Local CA):**
```mermaid
sequenceDiagram
participant S as Scheduler
participant DB as PostgreSQL
participant SVC as CertificateService
participant SVC as RenewalService
participant ISS as IssuerConnector
participant A as Agent
@@ -233,23 +233,25 @@ sequenceDiagram
S->>DB: SELECT pending jobs
DB-->>S: [job-123: Renewal for mc-demo-api]
S->>A: Notify: generate CSR for demo-api.internal.example.com
A->>A: Generate RSA-2048 key pair locally
A->>A: Create CSR with CN + SANs
A->>SVC: POST /api/v1/agents/{id}/csr {csr_pem: "..."}
SVC->>ISS: IssueCertificate(CSR)
SVC->>SVC: Generate RSA-2048 key + CSR (server-side in V1)
SVC->>ISS: IssueCertificate(commonName, sans, csrPEM)
ISS-->>SVC: {cert_pem, chain_pem, serial, not_after}
SVC->>DB: INSERT certificate_version
SVC->>DB: INSERT certificate_version (PEM chain + fingerprint)
SVC->>DB: UPDATE managed_certificates SET status='Active'
SVC->>DB: INSERT audit_event (certificate_renewed)
SVC->>DB: CREATE deployment jobs for all targets
SVC-->>A: {certificate_pem, chain_pem}
A->>A: Store cert + chain locally (key never leaves)
Note over A: Agent polls GET /agents/{id}/work
A->>SVC: GET /api/v1/agents/{id}/work
SVC-->>A: [deployment job for mc-demo-api]
A->>SVC: GET /api/v1/agents/{id}/certificates/{certId}
SVC-->>A: {certificate PEM chain}
A->>A: Deploy to target system
A->>SVC: POST /api/v1/agents/{id}/jobs/{jobId}/status {Completed}
```
The critical security property: the private key is generated by the agent in step 3 and never transmitted. The CSR contains only the public key. The control plane forwards the CSR to the issuer and returns the signed certificate — it never has access to the private key material.
**V1 note:** In V1 with the Local CA, key generation happens server-side in `RenewalService.ProcessRenewalJob`. In V2+, agents will generate keys locally and submit CSRs, ensuring private keys never touch the control plane.
Check the jobs list:
@@ -322,6 +324,29 @@ Check for deployment jobs:
curl -s "$API/api/v1/jobs" | jq '.data[] | select(.certificate_id == "mc-demo-api")'
```
### Agent Work Polling & Status Reporting
In production, agents poll for work and report results. You can simulate this manually:
```bash
# Poll for pending deployment work (as an agent)
curl -s "$API/api/v1/agents/agent-nginx-prod/work" | jq .
```
This returns pending deployment jobs assigned to the agent. The agent would then fetch the certificate, deploy it, and report back:
```bash
# Report job completion (replace JOB_ID with an actual job ID from the work response)
curl -s -X POST "$API/api/v1/agents/agent-nginx-prod/jobs/JOB_ID/status" \
-H "Content-Type: application/json" \
-d '{
"status": "Completed",
"error": ""
}' | jq .
```
**How it works:** The `GET /api/v1/agents/{id}/work` endpoint returns all pending deployment jobs. The agent processes each one, then calls `POST /api/v1/agents/{id}/jobs/{job_id}/status` with either `"Completed"` or `"Failed"` (with an error message). The control plane updates the job record and logs an audit event.
---
## Part 6: View the Audit Trail
+7
View File
@@ -111,6 +111,13 @@ curl -s http://localhost:8443/api/v1/certificates/mc-api-prod | jq .
curl -s http://localhost:8443/api/v1/agents | jq .
```
### Check agent pending work
```bash
# Replace with an actual agent ID from the list above
curl -s http://localhost:8443/api/v1/agents/agent-nginx-prod/work | jq .
```
### View audit trail
```bash