mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 13:41:30 +00:00
docs: CRL/OCSP user guide + architecture cross-reference — Phase 6
Audit of cowork/crl-ocsp-responder-prompt.md against repo HEAD found
two prompt deliverables still missing after the Phase 5 + Phase 6 code
landed: the docs/crl-ocsp.md operator+relying-party guide (Phase 6.2)
and the docs/architecture.md cross-reference. This commit closes both.
docs/crl-ocsp.md (329 lines) covers:
* Conceptual overview — why both CRL and OCSP, why a separate
responder cert (RFC 6960 §2.6 / §4.2.2.2.1) keeps the CA key cold
* Endpoints — GET CRL, GET + POST OCSP, admin observability endpoint
(M-008 admin-gated) with full request/response shape examples
* Configuration — every CERTCTL_CRL_* / CERTCTL_OCSP_RESPONDER_*
env var with default + meaning + 'MUST set in prod' callout for
OCSP_RESPONDER_KEY_DIR
* OCSP responder cert lifecycle — first-request bootstrap, disk
self-healing when keydir is pruned out from under the DB row,
rotation grace, ExtraExtensions wiring for id-pkix-ocsp-nocheck
* Consumer integration recipes — cert-manager (AIA/CDP automatic),
Firefox (about:preferences quirk), OpenSSL (ocsp + s_client -status),
Intune (CRL pull cadence)
* V3-Pro deferred (delta CRLs, OCSP rate-limiting, OCSP stapling)
* Troubleshooting (404 on issuer that doesn't support CRL, hex
serial format, admin-gated 403, scheduler not running)
docs/architecture.md: extended the existing 'Certificate revocation'
paragraph to explicitly call out the new pipeline (crl_cache table,
OCSP responder cert per RFC 6960 §2.6, POST + GET OCSP endpoints,
auto-rotation grace) and added the 'See docs/crl-ocsp.md for the
operator + relying-party guide' link so future readers can find the
deep dive.
Closes the prompt's Phase 6.2 + 6.3 exit criteria. Combined with
the Phase 5 GUI panel (0594631) + Phase 6 e2e helpers (fc3c7ad) +
Phase 5 admin endpoint (a4df1f8), this completes V2 for the bundle.
V3-Pro polish (delta CRLs, OCSP rate-limiting, OCSP stapling) remains
explicitly out of scope per the prompt's 'What this prompt is NOT'
section.
This commit is contained in:
@@ -981,7 +981,7 @@ Jobs support additional action endpoints: `POST /api/v1/jobs/{id}/cancel`, `POST
|
||||
- **Additional filters**: `?agent_id=`, `?profile_id=` (in addition to existing status, environment, owner_id, team_id, issuer_id).
|
||||
- **Deployments**: `GET /api/v1/certificates/{id}/deployments` returns deployment targets for a certificate.
|
||||
|
||||
Certificate revocation: `POST /api/v1/certificates/{id}/revoke` with optional `{"reason": "keyCompromise"}`. Supports RFC 5280 reason codes (unspecified, keyCompromise, caCompromise, affiliationChanged, superseded, cessationOfOperation, certificateHold, privilegeWithdrawn). Returns the updated certificate status. Best-effort issuer notification — the revocation succeeds even if the issuer connector is unavailable. The DER-encoded X.509 CRL signed by the issuing CA is served unauthenticated at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5 + RFC 8615, `Content-Type: application/pkix-crl`). The embedded OCSP responder serves signed responses unauthenticated at `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` (RFC 6960, `Content-Type: application/ocsp-response`). Both endpoints are accessible to relying parties with no certctl API credentials, as RFC-compliant PKI consumers expect. Short-lived certificates (profile TTL < 1 hour) are exempt from CRL/OCSP — expiry is sufficient revocation.
|
||||
Certificate revocation: `POST /api/v1/certificates/{id}/revoke` with optional `{"reason": "keyCompromise"}`. Supports RFC 5280 reason codes (unspecified, keyCompromise, caCompromise, affiliationChanged, superseded, cessationOfOperation, certificateHold, privilegeWithdrawn). Returns the updated certificate status. Best-effort issuer notification — the revocation succeeds even if the issuer connector is unavailable. The DER-encoded X.509 CRL signed by the issuing CA is served unauthenticated at `GET /.well-known/pki/crl/{issuer_id}` (RFC 5280 §5 + RFC 8615, `Content-Type: application/pkix-crl`); the CRL is pre-generated by the scheduler-driven `crlGenerationLoop` and persisted in the `crl_cache` table (migration 000019) so HTTP fetches do not rebuild per request. The embedded OCSP responder serves signed responses unauthenticated at both `GET /.well-known/pki/ocsp/{issuer_id}/{serial}` and `POST /.well-known/pki/ocsp/{issuer_id}` (RFC 6960 §A.1.1, `Content-Type: application/ocsp-response`); responses are signed by a per-issuer dedicated OCSP responder cert (RFC 6960 §2.6, migration 000020) carrying the `id-pkix-ocsp-nocheck` extension (RFC 6960 §4.2.2.2.1) — the CA private key is never used directly for OCSP signing, which keeps it cold for the future PKCS#11/HSM driver path. The responder cert auto-rotates within `CERTCTL_OCSP_RESPONDER_ROTATION_GRACE` (default 7d) of expiry. Both endpoints are accessible to relying parties with no certctl API credentials, as RFC-compliant PKI consumers expect. Short-lived certificates (profile TTL < 1 hour) are exempt from CRL/OCSP — expiry is sufficient revocation. See [`crl-ocsp.md`](crl-ocsp.md) for the operator + relying-party guide (endpoint URLs, configuration knobs, responder cert lifecycle, cert-manager / Firefox / OpenSSL / Intune integration recipes, troubleshooting).
|
||||
|
||||
Certificate export (M27): `GET /api/v1/certificates/{id}/export/pem` returns PEM-encoded certificate and chain, and `POST /api/v1/certificates/{id}/export/pkcs12` returns a PKCS#12 bundle (binary). Private keys are never exported — they remain on agents. All exports are audited with actor, timestamp, and format.
|
||||
|
||||
|
||||
@@ -0,0 +1,329 @@
|
||||
# CRL & OCSP — Revocation Status for Relying Parties
|
||||
|
||||
This guide is the operator + relying-party reference for certctl's revocation
|
||||
status surfaces. It covers the wire format, endpoint URLs, configuration knobs,
|
||||
the OCSP responder cert lifecycle, and how to point common consumers
|
||||
(cert-manager, Firefox, OpenSSL) at the endpoints.
|
||||
|
||||
If you're looking for the higher-level architecture, see
|
||||
[`architecture.md` § Security Model](architecture.md#security-model). If you're
|
||||
looking for the revocation policy / reason codes the API accepts, see
|
||||
[`api/openapi.yaml` § /certificates/{id}/revoke](../api/openapi.yaml).
|
||||
|
||||
---
|
||||
|
||||
## Conceptual overview
|
||||
|
||||
**Why two formats.** RFC 5280 §5 defines a Certificate Revocation List (CRL)
|
||||
— a periodically-published, signed list of every revoked certificate for an
|
||||
issuer. RFC 6960 defines the Online Certificate Status Protocol (OCSP) — a
|
||||
request/response protocol that returns the status of a single certificate by
|
||||
serial number. CRLs are batch-friendly and cacheable; OCSP is point-query and
|
||||
fresh. Production PKI deployments serve both because different relying parties
|
||||
prefer different trade-offs:
|
||||
|
||||
- Browsers (Firefox / Safari) prefer OCSP for freshness; some pin OCSP
|
||||
stapling.
|
||||
- cert-manager and most Linux TLS clients fall back to CRL when OCSP is
|
||||
unreachable.
|
||||
- Microsoft Intune / corporate device-state validators do periodic CRL pulls.
|
||||
- OpenSSL `s_client -status` exercises OCSP via the `Certificate Status
|
||||
Request` extension during the handshake.
|
||||
|
||||
certctl's local issuer publishes both, with a pre-generation cache so a busy
|
||||
CA does not DOS itself rebuilding the CRL on every fetch.
|
||||
|
||||
**Why a separate OCSP responder cert.** RFC 6960 §2.6 + §4.2.2.2 strongly
|
||||
recommend that OCSP responses be signed by a delegated "OCSP responder cert"
|
||||
issued by the CA, NOT by the CA private key directly. The responder cert
|
||||
carries the `id-pkix-ocsp-nocheck` extension (RFC 6960 §4.2.2.2.1) so OCSP
|
||||
clients do not recursively check the responder cert's revocation status. This
|
||||
keeps the CA private key cold (an HSM operation per OCSP request would be
|
||||
prohibitive at scale) and lets the responder key live on disk, on a separate
|
||||
HSM partition, or rotate frequently while the CA key stays untouched.
|
||||
|
||||
---
|
||||
|
||||
## Endpoints
|
||||
|
||||
All revocation endpoints live under `/.well-known/pki/` per RFC 8615 and run
|
||||
**unauthenticated** — relying parties without certctl API credentials must be
|
||||
able to validate revocation status. The HTTPS-only TLS 1.3 control plane
|
||||
applies; there is no plaintext fallback.
|
||||
|
||||
### CRL — Certificate Revocation List
|
||||
|
||||
```
|
||||
GET https://<host>/.well-known/pki/crl/{issuer_id}
|
||||
```
|
||||
|
||||
| Field | Value |
|
||||
| --- | --- |
|
||||
| Method | `GET` |
|
||||
| Auth | None (unauthenticated, RFC 5280 §5 distribution semantics) |
|
||||
| Response Content-Type | `application/pkix-crl` |
|
||||
| Response body | DER-encoded X.509 CRL signed by the issuer's CA |
|
||||
| Cache | Pre-generated by the scheduler; configurable interval |
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
curl --cacert ca.crt \
|
||||
-o crl.der \
|
||||
https://localhost:8443/.well-known/pki/crl/iss-local
|
||||
|
||||
openssl crl -inform DER -in crl.der -text -noout
|
||||
```
|
||||
|
||||
### OCSP — Online Certificate Status Protocol
|
||||
|
||||
certctl serves both the GET form (RFC 6960 §A.1.1, simple URL-path lookup)
|
||||
and the POST form (RFC 6960 §A.1.1, binary OCSPRequest body). Most
|
||||
production OCSP clients (Firefox, OpenSSL `s_client -status`, cert-manager,
|
||||
Intune) use POST. The GET form is preserved for ops curl-debugging.
|
||||
|
||||
#### GET form
|
||||
|
||||
```
|
||||
GET https://<host>/.well-known/pki/ocsp/{issuer_id}/{serial_hex}
|
||||
```
|
||||
|
||||
| Field | Value |
|
||||
| --- | --- |
|
||||
| Method | `GET` |
|
||||
| Auth | None |
|
||||
| Response Content-Type | `application/ocsp-response` |
|
||||
| Response body | DER-encoded OCSPResponse signed by the **OCSP responder cert** (NOT the CA cert) |
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
curl --cacert ca.crt \
|
||||
-o response.der \
|
||||
https://localhost:8443/.well-known/pki/ocsp/iss-local/a1b2c3d4
|
||||
|
||||
openssl ocsp -respin response.der -text -CAfile ca.crt
|
||||
```
|
||||
|
||||
#### POST form (the standard one)
|
||||
|
||||
```
|
||||
POST https://<host>/.well-known/pki/ocsp/{issuer_id}
|
||||
Content-Type: application/ocsp-request
|
||||
Body: <DER-encoded OCSPRequest>
|
||||
```
|
||||
|
||||
| Field | Value |
|
||||
| --- | --- |
|
||||
| Method | `POST` |
|
||||
| Auth | None |
|
||||
| Request Content-Type | `application/ocsp-request` |
|
||||
| Response Content-Type | `application/ocsp-response` |
|
||||
|
||||
Example with OpenSSL building the request:
|
||||
|
||||
```bash
|
||||
openssl ocsp -issuer ca.crt -cert leaf.crt -reqout request.der
|
||||
|
||||
curl --cacert ca.crt \
|
||||
-X POST \
|
||||
-H "Content-Type: application/ocsp-request" \
|
||||
--data-binary @request.der \
|
||||
-o response.der \
|
||||
https://localhost:8443/.well-known/pki/ocsp/iss-local
|
||||
|
||||
openssl ocsp -respin response.der -text -CAfile ca.crt
|
||||
```
|
||||
|
||||
The body-size limit applies (`http.MaxBytesReader` from middleware,
|
||||
default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`); a typical OCSPRequest
|
||||
is ~200 bytes so this is a generous cap.
|
||||
|
||||
### Admin observability endpoint
|
||||
|
||||
```
|
||||
GET https://<host>/api/v1/admin/crl/cache
|
||||
Authorization: Bearer <token-with-admin-flag>
|
||||
```
|
||||
|
||||
Returns the per-issuer cache state — for ops dashboards, GUI badges, or
|
||||
"is the scheduler keeping up?" diagnostics. Admin-gated (M-008 admin-gated
|
||||
handler allowlist; non-admin Bearer callers receive HTTP 403). Response shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"cache_rows": [
|
||||
{
|
||||
"issuer_id": "iss-local",
|
||||
"cache_present": true,
|
||||
"crl_number": 42,
|
||||
"this_update": "2026-04-29T10:00:00Z",
|
||||
"next_update": "2026-04-29T11:00:00Z",
|
||||
"generated_at": "2026-04-29T10:00:00Z",
|
||||
"generation_duration_ms": 87,
|
||||
"revoked_count": 13,
|
||||
"is_stale": false,
|
||||
"recent_events": [
|
||||
{
|
||||
"started_at": "2026-04-29T10:00:00Z",
|
||||
"duration_ms": 87,
|
||||
"succeeded": true,
|
||||
"crl_number": 42,
|
||||
"revoked_count": 13
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"row_count": 1,
|
||||
"generated_at": "2026-04-29T10:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
Issuers that have not yet had a CRL generated appear with `cache_present:
|
||||
false` so the GUI can render a "Not yet generated" pill rather than 404.
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
| Env var | Default | Meaning |
|
||||
| --- | --- | --- |
|
||||
| `CERTCTL_CRL_GENERATION_INTERVAL` | `1h` | How often the scheduler walks every CRL-supporting issuer and rebuilds. The HTTP handler reads from the cache, not from a per-request rebuild. |
|
||||
| `CERTCTL_OCSP_RESPONDER_KEY_DIR` | unset | **Operator MUST set in production.** Directory where the FileDriver persists each issuer's OCSP responder key (`ocsp-responder-<issuer_id>.key`). When unset, the responder service uses a temporary directory that does NOT survive restarts — fine for dev, NEVER for prod. |
|
||||
| `CERTCTL_OCSP_RESPONDER_ROTATION_GRACE` | `7d` | When the responder cert's `NotAfter` falls within this window, `EnsureResponder` rotates to a fresh cert+key on the next OCSP request or scheduler tick. |
|
||||
| `CERTCTL_OCSP_RESPONDER_VALIDITY` | `30d` | How long each newly-issued responder cert is valid for. Short by design — relying parties cache OCSP responses, not the responder cert chain, and `id-pkix-ocsp-nocheck` blocks recursive revocation checking on the responder itself. |
|
||||
|
||||
The issuer-level CRL `nextUpdate` is derived from the generation timestamp +
|
||||
the configured CRL validity (currently a build-time constant in the
|
||||
`CRLCacheService`; configurable knob deferred until an operator asks).
|
||||
|
||||
---
|
||||
|
||||
## OCSP responder cert lifecycle
|
||||
|
||||
1. **First OCSP request for an issuer (or scheduler tick).** The local
|
||||
issuer's `SignOCSPResponse` calls into `OCSPResponderService.EnsureResponder`.
|
||||
2. **Cache lookup.** `EnsureResponder` queries the `ocsp_responders` table for
|
||||
a row keyed by `issuer_id`.
|
||||
3. **Disk lookup.** If a row exists, the FileDriver reads the persisted key
|
||||
from `<keydir>/ocsp-responder-<issuer_id>.key`. **Self-healing:** if the
|
||||
row exists but the file is missing (operator pruned the keydir without
|
||||
pruning the DB), the service treats this as "rotate now" rather than
|
||||
crashing.
|
||||
4. **Rotation check.** If `cert.NotAfter < now + RotationGrace`, the service
|
||||
generates a fresh ECDSA-P256 key, builds a `*x509.CertificateRequest`,
|
||||
and asks the local issuer's existing `IssueCertificate` flow to sign it.
|
||||
The signing template carries:
|
||||
- `KeyUsage: x509.KeyUsageDigitalSignature` (signing OCSP responses)
|
||||
- `ExtKeyUsage: x509.ExtKeyUsageOCSPSigning` (RFC 6960 §4.2.2.2)
|
||||
- The `id-pkix-ocsp-nocheck` extension (OID `1.3.6.1.5.5.7.48.1.5`,
|
||||
DER value `NULL`, RFC 6960 §4.2.2.2.1) wired through
|
||||
`Certificate.ExtraExtensions`.
|
||||
5. **Persistence.** The new cert + key path are written to `ocsp_responders`
|
||||
via an idempotent `INSERT … ON CONFLICT DO UPDATE`.
|
||||
6. **Response signing.** `ocsp.CreateResponse(caCert, responderCert,
|
||||
template, responderSigner)` produces the response bytes; the responder
|
||||
cert is included in the response chain so relying parties can validate
|
||||
without a separate fetch.
|
||||
|
||||
The race between scheduler-driven cache refresh and on-demand cache miss is
|
||||
collapsed by the `CRLCacheService`'s in-tree singleflight (a `sync.Map` of
|
||||
`*flightEntry` keyed by `issuer_id`). Concurrent generation requests for the
|
||||
same issuer wait on the in-flight result rather than each rebuilding from
|
||||
scratch.
|
||||
|
||||
---
|
||||
|
||||
## Pointing common consumers at the endpoints
|
||||
|
||||
### cert-manager (Kubernetes)
|
||||
|
||||
cert-manager's certificate-validation logic checks both the AIA OCSP URI
|
||||
embedded in the leaf and the CDP CRL URI. Both are populated automatically
|
||||
by the local issuer's certificate template — relying parties should NOT
|
||||
need any additional configuration. To verify:
|
||||
|
||||
```bash
|
||||
openssl x509 -in leaf.crt -text -noout | grep -A1 "Authority Information Access"
|
||||
openssl x509 -in leaf.crt -text -noout | grep -A2 "CRL Distribution Points"
|
||||
```
|
||||
|
||||
If your cert-manager pods cannot reach `https://<certctl-host>:8443/.well-known/pki/`,
|
||||
add a NetworkPolicy egress rule or expose the certctl service via the
|
||||
appropriate ingress class.
|
||||
|
||||
### Firefox
|
||||
|
||||
Firefox honors the AIA OCSP URI by default. To force-refresh the local
|
||||
revocation cache after revoking a cert in dev:
|
||||
|
||||
```
|
||||
about:preferences#privacy → Certificates → Query OCSP responder servers
|
||||
```
|
||||
|
||||
If Firefox reports `SEC_ERROR_OCSP_INVALID_SIGNING_CERT`, verify that the
|
||||
responder cert chain is reachable from the system trust store —
|
||||
`id-pkix-ocsp-nocheck` is a Firefox-strict extension and is set automatically
|
||||
on every responder cert certctl issues.
|
||||
|
||||
### OpenSSL
|
||||
|
||||
```bash
|
||||
# OCSP via stand-alone request
|
||||
openssl ocsp -issuer ca.crt -cert leaf.crt -url https://localhost:8443/.well-known/pki/ocsp/iss-local -CAfile ca.crt -text
|
||||
|
||||
# OCSP via TLS Certificate Status Request extension
|
||||
openssl s_client -connect example.com:443 -status -CAfile ca.crt
|
||||
```
|
||||
|
||||
### Intune (corporate device state)
|
||||
|
||||
Intune device-compliance validators pull the CRL on a schedule (configured in
|
||||
the Intune admin console, default 24h). Configure the CRL distribution point
|
||||
to `https://<certctl-host>:8443/.well-known/pki/crl/<issuer_id>` and Intune
|
||||
will pull on its own cadence.
|
||||
|
||||
---
|
||||
|
||||
## What this release does NOT include (V3-Pro)
|
||||
|
||||
The following are explicitly out of scope for the V2 (free) bundle and are
|
||||
tracked for the certctl Pro release:
|
||||
|
||||
- **Delta CRLs (RFC 5280 §5.2.4).** Useful for very large CRLs (10k+
|
||||
revoked certs); the data model already accommodates the Base CRL Number
|
||||
reference but the pipeline only emits Base CRLs in V2.
|
||||
- **OCSP rate-limiting per relying party.** Per-IP token bucket on the OCSP
|
||||
endpoint — V3-Pro because it justifies per-seat pricing for high-traffic
|
||||
responders.
|
||||
- **OCSP stapling.** Server-side: cache pre-fetched OCSP responses + serve
|
||||
in TLS handshake. Client-side: a "stapling fetcher" agent for non-stapling
|
||||
origins.
|
||||
|
||||
The MaxBytesReader cap is the only request-level guard in V2; the
|
||||
unauthenticated-by-design relying-party endpoints are intentionally not
|
||||
rate-limited per IP.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**`pki/crl/<issuer_id>` returns 404.** The issuer either does not support
|
||||
CRL signing (Vault, EJBCA, DigiCert serve their own CRL infrastructure;
|
||||
certctl's connectors return `nil` from `GenerateCRL` for these) or the
|
||||
issuer ID is wrong. Verify with `GET /api/v1/issuers`.
|
||||
|
||||
**`pki/ocsp/<issuer_id>/<serial>` returns 200 but `openssl ocsp -text`
|
||||
shows "unauthorized".** Check that the serial in the URL is hex-encoded (no
|
||||
`0x` prefix, no leading zeros stripped, lowercase). Mismatched serials
|
||||
return an OCSP response with status `unauthorized` per RFC 6960 §2.3.
|
||||
|
||||
**Admin cache endpoint returns 403.** The Bearer key does not carry the
|
||||
admin flag. M-008 gates this endpoint server-side; the GUI also gates the
|
||||
fetch on `useAuth().admin`. Either escalate the key (`certctl admin
|
||||
keys promote <key-id>`) or use a different identity.
|
||||
|
||||
**Cache shows `is_stale: true` repeatedly.** The scheduler is not running
|
||||
(or not getting scheduled often enough). Check `CERTCTL_CRL_GENERATION_INTERVAL`
|
||||
and confirm the scheduler started: `grep crlGenerationLoop` in the server
|
||||
logs at startup.
|
||||
Reference in New Issue
Block a user