mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:51:30 +00:00
b4334edda1
Audit of cowork/crl-ocsp-responder-prompt.md against repo HEAD found
two prompt deliverables still missing after the Phase 5 + Phase 6 code
landed: the docs/crl-ocsp.md operator+relying-party guide (Phase 6.2)
and the docs/architecture.md cross-reference. This commit closes both.
docs/crl-ocsp.md (329 lines) covers:
* Conceptual overview — why both CRL and OCSP, why a separate
responder cert (RFC 6960 §2.6 / §4.2.2.2.1) keeps the CA key cold
* Endpoints — GET CRL, GET + POST OCSP, admin observability endpoint
(M-008 admin-gated) with full request/response shape examples
* Configuration — every CERTCTL_CRL_* / CERTCTL_OCSP_RESPONDER_*
env var with default + meaning + 'MUST set in prod' callout for
OCSP_RESPONDER_KEY_DIR
* OCSP responder cert lifecycle — first-request bootstrap, disk
self-healing when keydir is pruned out from under the DB row,
rotation grace, ExtraExtensions wiring for id-pkix-ocsp-nocheck
* Consumer integration recipes — cert-manager (AIA/CDP automatic),
Firefox (about:preferences quirk), OpenSSL (ocsp + s_client -status),
Intune (CRL pull cadence)
* V3-Pro deferred (delta CRLs, OCSP rate-limiting, OCSP stapling)
* Troubleshooting (404 on issuer that doesn't support CRL, hex
serial format, admin-gated 403, scheduler not running)
docs/architecture.md: extended the existing 'Certificate revocation'
paragraph to explicitly call out the new pipeline (crl_cache table,
OCSP responder cert per RFC 6960 §2.6, POST + GET OCSP endpoints,
auto-rotation grace) and added the 'See docs/crl-ocsp.md for the
operator + relying-party guide' link so future readers can find the
deep dive.
Closes the prompt's Phase 6.2 + 6.3 exit criteria. Combined with
the Phase 5 GUI panel (0594631) + Phase 6 e2e helpers (fc3c7ad) +
Phase 5 admin endpoint (a4df1f8), this completes V2 for the bundle.
V3-Pro polish (delta CRLs, OCSP rate-limiting, OCSP stapling) remains
explicitly out of scope per the prompt's 'What this prompt is NOT'
section.
330 lines
13 KiB
Markdown
330 lines
13 KiB
Markdown
# CRL & OCSP — Revocation Status for Relying Parties
|
|
|
|
This guide is the operator + relying-party reference for certctl's revocation
|
|
status surfaces. It covers the wire format, endpoint URLs, configuration knobs,
|
|
the OCSP responder cert lifecycle, and how to point common consumers
|
|
(cert-manager, Firefox, OpenSSL) at the endpoints.
|
|
|
|
If you're looking for the higher-level architecture, see
|
|
[`architecture.md` § Security Model](architecture.md#security-model). If you're
|
|
looking for the revocation policy / reason codes the API accepts, see
|
|
[`api/openapi.yaml` § /certificates/{id}/revoke](../api/openapi.yaml).
|
|
|
|
---
|
|
|
|
## Conceptual overview
|
|
|
|
**Why two formats.** RFC 5280 §5 defines a Certificate Revocation List (CRL)
|
|
— a periodically-published, signed list of every revoked certificate for an
|
|
issuer. RFC 6960 defines the Online Certificate Status Protocol (OCSP) — a
|
|
request/response protocol that returns the status of a single certificate by
|
|
serial number. CRLs are batch-friendly and cacheable; OCSP is point-query and
|
|
fresh. Production PKI deployments serve both because different relying parties
|
|
prefer different trade-offs:
|
|
|
|
- Browsers (Firefox / Safari) prefer OCSP for freshness; some pin OCSP
|
|
stapling.
|
|
- cert-manager and most Linux TLS clients fall back to CRL when OCSP is
|
|
unreachable.
|
|
- Microsoft Intune / corporate device-state validators do periodic CRL pulls.
|
|
- OpenSSL `s_client -status` exercises OCSP via the `Certificate Status
|
|
Request` extension during the handshake.
|
|
|
|
certctl's local issuer publishes both, with a pre-generation cache so a busy
|
|
CA does not DOS itself rebuilding the CRL on every fetch.
|
|
|
|
**Why a separate OCSP responder cert.** RFC 6960 §2.6 + §4.2.2.2 strongly
|
|
recommend that OCSP responses be signed by a delegated "OCSP responder cert"
|
|
issued by the CA, NOT by the CA private key directly. The responder cert
|
|
carries the `id-pkix-ocsp-nocheck` extension (RFC 6960 §4.2.2.2.1) so OCSP
|
|
clients do not recursively check the responder cert's revocation status. This
|
|
keeps the CA private key cold (an HSM operation per OCSP request would be
|
|
prohibitive at scale) and lets the responder key live on disk, on a separate
|
|
HSM partition, or rotate frequently while the CA key stays untouched.
|
|
|
|
---
|
|
|
|
## Endpoints
|
|
|
|
All revocation endpoints live under `/.well-known/pki/` per RFC 8615 and run
|
|
**unauthenticated** — relying parties without certctl API credentials must be
|
|
able to validate revocation status. The HTTPS-only TLS 1.3 control plane
|
|
applies; there is no plaintext fallback.
|
|
|
|
### CRL — Certificate Revocation List
|
|
|
|
```
|
|
GET https://<host>/.well-known/pki/crl/{issuer_id}
|
|
```
|
|
|
|
| Field | Value |
|
|
| --- | --- |
|
|
| Method | `GET` |
|
|
| Auth | None (unauthenticated, RFC 5280 §5 distribution semantics) |
|
|
| Response Content-Type | `application/pkix-crl` |
|
|
| Response body | DER-encoded X.509 CRL signed by the issuer's CA |
|
|
| Cache | Pre-generated by the scheduler; configurable interval |
|
|
|
|
Example:
|
|
|
|
```bash
|
|
curl --cacert ca.crt \
|
|
-o crl.der \
|
|
https://localhost:8443/.well-known/pki/crl/iss-local
|
|
|
|
openssl crl -inform DER -in crl.der -text -noout
|
|
```
|
|
|
|
### OCSP — Online Certificate Status Protocol
|
|
|
|
certctl serves both the GET form (RFC 6960 §A.1.1, simple URL-path lookup)
|
|
and the POST form (RFC 6960 §A.1.1, binary OCSPRequest body). Most
|
|
production OCSP clients (Firefox, OpenSSL `s_client -status`, cert-manager,
|
|
Intune) use POST. The GET form is preserved for ops curl-debugging.
|
|
|
|
#### GET form
|
|
|
|
```
|
|
GET https://<host>/.well-known/pki/ocsp/{issuer_id}/{serial_hex}
|
|
```
|
|
|
|
| Field | Value |
|
|
| --- | --- |
|
|
| Method | `GET` |
|
|
| Auth | None |
|
|
| Response Content-Type | `application/ocsp-response` |
|
|
| Response body | DER-encoded OCSPResponse signed by the **OCSP responder cert** (NOT the CA cert) |
|
|
|
|
Example:
|
|
|
|
```bash
|
|
curl --cacert ca.crt \
|
|
-o response.der \
|
|
https://localhost:8443/.well-known/pki/ocsp/iss-local/a1b2c3d4
|
|
|
|
openssl ocsp -respin response.der -text -CAfile ca.crt
|
|
```
|
|
|
|
#### POST form (the standard one)
|
|
|
|
```
|
|
POST https://<host>/.well-known/pki/ocsp/{issuer_id}
|
|
Content-Type: application/ocsp-request
|
|
Body: <DER-encoded OCSPRequest>
|
|
```
|
|
|
|
| Field | Value |
|
|
| --- | --- |
|
|
| Method | `POST` |
|
|
| Auth | None |
|
|
| Request Content-Type | `application/ocsp-request` |
|
|
| Response Content-Type | `application/ocsp-response` |
|
|
|
|
Example with OpenSSL building the request:
|
|
|
|
```bash
|
|
openssl ocsp -issuer ca.crt -cert leaf.crt -reqout request.der
|
|
|
|
curl --cacert ca.crt \
|
|
-X POST \
|
|
-H "Content-Type: application/ocsp-request" \
|
|
--data-binary @request.der \
|
|
-o response.der \
|
|
https://localhost:8443/.well-known/pki/ocsp/iss-local
|
|
|
|
openssl ocsp -respin response.der -text -CAfile ca.crt
|
|
```
|
|
|
|
The body-size limit applies (`http.MaxBytesReader` from middleware,
|
|
default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`); a typical OCSPRequest
|
|
is ~200 bytes so this is a generous cap.
|
|
|
|
### Admin observability endpoint
|
|
|
|
```
|
|
GET https://<host>/api/v1/admin/crl/cache
|
|
Authorization: Bearer <token-with-admin-flag>
|
|
```
|
|
|
|
Returns the per-issuer cache state — for ops dashboards, GUI badges, or
|
|
"is the scheduler keeping up?" diagnostics. Admin-gated (M-008 admin-gated
|
|
handler allowlist; non-admin Bearer callers receive HTTP 403). Response shape:
|
|
|
|
```json
|
|
{
|
|
"cache_rows": [
|
|
{
|
|
"issuer_id": "iss-local",
|
|
"cache_present": true,
|
|
"crl_number": 42,
|
|
"this_update": "2026-04-29T10:00:00Z",
|
|
"next_update": "2026-04-29T11:00:00Z",
|
|
"generated_at": "2026-04-29T10:00:00Z",
|
|
"generation_duration_ms": 87,
|
|
"revoked_count": 13,
|
|
"is_stale": false,
|
|
"recent_events": [
|
|
{
|
|
"started_at": "2026-04-29T10:00:00Z",
|
|
"duration_ms": 87,
|
|
"succeeded": true,
|
|
"crl_number": 42,
|
|
"revoked_count": 13
|
|
}
|
|
]
|
|
}
|
|
],
|
|
"row_count": 1,
|
|
"generated_at": "2026-04-29T10:30:00Z"
|
|
}
|
|
```
|
|
|
|
Issuers that have not yet had a CRL generated appear with `cache_present:
|
|
false` so the GUI can render a "Not yet generated" pill rather than 404.
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
| Env var | Default | Meaning |
|
|
| --- | --- | --- |
|
|
| `CERTCTL_CRL_GENERATION_INTERVAL` | `1h` | How often the scheduler walks every CRL-supporting issuer and rebuilds. The HTTP handler reads from the cache, not from a per-request rebuild. |
|
|
| `CERTCTL_OCSP_RESPONDER_KEY_DIR` | unset | **Operator MUST set in production.** Directory where the FileDriver persists each issuer's OCSP responder key (`ocsp-responder-<issuer_id>.key`). When unset, the responder service uses a temporary directory that does NOT survive restarts — fine for dev, NEVER for prod. |
|
|
| `CERTCTL_OCSP_RESPONDER_ROTATION_GRACE` | `7d` | When the responder cert's `NotAfter` falls within this window, `EnsureResponder` rotates to a fresh cert+key on the next OCSP request or scheduler tick. |
|
|
| `CERTCTL_OCSP_RESPONDER_VALIDITY` | `30d` | How long each newly-issued responder cert is valid for. Short by design — relying parties cache OCSP responses, not the responder cert chain, and `id-pkix-ocsp-nocheck` blocks recursive revocation checking on the responder itself. |
|
|
|
|
The issuer-level CRL `nextUpdate` is derived from the generation timestamp +
|
|
the configured CRL validity (currently a build-time constant in the
|
|
`CRLCacheService`; configurable knob deferred until an operator asks).
|
|
|
|
---
|
|
|
|
## OCSP responder cert lifecycle
|
|
|
|
1. **First OCSP request for an issuer (or scheduler tick).** The local
|
|
issuer's `SignOCSPResponse` calls into `OCSPResponderService.EnsureResponder`.
|
|
2. **Cache lookup.** `EnsureResponder` queries the `ocsp_responders` table for
|
|
a row keyed by `issuer_id`.
|
|
3. **Disk lookup.** If a row exists, the FileDriver reads the persisted key
|
|
from `<keydir>/ocsp-responder-<issuer_id>.key`. **Self-healing:** if the
|
|
row exists but the file is missing (operator pruned the keydir without
|
|
pruning the DB), the service treats this as "rotate now" rather than
|
|
crashing.
|
|
4. **Rotation check.** If `cert.NotAfter < now + RotationGrace`, the service
|
|
generates a fresh ECDSA-P256 key, builds a `*x509.CertificateRequest`,
|
|
and asks the local issuer's existing `IssueCertificate` flow to sign it.
|
|
The signing template carries:
|
|
- `KeyUsage: x509.KeyUsageDigitalSignature` (signing OCSP responses)
|
|
- `ExtKeyUsage: x509.ExtKeyUsageOCSPSigning` (RFC 6960 §4.2.2.2)
|
|
- The `id-pkix-ocsp-nocheck` extension (OID `1.3.6.1.5.5.7.48.1.5`,
|
|
DER value `NULL`, RFC 6960 §4.2.2.2.1) wired through
|
|
`Certificate.ExtraExtensions`.
|
|
5. **Persistence.** The new cert + key path are written to `ocsp_responders`
|
|
via an idempotent `INSERT … ON CONFLICT DO UPDATE`.
|
|
6. **Response signing.** `ocsp.CreateResponse(caCert, responderCert,
|
|
template, responderSigner)` produces the response bytes; the responder
|
|
cert is included in the response chain so relying parties can validate
|
|
without a separate fetch.
|
|
|
|
The race between scheduler-driven cache refresh and on-demand cache miss is
|
|
collapsed by the `CRLCacheService`'s in-tree singleflight (a `sync.Map` of
|
|
`*flightEntry` keyed by `issuer_id`). Concurrent generation requests for the
|
|
same issuer wait on the in-flight result rather than each rebuilding from
|
|
scratch.
|
|
|
|
---
|
|
|
|
## Pointing common consumers at the endpoints
|
|
|
|
### cert-manager (Kubernetes)
|
|
|
|
cert-manager's certificate-validation logic checks both the AIA OCSP URI
|
|
embedded in the leaf and the CDP CRL URI. Both are populated automatically
|
|
by the local issuer's certificate template — relying parties should NOT
|
|
need any additional configuration. To verify:
|
|
|
|
```bash
|
|
openssl x509 -in leaf.crt -text -noout | grep -A1 "Authority Information Access"
|
|
openssl x509 -in leaf.crt -text -noout | grep -A2 "CRL Distribution Points"
|
|
```
|
|
|
|
If your cert-manager pods cannot reach `https://<certctl-host>:8443/.well-known/pki/`,
|
|
add a NetworkPolicy egress rule or expose the certctl service via the
|
|
appropriate ingress class.
|
|
|
|
### Firefox
|
|
|
|
Firefox honors the AIA OCSP URI by default. To force-refresh the local
|
|
revocation cache after revoking a cert in dev:
|
|
|
|
```
|
|
about:preferences#privacy → Certificates → Query OCSP responder servers
|
|
```
|
|
|
|
If Firefox reports `SEC_ERROR_OCSP_INVALID_SIGNING_CERT`, verify that the
|
|
responder cert chain is reachable from the system trust store —
|
|
`id-pkix-ocsp-nocheck` is a Firefox-strict extension and is set automatically
|
|
on every responder cert certctl issues.
|
|
|
|
### OpenSSL
|
|
|
|
```bash
|
|
# OCSP via stand-alone request
|
|
openssl ocsp -issuer ca.crt -cert leaf.crt -url https://localhost:8443/.well-known/pki/ocsp/iss-local -CAfile ca.crt -text
|
|
|
|
# OCSP via TLS Certificate Status Request extension
|
|
openssl s_client -connect example.com:443 -status -CAfile ca.crt
|
|
```
|
|
|
|
### Intune (corporate device state)
|
|
|
|
Intune device-compliance validators pull the CRL on a schedule (configured in
|
|
the Intune admin console, default 24h). Configure the CRL distribution point
|
|
to `https://<certctl-host>:8443/.well-known/pki/crl/<issuer_id>` and Intune
|
|
will pull on its own cadence.
|
|
|
|
---
|
|
|
|
## What this release does NOT include (V3-Pro)
|
|
|
|
The following are explicitly out of scope for the V2 (free) bundle and are
|
|
tracked for the certctl Pro release:
|
|
|
|
- **Delta CRLs (RFC 5280 §5.2.4).** Useful for very large CRLs (10k+
|
|
revoked certs); the data model already accommodates the Base CRL Number
|
|
reference but the pipeline only emits Base CRLs in V2.
|
|
- **OCSP rate-limiting per relying party.** Per-IP token bucket on the OCSP
|
|
endpoint — V3-Pro because it justifies per-seat pricing for high-traffic
|
|
responders.
|
|
- **OCSP stapling.** Server-side: cache pre-fetched OCSP responses + serve
|
|
in TLS handshake. Client-side: a "stapling fetcher" agent for non-stapling
|
|
origins.
|
|
|
|
The MaxBytesReader cap is the only request-level guard in V2; the
|
|
unauthenticated-by-design relying-party endpoints are intentionally not
|
|
rate-limited per IP.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
**`pki/crl/<issuer_id>` returns 404.** The issuer either does not support
|
|
CRL signing (Vault, EJBCA, DigiCert serve their own CRL infrastructure;
|
|
certctl's connectors return `nil` from `GenerateCRL` for these) or the
|
|
issuer ID is wrong. Verify with `GET /api/v1/issuers`.
|
|
|
|
**`pki/ocsp/<issuer_id>/<serial>` returns 200 but `openssl ocsp -text`
|
|
shows "unauthorized".** Check that the serial in the URL is hex-encoded (no
|
|
`0x` prefix, no leading zeros stripped, lowercase). Mismatched serials
|
|
return an OCSP response with status `unauthorized` per RFC 6960 §2.3.
|
|
|
|
**Admin cache endpoint returns 403.** The Bearer key does not carry the
|
|
admin flag. M-008 gates this endpoint server-side; the GUI also gates the
|
|
fetch on `useAuth().admin`. Either escalate the key (`certctl admin
|
|
keys promote <key-id>`) or use a different identity.
|
|
|
|
**Cache shows `is_stale: true` repeatedly.** The scheduler is not running
|
|
(or not getting scheduled often enough). Check `CERTCTL_CRL_GENERATION_INTERVAL`
|
|
and confirm the scheduler started: `grep crlGenerationLoop` in the server
|
|
logs at startup.
|