Files
shankar0123 19c8fafe84 docs: Phase 14 — Last reviewed line sweep across docs/
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
Adds a `> Last reviewed: 2026-05-05` line right after the H1 heading
of every doc that didn't already have one (41 files).

This dates the freshness clock for the future Phase 4 per-doc review.
The discipline going forward: when a doc's content gets a meaningful
edit, bump the date. When the date gets old (e.g., >6 months), the
doc earns a freshness-review pass.

Mechanical insertion via awk one-liner, applied to every docs/*.md
that didn't already match `grep -q 'Last reviewed:'`. Files that
already carried the line from earlier Phase 2 work (the navigation
index, the new connector docs, the new SCEP server / legacy-clients-
TLS-1.2 / release-verification docs, and the 5 per-connector deep
dives) were skipped to avoid duplicate insertion.

Net: every doc in docs/ now has a Last reviewed line.
2026-05-05 03:26:46 +00:00

414 lines
16 KiB
Markdown

# CRL & OCSP — Revocation Status for Relying Parties
> Last reviewed: 2026-05-05
This guide is the operator + relying-party reference for certctl's revocation
status surfaces. It covers the wire format, endpoint URLs, configuration knobs,
the OCSP responder cert lifecycle, and how to point common consumers
(cert-manager, Firefox, OpenSSL) at the endpoints.
If you're looking for the higher-level architecture, see
[`architecture.md` § Security Model](architecture.md#security-model). If you're
looking for the revocation policy / reason codes the API accepts, see
[`api/openapi.yaml` § /certificates/{id}/revoke](../api/openapi.yaml).
---
## Conceptual overview
**Why two formats.** RFC 5280 §5 defines a Certificate Revocation List (CRL)
— a periodically-published, signed list of every revoked certificate for an
issuer. RFC 6960 defines the Online Certificate Status Protocol (OCSP) — a
request/response protocol that returns the status of a single certificate by
serial number. CRLs are batch-friendly and cacheable; OCSP is point-query and
fresh. Production PKI deployments serve both because different relying parties
prefer different trade-offs:
- Browsers (Firefox / Safari) prefer OCSP for freshness; some pin OCSP
stapling.
- cert-manager and most Linux TLS clients fall back to CRL when OCSP is
unreachable.
- Microsoft Intune / corporate device-state validators do periodic CRL pulls.
- OpenSSL `s_client -status` exercises OCSP via the `Certificate Status
Request` extension during the handshake.
certctl's local issuer publishes both, with a pre-generation cache so a busy
CA does not DOS itself rebuilding the CRL on every fetch.
**Why a separate OCSP responder cert.** RFC 6960 §2.6 + §4.2.2.2 strongly
recommend that OCSP responses be signed by a delegated "OCSP responder cert"
issued by the CA, NOT by the CA private key directly. The responder cert
carries the `id-pkix-ocsp-nocheck` extension (RFC 6960 §4.2.2.2.1) so OCSP
clients do not recursively check the responder cert's revocation status. This
keeps the CA private key cold (an HSM operation per OCSP request would be
prohibitive at scale) and lets the responder key live on disk, on a separate
HSM partition, or rotate frequently while the CA key stays untouched.
---
## Endpoints
All revocation endpoints live under `/.well-known/pki/` per RFC 8615 and run
**unauthenticated** — relying parties without certctl API credentials must be
able to validate revocation status. The HTTPS-only TLS 1.3 control plane
applies; there is no plaintext fallback.
### CRL — Certificate Revocation List
```
GET https://<host>/.well-known/pki/crl/{issuer_id}
```
| Field | Value |
| --- | --- |
| Method | `GET` |
| Auth | None (unauthenticated, RFC 5280 §5 distribution semantics) |
| Response Content-Type | `application/pkix-crl` |
| Response body | DER-encoded X.509 CRL signed by the issuer's CA |
| Cache | Pre-generated by the scheduler; configurable interval |
Example:
```bash
curl --cacert ca.crt \
-o crl.der \
https://localhost:8443/.well-known/pki/crl/iss-local
openssl crl -inform DER -in crl.der -text -noout
```
### OCSP — Online Certificate Status Protocol
certctl serves both the GET form (RFC 6960 §A.1.1, simple URL-path lookup)
and the POST form (RFC 6960 §A.1.1, binary OCSPRequest body). Most
production OCSP clients (Firefox, OpenSSL `s_client -status`, cert-manager,
Intune) use POST. The GET form is preserved for ops curl-debugging.
#### GET form
```
GET https://<host>/.well-known/pki/ocsp/{issuer_id}/{serial_hex}
```
| Field | Value |
| --- | --- |
| Method | `GET` |
| Auth | None |
| Response Content-Type | `application/ocsp-response` |
| Response body | DER-encoded OCSPResponse signed by the **OCSP responder cert** (NOT the CA cert) |
Example:
```bash
curl --cacert ca.crt \
-o response.der \
https://localhost:8443/.well-known/pki/ocsp/iss-local/a1b2c3d4
openssl ocsp -respin response.der -text -CAfile ca.crt
```
#### POST form (the standard one)
```
POST https://<host>/.well-known/pki/ocsp/{issuer_id}
Content-Type: application/ocsp-request
Body: <DER-encoded OCSPRequest>
```
| Field | Value |
| --- | --- |
| Method | `POST` |
| Auth | None |
| Request Content-Type | `application/ocsp-request` |
| Response Content-Type | `application/ocsp-response` |
Example with OpenSSL building the request:
```bash
openssl ocsp -issuer ca.crt -cert leaf.crt -reqout request.der
curl --cacert ca.crt \
-X POST \
-H "Content-Type: application/ocsp-request" \
--data-binary @request.der \
-o response.der \
https://localhost:8443/.well-known/pki/ocsp/iss-local
openssl ocsp -respin response.der -text -CAfile ca.crt
```
The body-size limit applies (`http.MaxBytesReader` from middleware,
default 1MB, configurable via `CERTCTL_MAX_BODY_SIZE`); a typical OCSPRequest
is ~200 bytes so this is a generous cap.
### Admin observability endpoint
```
GET https://<host>/api/v1/admin/crl/cache
Authorization: Bearer <token-with-admin-flag>
```
Returns the per-issuer cache state — for ops dashboards, GUI badges, or
"is the scheduler keeping up?" diagnostics. Admin-gated (M-008 admin-gated
handler allowlist; non-admin Bearer callers receive HTTP 403). Response shape:
```json
{
"cache_rows": [
{
"issuer_id": "iss-local",
"cache_present": true,
"crl_number": 42,
"this_update": "2026-04-29T10:00:00Z",
"next_update": "2026-04-29T11:00:00Z",
"generated_at": "2026-04-29T10:00:00Z",
"generation_duration_ms": 87,
"revoked_count": 13,
"is_stale": false,
"recent_events": [
{
"started_at": "2026-04-29T10:00:00Z",
"duration_ms": 87,
"succeeded": true,
"crl_number": 42,
"revoked_count": 13
}
]
}
],
"row_count": 1,
"generated_at": "2026-04-29T10:30:00Z"
}
```
Issuers that have not yet had a CRL generated appear with `cache_present:
false` so the GUI can render a "Not yet generated" pill rather than 404.
---
## Configuration
| Env var | Default | Meaning |
| --- | --- | --- |
| `CERTCTL_CRL_GENERATION_INTERVAL` | `1h` | How often the scheduler walks every CRL-supporting issuer and rebuilds. The HTTP handler reads from the cache, not from a per-request rebuild. |
| `CERTCTL_OCSP_RESPONDER_KEY_DIR` | unset | **Operator MUST set in production.** Directory where the FileDriver persists each issuer's OCSP responder key (`ocsp-responder-<issuer_id>.key`). When unset, the responder service uses a temporary directory that does NOT survive restarts — fine for dev, NEVER for prod. |
| `CERTCTL_OCSP_RESPONDER_ROTATION_GRACE` | `7d` | When the responder cert's `NotAfter` falls within this window, `EnsureResponder` rotates to a fresh cert+key on the next OCSP request or scheduler tick. |
| `CERTCTL_OCSP_RESPONDER_VALIDITY` | `30d` | How long each newly-issued responder cert is valid for. Short by design — relying parties cache OCSP responses, not the responder cert chain, and `id-pkix-ocsp-nocheck` blocks recursive revocation checking on the responder itself. |
The issuer-level CRL `nextUpdate` is derived from the generation timestamp +
the configured CRL validity (currently a build-time constant in the
`CRLCacheService`; configurable knob deferred until an operator asks).
---
## OCSP responder cert lifecycle
1. **First OCSP request for an issuer (or scheduler tick).** The local
issuer's `SignOCSPResponse` calls into `OCSPResponderService.EnsureResponder`.
2. **Cache lookup.** `EnsureResponder` queries the `ocsp_responders` table for
a row keyed by `issuer_id`.
3. **Disk lookup.** If a row exists, the FileDriver reads the persisted key
from `<keydir>/ocsp-responder-<issuer_id>.key`. **Self-healing:** if the
row exists but the file is missing (operator pruned the keydir without
pruning the DB), the service treats this as "rotate now" rather than
crashing.
4. **Rotation check.** If `cert.NotAfter < now + RotationGrace`, the service
generates a fresh ECDSA-P256 key, builds a `*x509.CertificateRequest`,
and asks the local issuer's existing `IssueCertificate` flow to sign it.
The signing template carries:
- `KeyUsage: x509.KeyUsageDigitalSignature` (signing OCSP responses)
- `ExtKeyUsage: x509.ExtKeyUsageOCSPSigning` (RFC 6960 §4.2.2.2)
- The `id-pkix-ocsp-nocheck` extension (OID `1.3.6.1.5.5.7.48.1.5`,
DER value `NULL`, RFC 6960 §4.2.2.2.1) wired through
`Certificate.ExtraExtensions`.
5. **Persistence.** The new cert + key path are written to `ocsp_responders`
via an idempotent `INSERT … ON CONFLICT DO UPDATE`.
6. **Response signing.** `ocsp.CreateResponse(caCert, responderCert,
template, responderSigner)` produces the response bytes; the responder
cert is included in the response chain so relying parties can validate
without a separate fetch.
The race between scheduler-driven cache refresh and on-demand cache miss is
collapsed by the `CRLCacheService`'s in-tree singleflight (a `sync.Map` of
`*flightEntry` keyed by `issuer_id`). Concurrent generation requests for the
same issuer wait on the in-flight result rather than each rebuilding from
scratch.
---
## Pointing common consumers at the endpoints
### cert-manager (Kubernetes)
cert-manager's certificate-validation logic checks both the AIA OCSP URI
embedded in the leaf and the CDP CRL URI. Both are populated automatically
by the local issuer's certificate template — relying parties should NOT
need any additional configuration. To verify:
```bash
openssl x509 -in leaf.crt -text -noout | grep -A1 "Authority Information Access"
openssl x509 -in leaf.crt -text -noout | grep -A2 "CRL Distribution Points"
```
If your cert-manager pods cannot reach `https://<certctl-host>:8443/.well-known/pki/`,
add a NetworkPolicy egress rule or expose the certctl service via the
appropriate ingress class.
### Firefox
Firefox honors the AIA OCSP URI by default. To force-refresh the local
revocation cache after revoking a cert in dev:
```
about:preferences#privacy → Certificates → Query OCSP responder servers
```
If Firefox reports `SEC_ERROR_OCSP_INVALID_SIGNING_CERT`, verify that the
responder cert chain is reachable from the system trust store —
`id-pkix-ocsp-nocheck` is a Firefox-strict extension and is set automatically
on every responder cert certctl issues.
### OpenSSL
```bash
# OCSP via stand-alone request
openssl ocsp -issuer ca.crt -cert leaf.crt -url https://localhost:8443/.well-known/pki/ocsp/iss-local -CAfile ca.crt -text
# OCSP via TLS Certificate Status Request extension
openssl s_client -connect example.com:443 -status -CAfile ca.crt
```
### Intune (corporate device state)
Intune device-compliance validators pull the CRL on a schedule (configured in
the Intune admin console, default 24h). Configure the CRL distribution point
to `https://<certctl-host>:8443/.well-known/pki/crl/<issuer_id>` and Intune
will pull on its own cadence.
---
## Production hardening II additions (post-2026-04-30)
The following capabilities were folded into V2 (free) by the production
hardening II bundle. Each closes a real procurement-team checklist gap
without requiring a paid tier.
### OCSP nonce extension (RFC 6960 §4.4.1)
The POST OCSP handler echoes the request's nonce extension (OID
`1.3.6.1.5.5.7.48.1.2`) in the response. Defends against replay attacks
where a relying party's cached response is replayed against a now-revoked
cert. Always-on; no operator opt-out.
Failure modes:
- **No nonce in request** — back-compat; response omits the extension.
- **Well-formed nonce ≤ 32 bytes** — response echoes it; tracked in
`certctl_ocsp_counter_total{label="nonce_echoed"}`.
- **Empty or oversized nonce (> 32 bytes per CA/B Forum BR §4.10.2)** —
responder returns the canonical "unauthorized" status (RFC 6960 §2.3
status 6); tracked in `certctl_ocsp_counter_total{label="nonce_malformed"}`.
### OCSP pre-signed response cache
Mirrors the existing CRL cache. Per-(issuer, serial) entries pre-signed
and stored in `ocsp_response_cache`; the read-through facade in
`CAOperationsSvc.GetOCSPResponseWithNonce` consults the cache for
nil-nonce requests and falls through to live signing on miss + writes
the result back. Nonce-bearing requests always live-sign because the
cache stores nil-nonce blobs.
**Load-bearing security wire:** `RevocationSvc.RevokeCertificateWithActor`
calls `InvalidateOnRevoke` after a successful revocation so the next
OCSP fetch returns the revoked status. There is no stale-good window
after revoke.
### Per-source-IP OCSP rate limit + per-actor cert-export rate limit
Defaults: 1000 req/min/IP for OCSP; 50 exports/hr/operator for the
cert-export endpoints. Configurable via
`CERTCTL_OCSP_RATE_LIMIT_PER_IP_MIN` and
`CERTCTL_CERT_EXPORT_RATE_LIMIT_PER_ACTOR_HR`; zero disables.
OCSP rate-limit trip: canonical "unauthorized" OCSP blob plus
`Retry-After: 60`. Cert-export trip: HTTP 429 + JSON
`{"error":"rate_limit_exceeded","retry_after_seconds":3600}`.
The OCSP limiter does NOT honor `X-Forwarded-For` because OCSP is
publicly reachable and untrusted intermediaries could spoof the header
to bypass the cap.
### CRL HTTP caching headers (RFC 7232)
`GET /.well-known/pki/crl/{issuer_id}` now returns weak-form ETag,
`Cache-Control: public, max-age=3600, must-revalidate`, and respects
`If-None-Match` for HTTP 304 short-circuits. Lets CDNs and reverse
proxies serve repeated fetches from edge cache.
### CRL DistributionPoint auto-injection
Local issuer config field `CRLDistributionPointURLs []string`; when
non-empty, every issued cert carries the RFC 5280 §4.2.1.13
`id-ce-cRLDistributionPoints` extension pointing at certctl's CRL
endpoint. Refusing to silently inject an empty CDP is deliberate —
silent-empty fails relying-party validation worse than no CDP.
### Cert-export typed audit codes + Prometheus per-area metrics
Audit emission now carries typed action constants
(`cert_export_pem`, `cert_export_pkcs12`, `cert_export_failed`)
alongside legacy bare codes. Detail map enriched with
`has_private_key` (always false in V2) and `cipher`
(`AES-256-CBC-PBE2-SHA256` — pinned).
`GET /api/v1/metrics/prometheus` surfaces the new per-area counters
under the `certctl_<area>_counter_total{label=...}` family. OCSP
shipped in this bundle; alert recommendations:
- `{label="rate_limited"}` rate > 0 sustained > 5m → notify (limiter
is doing its job; investigate source IP).
- `{label="nonce_malformed"}` > 0 → notify (legitimate clients don't
send malformed nonces).
- `{label="signing_failed"}` > 0 → page on-call (issuer connector
failing).
## What this release does NOT include (V3-Pro)
Still out of scope for V2; tracked for V3-Pro:
- **Delta CRLs (RFC 5280 §5.2.4).** Useful for very large CRLs (10k+
revoked certs); the data model accommodates the Base CRL Number
reference but the pipeline only emits Base CRLs in V2.
- **OCSP stapling at SCEP/EST CertRep response time.** Server-side
pre-staple into the TLS handshake context.
- **OCSP request signature verification (RFC 6960 §4.1.1).** Optional
per-spec; certctl currently ignores the signature.
- **OCSP responder HA / multi-region replication.** Active-active
OCSP cache with Postgres logical replication.
- **CRL Issuing Distribution Point (IDP) extension** (RFC 5280
§5.2.5) — for sharded CRL deployments.
---
## Troubleshooting
**`pki/crl/<issuer_id>` returns 404.** The issuer either does not support
CRL signing (Vault, EJBCA, DigiCert serve their own CRL infrastructure;
certctl's connectors return `nil` from `GenerateCRL` for these) or the
issuer ID is wrong. Verify with `GET /api/v1/issuers`.
**`pki/ocsp/<issuer_id>/<serial>` returns 200 but `openssl ocsp -text`
shows "unauthorized".** Check that the serial in the URL is hex-encoded (no
`0x` prefix, no leading zeros stripped, lowercase). Mismatched serials
return an OCSP response with status `unauthorized` per RFC 6960 §2.3.
**Admin cache endpoint returns 403.** The Bearer key does not carry the
admin flag. M-008 gates this endpoint server-side; the GUI also gates the
fetch on `useAuth().admin`. Either escalate the key (`certctl admin
keys promote <key-id>`) or use a different identity.
**Cache shows `is_stale: true` repeatedly.** The scheduler is not running
(or not getting scheduled often enough). Check `CERTCTL_CRL_GENERATION_INTERVAL`
and confirm the scheduler started: `grep crlGenerationLoop` in the server
logs at startup.