mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:11:29 +00:00
d809874fa1
Per operator decision the framework-mapping docs are gone. They
were aspirational (no audit, no certification, no validated
mapping); keeping them around was misleading.
Files deleted (1,883 lines):
- docs/compliance/index.md
- docs/compliance/soc2.md
- docs/compliance/pci-dss.md
- docs/compliance/nist-sp-800-57.md
Hyperlinks removed:
- README.md: 'Auditor / compliance' row in the doc table; the
'(compliance mapping included)' parenthetical in the
positioning paragraph
- docs/README.md: the '## Compliance' section table; the
'Auditor / compliance team' reading-order-by-role row
Prose name-drops swept across 24 files:
- README.md: 'FedRAMP boundary CAs / financial-services policy
CAs' → '4-level boundary CAs / 3-level policy CAs';
'Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High,
SOC 2 Type II, HIPAA' → cut entirely
- getting-started/{quickstart,concepts,examples,why-certctl,
advanced-demo}.md: 'compliance' → 'audit' / 'policy';
'PCI-DSS / SOC 2 / NIST SP 800-57' framework lists cut;
''pci': 'true'' tag example → ''environment': 'production''
- migration/cert-manager-coexistence.md: 'compliance rules' →
'policy rules'
- operator/approval-workflow.md: 'Compliance customers (PCI-DSS
Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA)' →
'Operators'; entire 'Compliance control mapping' table
(PCI-DSS §6.4.5 / NIST SP 800-53 SA-15 / SOC 2 Type II CC6.1
/ HIPAA §164.308(a)(4)) deleted; 'compliance contract' →
'two-person-integrity contract'; 'compliance auditors' →
'reviewers'
- operator/legacy-clients-tls-1.2.md: 'PCI-DSS v4.0 Req 4 §2.2.5'
audit-reference → CWE-326 (kept); 'PCI-DSS Req 4 §2.2.5
attestation' section retitled to 'TLS posture summary' and
rewritten without framework framing; 'PCI-DSS, NIST, and
major browsers will eventually deprecate TLS 1.2' →
'Major browsers and OS vendors will eventually deprecate
TLS 1.2'
- operator/database-tls.md: PCI-DSS Req 4 §2.2.5 audit-ref →
CWE-319 only; 'PCI-DSS scope' → 'sensitive data'; PCI-DSS
Req 4 v4.0 prose footing → cut
- operator/runbooks/disaster-recovery.md: 'SOC 2 / PCI
procurement-team deliverable' → 'on-call deliverable';
'compliance auditors' → 'reviewers'
- reference/connectors/{acme,aws-acm,azure-kv,globalsign,
local-ca,openssl,ssh,index}.md: 'compliance reporting
(PCI-DSS §3.6, HIPAA §164.312)' → 'audit reporting';
'Compliance environments (PCI-DSS Level 1, FedRAMP High,
HIPAA)' → 'Regulated environments'; 'compliance audits' →
'audit'; 'FedRAMP boundary CA' pattern names →
'4-level boundary CA' (technically descriptive)
- reference/protocols/est.md: 'compliance-hook seam' →
'device-state hook seam'; 'compliance gating' → 'device-state
gating'; 'est_compliance_failed' → 'est_device_state_failed'
- reference/protocols/scep-intune.md: 'Optional compliance
check' → 'Optional device-state check'; failure-counter
'compliance_failed' → 'device_state_failed'; 'Conditional
Access compliance gating' → 'Conditional Access
device-state gating'
- reference/intermediate-ca-hierarchy.md: 'FedRAMP boundary-CA
deployments where the regulator requires...' →
'Boundary-CA deployments where you want separation of policy
and issuing authorities'; pattern A retitled '4-level FedRAMP
boundary CA' → '4-level boundary CA'
- reference/architecture.md: broken Related-docs link to
compliance.md removed; the rest of that block had stale
pre-Phase-2 paths (quickstart.md, demo-advanced.md,
connectors.md, openapi.md, testing-guide.md, test-env.md) —
retargeted to current locations
- reference/deployment-model.md: 'SOC 2 evidence-report
generator' → 'Audit-evidence report generator'
- reference/vendor-matrix.md: 'SOC 2 / PCI auditors paste this
into evidence packs' → 'reviewers paste this into
vendor-evaluation packs'
- contributor/qa-test-suite.md: 'compliance exist' coverage
description cut; 'Compliance (PCI / SOC2 / HIPAA-relevant)'
risk-class label → 'Audit-relevant'
What was kept:
- CWE references (legitimate technical pointers)
- Microsoft API/feature names that happen to use 'compliance'
literally ('Microsoft Graph compliance API',
'device-compliance validators' — these are MS product names,
not framework name-drops)
- 'NIST PQC' on the landing page (Post-Quantum Cryptography is
the actual NIST standard family, not a compliance framework)
Verified: zero hyperlinks into docs/compliance/ remain. All 24
ci-guards/*.sh pass locally. qa-doc-seed-count.sh clean.
Net diff: 26 files / -1,883 deletions in compliance/ + -32 net
across the prose sweep.
Companion edits in cowork/ (CLAUDE.md doc-tree summary +
WORKSPACE-CHANGELOG.md retirement note) land separately.
234 lines
9.3 KiB
Markdown
234 lines
9.3 KiB
Markdown
# Intermediate CA hierarchy — operator runbook
|
|
|
|
> Last reviewed: 2026-05-05
|
|
|
|
Rank 8 of the 2026-05-03 deep-research deliverable. This page is the
|
|
canonical reference for operators running certctl as a multi-level
|
|
internal PKI.
|
|
|
|
The default `single`-mode flow (one operator-supplied sub-CA loaded
|
|
from disk at boot) is unchanged and will keep working byte-for-byte
|
|
forever. This page is for operators who need a real CA tree:
|
|
|
|
- Boundary-CA deployments where you want separation of policy and
|
|
issuing authorities.
|
|
- Policy-CA deployments (one root, one policy CA per business unit,
|
|
one issuing CA per environment).
|
|
- OT / industrial control networks where the air-gapped root signs
|
|
online sub-CAs that go in and out of service on a rotation.
|
|
|
|
## Concepts
|
|
|
|
`Issuer.HierarchyMode` is a per-issuer column on the `issuers` table.
|
|
Two values are valid (the database default is `"single"` — back-compat
|
|
byte-identical for unmigrated rows):
|
|
|
|
- `single` — pre-Rank-8 historical flow. The local connector loads a
|
|
pre-signed CA cert+key from disk via `local.Config.CACertPath` /
|
|
`local.Config.CAKeyPath`. Existing operators upgrade with no
|
|
behavior change.
|
|
- `tree` — the issuer's CAs are managed via the `intermediate_cas`
|
|
table. Chain assembly walks the `parent_ca_id` foreign key from the
|
|
issuing leaf CA up to the root and attaches the assembled chain to
|
|
every `IssuanceResult`.
|
|
|
|
Each row in `intermediate_cas` is one CA cert (root, policy, issuing).
|
|
The lifecycle is `created` → `active` → `retiring` → `retired`. The
|
|
state column is a closed enum and validates at the service layer; the
|
|
postgres CHECK constraint enforces it at the database layer too.
|
|
|
|
A CA's private key bytes are NEVER persisted on the row. The
|
|
`key_driver_id` column is a reference (filesystem path / KMS key ID /
|
|
HSM slot) that the `signer.Driver` resolves at sign time. A SQL
|
|
injection or a row-leak surface MUST NEVER expose key bytes; only the
|
|
reference can leak.
|
|
|
|
## Lifecycle states
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> created : CreateRoot / CreateChild
|
|
created --> active : registration completes
|
|
active --> retiring : Retire(confirm=false)
|
|
retiring --> retired : Retire(confirm=true)
|
|
retired --> [*]
|
|
|
|
note right of retiring
|
|
Drain start. CA stops issuing
|
|
NEW children; existing children
|
|
keep issuing until they retire.
|
|
end note
|
|
|
|
note right of retired
|
|
Terminal. Refused if active children
|
|
remain (ErrCAStillHasActiveChildren
|
|
→ HTTP 409). OCSP keeps responding
|
|
for already-issued leaves until expiry.
|
|
end note
|
|
```
|
|
|
|
Drain-first semantics: a CA in `retiring` state cannot terminalize to
|
|
`retired` while it still has active children. The service layer
|
|
returns `ErrCAStillHasActiveChildren`; the API surfaces HTTP 409. Drain
|
|
the children first.
|
|
|
|
## Common deployment patterns
|
|
|
|
### Pattern A — 4-level boundary CA
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Root["Acme Root CA<br/>path_len=3<br/>offline air-gapped"]
|
|
Policy["Acme Policy CA<br/>path_len=2<br/>boundary"]
|
|
IssA["Acme Issuing A<br/>path_len=0<br/>prod workload leaves"]
|
|
IssB["Acme Issuing B<br/>path_len=0<br/>ephemeral pod identity"]
|
|
Root --> Policy --> IssA --> IssB
|
|
```
|
|
|
|
Operator workflow:
|
|
|
|
1. Mint the root cert+key on the offline workstation. Move the cert
|
|
PEM (no key) to the online operator workstation.
|
|
2. `POST /api/v1/issuers/{id}/intermediates` with the empty
|
|
`parent_ca_id` and `root_cert_pem` + `key_driver_id` populated
|
|
(the operator pre-positions the root key file at the path the
|
|
`key_driver_id` points to). The service validates RFC 5280 §3.2
|
|
self-signed semantics + cross-checks the operator-supplied key
|
|
matches the cert (rejects mismatched bundles at registration time
|
|
with `ErrCAKeyMismatch`).
|
|
3. `POST /api/v1/issuers/{id}/intermediates` with `parent_ca_id`
|
|
pointing at the root for the Policy CA. The service generates the
|
|
child key via `signer.Driver.Generate`, signs the child cert via
|
|
the parent's signer (loaded from the parent's `key_driver_id`),
|
|
and persists the new row with the next `path_len` value (parent's
|
|
- 1 if unset). Repeat for each lower level.
|
|
4. Set `Issuer.HierarchyMode = "tree"` on the issuer row + set the
|
|
`treeIssuingCAID` connector field to point at the deepest CA
|
|
(Acme Issuing B in the example above) — issued leaves chain via
|
|
`AssembleChain` from B up to the root.
|
|
|
|
### Pattern B — 3-level financial-services policy CA
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Root["FinCo Root CA<br/>path_len=2"]
|
|
Pol["FinCo Trading Policy CA<br/>path_len=1<br/>permitted DNS = trading.finco.example"]
|
|
Iss["FinCo Trading Issuing CA<br/>path_len=0"]
|
|
Root --> Pol --> Iss
|
|
```
|
|
|
|
Per business-unit name constraints: each policy CA carries a
|
|
`PermittedDNSDomains` list scoped to the business unit (RFC 5280
|
|
§4.2.1.10). The service enforces subset semantics — a child policy CA
|
|
cannot widen the parent's permitted set, and cannot remove an
|
|
excluded subtree. Operators submit `name_constraints` on the
|
|
`POST /api/v1/issuers/{id}/intermediates` body.
|
|
|
|
### Pattern C — 2-level internal PKI
|
|
|
|
```mermaid
|
|
flowchart TD
|
|
Root["Internal Root CA<br/>path_len=0"]
|
|
Iss["Internal Issuing CA<br/>path_len=0<br/>issues leaves directly"]
|
|
Root --> Iss
|
|
```
|
|
|
|
The simplest tree-mode deployment. Roughly equivalent to single mode
|
|
in terms of operator overhead, but provides one extra layer of
|
|
indirection so the root key can stay offline while only the issuing
|
|
CA's key sits on the certctl host.
|
|
|
|
## RFC 5280 enforcement
|
|
|
|
All enforcement happens at the service layer. The local connector
|
|
trusts the service's contract; the API layer translates errors to
|
|
HTTP codes.
|
|
|
|
- §3.2 self-signed root validation: `cert.CheckSignatureFrom(cert)` +
|
|
subject == issuer DN. Rejected with `ErrCANotSelfSigned` →
|
|
HTTP 400.
|
|
- §4.2.1.9 path-length tightening: child's `PathLenConstraint` must
|
|
be strictly less than parent's. Default to `parent - 1` when unset.
|
|
Rejected with `ErrPathLenExceeded` → HTTP 400.
|
|
- §4.2.1.10 NameConstraints subset: child's `Permitted` set must be a
|
|
subset of parent's; child's `Excluded` set must be a superset of
|
|
parent's. Rejected with `ErrNameConstraintExceeded` → HTTP 400.
|
|
- §4.1.2.5 validity capping: child's `notAfter` capped to parent's
|
|
`notAfter` automatically (chain breaks at parent's expiry
|
|
regardless).
|
|
|
|
## Migrating a single-mode issuer to tree mode
|
|
|
|
Pre-flight: the load-bearing pin
|
|
`TestLocal_HierarchyMode_SingleVsTree_ByteIdentical` guarantees that
|
|
a 1-level tree wired around the same on-disk root cert+key produces
|
|
byte-identical issuance bundles to single mode. Migration is therefore
|
|
a no-downtime operation if done carefully:
|
|
|
|
1. Register the existing single-mode CA cert as an `intermediate_cas`
|
|
row via `CreateRoot` (with the existing on-disk key referenced as
|
|
`key_driver_id`).
|
|
2. Update the issuer row's `hierarchy_mode` to `"tree"` and set the
|
|
connector's `SetTreeIssuingCAID` to the new row's ID. Restart the
|
|
server (no new code path activates until the connector reads the
|
|
updated mode at boot).
|
|
3. Issue a test cert. The byte-equivalence pin guarantees the wire
|
|
bytes match the pre-migration output for a 1-level tree.
|
|
4. Build out the child CAs via `CreateChild` calls. Update
|
|
`treeIssuingCAID` to the new leaf CA. Test, then ramp.
|
|
|
|
If the pin breaks during migration, abort: roll back the
|
|
`hierarchy_mode` flip and investigate. The byte-equivalence pin is
|
|
the canary — if it goes red, deeper bugs lurk.
|
|
|
|
## API reference
|
|
|
|
All endpoints under `/api/v1/issuers/{id}/intermediates` and
|
|
`/api/v1/intermediates/{id}` are admin-gated. Non-admin Bearer callers
|
|
get HTTP 403.
|
|
|
|
| Method | Path | Purpose |
|
|
|--------|------|---------|
|
|
| POST | `/api/v1/issuers/{id}/intermediates` | Register root OR sign child (body discriminator) |
|
|
| GET | `/api/v1/issuers/{id}/intermediates` | List flat hierarchy for issuer |
|
|
| GET | `/api/v1/intermediates/{id}` | Single-row detail |
|
|
| POST | `/api/v1/intermediates/{id}/retire` | Two-phase retirement |
|
|
|
|
See `api/openapi.yaml` for full request/response schemas.
|
|
|
|
## Observability
|
|
|
|
`IntermediateCAMetrics` ships counters dimensioned by `(issuer_id,
|
|
kind)`:
|
|
|
|
- `create_root` — successful CreateRoot calls.
|
|
- `create_child` — successful CreateChild calls.
|
|
- `retire_retiring` — `active → retiring` transitions.
|
|
- `retire_retired` — `retiring → retired` transitions.
|
|
|
|
The Prometheus exposer reads the snapshot via
|
|
`SnapshotIntermediateCA()` from a single instance constructed in
|
|
`cmd/server/main.go` (the snapshotter is the single source of truth
|
|
between the service-side recording path and the metrics-side exposing
|
|
path).
|
|
|
|
The audit table receives one row per CreateRoot / CreateChild /
|
|
Retire transition, scoped to the actor extracted from the API
|
|
request's auth context.
|
|
|
|
## Known limitations
|
|
|
|
The following are tracked in `WORKSPACE-ROADMAP.md` as Rank-8 follow-on
|
|
work — none are required for the v2.1.0 acquisition gate:
|
|
|
|
- HSM-backed roots beyond `signer.FileDriver` (PKCS#11 / cloud KMS
|
|
drivers).
|
|
- Automated rotation: scheduled re-issuance of sub-CAs ahead of
|
|
expiry with parallel-validity windows.
|
|
- Intra-hierarchy CRL chaining: each non-leaf CA publishes a CRL
|
|
covering its direct children's revocations.
|
|
- NameConstraints policy templates: declarative templates an operator
|
|
can pick from instead of hand-rolling the JSON.
|
|
- D3 dendrogram visualization on the GUI page (today's render is a
|
|
recursive `<ul>` nested list).
|