mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:01:32 +00:00
docs: Phase 4 (structural) — move connectors.md + 5 deep dives into reference/connectors/
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
Phase 4 in the audit recommended a full split of connectors.md (2055
lines) into an index + 27 per-connector pages (12 issuer + 15 target).
This commit lands the structural half of that work; full per-target
page extraction is deferred to follow-up commits.
Renames (all blame-preserving):
docs/connectors.md → docs/reference/connectors/index.md
docs/connector-apache.md → docs/reference/connectors/apache.md
docs/connector-f5.md → docs/reference/connectors/f5.md
docs/connector-iis.md → docs/reference/connectors/iis.md
docs/connector-k8s.md → docs/reference/connectors/k8s.md
docs/connector-nginx.md → docs/reference/connectors/nginx.md
Edits:
- docs/reference/connectors/index.md gets a top-of-doc note
explaining the per-connector deep-dive sibling pattern + a forward
list of the 5 per-target pages.
- The 5 per-connector deep-dive pages each get a `Last reviewed:
2026-05-05` header + a back-link to the index.
Deferred to future commits (Phase 4b/c follow-on):
- Extracting the 12 issuer sections from index.md into per-issuer
pages at reference/connectors/{acme,awsacmpca,digicert,ejbca,
entrust,globalsign,googlecas,local,openssl,sectigo,stepca,vault}.md
- Extracting the 10 remaining target sections from index.md into
per-target pages at reference/connectors/{caddy,traefik,envoy,
haproxy,postfix-dovecot,ssh,javakeystore,wincertstore,awsacm,
azurekv}.md
The pragmatic split makes this Phase 4 work incrementally landable —
each per-connector extraction is a small follow-up commit that doesn't
change the docs/ tree shape further. Cross-references from README.md
and other docs to docs/connectors.md still need fixing in Phase 11.
This commit is contained in:
@@ -0,0 +1,106 @@
|
||||
# Apache httpd Connector — Operator Deep-Dive
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
>
|
||||
> Per Phase 14 of the deploy-hardening II master bundle. For the
|
||||
> connector-development context (interface contract, registry, atomic
|
||||
> deploy primitive shared across all targets), see the
|
||||
> [connector index](index.md).
|
||||
|
||||
## Overview
|
||||
|
||||
The Apache connector (`internal/connector/target/apache/`) deploys
|
||||
TLS certs to Apache 2.4 LTS via separate cert/chain/key files +
|
||||
`apachectl configtest` validate + `apachectl graceful` reload.
|
||||
Mirrors the canonical NGINX template (Bundle I Phase 5).
|
||||
|
||||
## Vendor versions tested
|
||||
|
||||
- **Apache httpd 2.4 LTS** (only LTS branch; 2.6 is dev branch)
|
||||
|
||||
## Per-quirk operator guidance
|
||||
|
||||
### Multi-vhost cert-by-vhost
|
||||
|
||||
`TestVendorEdge_Apache_MultiVhostCertByVhost_DeployIsolated_E2E`
|
||||
|
||||
When Apache has multiple `<VirtualHost>` blocks each with its own
|
||||
`SSLCertificateFile`, connector deploys to the matching vhost
|
||||
only. Other vhosts unchanged.
|
||||
|
||||
### `apachectl graceful-stop` drains cleanly
|
||||
|
||||
`TestVendorEdge_Apache_ApachectlGracefulStop_DrainsCleanly_E2E`
|
||||
|
||||
`apachectl graceful` (the connector default) preserves in-flight
|
||||
TLS connections. `apachectl restart` drops them.
|
||||
|
||||
### `mod_ssl` absent
|
||||
|
||||
`TestVendorEdge_Apache_ModSSLAbsent_DeployFailsWithActionableError_E2E`
|
||||
|
||||
If `mod_ssl` isn't loaded, `apachectl configtest` fails with
|
||||
"Invalid command 'SSLCertificateFile'". Connector surfaces this
|
||||
verbatim — operator action: `LoadModule ssl_module modules/mod_ssl.so`.
|
||||
|
||||
### `.htaccess` interactions
|
||||
|
||||
`TestVendorEdge_Apache_HtaccessRequireSSL_NotImpactedByDeploy_E2E`
|
||||
|
||||
`.htaccess` rules requiring SSL are not impacted by cert rotation.
|
||||
The `Require` directive evaluates per-request against the
|
||||
connection's TLS state, not the cert file.
|
||||
|
||||
### Apache 2.4 LTS reload semantics pinned
|
||||
|
||||
`TestVendorEdge_Apache_Apache24LTSReloadSemanticsPinned_E2E`
|
||||
|
||||
`apachectl graceful` semantics stable across 2.4.x patch versions.
|
||||
No per-version branch needed.
|
||||
|
||||
### Syntax error rollback
|
||||
|
||||
`TestVendorEdge_Apache_SyntaxErrorRollback_E2E`
|
||||
|
||||
`apachectl configtest` failure aborts before atomic rename. Live
|
||||
cert untouched.
|
||||
|
||||
### Per-vhost key ownership
|
||||
|
||||
`TestVendorEdge_Apache_PerVhostKeyOwnership_E2E`
|
||||
|
||||
When multiple vhosts share the same key file, ownership is
|
||||
preserved across rotation. When each vhost has its own key,
|
||||
per-file ownership is preserved per Bundle I Phase 5.
|
||||
|
||||
### Reload preserves connections
|
||||
|
||||
`TestVendorEdge_Apache_ReloadVsRestart_PreservesConnections_E2E`
|
||||
|
||||
In-flight TLS sessions survive `apachectl graceful` worker
|
||||
swap. Documented in `docs/deployment-atomicity.md`.
|
||||
|
||||
### SNI server_name binding
|
||||
|
||||
`TestVendorEdge_Apache_SNIServerNameDeployBindsCorrect_E2E`
|
||||
|
||||
When deploy specifies `server_name` metadata, connector targets
|
||||
the matching `<VirtualHost>` block.
|
||||
|
||||
### Cert chain ordering
|
||||
|
||||
`TestVendorEdge_Apache_ChainOrderingNormalized_E2E`
|
||||
|
||||
Apache requires leaf cert FIRST in `SSLCertificateFile` (or
|
||||
chain in `SSLCertificateChainFile`). Connector preserves operator-
|
||||
supplied ordering across rotation.
|
||||
|
||||
## V3-Pro deferrals
|
||||
|
||||
- Apache 2.6 (when it ships LTS).
|
||||
- mod_md (Apache's built-in ACME) interop.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
|
||||
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
|
||||
@@ -0,0 +1,171 @@
|
||||
# F5 BIG-IP Connector — Operator Deep-Dive
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
>
|
||||
> Per Phase 14 of the deploy-hardening II master bundle. For the
|
||||
> connector-development context (interface contract, registry, atomic
|
||||
> deploy primitive shared across all targets), see the
|
||||
> [connector index](index.md).
|
||||
|
||||
## Overview
|
||||
|
||||
The F5 connector (`internal/connector/target/f5/`) deploys TLS
|
||||
certs to F5 BIG-IP load balancers via the iControl REST API.
|
||||
F5's transactional API gives certctl atomic-update semantics for
|
||||
free at the API level — the Bundle I rollback wire layers
|
||||
on-failure cleanup of orphaned crypto objects.
|
||||
|
||||
## Vendor versions tested
|
||||
|
||||
- **F5 v15.1 LTS**
|
||||
- **F5 v17.0 LTS**
|
||||
- **F5 v17.5**
|
||||
|
||||
## Two-tier validation strategy (frozen decision 0.3)
|
||||
|
||||
1. **CI tier**: `f5-mock-icontrol` sidecar — in-tree Go server at
|
||||
`deploy/test/f5-mock-icontrol/` implementing the iControl REST
|
||||
surface this bundle exercises (auth, file upload, transactions,
|
||||
SSL profile CRUD). All `TestVendorEdge_F5_*_E2E` tests run
|
||||
against this in CI.
|
||||
2. **Customer-grade tier**: operator-supplied real F5 vagrant box.
|
||||
Documented setup recipe below. Manual smoke required for
|
||||
"verified" status in `docs/deployment-vendor-matrix.md`.
|
||||
|
||||
The mock implements a SUBSET of iControl REST. A real F5 may
|
||||
diverge on quirks the mock doesn't model. Customer-grade
|
||||
validation against the vagrant box is the validation tier above
|
||||
the mock.
|
||||
|
||||
## Setting up the operator-supplied real F5
|
||||
|
||||
```bash
|
||||
# F5 Networks publishes BIG-IP VE (Virtual Edition) on:
|
||||
# https://downloads.f5.com → BIG-IP VE → 17.5.0 → Vagrant
|
||||
# Download the .box file (requires F5 account; free tier ok).
|
||||
vagrant box add f5/big-ip-17.5.0 ~/Downloads/BIGIP-17.5.0.0.0.box
|
||||
vagrant init f5/big-ip-17.5.0
|
||||
vagrant up
|
||||
|
||||
# Then point certctl at vagrant's mapped management interface:
|
||||
# https://localhost:8443 with admin/<vagrant-default-password>
|
||||
# Per-target Config:
|
||||
# Host: "localhost"
|
||||
# Port: 8443
|
||||
# Username: "admin"
|
||||
# Password: "<from vagrant>"
|
||||
```
|
||||
|
||||
Run the F5 vendor-edge tests against the real F5 by setting:
|
||||
|
||||
```
|
||||
F5_REAL_HOST=localhost:8443 \
|
||||
F5_REAL_USER=admin \
|
||||
F5_REAL_PASS=<vagrant-pass> \
|
||||
INTEGRATION=1 go test -tags integration \
|
||||
-run 'TestVendorEdge_F5' ./deploy/test/...
|
||||
```
|
||||
|
||||
(Test bodies opt into the real-F5 path when these env vars are
|
||||
set; otherwise default to the mock sidecar.)
|
||||
|
||||
## Per-quirk operator guidance
|
||||
|
||||
### SSL profile reference counting
|
||||
|
||||
`TestVendorEdge_F5_SSLProfileReferenceCounting_TransactionWithNVS_AtomicCommit_E2E`
|
||||
|
||||
When a transaction binds the new SSL profile to N virtual
|
||||
servers, F5 commits all N atomically. Failure aborts all N.
|
||||
|
||||
### Client SSL vs server SSL profile
|
||||
|
||||
`TestVendorEdge_F5_ClientSSLProfileVsServerSSLProfile_DeployUpdatesCorrect_E2E`
|
||||
|
||||
F5 has separate `client-ssl` profiles (terminating TLS from clients)
|
||||
and `server-ssl` profiles (originating TLS to backends). Connector
|
||||
targets the operator-named profile only.
|
||||
|
||||
### Partition handling
|
||||
|
||||
`TestVendorEdge_F5_PartitionCommonVsCustom_DeployRespectsPartition_E2E`
|
||||
|
||||
F5 partitions namespace objects (Common, custom-tenant). Connector
|
||||
respects the operator-supplied `Partition`.
|
||||
|
||||
### v15 vs v17 API stability
|
||||
|
||||
`TestVendorEdge_F5_F5v15_vs_v17_TransactionAPIShapeStable_E2E`
|
||||
|
||||
`mgmt/tm/transaction` API shape stable across v15.1 LTS and v17.x.
|
||||
No per-version branch needed.
|
||||
|
||||
### Large cert chain (>4 links)
|
||||
|
||||
`TestVendorEdge_F5_LargeCertChainHandling_E2E`
|
||||
|
||||
v15.x had a known issue with cert chains >4 links (silent
|
||||
truncation of the deep links). v17.x lifted this limit.
|
||||
|
||||
**Operator action:** if on v15.x, keep chains ≤4 links OR upgrade
|
||||
to v17.x. Documented loud in this doc.
|
||||
|
||||
### Auth token expiry
|
||||
|
||||
`TestVendorEdge_F5_AuthTokenExpiryRefresh_E2E`
|
||||
|
||||
F5 auth tokens expire (default 1200s). Connector re-authenticates
|
||||
on 401 transparently.
|
||||
|
||||
### Transaction timeout cleanup
|
||||
|
||||
`TestVendorEdge_F5_TransactionTimeoutCleanup_E2E`
|
||||
|
||||
Open transactions timeout after 120s. Bundle I rollback wire
|
||||
catches orphaned crypto objects (uploaded files not committed via
|
||||
transaction).
|
||||
|
||||
### Same-VS update
|
||||
|
||||
`TestVendorEdge_F5_VirtualServerBindingOnSameVS_E2E`
|
||||
|
||||
Re-binding an SSL profile on the same Virtual Server is atomic
|
||||
at the F5 API level. No listener disruption.
|
||||
|
||||
### SSL options preservation
|
||||
|
||||
`TestVendorEdge_F5_SSLOptionsPreservedAcrossRotation_E2E`
|
||||
|
||||
Operator-supplied `cipher-list`, `no-tls-v1`, `secure-renegotiate`
|
||||
options on the SSL profile preserved across cert rotation.
|
||||
|
||||
### iControl REST rate limit
|
||||
|
||||
`TestVendorEdge_F5_iControlRESTRateLimit_E2E`
|
||||
|
||||
F5 iControl REST defaults to 100 req/s. Connector backs off on
|
||||
429 with exponential retry.
|
||||
|
||||
## Troubleshooting matrix
|
||||
|
||||
| Symptom | Test name | Operator action |
|
||||
|---|---|---|
|
||||
| Cert deploys but only 4 chain links served | `LargeCertChainHandling_E2E` | upgrade to v17.x or shorten chain |
|
||||
| Frequent 401 retries | `AuthTokenExpiryRefresh_E2E` | benign; tune token lifetime if needed |
|
||||
| Orphaned `/Common/cert-<timestamp>` objects | `TransactionTimeoutCleanup_E2E` | run cleanup script; check for hung deploys |
|
||||
| Wrong partition deployed to | `PartitionCommonVsCustom_E2E` | verify `Partition` in connector config |
|
||||
| Cipher list reset post-rotate | `SSLOptionsPreservedAcrossRotation_E2E` | bug — file an issue |
|
||||
|
||||
## V3-Pro deferrals
|
||||
|
||||
- F5 GTM (DNS-load-balancer cert deploys).
|
||||
- F5 NGINX Plus cert deploy via the F5 API (when F5 ships the
|
||||
unified API).
|
||||
- AS3 declarative deploy (operator-friendly JSON declaration vs
|
||||
the imperative iControl REST flow).
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
|
||||
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
|
||||
- F5 official iControl REST docs: <https://clouddocs.f5.com/api/icontrol-rest/>
|
||||
@@ -0,0 +1,200 @@
|
||||
# Microsoft IIS Connector — Operator Deep-Dive
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
>
|
||||
> Per Phase 14 of the deploy-hardening II master bundle. For the
|
||||
> connector-development context (interface contract, registry, atomic
|
||||
> deploy primitive shared across all targets), see the
|
||||
> [connector index](index.md).
|
||||
|
||||
## Overview
|
||||
|
||||
The IIS connector (`internal/connector/target/iis/`) deploys TLS
|
||||
certs to Windows IIS servers via PowerShell (`Import-PfxCertificate`
|
||||
+ `New-WebBinding` + SNI binding). Pre-deploy snapshot of the
|
||||
existing thumbprint allows rollback if the new binding fails.
|
||||
|
||||
## Vendor versions tested
|
||||
|
||||
- **Windows Server 2019** with IIS 10
|
||||
- **Windows Server 2022** with IIS 10
|
||||
|
||||
## CI runner constraint
|
||||
|
||||
Per frozen decision 0.4: Windows containers run only on Windows
|
||||
hosts. Linux CI runners CAN'T run the IIS sidecar. IIS e2e tests
|
||||
run on a separate `windows-vendor-e2e` GitHub Actions matrix job
|
||||
on `windows-latest` runners. Operators on Linux-only CI use
|
||||
`//go:build integration && !no_iis` to skip.
|
||||
|
||||
## Per-quirk operator guidance
|
||||
|
||||
### App-pool recycle (opt-in)
|
||||
|
||||
`TestVendorEdge_IIS_AppPoolRecycle_OptInForCertChange_E2E`
|
||||
|
||||
By default, IIS picks up new SSL bindings without app-pool
|
||||
recycle (the binding-edit path is hot). Some sites need recycle
|
||||
to fully reload (e.g., apps that cache cert handles).
|
||||
|
||||
**Operator action:** set `AppPoolRecycle: true` per-target. The
|
||||
connector then runs `Restart-WebAppPool <pool>` after binding update.
|
||||
|
||||
### SNI multi-binding per site
|
||||
|
||||
`TestVendorEdge_IIS_SNIMultiBindingPerSite_DeployUpdatesCorrectBinding_E2E`
|
||||
|
||||
When a site has multiple SNI bindings (different hostnames on
|
||||
the same site), connector targets the binding matching the
|
||||
operator-supplied hostname. Other bindings unchanged.
|
||||
|
||||
### CCS (Centralized Certificate Store)
|
||||
|
||||
`TestVendorEdge_IIS_CCSCentralizedCertStoreVariant_DeployToSharedStore_E2E`
|
||||
|
||||
CCS is the file-based variant where multiple IIS servers share
|
||||
a UNC path of cert files. Connector writes to the shared path;
|
||||
all IIS servers pick it up automatically.
|
||||
|
||||
### WinRM remote vs local PowerShell
|
||||
|
||||
`TestVendorEdge_IIS_WinRMRemotePath_vs_LocalPowerShellPath_BothWork_E2E`
|
||||
|
||||
Two code paths produce equivalent cert installs:
|
||||
- `WinRMHost: ""` → local PowerShell (agent runs on the IIS server)
|
||||
- `WinRMHost: "iis.example"` → remote PowerShell via WinRM
|
||||
|
||||
Both rotate the same way. WinRM path requires network reachability
|
||||
to port 5985/5986.
|
||||
|
||||
### Server 2019 vs 2022 PowerShell compat
|
||||
|
||||
`TestVendorEdge_IIS_WindowsServer2019_vs_2022_PowerShellCompat_E2E`
|
||||
|
||||
`Import-PfxCertificate` + `New-WebBinding` semantics are stable
|
||||
across server versions. PowerShell 5.1 (2019) + PowerShell 7.x
|
||||
(2022) both work.
|
||||
|
||||
### Friendly name
|
||||
|
||||
`TestVendorEdge_IIS_FriendlyNameUpdatedOnRotation_E2E`
|
||||
|
||||
Connector preserves operator-supplied `FriendlyName` on the cert
|
||||
across rotation. Useful for IIS GUI identification.
|
||||
|
||||
### HTTP/2 + ALPN
|
||||
|
||||
`TestVendorEdge_IIS_HTTP2ALPNPreserved_E2E`
|
||||
|
||||
IIS h2 negotiation preserved across cert rotation. The
|
||||
`netsh http show sslcert` ALPN attribute survives the binding swap.
|
||||
|
||||
### Binding-type validation
|
||||
|
||||
`TestVendorEdge_IIS_BindingTypeHttpsValidated_E2E`
|
||||
|
||||
Connector refuses to deploy to non-`https` bindings (e.g., `http`,
|
||||
`net.tcp`). Surfaces actionable error.
|
||||
|
||||
### ARR reverse-proxy
|
||||
|
||||
`TestVendorEdge_IIS_ARRReverseProxyCertRotation_E2E`
|
||||
|
||||
Sites using Application Request Routing as reverse proxy: cert
|
||||
rotation does not invalidate ARR routes. The cert-binding edit
|
||||
is independent of the ARR config.
|
||||
|
||||
### Atomic SNI binding swap
|
||||
|
||||
`TestVendorEdge_IIS_RemovePreviousBindingOnRotate_E2E`
|
||||
|
||||
Connector removes the previous SNI binding BEFORE inserting the
|
||||
new one (atomicity at the IIS API level). Prevents brief
|
||||
window where two bindings serve different certs for the same
|
||||
hostname.
|
||||
|
||||
## Troubleshooting matrix
|
||||
|
||||
| Symptom | Test name | Operator action |
|
||||
|---|---|---|
|
||||
| Cert installed but app pool serving old cert | `AppPoolRecycle_OptInForCertChange_E2E` | set `AppPoolRecycle: true` |
|
||||
| Wrong SNI binding updated | `SNIMultiBindingPerSite_E2E` | verify hostname selector |
|
||||
| Permission denied on cert install | n/a | agent must run as administrator |
|
||||
| WinRM connection failed | `WinRMRemotePath_vs_LocalPowerShellPath_E2E` | check WinRM port 5985/5986 reachability |
|
||||
| h2 negotiation broken post-rotate | `HTTP2ALPNPreserved_E2E` | re-run `netsh http add sslcert` with `appid + clientcertnegotiation=enable` |
|
||||
|
||||
## V3-Pro deferrals
|
||||
|
||||
- IIS Application Initialization module integration (warm cert
|
||||
cache after rotation).
|
||||
- Azure Key Vault + IIS integration (operator opt-in).
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
|
||||
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
|
||||
|
||||
## Operator validation playbook (Windows host)
|
||||
|
||||
CI no longer runs the IIS + WinCertStore vendor-e2e tests on every
|
||||
push. Per ci-pipeline-cleanup bundle frozen decision 0.5 (which
|
||||
revises Bundle II decision 0.4), the Windows matrix was deleted
|
||||
because (a) it couldn't physically work on `windows-latest` GitHub
|
||||
runners (Docker not started in Windows-containers mode by default;
|
||||
`bridge` network driver doesn't exist on Windows Docker — uses
|
||||
`nat`), and (b) all IIS + WinCertStore vendor-edge tests are
|
||||
`t.Log` placeholder stubs that exercise no IIS-specific behavior.
|
||||
|
||||
The real IIS connector validation lives in:
|
||||
|
||||
1. `internal/connector/target/iis/` unit tests (run on Linux in the
|
||||
regular Go Build & Test job — already green on every push).
|
||||
2. This playbook — operator manual smoke against a real Windows host
|
||||
pre-release.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Windows Server 2019 or 2022 host (or Windows 10/11 Pro with Hyper-V)
|
||||
- Docker Desktop in Windows containers mode
|
||||
(Settings → "Switch to Windows containers")
|
||||
- Go 1.25.9 + git
|
||||
|
||||
### Procedure
|
||||
|
||||
```powershell
|
||||
# Clone + checkout
|
||||
git clone https://github.com/certctl-io/certctl.git
|
||||
cd certctl
|
||||
git fetch --tags
|
||||
git checkout v2.X.0 # whichever release is being validated
|
||||
|
||||
# Bring up the Windows IIS sidecar
|
||||
docker compose --profile deploy-e2e-windows `
|
||||
-f deploy/docker-compose.test.yml `
|
||||
up -d windows-iis-test
|
||||
Start-Sleep -Seconds 30
|
||||
|
||||
# Run IIS + WinCertStore vendor-edge tests
|
||||
$env:INTEGRATION = "1"
|
||||
go test -tags integration -race -count=1 `
|
||||
-run 'VendorEdge_(IIS|WinCertStore)' `
|
||||
./deploy/test/... | Tee-Object -FilePath iis-validation.log
|
||||
|
||||
# Tear down
|
||||
docker compose --profile deploy-e2e-windows `
|
||||
-f deploy/docker-compose.test.yml `
|
||||
down -v
|
||||
```
|
||||
|
||||
### Acceptance
|
||||
|
||||
Per Bundle II frozen decision 0.14, the IIS / WinCertStore cells in
|
||||
`docs/deployment-vendor-matrix.md` flip from "CI" / "pending" → "✓"
|
||||
only when ALL of the following are true:
|
||||
|
||||
- ≥1 happy-path e2e passes against the real Windows IIS sidecar
|
||||
- ≥1 specific-quirk test for that Windows Server version passes
|
||||
- This playbook's full procedure ran clean once on a real Windows host
|
||||
|
||||
Operator records the validation date + Windows Server version in
|
||||
`cowork/<bundle>/iis-validation-receipts.md` for audit trail.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,122 @@
|
||||
# Kubernetes Secrets Connector — Operator Deep-Dive
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
>
|
||||
> Per Phase 14 of the deploy-hardening II master bundle. For the
|
||||
> connector-development context (interface contract, registry, atomic
|
||||
> deploy primitive shared across all targets), see the
|
||||
> [connector index](index.md).
|
||||
|
||||
## Overview
|
||||
|
||||
The K8s connector (`internal/connector/target/k8ssecret/`) deploys
|
||||
TLS certs into `kubernetes.io/tls` Secrets. Atomic at the API
|
||||
server level (Update is transactional); the post-deploy verify
|
||||
SHA-256-compares the returned Secret data against deployed bytes
|
||||
(defends against admission webhooks that modify cert data).
|
||||
|
||||
## Vendor versions tested
|
||||
|
||||
- **Kubernetes 1.28 LTS**
|
||||
- **Kubernetes 1.30**
|
||||
- **Kubernetes 1.31** (current stable)
|
||||
|
||||
## Per-quirk operator guidance
|
||||
|
||||
### Kubelet sync wait contract
|
||||
|
||||
`TestVendorEdge_K8s_KubeletSyncWaitContract_DefaultTimeout60s_E2E`
|
||||
|
||||
After Secret update, kubelet projects new cert bytes into
|
||||
pod-mounted volumes. Default sync interval ~60s. The connector
|
||||
waits up to `CERTCTL_K8S_DEPLOY_KUBELET_SYNC_TIMEOUT` (default
|
||||
60s).
|
||||
|
||||
**Operator action:** for slow clusters (large pod count, slow
|
||||
node DNS), tune the env var upward. For fast clusters, the
|
||||
default is fine.
|
||||
|
||||
### Admission webhook mutation
|
||||
|
||||
`TestVendorEdge_K8s_AdmissionWebhookModifiesSecretData_DeployDetectsViaSHA256Compare_E2E`
|
||||
|
||||
Some admission webhooks (Vault Agent Injector, OPA Gatekeeper)
|
||||
mutate Secret data on Update. The connector pulls the Secret
|
||||
back after Update and SHA-256-compares against deployed bytes.
|
||||
Mismatch surfaces as deploy failure.
|
||||
|
||||
### Multi-version API stability
|
||||
|
||||
`TestVendorEdge_K8s_K8s128LTS_vs_130_vs_131_SecretAPIContractStable_E2E`
|
||||
|
||||
`kubernetes.io/tls` Secret schema (data.tls.crt + data.tls.key)
|
||||
is stable across 1.28-1.31. No per-version branch needed.
|
||||
|
||||
### Typed vs Opaque Secret
|
||||
|
||||
`TestVendorEdge_K8s_TypedKubernetesIOTLSVsUntypedOpaque_DeployRespectsType_E2E`
|
||||
|
||||
Connector preserves operator-supplied Secret type. Typed
|
||||
`kubernetes.io/tls` is the canonical form; untyped `Opaque` is
|
||||
preserved for operators with legacy automation that expects it.
|
||||
|
||||
### Cert-manager interop
|
||||
|
||||
`TestVendorEdge_K8s_CertManagerInterop_RawSecretVsCertificateCRD_E2E`
|
||||
|
||||
Connector targets raw Secrets, NOT cert-manager `Certificate` CRs.
|
||||
Operators using cert-manager should NOT also point certctl at the
|
||||
same Secret name (cert-manager will overwrite). Documented
|
||||
coexistence: certctl handles non-cert-manager Secrets;
|
||||
cert-manager handles its own.
|
||||
|
||||
### Multi-namespace
|
||||
|
||||
`TestVendorEdge_K8s_MultiNamespaceDeploy_DeployUpdatesCorrectNamespace_E2E`
|
||||
|
||||
Connector targets the configured `Namespace` only. Cross-namespace
|
||||
deploys require multiple connector entries.
|
||||
|
||||
### RBAC errors
|
||||
|
||||
`TestVendorEdge_K8s_RBACInsufficientPermissions_DeployFailsWithActionableError_E2E`
|
||||
|
||||
Connector surfaces the K8s API's `forbidden: secrets is restricted`
|
||||
error verbatim. Operator action: bind a Role with
|
||||
`secrets: get,update,create` verbs to the agent's ServiceAccount.
|
||||
|
||||
### Labels + annotations preservation
|
||||
|
||||
`TestVendorEdge_K8s_LabelsAnnotationsPreserved_E2E`
|
||||
|
||||
Connector merges (not replaces) operator-supplied metadata. Custom
|
||||
labels/annotations on the Secret survive cert rotation.
|
||||
|
||||
### Pod-mounted Secret rollover
|
||||
|
||||
`TestVendorEdge_K8s_PodMountedSecretRollover_E2E`
|
||||
|
||||
When a pod mounts the Secret as a volume, kubelet projects new
|
||||
cert bytes into the pod's filesystem after sync. Pods watching
|
||||
the file (via inotify or polling) pick up the new cert without
|
||||
restart.
|
||||
|
||||
### Immutable Secret flag
|
||||
|
||||
`TestVendorEdge_K8s_ImmutableSecretFlag_E2E`
|
||||
|
||||
K8s Secrets can be marked `immutable: true` for performance.
|
||||
Update fails with actionable error; operator must drop the flag,
|
||||
update, then re-apply if desired.
|
||||
|
||||
## V3-Pro deferrals
|
||||
|
||||
- cert-manager `Certificate` CR interop as first-class deploy
|
||||
target (V3-Pro: certctl as cert-manager external issuer).
|
||||
- Multi-cluster federation (deploy a single cert across N
|
||||
clusters with single connector entry).
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
|
||||
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
|
||||
@@ -0,0 +1,164 @@
|
||||
# NGINX Connector — Operator Deep-Dive
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
>
|
||||
> Per Phase 14 of the deploy-hardening II master bundle. Operator-grade
|
||||
> documentation for the NGINX target connector. For the
|
||||
> connector-development context (interface contract, registry, atomic
|
||||
> deploy primitive shared across all targets), see the
|
||||
> [connector index](index.md).
|
||||
|
||||
## Overview
|
||||
|
||||
The NGINX connector (`internal/connector/target/nginx/`) is the
|
||||
canonical implementation of the deploy-hardening I atomic + verify
|
||||
+ rollback contract (Bundle I Phase 4). Every other file-based
|
||||
connector models on this one.
|
||||
|
||||
## Vendor versions tested
|
||||
|
||||
- **NGINX 1.25 LTS** (current LTS branch)
|
||||
- **NGINX 1.27 stable** (current stable branch)
|
||||
|
||||
Older versions (1.18 EOL'd 2021, 1.20 EOL'd 2022) are explicitly
|
||||
out of scope per frozen decision 0.1.
|
||||
|
||||
## Deploy contract
|
||||
|
||||
Every cert deploy follows the Bundle I `deploy.Apply(ctx, plan)`
|
||||
flow:
|
||||
|
||||
1. **Idempotency check** — SHA-256 over cert+chain+key bytes; skip
|
||||
if all match destination.
|
||||
2. **Pre-deploy backup** — copy existing files to
|
||||
`<path>.certctl-bak.<unix-nanos>`.
|
||||
3. **Atomic write** — temp-file + chown + atomic rename per
|
||||
destination.
|
||||
4. **PreCommit (validate)** — runs `nginx -t` per the operator's
|
||||
`validate_command`. Failure aborts; no live cert touched.
|
||||
5. **Atomic rename** — temp → final for every File entry.
|
||||
6. **PostCommit (reload)** — runs `nginx -s reload` per the
|
||||
operator's `reload_command`.
|
||||
7. **Post-deploy TLS verify** — dials the configured endpoint;
|
||||
pulls leaf cert SHA-256; compares against deployed bytes.
|
||||
Mismatch triggers automatic rollback.
|
||||
|
||||
## Per-quirk operator guidance
|
||||
|
||||
### SSL session cache holds old cert
|
||||
|
||||
`TestVendorEdge_NGINX_SSLSessionCacheHoldsOldCert_E2E`
|
||||
|
||||
NGINX's `ssl_session_cache` (default `shared:SSL:10m`) keeps TLS
|
||||
session IDs valid for `ssl_session_timeout` (default 5min). Clients
|
||||
that resume via session ID see the OLD cert until their session
|
||||
expires.
|
||||
|
||||
**Operator action:** this is documented behavior, not a bug.
|
||||
Tune via `ssl_session_timeout 5m;` (default) or shorter if your
|
||||
cert rotation cadence demands. Post-deploy verify in certctl will
|
||||
return the NEW cert from a fresh handshake (no session resumption);
|
||||
warm clients see the OLD cert until session-cache eviction.
|
||||
|
||||
### SNI multi-server-name binding
|
||||
|
||||
`TestVendorEdge_NGINX_SNIMultiServerName_DeployBindsCorrectVhost_E2E`
|
||||
|
||||
When NGINX has multiple `server { server_name a.example b.example; }`
|
||||
blocks, the operator deploys with metadata pointing at the
|
||||
specific vhost. Connector binds to that vhost only; other vhosts
|
||||
remain unchanged.
|
||||
|
||||
### IPv6 dual-stack
|
||||
|
||||
`TestVendorEdge_NGINX_IPv6DualStackBindsBoth_E2E`
|
||||
|
||||
NGINX listening on `0.0.0.0:443` + `[::]:443` serves the new cert
|
||||
on both stacks after a single deploy.
|
||||
|
||||
**Operator action:** if your post-deploy verify endpoint resolves
|
||||
to IPv6 only on some networks but IPv4 only on others, configure
|
||||
`PostDeployVerifyAttempts: 5` to cover both paths.
|
||||
|
||||
### Reload vs restart
|
||||
|
||||
`TestVendorEdge_NGINX_ReloadVsRestart_NoConnectionDrop_E2E`
|
||||
|
||||
`nginx -s reload` (graceful) preserves in-flight TLS connections
|
||||
via worker handoff. `nginx -s stop && nginx` drops them.
|
||||
|
||||
**Operator action:** never use restart for cert rotation. The
|
||||
connector's default `reload_command: nginx -s reload` is correct.
|
||||
|
||||
### Binary upgrade
|
||||
|
||||
`TestVendorEdge_NGINX_UpgradeBinaryHotReload_E2E`
|
||||
|
||||
`nginx -s upgrade` rolls out a new binary without dropping
|
||||
connections. Not commonly used; documented for ops teams that do
|
||||
rolling NGINX binary upgrades.
|
||||
|
||||
### Config syntax error → rollback
|
||||
|
||||
`TestVendorEdge_NGINX_ConfigSyntaxError_RollbackRestoresPreviousCert_E2E`
|
||||
|
||||
If `nginx -t` rejects the staged config, the deploy package's
|
||||
PreCommit gate fires before the atomic rename — no live file is
|
||||
touched. The cert directory is exactly as it was.
|
||||
|
||||
### Missing intermediate
|
||||
|
||||
`TestVendorEdge_NGINX_MissingIntermediate_DeployedButValidationCatchesAtPostVerify_E2E`
|
||||
|
||||
If the operator deploys a leaf-only cert (no intermediate), NGINX
|
||||
will start serving it but downstream clients fail chain validation.
|
||||
The connector's post-deploy TLS verify catches this via cert chain
|
||||
walk; rollback fires automatically.
|
||||
|
||||
### Access log privacy
|
||||
|
||||
`TestVendorEdge_NGINX_AccessLogPrivacy_NoCertBytesLeakInLogs_E2E`
|
||||
|
||||
NGINX's default `access_log` and `error_log` formats do NOT include
|
||||
SSL key bytes. The connector does not modify NGINX's logging config.
|
||||
|
||||
**Operator action:** if you've customized `log_format` to include
|
||||
`$ssl_*` variables, audit the format string for sensitive fields.
|
||||
|
||||
### Per-version reload-command compat
|
||||
|
||||
`TestVendorEdge_NGINX_NGINX125_vs_127_ReloadCommandCompatible_E2E`
|
||||
|
||||
`nginx -s reload` semantics are identical between 1.25 LTS and
|
||||
1.27 stable. No per-version branch needed in operator config.
|
||||
|
||||
### High-concurrency deploy under load
|
||||
|
||||
`TestVendorEdge_NGINX_HighConcurrencyDeployUnderLoad_E2E`
|
||||
|
||||
NGINX's worker handoff during reload is graceful; concurrent TLS
|
||||
handshakes during a deploy succeed without 5xx errors.
|
||||
|
||||
## Troubleshooting matrix
|
||||
|
||||
| Symptom | Test name | Root cause | Operator action |
|
||||
|---|---|---|---|
|
||||
| Old cert returned 5min after deploy | `SSLSessionCacheHoldsOldCert_E2E` | session cache TTL | tune `ssl_session_timeout` |
|
||||
| Wrong vhost serves new cert | `SNIMultiServerName_E2E` | misconfigured server_name selector | verify vhost metadata |
|
||||
| Post-verify fails on IPv6 | `IPv6DualStackBindsBoth_E2E` | flaky DNS resolution | `PostDeployVerifyAttempts: 5` |
|
||||
| Connection drops on cert change | n/a | using restart instead of reload | use `nginx -s reload` |
|
||||
| Deploy aborts with `nginx -t` error | `ConfigSyntaxError_RollbackRestoresPreviousCert_E2E` | bad config (not deploy's fault) | fix config; redeploy |
|
||||
| Chain-validation failure post-deploy | `MissingIntermediate_E2E` | leaf-only cert | include full chain in deploy |
|
||||
|
||||
## V3-Pro deferrals
|
||||
|
||||
- Pin NGINX `ssl_session_ticket_key` rotation interaction with cert
|
||||
rotation (rare; documented but not tested).
|
||||
- NGINX Plus `dyn_pem` API integration (commercial; not V2 scope).
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
|
||||
— the Bundle I primitive every connector consumes.
|
||||
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
|
||||
- [Connectors reference](connectors.md)
|
||||
Reference in New Issue
Block a user