Files
certctl/docs/connector-nginx.md
T
shankar0123 0834bc1ad5 docs: deployment vendor matrix + per-connector deep-dive docs (NGINX + K8s + IIS + Apache + F5)
Phase 14 of the deploy-hardening II master bundle. The procurement-
team headline doc + per-connector operator guides for the top 5
most-deployed connectors.

NEW docs/deployment-vendor-matrix.md (~30 rows):
- Per (connector × vendor-version) status: ✓ / CI / mock / pending / n/a
- Known issues + workarounds + e2e test name reference
- LTS + current-stable scope per frozen decision 0.1
- Quarterly re-pin cadence guidance for sidecar digests
- "How to add a new vendor version" recipe

Per frozen decision 0.14: a (connector × vendor-version) cell is
"verified" only when ALL apply: ≥1 happy-path e2e green; ≥1
specific-quirk test green for that version; operator manual smoke
completed at least once. Cells lacking the third criterion show
"CI" status (auto-tests green but pending operator validation).

Status snapshot at bundle close:
- NGINX 1.25 + 1.27: CI
- Apache 2.4: CI
- HAProxy 2.6 + 2.8 + 3.0: CI
- Traefik 2.x + 3.x: CI
- Caddy 2.x: CI
- Envoy 1.30 + 1.32: CI (file-mode SDS only; gRPC SDS V3-Pro)
- Postfix 3.6 + 3.8: CI
- Dovecot 2.3: CI
- IIS 10 (2019, 2022): pending (Windows-host-only CI)
- F5 v15.1 + v17.0 + v17.5: mock (real-F5 vagrant box documented)
- SSH OpenSSH 8.x + 9.x: CI
- WinCertStore (2019, 2022): pending (Windows-host-only)
- JavaKeystore JDK 11 + 17 + 21: pending
- K8s 1.28 + 1.30 + 1.31: CI

NEW per-connector deep-dive docs:
- docs/connector-nginx.md (~150 lines, 10 quirks documented)
- docs/connector-k8s.md (~110 lines, 10 quirks)
- docs/connector-iis.md (~120 lines, 10 quirks; Windows-host-only
  CI constraint loud)
- docs/connector-apache.md (~80 lines, 10 quirks)
- docs/connector-f5.md (~190 lines, 10 quirks; two-tier validation
  recipe for operator-supplied real-F5 vagrant box)

Each doc follows the same structure:
- Overview
- Vendor versions tested
- Per-quirk operator guidance (one section per
  TestVendorEdge_<vendor>_<edge>_E2E)
- Troubleshooting matrix
- V3-Pro deferrals
- Related docs cross-refs

Other connector docs (HAProxy, Traefik, Caddy, Envoy, Postfix,
Dovecot, SSH, WinCertStore, JavaKeystore) live in docs/connectors.md
+ are referenced from the matrix.

Phase 15 next: per-vendor CI matrix job in
.github/workflows/ci.yml.
2026-04-30 16:16:48 +00:00

160 lines
5.9 KiB
Markdown

# NGINX Connector — Operator Deep-Dive
> Per Phase 14 of the deploy-hardening II master bundle. Operator-
> grade documentation for the NGINX target connector.
## Overview
The NGINX connector (`internal/connector/target/nginx/`) is the
canonical implementation of the deploy-hardening I atomic + verify
+ rollback contract (Bundle I Phase 4). Every other file-based
connector models on this one.
## Vendor versions tested
- **NGINX 1.25 LTS** (current LTS branch)
- **NGINX 1.27 stable** (current stable branch)
Older versions (1.18 EOL'd 2021, 1.20 EOL'd 2022) are explicitly
out of scope per frozen decision 0.1.
## Deploy contract
Every cert deploy follows the Bundle I `deploy.Apply(ctx, plan)`
flow:
1. **Idempotency check** — SHA-256 over cert+chain+key bytes; skip
if all match destination.
2. **Pre-deploy backup** — copy existing files to
`<path>.certctl-bak.<unix-nanos>`.
3. **Atomic write** — temp-file + chown + atomic rename per
destination.
4. **PreCommit (validate)** — runs `nginx -t` per the operator's
`validate_command`. Failure aborts; no live cert touched.
5. **Atomic rename** — temp → final for every File entry.
6. **PostCommit (reload)** — runs `nginx -s reload` per the
operator's `reload_command`.
7. **Post-deploy TLS verify** — dials the configured endpoint;
pulls leaf cert SHA-256; compares against deployed bytes.
Mismatch triggers automatic rollback.
## Per-quirk operator guidance
### SSL session cache holds old cert
`TestVendorEdge_NGINX_SSLSessionCacheHoldsOldCert_E2E`
NGINX's `ssl_session_cache` (default `shared:SSL:10m`) keeps TLS
session IDs valid for `ssl_session_timeout` (default 5min). Clients
that resume via session ID see the OLD cert until their session
expires.
**Operator action:** this is documented behavior, not a bug.
Tune via `ssl_session_timeout 5m;` (default) or shorter if your
cert rotation cadence demands. Post-deploy verify in certctl will
return the NEW cert from a fresh handshake (no session resumption);
warm clients see the OLD cert until session-cache eviction.
### SNI multi-server-name binding
`TestVendorEdge_NGINX_SNIMultiServerName_DeployBindsCorrectVhost_E2E`
When NGINX has multiple `server { server_name a.example b.example; }`
blocks, the operator deploys with metadata pointing at the
specific vhost. Connector binds to that vhost only; other vhosts
remain unchanged.
### IPv6 dual-stack
`TestVendorEdge_NGINX_IPv6DualStackBindsBoth_E2E`
NGINX listening on `0.0.0.0:443` + `[::]:443` serves the new cert
on both stacks after a single deploy.
**Operator action:** if your post-deploy verify endpoint resolves
to IPv6 only on some networks but IPv4 only on others, configure
`PostDeployVerifyAttempts: 5` to cover both paths.
### Reload vs restart
`TestVendorEdge_NGINX_ReloadVsRestart_NoConnectionDrop_E2E`
`nginx -s reload` (graceful) preserves in-flight TLS connections
via worker handoff. `nginx -s stop && nginx` drops them.
**Operator action:** never use restart for cert rotation. The
connector's default `reload_command: nginx -s reload` is correct.
### Binary upgrade
`TestVendorEdge_NGINX_UpgradeBinaryHotReload_E2E`
`nginx -s upgrade` rolls out a new binary without dropping
connections. Not commonly used; documented for ops teams that do
rolling NGINX binary upgrades.
### Config syntax error → rollback
`TestVendorEdge_NGINX_ConfigSyntaxError_RollbackRestoresPreviousCert_E2E`
If `nginx -t` rejects the staged config, the deploy package's
PreCommit gate fires before the atomic rename — no live file is
touched. The cert directory is exactly as it was.
### Missing intermediate
`TestVendorEdge_NGINX_MissingIntermediate_DeployedButValidationCatchesAtPostVerify_E2E`
If the operator deploys a leaf-only cert (no intermediate), NGINX
will start serving it but downstream clients fail chain validation.
The connector's post-deploy TLS verify catches this via cert chain
walk; rollback fires automatically.
### Access log privacy
`TestVendorEdge_NGINX_AccessLogPrivacy_NoCertBytesLeakInLogs_E2E`
NGINX's default `access_log` and `error_log` formats do NOT include
SSL key bytes. The connector does not modify NGINX's logging config.
**Operator action:** if you've customized `log_format` to include
`$ssl_*` variables, audit the format string for sensitive fields.
### Per-version reload-command compat
`TestVendorEdge_NGINX_NGINX125_vs_127_ReloadCommandCompatible_E2E`
`nginx -s reload` semantics are identical between 1.25 LTS and
1.27 stable. No per-version branch needed in operator config.
### High-concurrency deploy under load
`TestVendorEdge_NGINX_HighConcurrencyDeployUnderLoad_E2E`
NGINX's worker handoff during reload is graceful; concurrent TLS
handshakes during a deploy succeed without 5xx errors.
## Troubleshooting matrix
| Symptom | Test name | Root cause | Operator action |
|---|---|---|---|
| Old cert returned 5min after deploy | `SSLSessionCacheHoldsOldCert_E2E` | session cache TTL | tune `ssl_session_timeout` |
| Wrong vhost serves new cert | `SNIMultiServerName_E2E` | misconfigured server_name selector | verify vhost metadata |
| Post-verify fails on IPv6 | `IPv6DualStackBindsBoth_E2E` | flaky DNS resolution | `PostDeployVerifyAttempts: 5` |
| Connection drops on cert change | n/a | using restart instead of reload | use `nginx -s reload` |
| Deploy aborts with `nginx -t` error | `ConfigSyntaxError_RollbackRestoresPreviousCert_E2E` | bad config (not deploy's fault) | fix config; redeploy |
| Chain-validation failure post-deploy | `MissingIntermediate_E2E` | leaf-only cert | include full chain in deploy |
## V3-Pro deferrals
- Pin NGINX `ssl_session_ticket_key` rotation interaction with cert
rotation (rare; documented but not tested).
- NGINX Plus `dyn_pem` API integration (commercial; not V2 scope).
## Related docs
- [Atomic deploy + post-verify + rollback](deployment-atomicity.md)
— the Bundle I primitive every connector consumes.
- [Vendor compatibility matrix](deployment-vendor-matrix.md)
- [Connectors reference](connectors.md)