fix(helm): DEPL-004 — ServiceMonitor TLS default flipped to fail-closed

Acquisition-audit DEPL-004 closure (Sprint 6 ACQ, 2026-05-16).

Pre-2026-05-16, monitoring.serviceMonitor.tlsConfig in values.yaml
was empty by default, and the ServiceMonitor template fell through
to an implicit `insecureSkipVerify: true` else-branch. Operators
opting into the ServiceMonitor (monitoring.serviceMonitor.enabled=true)
got no Prometheus TLS verification by default — in-cluster scrapes
tolerate this, out-of-cluster scrapes silently skip the chain check.

The template now emits a fail-closed `{{ required ... }}` message
at `helm template` / `helm upgrade` time if neither a real verify
nor an explicit opt-back is supplied. The error string lists both
escape hatches and the docs cross-link, so the operator sees the
fix in the same line they hit the error.

Operators with monitoring.serviceMonitor.enabled=false (the chart
default): no action required — the template short-circuits before
the tlsConfig block. Operators who had ServiceMonitor on with no
tlsConfig set: helm upgrade will fail until they supply either
{ caFile: ..., serverName: ... } (production-shaped) or
{ insecureSkipVerify: true } (operator-acknowledged opt-back).

Files
=====
- deploy/helm/certctl/templates/servicemonitor.yaml: replace the
  else-branch insecureSkipVerify default with a {{ required ... }}
  Helm builtin that fails the render with a clear remediation
  message pointing at both escape hatches and docs/operator/
  helm-deployment.md
- deploy/helm/certctl/values.yaml: rewrite the tlsConfig comment
  block to document the new fail-closed posture + both upgrade
  paths (production verify vs operator-acknowledged opt-back)
- docs/operator/helm-deployment.md: new "2026-05-16 — ServiceMonitor
  TLS default flipped (DEPL-004)" subsection in the existing
  Upgrade section with the two operator-action recipes
This commit is contained in:
shankar0123
2026-05-16 19:44:48 +00:00
parent 5ea45a19b9
commit d7546aedca
3 changed files with 60 additions and 11 deletions
+25
View File
@@ -94,6 +94,31 @@ helm upgrade certctl deploy/helm/certctl/ \
Postgres state survives the upgrade (the PVC is retained). The server / agent images bump per the chart's `image.tag`. See [`docs/archive/upgrades/`](../archive/upgrades/) for version-specific upgrade guidance.
### 2026-05-16 — ServiceMonitor TLS default flipped (DEPL-004)
Acquisition-audit DEPL-004 closure. `monitoring.serviceMonitor.tlsConfig` was previously empty by default and the chart template fell through to `insecureSkipVerify: true`. Post-2026-05-16, the template emits a `{{ required ... }}` fail-closed message at `helm template` / `helm upgrade` time if neither a real verify nor an explicit opt-back is supplied.
Operators with `monitoring.serviceMonitor.enabled: true` MUST set one of:
```yaml
# A. Real TLS verify against the chart's CA (production-shaped).
monitoring:
serviceMonitor:
enabled: true
tlsConfig:
caFile: /etc/prometheus/secrets/certctl-ca/ca.crt
serverName: certctl-server
# B. Demo / dev-cluster — operator-acknowledged opt-back to pre-flip default.
monitoring:
serviceMonitor:
enabled: true
tlsConfig:
insecureSkipVerify: true
```
Operators with `monitoring.serviceMonitor.enabled: false` (the chart default) need no action — the template short-circuits before the `tlsConfig` block.
## Configuration reference
Every value is documented at `deploy/helm/certctl/values.yaml`. Common tweaks: