mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-12 07:49:02 +00:00
fix(helm): DEPL-004 — ServiceMonitor TLS default flipped to fail-closed
Acquisition-audit DEPL-004 closure (Sprint 6 ACQ, 2026-05-16).
Pre-2026-05-16, monitoring.serviceMonitor.tlsConfig in values.yaml
was empty by default, and the ServiceMonitor template fell through
to an implicit `insecureSkipVerify: true` else-branch. Operators
opting into the ServiceMonitor (monitoring.serviceMonitor.enabled=true)
got no Prometheus TLS verification by default — in-cluster scrapes
tolerate this, out-of-cluster scrapes silently skip the chain check.
The template now emits a fail-closed `{{ required ... }}` message
at `helm template` / `helm upgrade` time if neither a real verify
nor an explicit opt-back is supplied. The error string lists both
escape hatches and the docs cross-link, so the operator sees the
fix in the same line they hit the error.
Operators with monitoring.serviceMonitor.enabled=false (the chart
default): no action required — the template short-circuits before
the tlsConfig block. Operators who had ServiceMonitor on with no
tlsConfig set: helm upgrade will fail until they supply either
{ caFile: ..., serverName: ... } (production-shaped) or
{ insecureSkipVerify: true } (operator-acknowledged opt-back).
Files
=====
- deploy/helm/certctl/templates/servicemonitor.yaml: replace the
else-branch insecureSkipVerify default with a {{ required ... }}
Helm builtin that fails the render with a clear remediation
message pointing at both escape hatches and docs/operator/
helm-deployment.md
- deploy/helm/certctl/values.yaml: rewrite the tlsConfig comment
block to document the new fail-closed posture + both upgrade
paths (production verify vs operator-acknowledged opt-back)
- docs/operator/helm-deployment.md: new "2026-05-16 — ServiceMonitor
TLS default flipped (DEPL-004)" subsection in the existing
Upgrade section with the two operator-action recipes
This commit is contained in:
@@ -42,15 +42,25 @@ spec:
|
|||||||
interval: {{ .Values.monitoring.serviceMonitor.interval | default "30s" }}
|
interval: {{ .Values.monitoring.serviceMonitor.interval | default "30s" }}
|
||||||
scrapeTimeout: {{ .Values.monitoring.serviceMonitor.scrapeTimeout | default "10s" }}
|
scrapeTimeout: {{ .Values.monitoring.serviceMonitor.scrapeTimeout | default "10s" }}
|
||||||
tlsConfig:
|
tlsConfig:
|
||||||
# The certctl server uses self-signed bootstrap TLS or operator-
|
# Acquisition-audit DEPL-004 closure (Sprint 6 ACQ, 2026-05-16).
|
||||||
# provided cert-manager TLS — the ServiceMonitor consumes the
|
# The default flipped from `insecureSkipVerify: true` to a
|
||||||
# same CA bundle the server presents. When server.tls.existingSecret
|
# fail-closed posture: operators MUST either supply a
|
||||||
# is set, operators usually want to pull the matching ca.crt key
|
# `monitoring.serviceMonitor.tlsConfig` block (caFile / ca /
|
||||||
# out of that Secret. Adjust if your CA chain lives elsewhere.
|
# serverName for a real TLS verify) or opt back in explicitly
|
||||||
|
# with `tlsConfig: { insecureSkipVerify: true }`. The {{ required }}
|
||||||
|
# check below renders an error at `helm template` / `helm upgrade`
|
||||||
|
# time if neither is supplied, surfacing the misconfiguration
|
||||||
|
# before the ServiceMonitor lands in-cluster.
|
||||||
|
#
|
||||||
|
# In-cluster scrapes from a Prometheus pod that already trusts the
|
||||||
|
# certctl CA (via existingSecret + cert-manager) keep working with
|
||||||
|
# zero operator action — they just point at the right caFile.
|
||||||
|
# Out-of-cluster Prometheus deployments now require the operator
|
||||||
|
# to surface the trust decision explicitly.
|
||||||
{{- if .Values.monitoring.serviceMonitor.tlsConfig }}
|
{{- if .Values.monitoring.serviceMonitor.tlsConfig }}
|
||||||
{{- toYaml .Values.monitoring.serviceMonitor.tlsConfig | nindent 8 }}
|
{{- toYaml .Values.monitoring.serviceMonitor.tlsConfig | nindent 8 }}
|
||||||
{{- else }}
|
{{- else }}
|
||||||
insecureSkipVerify: true
|
{{- required "monitoring.serviceMonitor.tlsConfig is required when monitoring.serviceMonitor.enabled=true (Sprint 6 ACQ DEPL-004 closure, 2026-05-16). Supply { caFile: \"/etc/prometheus/secrets/.../ca.crt\", serverName: \"certctl-server\" } to verify against your CA, or { insecureSkipVerify: true } to opt back into the pre-2026-05-16 default. See docs/operator/helm-deployment.md for the upgrade-path note." nil }}
|
||||||
{{- end }}
|
{{- end }}
|
||||||
{{- with .Values.monitoring.serviceMonitor.bearerTokenSecret }}
|
{{- with .Values.monitoring.serviceMonitor.bearerTokenSecret }}
|
||||||
bearerTokenSecret:
|
bearerTokenSecret:
|
||||||
|
|||||||
@@ -680,14 +680,28 @@ monitoring:
|
|||||||
# name: certctl-prometheus-key
|
# name: certctl-prometheus-key
|
||||||
# key: api-key
|
# key: api-key
|
||||||
# bearerTokenSecret: {}
|
# bearerTokenSecret: {}
|
||||||
# TLS config for the scrape endpoint. The certctl server presents
|
# TLS config for the scrape endpoint. Acquisition-audit DEPL-004
|
||||||
# the same TLS cert the rest of the chart uses; insecureSkipVerify
|
# closure (Sprint 6 ACQ, 2026-05-16): the default flipped from
|
||||||
# defaults to true so demos work out of the box. Production deploys
|
# `insecureSkipVerify: true` to fail-closed. Operators MUST supply
|
||||||
# should pin the CA via caFile or ca.secret.
|
# tlsConfig — either a real verify (caFile / ca / serverName) for
|
||||||
|
# production, or explicit `{ insecureSkipVerify: true }` to opt
|
||||||
|
# back into the pre-2026-05-16 default. The ServiceMonitor template
|
||||||
|
# `{{ required ... }}` guard surfaces missing tlsConfig at chart-
|
||||||
|
# render time before it lands in-cluster.
|
||||||
|
#
|
||||||
|
# In-cluster Prometheus that already trusts the certctl CA via
|
||||||
|
# the chart's existingSecret / cert-manager-emitted bundle: point
|
||||||
|
# caFile at that path (typically /etc/prometheus/secrets/<name>/ca.crt
|
||||||
|
# once you mount the Secret into the Prometheus pod).
|
||||||
|
#
|
||||||
|
# Production-shaped example (verify against the chart's CA):
|
||||||
# tlsConfig:
|
# tlsConfig:
|
||||||
# caFile: /etc/prometheus/secrets/certctl-ca/ca.crt
|
# caFile: /etc/prometheus/secrets/certctl-ca/ca.crt
|
||||||
# serverName: certctl-server
|
# serverName: certctl-server
|
||||||
# tlsConfig: {}
|
#
|
||||||
|
# Demo / dev-cluster escape hatch (operator-acknowledged):
|
||||||
|
# tlsConfig:
|
||||||
|
# insecureSkipVerify: true
|
||||||
# Optional relabeling for the scrape job.
|
# Optional relabeling for the scrape job.
|
||||||
# relabelings: []
|
# relabelings: []
|
||||||
|
|
||||||
|
|||||||
@@ -94,6 +94,31 @@ helm upgrade certctl deploy/helm/certctl/ \
|
|||||||
|
|
||||||
Postgres state survives the upgrade (the PVC is retained). The server / agent images bump per the chart's `image.tag`. See [`docs/archive/upgrades/`](../archive/upgrades/) for version-specific upgrade guidance.
|
Postgres state survives the upgrade (the PVC is retained). The server / agent images bump per the chart's `image.tag`. See [`docs/archive/upgrades/`](../archive/upgrades/) for version-specific upgrade guidance.
|
||||||
|
|
||||||
|
### 2026-05-16 — ServiceMonitor TLS default flipped (DEPL-004)
|
||||||
|
|
||||||
|
Acquisition-audit DEPL-004 closure. `monitoring.serviceMonitor.tlsConfig` was previously empty by default and the chart template fell through to `insecureSkipVerify: true`. Post-2026-05-16, the template emits a `{{ required ... }}` fail-closed message at `helm template` / `helm upgrade` time if neither a real verify nor an explicit opt-back is supplied.
|
||||||
|
|
||||||
|
Operators with `monitoring.serviceMonitor.enabled: true` MUST set one of:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# A. Real TLS verify against the chart's CA (production-shaped).
|
||||||
|
monitoring:
|
||||||
|
serviceMonitor:
|
||||||
|
enabled: true
|
||||||
|
tlsConfig:
|
||||||
|
caFile: /etc/prometheus/secrets/certctl-ca/ca.crt
|
||||||
|
serverName: certctl-server
|
||||||
|
|
||||||
|
# B. Demo / dev-cluster — operator-acknowledged opt-back to pre-flip default.
|
||||||
|
monitoring:
|
||||||
|
serviceMonitor:
|
||||||
|
enabled: true
|
||||||
|
tlsConfig:
|
||||||
|
insecureSkipVerify: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Operators with `monitoring.serviceMonitor.enabled: false` (the chart default) need no action — the template short-circuits before the `tlsConfig` block.
|
||||||
|
|
||||||
## Configuration reference
|
## Configuration reference
|
||||||
|
|
||||||
Every value is documented at `deploy/helm/certctl/values.yaml`. Common tweaks:
|
Every value is documented at `deploy/helm/certctl/values.yaml`. Common tweaks:
|
||||||
|
|||||||
Reference in New Issue
Block a user