mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-10 04:48:52 +00:00
d6f4d5c5e8
Phase 4 of the certctl architecture diligence remediation closure.
Seven findings, all in deploy/helm/certctl/.
DEPL-H2 (High) — ship deploy/helm/certctl/templates/backup-cronjob.yaml
Operator opt-in via backup.enabled=true. Default OFF. CronJob runs
pg_dump --format=custom --no-owner --no-acl --dbname=certctl
matching the canonical shape in
docs/operator/runbooks/postgres-backup.md (so manual and
automated dumps are byte-identical). Sink: PVC (default) OR S3
via aws-cli. Documented as in-cluster-Postgres only — managed DB
deployments rely on their provider's PITR.
DEPL-M1 (Med) — Helm pre-install/pre-upgrade migration hook
deploy/helm/certctl/templates/migration-job.yaml — runs
`certctl-server --migrate-only` before the server Deployment
rolls. The --migrate-only flag (new in cmd/server/main.go) is a
hermetic schema-mutation pass: load config, open DB pool, run
RunMigrations + RunSeed, exit 0. No HTTP listener, no scheduler,
no signing setup.
Server's boot-time RunMigrations call is now gated on
CERTCTL_MIGRATIONS_VIA_HOOK — when set true, the server skips
the boot path (the hook owns the work). Default still runs at
boot, so Compose / VM / bare-metal deploys are unchanged.
migrations.viaHook: false in values.yaml (off by default).
DEPL-M4 (Med) — explicit Postgres StatefulSet strategy fields
deploy/helm/certctl/templates/postgres-statefulset.yaml adds:
spec.updateStrategy.type: OnDelete
spec.podManagementPolicy: OrderedReady
Operator-controlled Postgres upgrades (the OnDelete strategy
means a chart template tweak no longer triggers an immediate
Postgres restart). OrderedReady aligns with the standard
Postgres-on-Kubernetes pattern for any future HA work.
DEPL-M5 (Med) — per-fleet-size resource ladder documentation
deploy/helm/certctl/values.yaml — extended comments next to
server.resources + agent.resources documenting:
"≤ 500 certs / 100 agents" → defaults are validated
"5K certs / 1K agents" → starter suggestions, TBD Phase 8
"50K certs / 10K agents" → starter suggestions, TBD Phase 8
Numbers for the small-fleet case derive from the measured
baselines in docs/operator/performance-baselines.md
(50ms p50, < 3s for 1000-cert inventory walk, etc.). Larger
fleet numbers explicitly marked TBD pending Phase 8 load-test
runs — operators tune empirically until then.
DEPL-L1 (Low) — Helm rollback runbook
docs/operator/runbooks/rollback.md — covers helm rollback
mechanics, the schema-migration manual-cleanup path (when
*.down.sql files apply vs. when full restore is the only safe
path), and the per-migration-class safe-to-rollback table.
DEPL-L2 (Low) — Prometheus AlertManager rules
deploy/helm/certctl/templates/prometheusrules.yaml — opt-in via
monitoring.prometheusRules.enabled=true. Default OFF. Four
starter rules using verified metric names from
internal/api/handler/metrics.go:
CertctlCertificateExpiringSoon (certctl_certificate_expiring_soon)
CertctlAgentOffline ((agent_total - agent_online) > 0 for 1h)
CertctlJobFailureRateHigh (failure rate over 5% for 15m)
CertctlIssuanceFailures (any failures over 15m window)
All thresholds operator-tunable via
monitoring.prometheusRules.thresholds.* in values.
DEPL-L3 (Low) — Prometheus bearer-token setup runbook
docs/operator/runbooks/prometheus-bearer-token.md — documents
the API-key + Secret + values wiring for the RBAC-gated
/api/v1/metrics/prometheus scrape endpoint. End-to-end
procedure with troubleshooting steps + rotation guide.
CI guard: scripts/ci-guards/helm-templates-lint.sh
Six-combo matrix: defaults / backup PVC / backup S3 /
prometheusRules / migrations.viaHook / all-on. Each runs helm
template + checks render success. helm lint also gated.
Wired into the auto-pickup loop in .github/workflows/ci.yml;
azure/setup-helm@b9e51907 (v4.3.0, SHA-pinned per Phase 1
RED-2) installs helm v3.16.0 on the runner.
Verification (all pass):
ls deploy/helm/certctl/templates/{backup-cronjob,migration-job,prometheusrules}.yaml
grep -E 'updateStrategy|podManagementPolicy' deploy/helm/certctl/templates/postgres-statefulset.yaml # 2 matches
helm template deploy/helm/certctl/ --set backup.enabled=true \
--set monitoring.prometheusRules.enabled=true --set migrations.viaHook=true \
| grep -E "kind: (CronJob|PrometheusRule|Job)" # 3 matches
helm lint deploy/helm/certctl/ # 0 failed
ls docs/operator/runbooks/{rollback,prometheus-bearer-token}.md
bash scripts/ci-guards/helm-templates-lint.sh # 6/6 matrix combinations pass
Go build clean (cmd/server compiles, migrate-only path verified by
the build target). YAML validated.
Closes: cowork/certctl-architecture-diligence-audit.html#fix-DEPL-H2
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-M1
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-M4
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-M5
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-L1
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-L2
cowork/certctl-architecture-diligence-audit.html#fix-DEPL-L3
95 lines
3.4 KiB
YAML
95 lines
3.4 KiB
YAML
{{- if .Values.postgresql.enabled }}
|
|
apiVersion: apps/v1
|
|
kind: StatefulSet
|
|
metadata:
|
|
name: {{ include "certctl.fullname" . }}-postgres
|
|
labels:
|
|
{{- include "certctl.labels" . | nindent 4 }}
|
|
app.kubernetes.io/component: postgres
|
|
spec:
|
|
serviceName: {{ include "certctl.fullname" . }}-postgres
|
|
replicas: 1
|
|
# Phase 4 DEPL-M4 closure (2026-05-14): explicit StatefulSet update +
|
|
# pod-management strategies. Defaults make Postgres upgrades
|
|
# operator-controlled rather than automatic:
|
|
# updateStrategy.type: OnDelete — Postgres pods do NOT roll
|
|
# automatically when the StatefulSet spec changes. Operator
|
|
# deletes the pod explicitly after taking a backup + reviewing
|
|
# the change. Prevents an accidental Helm-template tweak from
|
|
# triggering a database restart at an awkward time.
|
|
# podManagementPolicy: OrderedReady — when scaling Postgres to
|
|
# a replica >1 (future HA work), pods come up one at a time
|
|
# and must reach Ready before the next pod is created. Aligns
|
|
# with the standard Postgres-on-Kubernetes pattern.
|
|
updateStrategy:
|
|
type: OnDelete
|
|
podManagementPolicy: OrderedReady
|
|
selector:
|
|
matchLabels:
|
|
{{- include "certctl.postgresSelectorLabels" . | nindent 6 }}
|
|
template:
|
|
metadata:
|
|
labels:
|
|
{{- include "certctl.postgresSelectorLabels" . | nindent 8 }}
|
|
spec:
|
|
securityContext:
|
|
{{- toYaml .Values.postgresql.securityContext | nindent 8 }}
|
|
{{- with .Values.imagePullSecrets }}
|
|
imagePullSecrets:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
containers:
|
|
- name: postgres
|
|
image: {{ include "certctl.postgresImage" . }}
|
|
imagePullPolicy: {{ .Values.postgresql.image.pullPolicy }}
|
|
ports:
|
|
- name: postgres
|
|
containerPort: 5432
|
|
protocol: TCP
|
|
env:
|
|
- name: POSTGRES_DB
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: {{ include "certctl.fullname" . }}-postgres
|
|
key: database
|
|
- name: POSTGRES_USER
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: {{ include "certctl.fullname" . }}-postgres
|
|
key: username
|
|
- name: POSTGRES_PASSWORD
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: {{ include "certctl.fullname" . }}-postgres
|
|
key: password
|
|
- name: POSTGRES_INITDB_ARGS
|
|
value: "--encoding=UTF8"
|
|
livenessProbe:
|
|
{{- toYaml .Values.postgresql.livenessProbe | nindent 12 }}
|
|
readinessProbe:
|
|
{{- toYaml .Values.postgresql.readinessProbe | nindent 12 }}
|
|
resources:
|
|
{{- toYaml .Values.postgresql.resources | nindent 12 }}
|
|
volumeMounts:
|
|
- name: postgres-data
|
|
mountPath: /var/lib/postgresql/data
|
|
subPath: postgres
|
|
- name: postgres-init
|
|
mountPath: /docker-entrypoint-initdb.d
|
|
volumes:
|
|
- name: postgres-init
|
|
emptyDir: {}
|
|
volumeClaimTemplates:
|
|
- metadata:
|
|
name: postgres-data
|
|
spec:
|
|
accessModes:
|
|
- ReadWriteOnce
|
|
{{- if .Values.postgresql.storage.storageClass }}
|
|
storageClassName: {{ .Values.postgresql.storage.storageClass }}
|
|
{{- end }}
|
|
resources:
|
|
requests:
|
|
storage: {{ .Values.postgresql.storage.size }}
|
|
{{- end }}
|