mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 13:51:36 +00:00
docs(observability): DEPL-006 follow-up — document CERTCTL_OTEL_ENABLED (G-3 ci-guard)
Sprint 6 ACQ DEPL-006 closure follow-up. The G-3-env-docs-drift
ci-guard scans `internal/` + `cmd/` for every CERTCTL_*
env-var reference and cross-checks against README + docs/ +
deploy/helm/ + deploy/ENVIRONMENTS.md. The OTel-seed commit
(35277c0) introduced `CERTCTL_OTEL_ENABLED` in
`internal/config/config.go` + `cmd/server/main.go` but didn't
add the matching doc entry, so the guard caught the drift on
the next CI run with:
G-3 regression: env var(s) defined in Go source but never documented:
CERTCTL_OTEL_ENABLED
Replaces the existing "Tracing — explicitly not yet shipped"
subsection in docs/operator/observability.md with an honest
"Tracing — OTLP surface available, instrumentation pending"
section that:
- Documents the env var + the standard OTEL_* env vars the SDK
honors (OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_SERVICE_NAME, etc.).
- Explains the OTLP/HTTP transport choice (vs gRPC) per the
rationale in internal/observability/otel.go's header.
- Pins what the current release DOES (surface + lazy connect +
graceful shutdown) vs DOES NOT (per-handler / per-DB /
per-connector spans).
- Notes the no-op-shutdown contract so operators can defer
unconditionally.
- Cross-references the existing request_id correlation + per-
issuer Prometheus histogram as the interim correlation surface.
- Repoints the "future work" tracker from the old "v3 item"
framing to WORKSPACE-ROADMAP.md §2 (Phase 4 in the path-b
build plan).
Verified locally: `bash scripts/ci-guards/G-3-env-docs-drift.sh`
exits 0 ("G-3 env-docs-drift: clean").
This commit is contained in:
@@ -74,22 +74,55 @@ metric surface meet our SLO needs today" — not "is the right library
|
|||||||
under the hood." If the answer to the first question is yes, the
|
under the hood." If the answer to the first question is yes, the
|
||||||
second is a refactor, not a feature gap.
|
second is a refactor, not a feature gap.
|
||||||
|
|
||||||
## Tracing — explicitly not yet shipped
|
## Tracing — OTLP surface available, instrumentation pending
|
||||||
|
|
||||||
certctl does **not** ship distributed tracing instrumentation today:
|
Sprint 6 ACQ DEPL-006 closure (2026-05-16) stood up the OTel tracer-
|
||||||
|
provider surface. Operators with an OTel collector can opt in via:
|
||||||
|
|
||||||
- No OpenTelemetry SDK setup in `cmd/server/main.go`.
|
```
|
||||||
- No OTLP exporter wired into outbound calls (issuer connectors,
|
CERTCTL_OTEL_ENABLED=true
|
||||||
agent enrollment, etc.).
|
OTEL_EXPORTER_OTLP_ENDPOINT=https://otel-collector.example.com:4318
|
||||||
- The `go.opentelemetry.io/otel` packages that appear in
|
```
|
||||||
[`go.mod`](../../go.mod) are indirect-only — they're transitive
|
|
||||||
dependencies of `coreos/go-oidc` and similar.
|
|
||||||
|
|
||||||
This is honest: there is no in-process tracing surface to monitor,
|
When `CERTCTL_OTEL_ENABLED` is true, `cmd/server/main.go` calls
|
||||||
correlate, or sample. If your environment requires end-to-end traces
|
`internal/observability.Init` which:
|
||||||
across the certctl control plane + agents + issuer backends, this is
|
|
||||||
a gap you would close on the certctl side as part of a v3 work item.
|
- Constructs an OTLP/HTTP exporter (chosen over OTLP/gRPC to keep
|
||||||
Until then:
|
the dependency surface narrow — see `internal/observability/otel.go`
|
||||||
|
header for the transport-choice rationale).
|
||||||
|
- Registers a real `sdktrace.TracerProvider` as the otel global.
|
||||||
|
- Honors the standard OTel env vars (`OTEL_EXPORTER_OTLP_ENDPOINT`,
|
||||||
|
`OTEL_EXPORTER_OTLP_HEADERS`, `OTEL_EXPORTER_OTLP_INSECURE`,
|
||||||
|
`OTEL_SERVICE_NAME` overrides the default `certctl-server`, etc.).
|
||||||
|
- Defers a graceful shutdown that flushes the in-flight batcher.
|
||||||
|
|
||||||
|
What this **does not** ship yet:
|
||||||
|
|
||||||
|
- No per-handler / per-DB / per-connector span instrumentation in
|
||||||
|
the certctl code base. The OTel SDK emits the spans it generates
|
||||||
|
internally (process resource attributes, eventual stdlib HTTP
|
||||||
|
spans), but certctl-domain spans (issuance, renewal, deployment,
|
||||||
|
agent enrollment) are a v2.3 roadmap follow-up.
|
||||||
|
- No tracing-correlated metric exemplars in the Prometheus
|
||||||
|
histograms above. Those still ship the per-issuer latency signal
|
||||||
|
without per-request fan-out.
|
||||||
|
- No backwards-compat shim — operators who never set
|
||||||
|
`CERTCTL_OTEL_ENABLED` (the default) see zero behavior change.
|
||||||
|
The init returns a no-op shutdown so the deferred call is safe
|
||||||
|
to invoke unconditionally.
|
||||||
|
|
||||||
|
When this matters today:
|
||||||
|
|
||||||
|
- Operators wiring up a v3 instrumentation effort have the OTel
|
||||||
|
surface in place; they only need to add `tracer.Start(ctx, "…")`
|
||||||
|
call sites in the handler/service code.
|
||||||
|
- Operators evaluating certctl for acquisition / due-diligence see
|
||||||
|
an opt-in OTel surface in the current release rather than a "v3
|
||||||
|
roadmap item" — a useful signal for buyer credibility per the
|
||||||
|
acquisition-thesis framing in `WORKSPACE-ROADMAP.md` §3.
|
||||||
|
|
||||||
|
Existing correlation surfaces stay in place until span coverage
|
||||||
|
ships:
|
||||||
|
|
||||||
- Structured logs include a `request_id` you can correlate across
|
- Structured logs include a `request_id` you can correlate across
|
||||||
the server log stream. See
|
the server log stream. See
|
||||||
@@ -99,8 +132,9 @@ Until then:
|
|||||||
same per-issuer latency signal a trace span would, just without
|
same per-issuer latency signal a trace span would, just without
|
||||||
the per-request fan-out.
|
the per-request fan-out.
|
||||||
|
|
||||||
OpenTelemetry instrumentation is tracked in
|
Per-handler / per-query / per-connector span instrumentation is
|
||||||
[WORKSPACE-ROADMAP.md](../../WORKSPACE-ROADMAP.md) as a v3 item.
|
tracked in [WORKSPACE-ROADMAP.md](../../WORKSPACE-ROADMAP.md) under
|
||||||
|
§2 (NHI / Agent Identity, Phase 4 in the path-b build plan).
|
||||||
|
|
||||||
## Logging
|
## Logging
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user