mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 15:11:29 +00:00
docs: Phase 5 — testing-guide.md prune (8268 → 0 lines, content dispersed)
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/
and the section-by-section plan in testing-guide-tumor.md.
testing-guide.md was 30% of all docs/ content (8268 lines) but was
integration test code written in markdown, not operator documentation.
The audit's tumor analysis disposed of every Part:
- ~65% DELETE (test cases that already exist in code)
- ~22% MOVE to inline test code
- ~8% KEEP-COMPRESSED into focused operator-runbook docs
- Title + contents + release sign-off ~5% KEEP
This commit ships the KEEP-COMPRESSED dispersal:
docs/contributor/qa-prerequisites.md (NEW, ~120 lines):
From testing-guide.md "Prerequisites" section. Stack boot procedure,
demo data baseline, reference IDs operators reuse across QA docs.
docs/contributor/gui-qa-checklist.md (NEW, ~105 lines):
From testing-guide.md "Part 35: GUI Testing". Manual GUI verification
pass for release sign-off. 25-row table covering every dashboard page.
docs/contributor/release-sign-off.md (NEW, ~130 lines):
From testing-guide.md "Release Sign-Off" section (originally 1009
lines of per-test detail tables). Compressed to a release-day
checklist organized by gate category: code state, automated gates,
manual QA passes, release artefact verification, branch protection,
post-release.
docs/operator/performance-baselines.md (NEW, ~100 lines):
From testing-guide.md "Part 39: Performance Spot Checks". Four
operator-runnable benchmarks (API request handling, inventory list
pagination, scheduler tick, bulk revoke) with baseline numbers and
when-to-re-baseline guidance.
docs/operator/helm-deployment.md (NEW, ~120 lines):
From testing-guide.md "Part 52: Helm Chart Deployment". Operator
runbook for the bundled deploy/helm/certctl/ chart: prereqs,
install, four cert-source patterns, verify, upgrade, troubleshooting.
docs/reference/cli.md (NEW, ~120 lines):
From testing-guide.md "Part 28: CLI Tool". certctl-cli command
reference with command-group breakdown, common workflows
(list/filter, renew, revoke, bulk import, EST enrollment, status),
output formats, CI/CD integration patterns.
docs/README.md navigation index updated to include the 6 new docs:
Reference section gains: cli.md, release-verification.md (was added
in Phase 13)
Operator section gains: helm-deployment.md, performance-baselines.md
Contributor section gains: qa-prerequisites.md, gui-qa-checklist.md,
release-sign-off.md
docs/testing-guide.md deleted. Git history preserves the 8268 lines —
if any specific test case is found missing from inline test code or
the destination docs during future work, lift from `git show
HEAD~1:docs/testing-guide.md`.
Net: docs/ total line count drops by ~7700 lines (28%), from 26,369
to 18,742. testing-guide.md was the single largest doc; pruning it is
the single biggest content-edit win of the entire restructure.
Phase 5 is the last major content phase. Remaining: Phase 4 follow-on
(per-connector page extractions from reference/connectors/index.md),
Phase 15 (WHAT/HOW/WHY remediation), Phase 16 (final acceptance gate).
This commit is contained in:
@@ -28,7 +28,9 @@ You're operating certctl in production or building integrations and need authori
|
||||
|---|---|
|
||||
| [Architecture](reference/architecture.md) | System design, data flow, security model, deployment topologies |
|
||||
| [API](reference/api.md) | OpenAPI 3.1 spec, integration patterns, client SDK generation |
|
||||
| [CLI](reference/cli.md) | certctl-cli command reference and CI/CD integration patterns |
|
||||
| [MCP server](reference/mcp.md) | Model Context Protocol integration for AI assistants |
|
||||
| [Release verification](reference/release-verification.md) | Cosign / SLSA / SBOM verification procedure |
|
||||
| [Intermediate CA hierarchy](reference/intermediate-ca-hierarchy.md) | Multi-level CA tree management — RFC 5280 §3.2/§4.2.1.9/§4.2.1.10 enforcement |
|
||||
| [Deployment model](reference/deployment-model.md) | Atomic write, post-deploy verify, rollback semantics across all targets |
|
||||
| [Vendor matrix](reference/vendor-matrix.md) | Tested vendor versions per target connector |
|
||||
@@ -66,6 +68,8 @@ You're running certctl in production and need operational guidance.
|
||||
| [Control plane TLS](operator/tls.md) | Self-signed bootstrap, operator-supplied Secret, cert-manager Certificate CR |
|
||||
| [Database TLS](operator/database-tls.md) | PostgreSQL transport encryption |
|
||||
| [Approval workflow](operator/approval-workflow.md) | Two-person integrity gate for high-stakes issuance |
|
||||
| [Helm deployment](operator/helm-deployment.md) | Kubernetes installation via the bundled chart |
|
||||
| [Performance baselines](operator/performance-baselines.md) | Operator-runnable benchmarks for regression spot checks |
|
||||
| [Legacy clients (TLS 1.2)](operator/legacy-clients-tls-1.2.md) | Reverse-proxy runbook for embedded EST/SCEP clients on TLS 1.2 |
|
||||
|
||||
### Runbooks
|
||||
@@ -108,7 +112,10 @@ You're contributing to certctl, running tests locally, or trying to understand t
|
||||
|---|---|
|
||||
| [Testing strategy](contributor/testing-strategy.md) | What we test and why; per-PR fast gates vs daily deep-scan |
|
||||
| [Test environment](contributor/test-environment.md) | Local environment with real CAs (Pebble, step-ca, etc.) |
|
||||
| [QA prerequisites](contributor/qa-prerequisites.md) | Before running QA: stack boot, demo data baseline, env vars |
|
||||
| [QA test suite](contributor/qa-test-suite.md) | qa_test.go reference for release QA |
|
||||
| [GUI QA checklist](contributor/gui-qa-checklist.md) | Manual GUI verification pass for release |
|
||||
| [Release sign-off](contributor/release-sign-off.md) | Release-day checklist — code state, automated gates, manual QA, artefact verification |
|
||||
| [CI pipeline](contributor/ci-pipeline.md) | CI shape, regression guards, adding new checks |
|
||||
|
||||
## Archive
|
||||
|
||||
@@ -0,0 +1,68 @@
|
||||
# GUI QA Checklist
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
Manual GUI verification pass for release sign-off. Vitest covers component-level behavior; this checklist covers end-to-end flows that only land correctly when the React SPA, the REST API, and the database are all wired together.
|
||||
|
||||
## Prereqs
|
||||
|
||||
The full stack must be running and healthy per [`qa-prerequisites.md`](qa-prerequisites.md). Open `https://localhost:8443` in a fresh browser session (Incognito / Private mode is fine — avoids cached state from previous QA passes).
|
||||
|
||||
## Pages to verify
|
||||
|
||||
For each page, the verification is "open it, confirm it renders without console errors, exercise the documented action, confirm the action lands as expected."
|
||||
|
||||
| Page | Action to verify | Expected result |
|
||||
|---|---|---|
|
||||
| `/dashboard` | Page loads, all 4 stat cards populate | Total / Active / Expiring / Expired counts match `GET /api/v1/stats/summary` |
|
||||
| `/certificates` | Inventory list paginates | "Next page" button works; URL updates with cursor; row count consistent |
|
||||
| `/certificates/<id>` | Detail page opens for any cert | Cert chain renders, deployment status shows, audit timeline visible |
|
||||
| `/issuers` | Catalog renders all configured issuers | Each issuer card shows last-used / status; clicking opens detail |
|
||||
| `/issuers/<id>` | Issuer config form | Edit + Save round-trips through `PATCH /api/v1/issuers/<id>` |
|
||||
| `/issuers/hierarchy` | CA tree view | Multi-level hierarchy renders; admin-gated CRUD buttons present for admins only |
|
||||
| `/agents` | Fleet view | Online/offline status accurate; OS/arch grouping correct |
|
||||
| `/agents/<id>` | Agent detail | Last heartbeat, registered date, deployment job history |
|
||||
| `/agents/groups` | Agent groups CRUD | Create + edit + delete a test group; verify dynamic membership matching |
|
||||
| `/jobs` | Job queue | Filter by status / type works; click into a job opens detail |
|
||||
| `/jobs/<id>` | Job detail | Status, retries, logs, owner attribution |
|
||||
| `/policies` | Renewal policies CRUD | Edit AlertChannels matrix, save, verify backend reflects change |
|
||||
| `/profiles` | Certificate profiles | EKU constraints + max TTL editable; profile binding works |
|
||||
| `/notifications` | Notifier config | Test connection button against each configured notifier |
|
||||
| `/discovery` | Discovery triage | Claim / Dismiss buttons round-trip to backend |
|
||||
| `/network-scans` | Scan target CRUD | Create scan target, trigger immediate scan, results appear |
|
||||
| `/audit` | Audit trail | Filter by actor / action / time range; CSV export works |
|
||||
| `/short-lived` | Short-lived credential dashboard | Live TTL countdown updates; auto-refresh every 10s |
|
||||
| `/observability` | Observability dashboard | Charts render: expiration heatmap, renewal trends, issuance rate |
|
||||
| `/health` | Health monitor | TLS endpoint health: healthy / degraded / down states accurate |
|
||||
| `/digest` | Digest preview | Email preview renders; "Send digest" button dispatches |
|
||||
| `/owners` | Owners CRUD | Create owner with team, edit, delete (after reassigning certs) |
|
||||
| `/teams` | Teams CRUD | Create + delete; verify cascade removes orphan owners |
|
||||
| `/scep` | SCEP admin tabs | Profiles / Intune Monitoring / Recent Activity all populate |
|
||||
| `/est` | EST admin tabs | Profiles / Recent Activity / Trust Bundle all populate |
|
||||
| `/login` | Login flow | API key entry persists for the session; bad key rejected |
|
||||
|
||||
## Console hygiene
|
||||
|
||||
Open browser DevTools and confirm:
|
||||
|
||||
- No uncaught exceptions on any page
|
||||
- No 404 / 500 responses in the Network tab from API calls
|
||||
- No CORS errors
|
||||
- No CSP violations
|
||||
|
||||
## Mobile / narrow-viewport
|
||||
|
||||
The dashboard is desktop-first but should not break catastrophically on narrow viewports. Resize the browser to 380px width; confirm:
|
||||
|
||||
- Sidebar collapses to a hamburger menu
|
||||
- Tables either scroll horizontally or stack on mobile
|
||||
- Forms remain usable
|
||||
|
||||
## Accessibility spot-check
|
||||
|
||||
- Tab through any single page using only the keyboard. Every interactive element must be reachable, and the focus indicator must be visible.
|
||||
- Lighthouse accessibility audit on `/dashboard`: target ≥ 90.
|
||||
|
||||
## Sign-off
|
||||
|
||||
Document any deviations in the release sign-off matrix at [`release-sign-off.md`](release-sign-off.md).
|
||||
@@ -0,0 +1,99 @@
|
||||
# QA Prerequisites
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
Operational prereqs for running release QA against certctl. Before any of the contributor-facing testing surfaces (test-environment.md, gui-qa-checklist.md, release-sign-off.md) are useful, the local stack needs to be in a known-good state.
|
||||
|
||||
## Why manual QA on top of automated tests?
|
||||
|
||||
Automated tests mock dependencies and run in isolation. Manual QA validates the full integrated stack: real PostgreSQL, real HTTP, real agent binary, real file I/O, real scheduler timing. It catches issues that unit tests can't: migration ordering, Docker networking, env var parsing, browser rendering, and timing-dependent scheduler behavior.
|
||||
|
||||
## Environment setup
|
||||
|
||||
**Step 1: Start the full stack.**
|
||||
|
||||
```bash
|
||||
cd deploy && docker compose -f docker-compose.demo.yml up --build -d
|
||||
```
|
||||
|
||||
This builds three containers (postgres, certctl-server, certctl-agent) and runs them on a bridge network. The `--build` flag ensures you're testing the current code, not a stale image. The `demo` overlay seeds the database with realistic fixtures.
|
||||
|
||||
**Step 2: Wait for healthy state.**
|
||||
|
||||
```bash
|
||||
for i in $(seq 1 30); do
|
||||
STATUS=$(docker compose ps --format json 2>/dev/null | jq -r 'select(.Health != null) | "\(.Name): \(.Health)"' 2>/dev/null)
|
||||
echo "$STATUS"
|
||||
echo "$STATUS" | grep -q "unhealthy\|starting" || break
|
||||
sleep 2
|
||||
done
|
||||
```
|
||||
|
||||
Why: Docker Compose starts containers in dependency order (postgres → server → agent), but "started" doesn't mean "ready." Health checks confirm postgres accepts connections, the server responds on `/health`, and the agent process is running.
|
||||
|
||||
**Step 3: Set shell variables used throughout the QA flow.**
|
||||
|
||||
```bash
|
||||
export SERVER=https://localhost:8443
|
||||
export API_KEY="change-me-in-production"
|
||||
export AUTH="Authorization: Bearer $API_KEY"
|
||||
export CT="Content-Type: application/json"
|
||||
export CACERT="--cacert ./deploy/test/certs/ca.crt"
|
||||
```
|
||||
|
||||
Every curl command in QA docs uses these variables. Setting them once avoids typos and keeps the docs copy-pasteable.
|
||||
|
||||
> **Note:** The default Docker Compose sets `CERTCTL_AUTH_TYPE: none` for the demo overlay, meaning auth is disabled. Tests that exercise auth require flipping this to `api-key`; instructions are in the relevant test docs.
|
||||
|
||||
**Step 4: Build CLI and MCP server binaries on the host.**
|
||||
|
||||
```bash
|
||||
go build -o certctl-cli ./cmd/cli/...
|
||||
go build -o certctl-mcp ./cmd/mcp-server/...
|
||||
```
|
||||
|
||||
The CLI and MCP server are separate binaries that talk to the server over HTTP. Building them verifies the code compiles and produces the executables you'll test later.
|
||||
|
||||
## Demo data baseline
|
||||
|
||||
The seed data (`migrations/seed.sql` + `migrations/seed_demo.sql`) pre-populates the database with realistic fixtures. Confirm it loaded:
|
||||
|
||||
```bash
|
||||
curl -s $CACERT -H "$AUTH" $SERVER/api/v1/stats/summary | jq .
|
||||
```
|
||||
|
||||
**Expected shape:**
|
||||
|
||||
```json
|
||||
{
|
||||
"total_certificates": 15,
|
||||
"active_certificates": ...,
|
||||
"expiring_certificates": ...,
|
||||
"expired_certificates": ...,
|
||||
"pending_renewals": ...
|
||||
}
|
||||
```
|
||||
|
||||
**Reference IDs in the demo data** (used across QA docs):
|
||||
|
||||
| Resource | IDs | Count |
|
||||
|---|---|---|
|
||||
| Teams | `t-platform`, `t-security`, `t-payments`, `t-frontend`, `t-data` | 5 |
|
||||
| Owners | `o-alice`, `o-bob`, `o-carol`, `o-dave`, `o-eve` | 5 |
|
||||
| Policies | `rp-standard`, `rp-urgent`, `rp-manual` | 3 |
|
||||
| Issuers | `iss-local`, `iss-acme-le`, `iss-stepca`, `iss-digicert` | 4 |
|
||||
| Agents | `ag-web-prod`, `ag-web-staging`, `ag-lb-prod`, `ag-iis-prod`, `ag-data-prod` | 5 |
|
||||
| Targets | `tgt-nginx-prod`, `tgt-nginx-staging`, `tgt-f5-prod`, `tgt-iis-prod`, `tgt-nginx-data` | 5 |
|
||||
| Profiles | `prof-standard-tls`, `prof-internal-mtls`, `prof-short-lived`, `prof-high-security` | 4 |
|
||||
| Certificates | `mc-api-prod`, `mc-web-prod`, `mc-pay-prod`, etc. | 15 |
|
||||
| Agent Groups | `ag-linux-prod`, `ag-linux-amd64`, `ag-windows`, `ag-datacenter-a`, `ag-manual` | 5 |
|
||||
| Network Scan Targets | `nst-dc1-web`, `nst-dc2-apps`, `nst-dmz` | 3 |
|
||||
|
||||
## Once these are green
|
||||
|
||||
Move to the appropriate downstream surface:
|
||||
|
||||
- [`test-environment.md`](test-environment.md) — full local environment tutorial with real CAs (Pebble, step-ca, etc.)
|
||||
- [`gui-qa-checklist.md`](gui-qa-checklist.md) — manual GUI test pass
|
||||
- [`release-sign-off.md`](release-sign-off.md) — release-day checklist
|
||||
- [`testing-strategy.md`](testing-strategy.md) — what we test in CI vs daily deep-scan vs manual QA
|
||||
@@ -0,0 +1,93 @@
|
||||
# Release Sign-Off
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
Release-day checklist for tagging a new certctl release. Walks through the gates that must be green before pushing the tag, in the order they should be verified.
|
||||
|
||||
## Pre-release: code state
|
||||
|
||||
| Gate | How to check | Pass |
|
||||
|---|---|---|
|
||||
| `master` is at the commit you intend to tag | `git log -1 --format='%H %s'` | ☐ |
|
||||
| Working tree clean | `git status -sb` | ☐ |
|
||||
| Local matches GitHub | `curl -sS https://api.github.com/repos/certctl-io/certctl/commits/master \| grep -oE '"sha": "[a-f0-9]+"' \| head -1` matches local | ☐ |
|
||||
| `WORKSPACE-CHANGELOG.md` updated with the release's milestones | manual review | ☐ |
|
||||
| `certctl/CHANGELOG.md` updated (release-facing) | manual review | ☐ |
|
||||
| Migration ladder ends cleanly | `ls migrations/*.up.sql \| sort \| tail -3` shows the right last migration | ☐ |
|
||||
|
||||
## Pre-release: automated gates (CI)
|
||||
|
||||
| Gate | How to check | Pass |
|
||||
|---|---|---|
|
||||
| CI pipeline green on the tag-target commit | GitHub Actions web UI | ☐ |
|
||||
| `make verify` clean locally | run from repo root | ☐ |
|
||||
| `go test -race -count=1 ./...` clean | full race check | ☐ |
|
||||
| `golangci-lint run ./...` clean | local lint | ☐ |
|
||||
| `govulncheck ./...` clean | vulnerability scan | ☐ |
|
||||
| Coverage thresholds met (service ≥55%, handler ≥60%, domain ≥40%, middleware ≥30%) | `go test -coverprofile=cover.out ./... && go tool cover -func=cover.out` | ☐ |
|
||||
| Frontend type-check + Vitest + Vite build clean | `cd web && npm run typecheck && npm run test && npm run build` | ☐ |
|
||||
|
||||
## Pre-release: manual QA passes
|
||||
|
||||
| Surface | Checklist | Pass |
|
||||
|---|---|---|
|
||||
| Local stack boots clean from scratch | `qa-prerequisites.md` Steps 1-4 green | ☐ |
|
||||
| GUI QA checklist | `gui-qa-checklist.md` end to end | ☐ |
|
||||
| End-to-end test environment | `test-environment.md` Steps 1-14 green | ☐ |
|
||||
| Performance baselines | `performance-baselines.md` four spot checks within bounds | ☐ |
|
||||
| Helm chart deploys clean | `helm-deployment.md` install + verify | ☐ |
|
||||
| ACME server interop (cert-manager) | `make acme-cert-manager-test` green | ☐ |
|
||||
| ACME server RFC conformance (lego) | `make acme-rfc-conformance-test` green | ☐ |
|
||||
|
||||
## Release artefact verification
|
||||
|
||||
After the release workflow runs (triggered by tag push), verify the published artefacts:
|
||||
|
||||
| Artefact | How to verify | Pass |
|
||||
|---|---|---|
|
||||
| Cosign keyless OIDC signature on `checksums.txt` | per `docs/reference/release-verification.md` step 2 | ☐ |
|
||||
| SLSA Level 3 provenance on each binary | step 3 | ☐ |
|
||||
| Container image signature + SBOM + provenance | step 4 | ☐ |
|
||||
| Release notes published on GitHub Releases page | manual review | ☐ |
|
||||
| ghcr.io images at `ghcr.io/certctl-io/certctl-{server,agent}:<tag>` pullable | `docker pull` round-trips | ☐ |
|
||||
|
||||
## Branch protection + tag push
|
||||
|
||||
| Gate | How to check | Pass |
|
||||
|---|---|---|
|
||||
| `master` branch protection rule allows the tag push | Repository Settings → Branches | ☐ |
|
||||
| Tag pushed | `git tag -s v<version> -m 'Release v<version>'; git push origin v<version>` | ☐ |
|
||||
| Release workflow kicked off in GitHub Actions | watch the Actions tab | ☐ |
|
||||
|
||||
## Post-release
|
||||
|
||||
| Gate | How to check | Pass |
|
||||
|---|---|---|
|
||||
| Release workflow completed without errors | GitHub Actions | ☐ |
|
||||
| Sample binary downloaded and Cosign-verified by an operator who is not the release author | another team member | ☐ |
|
||||
| `WORKSPACE-CHANGELOG.md` notes the tag commit SHA | manual edit | ☐ |
|
||||
| `cowork/CLAUDE.md` "Active Focus" → "Current tag" updated | manual edit | ☐ |
|
||||
| `certctl.io/index.html` star count + `data-gh-version` rendering picks up the new tag | open the landing page in 6+ hours (cache TTL) | ☐ |
|
||||
| Reddit / Hacker News / LinkedIn announcement drafted (if a major release) | per the operator's promotion playbook | ☐ |
|
||||
|
||||
## If a gate fails
|
||||
|
||||
Revert the tag push immediately:
|
||||
|
||||
```bash
|
||||
git push --delete origin v<version>
|
||||
git tag -d v<version>
|
||||
```
|
||||
|
||||
Investigate, fix, re-tag.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [`docs/contributor/qa-prerequisites.md`](qa-prerequisites.md) — local stack prereqs
|
||||
- [`docs/contributor/test-environment.md`](test-environment.md) — full local environment tutorial
|
||||
- [`docs/contributor/gui-qa-checklist.md`](gui-qa-checklist.md) — GUI manual QA pass
|
||||
- [`docs/contributor/testing-strategy.md`](testing-strategy.md) — what we test in CI vs deep-scan vs manual QA
|
||||
- [`docs/contributor/ci-pipeline.md`](ci-pipeline.md) — CI shape and regression guards
|
||||
- [`docs/operator/performance-baselines.md`](../operator/performance-baselines.md) — performance regression spot checks
|
||||
- [`docs/operator/helm-deployment.md`](../operator/helm-deployment.md) — Helm install + verify
|
||||
- [`docs/reference/release-verification.md`](../reference/release-verification.md) — Cosign / SLSA / SBOM verification procedure
|
||||
@@ -0,0 +1,120 @@
|
||||
# Helm Deployment
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
Operator runbook for deploying certctl on Kubernetes via the bundled Helm chart at `deploy/helm/certctl/`.
|
||||
|
||||
## Prereqs
|
||||
|
||||
- Kubernetes cluster, v1.27+
|
||||
- `kubectl` configured and authenticated
|
||||
- `helm` v3.13+
|
||||
- Storage class for the PostgreSQL StatefulSet PVC
|
||||
- TLS cert source: either an operator-supplied `kubernetes.io/tls` Secret OR a cert-manager `ClusterIssuer` / `Issuer`. The chart refuses to render without one. See [`tls.md`](tls.md) for the four cert provisioning patterns.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
helm install certctl deploy/helm/certctl/ \
|
||||
--namespace certctl \
|
||||
--create-namespace \
|
||||
--set server.apiKey=$(openssl rand -hex 32) \
|
||||
--set postgres.password=$(openssl rand -hex 32) \
|
||||
--set server.tls.existingSecret=certctl-server-tls
|
||||
```
|
||||
|
||||
`server.apiKey` and `postgres.password` should be high-entropy values. The example above generates them inline; production deployments use a secrets manager (Vault, External Secrets Operator, AWS Secrets Manager) instead.
|
||||
|
||||
## What you get
|
||||
|
||||
- **Server Deployment** with a configurable replica count (default 1; HA needs sticky sessions on the ACME server's nonce path)
|
||||
- **PostgreSQL StatefulSet** with PVC-backed persistence
|
||||
- **Agent DaemonSet** with one agent per node (configurable via `agent.daemonset.enabled=false` if you don't want the in-cluster agent)
|
||||
- Health probes (`/health` liveness + `/ready` readiness)
|
||||
- Security contexts: non-root, read-only root filesystem
|
||||
- Optional Ingress (off by default; opt in via `ingress.enabled=true`)
|
||||
|
||||
## Cert source patterns
|
||||
|
||||
### Pattern 1 — operator-supplied Secret (recommended for non-cert-manager shops)
|
||||
|
||||
```bash
|
||||
kubectl create secret tls certctl-server-tls \
|
||||
--cert=server.crt --key=server.key \
|
||||
--namespace certctl
|
||||
|
||||
helm install certctl deploy/helm/certctl/ \
|
||||
--namespace certctl \
|
||||
--set server.tls.existingSecret=certctl-server-tls
|
||||
```
|
||||
|
||||
### Pattern 2 — cert-manager Certificate CR (recommended for cert-manager shops)
|
||||
|
||||
```bash
|
||||
helm install certctl deploy/helm/certctl/ \
|
||||
--namespace certctl \
|
||||
--set server.tls.certManager.enabled=true \
|
||||
--set server.tls.certManager.issuerRef.name=my-cluster-issuer \
|
||||
--set server.tls.certManager.issuerRef.kind=ClusterIssuer
|
||||
```
|
||||
|
||||
### Refuses to render without one of the above
|
||||
|
||||
```bash
|
||||
helm install certctl deploy/helm/certctl/ --namespace certctl
|
||||
# Error: server.tls.existingSecret OR server.tls.certManager.enabled must be set
|
||||
```
|
||||
|
||||
The render-time guard catches the missing config at `helm install` time, not at pod-crash-loop time.
|
||||
|
||||
## Verify the install
|
||||
|
||||
```bash
|
||||
kubectl wait --for=condition=Ready --timeout=3m \
|
||||
-n certctl pod -l app.kubernetes.io/name=certctl-server
|
||||
|
||||
kubectl port-forward -n certctl svc/certctl-server 8443:8443 &
|
||||
|
||||
# Bundle the TLS root from the Secret to verify
|
||||
kubectl get secret -n certctl certctl-server-tls -o jsonpath='{.data.ca\.crt}' \
|
||||
| base64 -d > /tmp/certctl-ca.crt
|
||||
curl --cacert /tmp/certctl-ca.crt https://localhost:8443/health
|
||||
# {"status":"healthy"}
|
||||
```
|
||||
|
||||
If the Secret has no `ca.crt` key (operator-supplied Secrets often don't), use `tls.crt` as the bundle. For a self-signed cert the two files are identical; for a chained cert distribute the root CA bundle separately via ConfigMap.
|
||||
|
||||
## Upgrade
|
||||
|
||||
```bash
|
||||
helm upgrade certctl deploy/helm/certctl/ \
|
||||
--namespace certctl \
|
||||
--reuse-values
|
||||
```
|
||||
|
||||
Postgres state survives the upgrade (the PVC is retained). The server / agent images bump per the chart's `image.tag`. See [`docs/archive/upgrades/`](../archive/upgrades/) for version-specific upgrade guidance.
|
||||
|
||||
## Configuration reference
|
||||
|
||||
Every value is documented at `deploy/helm/certctl/values.yaml`. Common tweaks:
|
||||
|
||||
- `server.replicaCount` — replica count (default 1)
|
||||
- `server.resources.{requests,limits}` — pod resource bounds
|
||||
- `agent.daemonset.enabled` — toggle the in-cluster agent (default true)
|
||||
- `postgres.storageSize` — PVC size (default 10Gi)
|
||||
- `ingress.enabled` + `ingress.host` — opt into Ingress
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Pod crash-loops with TLS error.** Cert + key in the Secret don't pair. Verify with `openssl x509 -modulus -in server.crt -noout | md5` against `openssl rsa -modulus -in server.key -noout | md5` — outputs must match.
|
||||
|
||||
**Agent DaemonSet pods can't reach the server.** Service DNS / NetworkPolicy issue. Confirm the agent's `CERTCTL_SERVER_URL` env points at the in-cluster service name (`https://certctl-server.certctl.svc.cluster.local:8443`).
|
||||
|
||||
**Postgres won't start.** PVC permissions. Check `kubectl describe pvc -n certctl certctl-postgres` and confirm the storage class supports `fsGroup`.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [`tls.md`](tls.md) — cert provisioning patterns + SIGHUP rotation
|
||||
- [`security.md`](security.md) — production security posture
|
||||
- [`runbooks/disaster-recovery.md`](runbooks/disaster-recovery.md) — Postgres restore + recovery procedures
|
||||
- [`docs/archive/upgrades/`](../archive/upgrades/) — version-specific upgrade procedures
|
||||
@@ -0,0 +1,106 @@
|
||||
# Performance Baselines
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
Operator-runnable benchmarks for spot-checking certctl performance against published baselines. Useful as a regression detector after upgrades or infra changes.
|
||||
|
||||
## Why these specific spots?
|
||||
|
||||
certctl's hot paths are dominated by three workloads:
|
||||
|
||||
1. **API request handling** — auth, rate-limit decision, route dispatch, DB read
|
||||
2. **Renewal scheduler** — periodic scan + dispatch
|
||||
3. **Certificate inventory queries** — large list returns with sparse fields
|
||||
|
||||
The baselines below cover those three.
|
||||
|
||||
## Baseline #1: API request handling (single endpoint)
|
||||
|
||||
Hit a hot read endpoint with a tight loop and compare against the baseline.
|
||||
|
||||
```bash
|
||||
SERVER=https://localhost:8443
|
||||
CACERT="--cacert ./deploy/test/certs/ca.crt"
|
||||
AUTH="Authorization: Bearer change-me-in-production"
|
||||
|
||||
# Warm the connection pool (5 requests, discard timing)
|
||||
for i in $(seq 1 5); do
|
||||
curl -s $CACERT -H "$AUTH" $SERVER/api/v1/stats/summary > /dev/null
|
||||
done
|
||||
|
||||
# Measured run: 100 requests, capture mean latency
|
||||
time (for i in $(seq 1 100); do
|
||||
curl -s $CACERT -H "$AUTH" $SERVER/api/v1/stats/summary > /dev/null
|
||||
done)
|
||||
```
|
||||
|
||||
**Baseline (M3 MacBook Pro, Docker Desktop):** real time under 5 seconds for 100 sequential requests = mean ~50ms p50.
|
||||
|
||||
If you're seeing > 100ms mean, something is wrong: PostgreSQL connection pool exhaustion, agent flooding the work-poll endpoint, or rate-limiter mis-tuned.
|
||||
|
||||
## Baseline #2: Inventory list with cursor pagination
|
||||
|
||||
```bash
|
||||
# Cursor-paginated full inventory walk
|
||||
NEXT=""
|
||||
PAGES=0
|
||||
START=$(date +%s)
|
||||
while true; do
|
||||
RESP=$(curl -s $CACERT -H "$AUTH" "$SERVER/api/v1/certificates?limit=100&cursor=$NEXT")
|
||||
NEXT=$(echo "$RESP" | jq -r '.next_cursor // empty')
|
||||
PAGES=$((PAGES + 1))
|
||||
[ -z "$NEXT" ] && break
|
||||
done
|
||||
END=$(date +%s)
|
||||
echo "Walked $PAGES pages in $((END - START))s"
|
||||
```
|
||||
|
||||
**Baseline:** for the demo dataset (15 certificates, 1 page), under 1 second total. For a 1000-cert inventory (10 pages of 100), under 3 seconds total = ~300ms per page.
|
||||
|
||||
If you're seeing > 1s per page on a 1000-cert inventory, the cursor index on `managed_certificates(created_at, id)` is missing or the query plan went wrong.
|
||||
|
||||
## Baseline #3: Scheduler tick (renewal scan)
|
||||
|
||||
The renewal scheduler runs every hour by default. Force a tick and observe the time-to-completion in the logs:
|
||||
|
||||
```bash
|
||||
# Trigger an immediate renewal scan via the admin endpoint
|
||||
curl -s $CACERT -H "$AUTH" -X POST $SERVER/api/v1/admin/scheduler/run-now/renewal | jq .
|
||||
|
||||
# Tail the log and look for the matching `renewal scan complete` line
|
||||
docker compose logs -f certctl-server | grep 'renewal'
|
||||
```
|
||||
|
||||
**Baseline (15-cert demo dataset):** "renewal scan complete" within 100ms of the trigger.
|
||||
|
||||
For a 1000-cert inventory: under 5 seconds. The dominant cost is the per-cert profile + policy + alert-channel resolve plus the threshold-comparison math. If you're seeing > 10 seconds, profile resolution is likely doing N+1 queries.
|
||||
|
||||
## Baseline #4: Bulk revoke
|
||||
|
||||
```bash
|
||||
# Bulk-revoke all certs from a (test) issuer
|
||||
TIME=$(date +%s)
|
||||
curl -s $CACERT -H "$AUTH" -H "$CT" -X POST $SERVER/api/v1/certificates/bulk-revoke \
|
||||
-d '{"filter":{"issuer_id":"iss-test"},"reason":"superseded"}' | jq .
|
||||
echo "Bulk revoke: $(($(date +%s) - TIME))s"
|
||||
```
|
||||
|
||||
**Baseline:** linear in cert count. For 100 certs from one issuer: under 5 seconds. For 1000 certs: under 30 seconds (dominated by per-cert audit row + per-cert CRL refresh).
|
||||
|
||||
## When to re-baseline
|
||||
|
||||
After any of:
|
||||
|
||||
- Postgres major-version upgrade
|
||||
- Go major-version upgrade
|
||||
- Significant migration (add a column to `managed_certificates`, add an index)
|
||||
- Connection pool config change
|
||||
- Changing the renewal scheduler interval
|
||||
|
||||
Capture timing in `cowork/loadtest-baselines/<date>.md` so future regressions surface against a real baseline rather than the operator's gut feeling.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [`docs/contributor/ci-pipeline.md`](../contributor/ci-pipeline.md) — CI guard for performance regression
|
||||
- [`docs/operator/security.md`](security.md) — rate limit tuning
|
||||
- [`docs/reference/architecture.md`](../reference/architecture.md) — request path through handler → service → repository
|
||||
@@ -0,0 +1,148 @@
|
||||
# certctl CLI
|
||||
|
||||
> Last reviewed: 2026-05-05
|
||||
|
||||
`certctl-cli` is the command-line interface to certctl. It wraps the REST API as terminal commands so operators and CI/CD pipelines can drive certctl without writing curl invocations.
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
go install github.com/certctl-io/certctl/cmd/cli@latest
|
||||
```
|
||||
|
||||
The binary lands at `$GOBIN/cli` (or `$HOME/go/bin/cli` if `GOBIN` is unset). Rename to `certctl-cli` if you prefer.
|
||||
|
||||
## Configure
|
||||
|
||||
The CLI reads three environment variables:
|
||||
|
||||
```bash
|
||||
export CERTCTL_SERVER_URL=https://localhost:8443
|
||||
export CERTCTL_API_KEY=your-api-key
|
||||
export CERTCTL_SERVER_CA_BUNDLE_PATH=/path/to/ca.crt
|
||||
```
|
||||
|
||||
Or pass them per-invocation:
|
||||
|
||||
```bash
|
||||
certctl-cli --server https://localhost:8443 --api-key your-key --ca-bundle ca.crt certs list
|
||||
```
|
||||
|
||||
For local development against a self-signed bootstrap cert, `--insecure` skips TLS verification. **Never set this in production.**
|
||||
|
||||
## Command groups
|
||||
|
||||
The CLI is organized by resource:
|
||||
|
||||
```
|
||||
certctl-cli certs [list|get|renew|revoke]
|
||||
certctl-cli agents [list|get]
|
||||
certctl-cli jobs [list|get|cancel]
|
||||
certctl-cli import [bulk PEM import]
|
||||
certctl-cli est [enroll|reenroll]
|
||||
certctl-cli status [server health + summary stats]
|
||||
certctl-cli version [CLI + server version]
|
||||
```
|
||||
|
||||
## Common workflows
|
||||
|
||||
### List + filter certificates
|
||||
|
||||
```bash
|
||||
# All certs
|
||||
certctl-cli certs list
|
||||
|
||||
# Filter by environment
|
||||
certctl-cli certs list --env production
|
||||
|
||||
# JSON output (default is table)
|
||||
certctl-cli certs list --format json
|
||||
|
||||
# Sort + paginate
|
||||
certctl-cli certs list --sort -expires_at --limit 50
|
||||
|
||||
# Time-range filter (RFC 3339)
|
||||
certctl-cli certs list --expires-before 2026-06-01T00:00:00Z
|
||||
|
||||
# Sparse fields — only return the columns you need
|
||||
certctl-cli certs list --fields id,common_name,expires_at,status
|
||||
```
|
||||
|
||||
### Trigger renewal
|
||||
|
||||
```bash
|
||||
certctl-cli certs renew mc-api-prod
|
||||
# Returns the job id; track with: certctl-cli jobs get <job-id>
|
||||
```
|
||||
|
||||
### Revoke
|
||||
|
||||
```bash
|
||||
# Single revoke
|
||||
certctl-cli certs revoke mc-api-prod --reason keyCompromise
|
||||
|
||||
# Bulk revoke by filter
|
||||
certctl-cli certs revoke --profile prof-deprecated --reason superseded
|
||||
certctl-cli certs revoke --team t-payments --reason cessationOfOperation
|
||||
certctl-cli certs revoke --issuer iss-old-vault --reason cACompromise
|
||||
```
|
||||
|
||||
Reason codes are the canonical RFC 5280 §5.3.1 set: `unspecified`, `keyCompromise`, `cACompromise`, `affiliationChanged`, `superseded`, `cessationOfOperation`, `certificateHold`, `removeFromCRL`, `privilegeWithdrawn`, `aACompromise`. Anything else returns an error.
|
||||
|
||||
### Bulk import
|
||||
|
||||
```bash
|
||||
# Import a directory of PEMs
|
||||
certctl-cli import /etc/letsencrypt/live/
|
||||
|
||||
# Import a single concatenated bundle
|
||||
certctl-cli import certs.pem
|
||||
```
|
||||
|
||||
Each cert lands in the inventory as `Unmanaged` (per the discovery model). Triage from the dashboard or via `certctl-cli certs claim <id>` once you've decided to actively manage it.
|
||||
|
||||
### EST enrollment
|
||||
|
||||
```bash
|
||||
# Enroll a new device cert via EST simpleenroll
|
||||
certctl-cli est enroll --csr device.csr --output device.crt
|
||||
|
||||
# Re-enroll (renew) an existing device cert
|
||||
certctl-cli est reenroll --csr device.csr --client-cert device.crt --client-key device.key
|
||||
```
|
||||
|
||||
### Server status
|
||||
|
||||
```bash
|
||||
certctl-cli status
|
||||
# Health: ok
|
||||
# Total certificates: 145
|
||||
# Expiring (30d): 12
|
||||
# Active jobs: 3
|
||||
# Pending renewals: 8
|
||||
```
|
||||
|
||||
## Output formats
|
||||
|
||||
- `--format table` (default) — human-readable terminal output
|
||||
- `--format json` — JSON for piping into `jq`, scripts, dashboards
|
||||
|
||||
The CLI is built with Go's standard library only — no external dependencies. The binary is small (~10MB) and statically linked.
|
||||
|
||||
## Wiring into CI/CD
|
||||
|
||||
Common pattern: a CI step that issues a cert from your internal CA, deploys it via certctl, and verifies the deploy:
|
||||
|
||||
```bash
|
||||
certctl-cli certs renew mc-api-prod --wait
|
||||
certctl-cli jobs get $(certctl-cli certs renew mc-api-prod --json | jq -r '.job_id') --wait
|
||||
certctl-cli certs get mc-api-prod --json | jq -r '.expires_at'
|
||||
```
|
||||
|
||||
The `--wait` flag blocks until the job reaches a terminal state (Completed / Failed / Cancelled), which is what CI scripts actually need.
|
||||
|
||||
## Related docs
|
||||
|
||||
- [`docs/reference/api.md`](api.md) — the OpenAPI 3.1 spec the CLI wraps
|
||||
- [`docs/reference/mcp.md`](mcp.md) — the MCP server that exposes the same surface to AI assistants
|
||||
- [`docs/contributor/qa-prerequisites.md`](../contributor/qa-prerequisites.md) — local environment setup before the CLI can talk to a server
|
||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user