certctl

mirror of https://github.com/shankar0123/certctl.git synced 2026-06-07 14:11:31 +00:00

Author	SHA1	Message	Date
shankar0123	1daae5d709	docs(readme): fix demo path command — point at deploy/demo-up.sh wrapper Operator reproduction (verbatim log captured 2026-05-14): $ docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build ... build succeeds, containers come up ... dependency failed to start: container certctl-server is unhealthy $ docker compose ... logs certctl-server \| tail -1 certctl-server \| Failed to load configuration: phase-2 SEC-H3 fail-closed guard (missing TS): CERTCTL_DEMO_MODE_ACK=true requires CERTCTL_DEMO_MODE_ACK_TS=<unix-epoch> set within the last 24h — refuse to start. Root cause ========== README.md L95 documented a bare `docker compose ... up` command that ignores the Phase 2 SEC-H3 fail-closed guard added in internal/config/config.go::Validate (commit 2026-05-13). The guard pairs CERTCTL_DEMO_MODE_ACK=true with a required CERTCTL_DEMO_MODE_ACK_TS=<unix-epoch> that must be within the last 24h, so a forgotten demo deploy doesn't accidentally end up serving production traffic with auth-type=none. The demo overlay (deploy/docker-compose.demo.yml) passes the timestamp through from the shell via `CERTCTL_DEMO_MODE_ACK_TS: "${CERTCTL_DEMO_MODE_ACK_TS:-}"`. The README command never exported it, so the server saw an empty value, the guard refused to boot, the healthcheck never passed, and the dependent certctl-agent container refused to start. The deploy/demo-up.sh wrapper (which already exists; it's used by CI cold-DB smoke and was added in the same SEC-H3 commit chain) mints `CERTCTL_DEMO_MODE_ACK_TS="$(date +%s)"` before exec'ing `docker compose` with the same -f flags. Drop-in replacement for the bare compose invocation. Fix === README.md "Demo path" code block now points at the wrapper script: ./deploy/demo-up.sh -d --build Plus a one-paragraph explanation of why the wrapper is the supported entry point and what the SEC-H3 timestamp gate is defending against. The bare `docker compose ... up` form is documented as failing-closed so a future operator who tries it understands the error message they see. Affected paths ============== - README.md (the Quick Start "Demo path" block; lines 92-100 before, 93-103 after this change) Out of scope (tracked separately if needed) ============================================ - The `WARN[0000] ... defaulting to a blank string` lines on docker compose stdout (POSTGRES_PASSWORD, CERTCTL_API_KEY, etc.) are red herrings — they fire on the BASE compose's env interpolation but the demo overlay immediately overrides those with hardcoded demo-safe values. They're noise; not a footgun. Leaving them alone — silencing the WARN would require either an .env shim or setting empty defaults at the base layer, both of which are worse than the current warn-but-correct behaviour. - The bare `docker compose -f base.yml up` production path (README L108) is unchanged. That path requires a real .env and will fail closed on placeholders — which is the correct behaviour. The README already documents .env setup for that path.	2026-05-14 15:01:38 +00:00
shankar0123	de8fac24a3	docs(readme): fix quickstart $EDITOR portability bug The production-path quickstart at README.md:103-108 used `$EDITOR deploy/.env` literally — assumes the operator has $EDITOR exported in their shell. On a fresh macOS / zsh session (default install, nothing in .zshrc), $EDITOR is unset and the shell expands the command to ` deploy/.env` with a leading empty arg, which zsh tries to execute as a binary: shankar@macbookpro certctl % $EDITOR deploy/.env zsh: permission denied: deploy/.env The escalation reflex makes it worse — `sudo $EDITOR deploy/.env` expands to `sudo deploy/.env` (sudo strips env by default), which sudo dispatches as a command lookup against PATH: sudo: deploy/.env: command not found Net: a new-user quickstart that fails on the second command of the production path with two opaque errors back-to-back. Replace with the POSIX-portable default-fallback form: "${EDITOR:-nano}" deploy/.env `nano` is pre-installed on macOS (BSD nano) and every mainstream Linux distro, so the fallback always resolves. The user's preferred editor (vim/emacs/code) is still honored if they have $EDITOR set. Added a parenthetical reminder so the operator who has a strong editor preference knows they can substitute. Verified no other phantom-EDITOR sites in README / docs/getting-started / docs/operator via: grep -nE '\$EDITOR\b' README.md docs/getting-started/.md docs/operator/.md	2026-05-13 04:09:39 +00:00
shankar0123	0161bb201c	docs: remove internal engineering docs; docs must be tool- or story-relevant Operator policy: docs in the public repo must help (a) a user deploying certctl or (b) the product story. Internal engineering process documentation belongs in cowork/ scratchpads or in git commit history, not docs/. Removed (docs/contributor/, 8 files, 2,323 lines): - release-sign-off.md — internal release-day checklist - ci-pipeline.md — what runs in CI (internal) - ci-guards.md — what the guards are (internal) - testing-strategy.md — internal testing strategy - qa-test-suite.md — internal QA reference (445 lines) - qa-prerequisites.md — internal QA setup - gui-qa-checklist.md — manual GUI QA checklist - test-environment.md — 1,103-line redundant with docs/getting-started/quickstart.md + docs/getting-started/advanced-demo.md Removed supporting script: - scripts/qa-doc-seed-count.sh — CI guard for the deleted qa-test-suite.md seed-data table Cross-reference cleanup: - README.md: dropped the Contributor audience row + footer pointer to docs/contributor/. - Makefile: dropped `verify-docs` target + qa-stats comment refs. - .github/workflows/ci.yml: dropped the QA-doc seed-count drift CI step + dead comment refs. - docs/reference/cli.md: repointed qa-prerequisites.md → quickstart.md. - docs/operator/performance-baselines.md: dropped ci-pipeline.md cross-ref. - scripts/ci-guards/README.md: dropped the 'Guards explicitly NOT here' section that referenced the deleted QA-doc guards. G-3 env-docs-drift guard improvements (a real consequence: deleting the contributor docs surfaced that some env vars only had a home there). Refit the guard to the new doc topology: - Defined-scan widened from `config.go + cmd/` to all of `cmd/ + internal/` (production code), excluding `_test.go` — catches service-layer env vars like CERTCTL_STEPCA_ROOT_CERT and CERTCTL_ZEROSSL_EAB_URL that were previously invisible to the guard. - Docs-scan widened to include deploy/ENVIRONMENTS.md (the canonical env-var inventory table — should have been in scope from day one). Kept narrow to README + docs/ + deploy/helm/ + ENVIRONMENTS.md to avoid pulling in compose/test fixtures. - ALLOWED filter now applies to both DOCS_ONLY and CONFIG_ONLY directions, so dynamic per-profile dispatch surfaces (CERTCTL_SCEP_PROFILE_<NAME>_, CERTCTL_EST_PROFILE_<NAME>_, CERTCTL_QA_) don't need static doc entries. - Added CERTCTL_SCEP_PROFILE_[A-Z_]+ and CERTCTL_EST_PROFILE_[A-Z_]+ to ALLOWED for the same reason. deploy/ENVIRONMENTS.md: added CERTCTL_ZEROSSL_EAB_URL row — real operator override (overrides the ZeroSSL EAB-credentials endpoint; read at internal/connector/issuer/acme/acme.go:372) that was defined in Go source but never documented. G-3 caught it after the defined-scan widened. scripts/ci-guards/S-1-hardcoded-source-counts.sh: removed dead WORKSPACE-CHANGELOG.md allowlist entry (the file was deleted in the prior workspace cleanup). Verified: All 35 scripts/ci-guards/.sh green (FAIL=0). No remaining references to docs/contributor/ or qa-doc-seed-count in tracked files.	2026-05-13 02:44:27 +00:00
shankar0123	47da13e7a1	fix(helm): close BUNDLE 3 — Helm chart hardening + enterprise deploy Bundle 3 closure (2026-05-12 acquisition diligence audit). Closes the "chart claims production-ready but lying-fields silently break it" hazard cluster: README install command had wrong key, required secrets weren't fail-fast, external Postgres rendered the bundled StatefulSet hostname, container-only security hardening fields landed at pod scope (silently dropped by K8s API), and three advertised template surfaces (ServiceMonitor, PodDisruptionBudget, NetworkPolicy) didn't render at all even when their values.yaml toggles were on. Source findings closed: C2 C3 D1 D2 D3 D5 D7 D11 D12 (repo audit) OPS-L1 OPS-L2 (cowork audit) Source findings explicitly deferred (tracked in WORKSPACE-ROADMAP.md): D6 OPS-H1 (backup automation — operator must choose target storage) D10 (digest pinning of latest `:latest` tags) OPS-M1 (prometheus/client_golang migration) OPS-M2 (distributed tracing instrumentation) Chart truth table (rendered with helm 3.16.3): -f values.yaml + tls.existingSecret + auth.apiKey + pg.auth.password → 12 resources (default mode, no monitoring/PDB/networkpolicy) + postgresql.enabled=false + externalDatabase.url=… → NO StatefulSet, NO postgres-secret, NO postgres-service (D2) + server.tls.certManager.enabled=true → +1 Certificate (cert-manager mode) + replicas=3 + monitoring.enabled=true + serviceMonitor.enabled=true + podDisruptionBudget.enabled=true + networkPolicy.enabled=true → +1 ServiceMonitor + 1 PodDisruptionBudget + 1 NetworkPolicy (D5+D11) tls.existingSecret AND tls.certManager.enabled both set → REFUSED with "EXACTLY ONE TLS ownership path" error (D7) Missing required secrets (apiKey / pg password / external URL) → REFUSED at template time with operator-actionable guidance (D1) Closures by source ID: C2 — README Helm install example fixed. Was `--set postgresql.password=…` (does not exist); now `--set postgresql.auth.password=…` matching the chart key. README install block also wires TLS, mentions fail-fast at template time, and links the external-Postgres example. C3 — Kubernetes Secrets connector annotated PREVIEW in values.yaml. The chart still exposes `kubernetesSecrets.enabled` for the RBAC preview wiring, but the values block now states clearly that the production K8s client at internal/connector/target/k8ssecret/ k8ssecret.go::realK8sClient is a stub (verified — go.mod imports zero k8s.io/client-go packages). Production landing tracked in WORKSPACE-ROADMAP.md. D1 — `certctl.requiredSecrets` template helper. Fail-fasts at render time when (a) server.auth.type=api-key + apiKey empty, (b) postgresql.enabled=true + pg.auth.password empty, (c) postgresql.enabled=false + externalDatabase.url + legacy env CERTCTL_DATABASE_URL all empty. Each branch emits an operator-actionable diagnostic with the openssl rand command or values override needed. postgres-secret template additionally uses Helm's `required` builtin so it can't render with the empty fallback that pre-Bundle-3 produced ("changeme" literal). D2 — externalDatabase.url first-class. New top-level values block. certctl.databaseURL helper now branches on postgresql.enabled: bundled path uses the helper-emitted in-cluster URL; external path uses externalDatabase.url verbatim. postgres-secret, postgres-statefulset, and postgres-service ALL gate on postgresql.enabled — external mode renders ZERO postgres-* resources. POSTGRES_PASSWORD env in server-deployment also gates. D3 — Container-vs-pod security context split. K8s API silently drops readOnlyRootFilesystem / allowPrivilegeEscalation / capabilities / privileged when they land at pod scope (`spec.securityContext`); they only work at container scope (`spec.containers[].securityContext`). Pre-Bundle-3 all fields sat at pod scope so the chart's documented "read-only rootfs + drop-all caps" hardening was effectively unenforced. New certctl.podSecurityContext + containerSecurityContext helpers split the operator-facing securityContext map by field-name whitelist so existing values keep working byte-for-byte while fields render at the K8s-valid scope. Applied to both server-deployment.yaml and agent-daemonset.yaml (DaemonSet + Deployment branches). D5 — Prometheus ServiceMonitor template. New templates/servicemonitor.yaml. Renders when monitoring.enabled AND monitoring.serviceMonitor.enabled. Scrapes /api/v1/metrics/prometheus (rbac-gated on metrics.read — needs bearerTokenSecret with an API key holding that perm). values.yaml block extended with bearerTokenSecret, tlsConfig, and relabelings knobs and the operator-facing comment documenting the auth requirement. D7 — TLS both-set rejection. certctl.tls.required helper extended. Pre-Bundle-3 only the NEITHER-set case was caught; setting BOTH rendered a dangling cert-manager Certificate alongside an existing-Secret mount, two conflicting TLS sources of truth. Now refuses with "EXACTLY ONE TLS ownership path" + remediation steps for both possible operator intents. D11 — PodDisruptionBudget + NetworkPolicy templates. New templates/pdb.yaml (renders when podDisruptionBudget.enabled + server.replicas > 1) + templates/networkpolicy.yaml (renders when networkPolicy.enabled). PDB uses minAvailable / maxUnavailable exclusivity per K8s spec. NetworkPolicy default-allows in-namespace agent → server traffic, kube-DNS egress, and bundled-postgres egress (when postgresql.enabled), with operator-extensible extraIngress / extraEgress for CA / OIDC / SMTP egress. Both default off so existing deploys don't lose network reach unannounced. D12 — Database max-conn config wired. Pre-Bundle-3 internal/repository/postgres/db.go::NewDB hard-coded SetMaxOpenConns(25). config.go loaded CERTCTL_DATABASE_MAX_CONNS, Validate() enforced the >= 1 floor, values.yaml documented it, and docs/reference/configuration.md surfaced it — but the pool ignored every operator setting. New NewDBWithMaxConns threads the operator value into the pool with maxIdle = maxOpen / 5 (≥ 1) so the historical ratio carries forward. cmd/server/main.go calls the new constructor; NewDB stays for compat at the default 25. OPS-L1 — Chart version 0.1.0 → 1.0.0. Chart has shipped through 8 audit closures since 2026-02 (M-018, U-1, U-2, U-3, H-1, G-1, B1, B2); pre-1.0 version was implying instability the chart no longer has. OPS-L2 — External-Postgres path is now properly documented in values.yaml (externalDatabase block with mode-2 example), README install command links the existing examples/values-external-db.yaml, and the chart truth table above proves the external mode renders cleanly. Receipts: helm lint deploy/helm/certctl/ # clean helm template c deploy/helm/certctl/ \ --set server.tls.existingSecret=ci \ --set postgresql.auth.password=p \ --set server.auth.apiKey=k # 12 kinds, default helm template c deploy/helm/certctl/ \ --set server.tls.existingSecret=ci \ --set postgresql.enabled=false \ --set externalDatabase.url='postgres://u:p@h:5432/db?sslmode=require' \ --set server.auth.apiKey=k # 9 kinds, no postgres-* helm template c deploy/helm/certctl/ \ --set server.tls.certManager.enabled=true \ --set server.tls.certManager.issuerRef.name=letsencrypt \ --set postgresql.auth.password=p --set server.auth.apiKey=k # +1 Certificate (cert-manager) helm template c deploy/helm/certctl/ \ --set server.tls.existingSecret=ci \ --set postgresql.auth.password=p --set server.auth.apiKey=k \ --set server.replicas=3 \ --set monitoring.enabled=true \ --set monitoring.serviceMonitor.enabled=true \ --set podDisruptionBudget.enabled=true \ --set networkPolicy.enabled=true # +ServiceMonitor +PDB +NetworkPolicy (TLS both-set + missing apiKey + missing pg password + missing extDb URL all REFUSED.) gofmt -l # clean go vet ./internal/repository/postgres ./cmd/server # clean go build ./cmd/server # clean bash scripts/ci-guards/B3-helm-chart-coherence.sh # clean Remaining operator warnings (deferred, tracked in WORKSPACE-ROADMAP.md): - Backup CronJob + restore script (D6 + OPS-H1): operator chooses target (S3, GCS, Azure Blob, NFS). Sample CronJob yaml may ship in deploy/helm/examples/ once an operator workstation has run one full backup-restore cycle. - Distributed tracing (OPS-M2): otel/* are go.mod indirect deps, not actively instrumented. Adding spans is a v3 work item. - Prometheus client_golang migration (OPS-M1): the hand-rolled /metrics/prometheus exposition format works today; client_golang migration unlocks histograms + exemplars + native label sets. Audit-Closes: BUNDLE-3 C2 C3 D1 D2 D3 D5 D7 D11 D12 OPS-L1 OPS-L2 Audit-Defers: D6 D10 OPS-H1 OPS-M1 OPS-M2	2026-05-13 00:40:42 +00:00
shankar0123	a849c8b8cf	fix(security): close BUNDLE 2 — safe first run, demo mode, agent bootstrap Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the "docker compose up == accidental production" hazard: pre-Bundle-2 the base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal change-me-... placeholder creds), the README claimed "drop the demo overlay for a clean install", and ENVIRONMENTS.md table documented auth-type default as api-key — three contradictory stories layered on the same compose file. Source findings closed: R2 R3 C1 D9 finding-2 S9 (repo audit) SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit) Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml): The base now ships production-shaped — no AUTH_TYPE override, no KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET / CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID must come from deploy/.env (sample template in deploy/.env.example + root .env.example). The demo overlay carries the full demo posture (every env var + every placeholder credential) so the `-f docker-compose.demo.yml` one-flag flip remains a zero-config populated-dashboard path. Fail-closed startup guards (internal/config/config.go::Validate): Three new gates layered on the existing HIGH-12 demo-mode listen-bind guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay keeps working: • HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse • HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse • LOW-5: CORS_ORIGINS contains "" (CWE-942 + CWE-352) → refuse Visible DEMO MODE banner (cmd/server/main.go): every boot under DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step production-promotion checklist. The 2026-04-19 incident (a screenshot run that kept running for three days) drove this; the per-startup banner makes the posture unmissable in any log scraper. Agent enrollment doc alignment: • docs/reference/configuration.md L83: corrected the non-existent URL `POST /api/v1/agents/register` to the real route `POST /api/v1/agents`; added the bootstrap-token note and the install-agent.sh handoff sequence. • docs/reference/architecture.md L154: replaced "agents register themselves at first heartbeat" (false — cmd/agent/main.go fail- fasts when CERTCTL_AGENT_ID is unset) with the actual two-step operator-driven flow (REST or GUI registration first, returned ID fed to install-agent.sh second). Tests + CI guard: • 9 new TestValidate_Bundle2_ cases in internal/config/config_test.go covering: placeholder-secret refused + demo-ack exempt; placeholder encryption-key refused + demo-ack exempt; real key not mistaken for placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed into a concrete allowlist still refused; concrete allowlist accepted. • scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base compose for any of the demo-mode env vars + placeholder credentials. Comments stripped before checking so the narrative header in the base file can still reference the overlay's posture in prose. Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke): Switched to layering -f docker-compose.demo.yml on top of the base — the new production base requires real env vars the smoke doesn't have, and the smoke's purpose (catch migration-on-cold-DB regressions + the bootstrap-token mint path) is orthogonal to which auth posture the boot lands in. Receipts: • Current first-run truth table compose flag → posture -f docker-compose.yml (production) → requires .env; fail-fasts on missing AUTH_SECRET / CONFIG_ENCRYPTION _KEY / POSTGRES _PASSWORD; agent fail-fasts on missing AGENT_ID -f docker-compose.yml -f docker-compose.demo.yml (demo) → zero-config; AUTH_TYPE=none + DEMO_MODE_ACK=true + KEYGEN=server + DEMO_SEED=true; boot banner WARN -f docker-compose.yml -f docker-compose.dev.yml (dev) → base + PgAdmin + debug logging -f docker-compose.test.yml (test, standalone) → production-shape posture, real CA backends • Verification (PATH=/tmp/go/bin export GO* paths to /tmp): gofmt -l # clean (no diffs) go vet ./internal/config ./cmd/server # clean go test -short -count=1 ./internal/config/... # PASS (cumulative + all 9 new Bundle 2 cases green) go test -short -count=1 # PASS (no regression ./internal/connector/target/configcheck in the Bundle 1 - closure tests) go build ./cmd/server ./cmd/agent # clean ./cmd/cli ./cmd/mcp-server bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean bash scripts/ci-guards/G-3-env-docs-drift.sh # clean Remaining operator warnings (not blocking; tracked in CLAUDE.md "Open decisions"): • The first `docker compose -f docker-compose.yml up -d` against a pre-Bundle-2 .env (placeholder values still in place) will now fail-fast. This is the intended posture but operators upgrading from v2.0.x via .env-from-old-master need to rotate before upgrading. The CHANGELOG note for the v2.1.0 release should call this out alongside Auth Bundle 2's other breaking changes. Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6	2026-05-13 00:14:59 +00:00
shankar0123	d60a0ac297	fix(security): close BUNDLE 1 — server+agent connector config validation chain Bundle 1 closure (2026-05-12 acquisition diligence audit). Closes the acquisition-blocker chain: target.edit (default r-operator grant per migrations/000029_rbac.up.sql:196) → arbitrary reload_command stored without validation → agent createTargetConnector json.Unmarshal-only → sh -c on agent host. README's 'shell injection prevention on all connector scripts' claim is now true at the chain level. Server-side: new internal/connector/target/configcheck package + a configcheck.Validate call in target.go::Create + ::Update + ::CreateTarget + ::UpdateTarget (all 4 entry points). Rejects shell metacharacters in reload_command / validate_command / restart_command for nginx, apache, haproxy, postfix/dovecot, javakeystore, ssh. Sentinel errors.Is(err, service.ErrInvalidConnectorConfig) available for handler 400 mapping. Non-shell connector types (F5, IIS, Caddy, Traefik, Envoy, cloud targets, K8s) are no-ops by design. Agent-side: defense-in-depth connector.ValidateConfig(ctx, configJSON) call in cmd/agent/main.go inserted between createTargetConnector and DeployCertificate. This catches (a) configs pre-dating the server gate, (b) encrypted-blob tampering, (c) per-connector filesystem invariants that the server can't check. F5 (S2 finding): proven docs-vs-code drift, not a security bug. The applyDefaults function never set Insecure=true; runtime default has always been Go zero-value (false → TLS verified). Three lying 'default true' comments in f5/f5.go (lines 30, 45-47, 126) rewritten to match actual code behavior. Docs (C4 + C9): README L12 + L68 narrowed — 'any CA / any server' → 'Twelve native CA connectors plus an OpenSSL adapter; fifteen native deployment-target connectors plus a proxy-agent pattern.' 'Every deploy goes through atomic-write + ...' narrowed to file-based connectors with inline link to per-target guarantee matrix. New deployment-model.md §1.6 ships a 15-target × 8-property guarantee table covering atomic write / owner-perms / SHA-256 idempotency / pre-deploy snapshot / on-failure rollback / post-deploy TLS verify / Prometheus counters / shell-injection validation — including the K8s preview honesty marker (CLAIM-H4). Tests: internal/connector/target/configcheck/configcheck_test.go covers 14 shell-injection payloads (semicolon, pipe, backtick, dollar-paren, redirect, and-chain, newline, double-quote, escape, dollar-var) × 7 shell-using connectors + benign-command acceptance + non-shell no-op behavior + empty config + malformed JSON. All pass. Verification (run from /sessions/gifted-blissful-pasteur/mnt/cowork/certctl): go fmt ./... # clean (no diffs) go vet ./... # clean (no findings) go test -short -count=1 ./internal/... ./cmd/... # 60+ packages all ok, zero FAIL Audit-Closes: BUNDLE-1 RT-C1 SEC-M4 CLAIM-M2 CLAIM-L3 Audit-Verifies-False: S2 (F5 'default insecure' was a comment lie, code was always secure)	2026-05-12 23:48:08 +00:00
shankar0123	7b3a57dfdf	docs(readme): revert Status block to 4-paragraph form (over-split was too choppy)	2026-05-11 22:18:38 +00:00
shankar0123	a103ccfe5c	docs(readme): one sentence per blockquote in Status block — full breathing room	2026-05-11 22:17:44 +00:00
shankar0123	c029875196	docs(readme): Status block rewrite — design-partner CTA, paragraph cadence Earlier versions were either link-soup or so tight they read as boilerplate. This pass aims for CMO-grade copy: - Paragraph 1: lede that combines the early-access label with the design-partner ask — sets the tone in one line. - Paragraph 2: what's production-quality today, with the RBAC + OIDC doc links inline (no bold, no link-soup). Names the v2.1.0 layer on top. - Paragraph 3: the ask — production deployments wanted, framed explicitly as 'we can't manufacture this exposure in CI'. Honest about the federated-identity surface being where the new exposure lives. Mutual-value framing. - Paragraph 4: the actionable bit — file issues liberally, with the why ('how the platform earns the right to drop early-access'). Three inline doc links (RBAC, OIDC runbook index, file-issues). Same factual content, warmer voice, paragraph cadence with breathing room between.	2026-05-11 22:16:32 +00:00
shankar0123	ed833e80f6	docs(readme): space out the Status block — three separate blockquotes	2026-05-11 22:14:50 +00:00
shankar0123	0eb3d0310c	docs(readme): tighten Status block; add RBAC + OIDC runbook links Quieter version of the Status block — single blockquote, three short sentences, three inline links (RBAC, OIDC, file-issues). Drops: - The Local-CA / ACME / agent-deployment / CRUD / audit feature pile (those live in the doc table immediately below) - The 6-IdP enumeration (Keycloak / Authentik / Okta / Auth0 / Entra ID / Google Workspace) — operators find that in the OIDC runbook index, now linked inline - The double 'in early-access' phrasing - 'HMAC-signed server-side sessions with __Host- cookies and CSRF rotation; OIDC Back-Channel Logout; Argon2id break-glass admin' — the spec details belong in the auth-threat-model + security docs, not the front-page status Same early-access framing, same issue-link CTA, far more readable.	2026-05-11 22:13:34 +00:00
shankar0123	46769fc7fa	docs(readme): audit pass — fix 7 stale/inaccurate claims Each claim ground-truthed against the live repo, not memory. Numeric drift (claims rotted since they were written): - Screenshot caption 'Catalog with 10 CA types' → 12 (matches internal/connector/issuerfactory/factory.go enumeration). - '33-permission canonical catalogue' → dropped the number. 33 was the base in migration 000029; across all 45 migrations 82 unique perms are seeded (+5 admin / +7 OIDC / +2 break-glass / +33 audit-CRIT-1 / +2 user). 'Fine-grained permission catalogue' is monotonic prose. - 'PostgreSQL 16 backend (35+ tables, idempotent migrations)' → '…backend with idempotent migrations'. Actual table count is 49 across 45 migrations; bare 'idempotent migrations' is drift-proof. - Demo overlay seeds '32 certificates across 10 issuers, 8 agents, 180 days' → '180 days of realistic history across 13 issuers, 8 agents, managed + discovered certs, jobs, deploys, audit, and notification events'. seed_demo.sql actually seeds 14 managed certs + 16 cert versions + 12 discovered, 13 issuers (not 10), 8 agents ✓, 23 INTERVAL '180 days' refs ✓. - 'golangci-lint (11 linters)' → '(govet + staticcheck + contextcheck + unused)'. .golangci.yml lists exactly 4 active linters; 6 others are commented-out 'temporarily disabled' so neither 4 nor 10 explains 11. Broken Helm one-liner (silently no-ops because --set against a nonexistent path doesn't error): - '--set server.apiKey=…' → 'server.auth.apiKey' (deploy/helm/certctl/values.yaml:147 + templates/server- secret.yaml:16). - '--set postgres.password=…' → 'postgresql.password' (top-level key is 'postgresql', not 'postgres'; password sits at postgresql.password per values.yaml:315). Verified accurate (no change): - 12 issuers / 15 targets / 6 notifiers (factory + dir listings). - 7 default roles seeded in migration 000029. - Coverage thresholds (service 70 / handler 75 / crypto 88 / auth packages 85-95) against .github/coverage-thresholds.yml. - All 6 OIDC runbooks present (auth0 / authentik / azure-ad / google-workspace / keycloak / okta). - 4 referenced screenshots all exist on disk. - 8 agents in demo seed, 180 days of history. - RFC 9700 §4.7.1 / 9207 / 8555 / 9773 / 8894 / 9266 / 5280 / 6960 citations match source. - ChromeOS in SCEP description matches source. - install-agent.sh uses uname for OS / arch detection + systemd (Linux) / launchd (macOS).	2026-05-11 17:29:18 +00:00
shankar0123	12705efe36	docs(readme): split Status block into two blockquotes for breathing room	2026-05-11 17:09:20 +00:00
shankar0123	de53847f51	docs(readme): quiet the Status block The previous version crammed 5 bold-emphasized inline links plus inline code into a single paragraph — visually loud and hard to scan. Rewrite as two short paragraphs: - First paragraph: what's production-quality + what's still maturing. No links, em-dash cadence for breathing room. - Second paragraph: v2.1.0 OIDC + sessions + break-glass slice with a single issue-link tail. Drops the bold-link sandwich in favor of plain prose; the doc-nav table directly below handles per-doc routing. Same content, same early-access framing, far less visual noise.	2026-05-11 17:08:21 +00:00
shankar0123	56e2ea1ad7	docs: v2.1.0 release polish — strip internal bundle/phase tags, update status for OIDC ship README: - Rewrite Status block: drop the stale 'federated identity not yet shipped' line; flag v2.1.0 OIDC + sessions + back-channel logout + break-glass as early-access; encourage GitHub issues for IdP rough edges. (A1 framing — keep early-access umbrella, no SAML/WebAuthn/JIT roadmap teaser.) - Add OIDC SSO bullet to 'What it does' covering per-IdP runbooks, group-claim → role mapping, AES-256-GCM client_secret encryption, JWKS auto-refresh, PKCE-S256, RFC 9700 §4.7.1 pre-login binding, RFC 9207 iss check, __Host- cookies, CSRF rotation, idle+absolute expiry, BCL, break-glass admin. - Update Security paragraph: three auth paths (API keys / OIDC / break-glass), HMAC-signed sessions, CSRF rotation, RFC OIDC BCL. - Correct CI coverage thresholds against .github/coverage-thresholds.yml (service 70%, handler 75%, crypto 88%, auth packages 85-95%); 'static analysis' replaces the inflated '11 linters' claim (actual count is 4 active). Docs B3 sweep — strip operator-facing 'Bundle N' / 'Phase N' tags: - docs/operator/auth-threat-model.md — rewrite intro; rename 5 H2 sections (API-key + RBAC defenses / OIDC + sessions + break-glass defenses / OIDC + sessions threat catalogue / Closed federated- identity threats / Future-work threats); clean ~12 H3/prose hits. - docs/operator/rbac.md — strip Bundle 1 framing from intro, scope_id deferral note, MCP tools section, day-0 bootstrap, and 'Where to look next'. - docs/operator/auth-benchmarks.md — drop 'Phase 14' framing from title intro, hardware floor caption, result table caption, methodology, and pre-merge audit section. - docs/operator/security.md — already cleaned earlier this session (RBAC / day-0 / approval-bypass / OIDC federation / sessions / OIDC first-admin / break-glass H3s). - docs/operator/oidc-runbooks/{index,keycloak,authentik,okta, azure-ad}.md — strip Auth Bundle 2 framing + Phase 10/3/4 references; replace with feature-name prose. - docs/operator/legacy-clients-tls-1.2.md — drop Bundle F / M-023 audit-reference framing; keep CWE-326. - docs/operator/database-tls.md — drop Bundle B / M-018 framing from intro + Helm section. - docs/operator/runbooks/disaster-recovery.md — drop 'Production hardening II Phase 10' status callout. - docs/migration/oidc-enable.md — retitle 'Enable OIDC SSO'; strip Bundle 1/2 framing from prereqs, troubleshooting, related docs; update __Host- cookie callout from 'audit MED-14' to v2.1.0-BREAKING. - docs/migration/api-keys-to-rbac.md — strip Bundle 1 framing from intro, migration table, IsAdmin section, and cross-references. - docs/migration/acme-from-cert-manager.md — strip residual 'Phase 5' tags from cert-manager integration test references. - docs/reference/configuration.md — retitle Auth section. - docs/reference/profiles.md — strip Bundle 1 Phase 9 framing from RequiresApproval section + Related list. - docs/reference/auth-standards-implemented.md — rewrite intro (API-key + RBAC + OIDC + sessions + back-channel logout + break-glass); rename 'Bundle 1 (RBAC) standards covered separately' H2; clean per-row Phase references. - docs/README.md — rewrite nav-table entries to drop Bundle 1/2 parentheticals; retitle 'Enable OIDC SSO' migration entry. No code or test changes; pure operator-facing prose polish for the v2.1.0 tag.	2026-05-11 16:54:07 +00:00
shankar0123	977cdbdf44	docs(README): surface Bundle 1 RBAC + signal Bundle 2 federation as roadmap Pre-fix the README said nothing about role-based access control, the auditor role, the day-0 bootstrap path, or the four-eyes approval workflow — all shipped in Bundle 1 (commit `22c4971` + follow-ons). A prospective adopter landing on the README would read "API key auth enforced by default" and walk away thinking certctl had no authz primitive at all. The only OIDC reference was the cosign-keyless line at the artefact-signing section, unrelated to authentication. Three surgical edits: 1. Status block: extend the "production-quality core" enumeration with role-based authz, auditor split, day-0 bootstrap, four-eyes approval. Add a one-line callout that federated identity (OIDC, SAML, WebAuthn, server-side sessions, break-glass, JIT elevation) is roadmap-not-shipped — preempts the natural-but- wrong assumption that "RBAC means OIDC works". The two terms are linked inline: - "role-based authz" -> docs/operator/rbac.md (operator how-to: role table, permission catalogue, scope semantics, GUI/CLI/ HTTP/MCP grant flows, day-0 bootstrap). - "Federated identity" -> docs/operator/auth-threat-model.md #threats-bundle-1-does-not-close (canonical place where deferred Bundle-2 work is enumerated). Keeps the roadmap promise honest: a skeptic can click through to the explicit deferred-work list rather than taking prose at face value. 2. "What it does" feature list: insert a new bullet right after the approval-workflow bullet covering the 7 default roles, the 33- permission canonical catalogue, scope semantics, the auditor read-only invariant, the bootstrap path, and the privilege-escalation guard. Cross-links to docs/operator/rbac.md, the threat model, and the v2.0.x → v2.1.0 migration guide. 3. Security paragraph: replace "API key auth enforced by default with SHA-256 hashing and constant-time comparison" with the Bundle-1 reality — auth + RBAC + auditor + bootstrap + privilege- escalation guard — keeping the rest of the paragraph (CORS, SSRF, encryption-at-rest, TLS-1.3, audit trail, CI gates) unchanged. Verified: Both link targets exist on disk (docs/operator/rbac.md, docs/operator/auth-threat-model.md). Threat-model anchor heading "## Threats Bundle 1 does NOT close" is intact (line 138). All 24 ci-guards pass locally including S-1 (no hardcoded source counts re-introduced) and G-3 (no env-var docs drift). Updates the README to match Bundle 1's actually-shipped surface and to set honest expectations about Bundle 2 (federated identity) being the next slice, not yet landed.	2026-05-10 02:21:39 +00:00
shankar0123	ff6bf8f203	docs(README): add Status: Early-access disclosure block Reddit posts and operator-facing copy describe certctl as alpha for production, but the README's marketing-paragraph framing implied a more polished maturity. Dual-positioning erodes credibility because evaluators read both surfaces. Adds a dedicated "Status: Early-access" blockquote between the SC-081v3 paragraph and the existing "Actively maintained, shipping weekly" callout. Calls out the production-quality core (Local CA, ACME, agent deployment, CRUD, audit) versus the still-maturing broader surface (intermediate CA hierarchy, ACME/SCEP/EST servers, network appliances). Encourages lab/dev deployments and welcomes production deployments with the customer-scale caveat. The two consecutive blockquotes (Status + Actively maintained) read as paired signals: the project is early-access AND actively shipping, which is the honest joint position.	2026-05-06 07:45:55 +00:00
shankar0123	1720e11109	docs: fix broken single-file demo invocation in README + qa-prerequisites + ENVIRONMENTS The README's Quick Start, the qa-prerequisites contributor doc, and the landing page (separate repo, separate commit) all shipped a copy-paste command that produces: service "certctl-server" has neither an image nor a build context specified: invalid compose project The bug landed silently with commit `a3d8b9c` (the U-3 master). Pre-U-3, docker-compose.demo.yml was self-contained and could be invoked with a single -f flag. U-3 deliberately reduced it to a 27-line overlay — its only payload today is `CERTCTL_DEMO_SEED=true` on the certctl-server service — because the demo seed now applies at boot via postgres.RunDemoSeed, not via /docker-entrypoint-initdb.d/. The overlay no longer carries an image: or build: of its own, so it MUST be passed alongside the base file. The README/qa-doc/landing-page never picked up the rename of the contract. Every operator who copy-pasted the Quick Start since U-3 has hit the "invalid compose project" error and bounced. The operator caught it running the demo locally today. This commit fixes the three certctl-repo sites: README.md (Quick Start) docker compose -f deploy/docker-compose.demo.yml up -d --build → docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build Plus the "drop the -f flag for clean install" prose now spells out the correct fallback (`-f deploy/docker-compose.yml` alone). docs/contributor/qa-prerequisites.md (Step 1) Same single-file → two-file fix, plus an inline note explaining why the override-only file requires the base (so the next person who reads it understands the contract instead of re-discovering it). deploy/ENVIRONMENTS.md (Demo Overlay → What it adds) Replaced the stale "One line: mounts seed_demo.sql into PostgreSQL's init directory" claim — that hasn't been true since U-3 — with the accurate "One env var: CERTCTL_DEMO_SEED=true; server applies seed_demo.sql at boot via postgres.RunDemoSeed" description, plus the historical context for why the overlay can't stand alone. The certctl.io landing page hits the same bug (line 759); fix shipping in a separate commit in that repo. Acceptance gate (manual): - copy/paste the new README Quick Start command end-to-end against a fresh clone — succeeds, dashboard at https://localhost:8443 shows the seeded demo data within ~30s. - clean-install fallback (`docker compose -f deploy/docker-compose.yml up -d --build`) starts a working stack with no demo data.	2026-05-05 20:55:26 +00:00
shankar0123	e0aaa967c9	docs(README): add MCP server bullet to capabilities list The README's 'What it does' section enumerated 11 capability bullets (issuers / targets / ACME server / SCEP server / EST server / hierarchy / approvals / discovery / revocation / alerts) but had zero mention of the MCP server. The 2026-05-05 CLI/API/MCP ↔ GUI parity audit confirmed 93 MCP tools shipped today (87 in internal/mcp/tools.go + 6 in internal/mcp/tools_est.go) covering the full API surface. That's a real differentiator hidden from anyone landing on the README. Adds a 12th bullet positioning the MCP server with concrete example queries operators can ask their AI client (expiring certs, revoke with key-compromise reason, agent offline check). Frames the architectural facts: separate binary at cmd/mcp-server/, stateless stdio transport, no extra auth surface beyond the existing API key, no extra attack surface. Links to docs/reference/mcp.md for setup details.	2026-05-05 19:10:27 +00:00
shankar0123	7c5cc57d75		2026-05-05 15:39:08 +00:00
shankar0123	d809874fa1	docs: retire compliance subtree + sweep framework name-drops from prose Per operator decision the framework-mapping docs are gone. They were aspirational (no audit, no certification, no validated mapping); keeping them around was misleading. Files deleted (1,883 lines): - docs/compliance/index.md - docs/compliance/soc2.md - docs/compliance/pci-dss.md - docs/compliance/nist-sp-800-57.md Hyperlinks removed: - README.md: 'Auditor / compliance' row in the doc table; the '(compliance mapping included)' parenthetical in the positioning paragraph - docs/README.md: the '## Compliance' section table; the 'Auditor / compliance team' reading-order-by-role row Prose name-drops swept across 24 files: - README.md: 'FedRAMP boundary CAs / financial-services policy CAs' → '4-level boundary CAs / 3-level policy CAs'; 'Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA' → cut entirely - getting-started/{quickstart,concepts,examples,why-certctl, advanced-demo}.md: 'compliance' → 'audit' / 'policy'; 'PCI-DSS / SOC 2 / NIST SP 800-57' framework lists cut; ''pci': 'true'' tag example → ''environment': 'production'' - migration/cert-manager-coexistence.md: 'compliance rules' → 'policy rules' - operator/approval-workflow.md: 'Compliance customers (PCI-DSS Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA)' → 'Operators'; entire 'Compliance control mapping' table (PCI-DSS §6.4.5 / NIST SP 800-53 SA-15 / SOC 2 Type II CC6.1 / HIPAA §164.308(a)(4)) deleted; 'compliance contract' → 'two-person-integrity contract'; 'compliance auditors' → 'reviewers' - operator/legacy-clients-tls-1.2.md: 'PCI-DSS v4.0 Req 4 §2.2.5' audit-reference → CWE-326 (kept); 'PCI-DSS Req 4 §2.2.5 attestation' section retitled to 'TLS posture summary' and rewritten without framework framing; 'PCI-DSS, NIST, and major browsers will eventually deprecate TLS 1.2' → 'Major browsers and OS vendors will eventually deprecate TLS 1.2' - operator/database-tls.md: PCI-DSS Req 4 §2.2.5 audit-ref → CWE-319 only; 'PCI-DSS scope' → 'sensitive data'; PCI-DSS Req 4 v4.0 prose footing → cut - operator/runbooks/disaster-recovery.md: 'SOC 2 / PCI procurement-team deliverable' → 'on-call deliverable'; 'compliance auditors' → 'reviewers' - reference/connectors/{acme,aws-acm,azure-kv,globalsign, local-ca,openssl,ssh,index}.md: 'compliance reporting (PCI-DSS §3.6, HIPAA §164.312)' → 'audit reporting'; 'Compliance environments (PCI-DSS Level 1, FedRAMP High, HIPAA)' → 'Regulated environments'; 'compliance audits' → 'audit'; 'FedRAMP boundary CA' pattern names → '4-level boundary CA' (technically descriptive) - reference/protocols/est.md: 'compliance-hook seam' → 'device-state hook seam'; 'compliance gating' → 'device-state gating'; 'est_compliance_failed' → 'est_device_state_failed' - reference/protocols/scep-intune.md: 'Optional compliance check' → 'Optional device-state check'; failure-counter 'compliance_failed' → 'device_state_failed'; 'Conditional Access compliance gating' → 'Conditional Access device-state gating' - reference/intermediate-ca-hierarchy.md: 'FedRAMP boundary-CA deployments where the regulator requires...' → 'Boundary-CA deployments where you want separation of policy and issuing authorities'; pattern A retitled '4-level FedRAMP boundary CA' → '4-level boundary CA' - reference/architecture.md: broken Related-docs link to compliance.md removed; the rest of that block had stale pre-Phase-2 paths (quickstart.md, demo-advanced.md, connectors.md, openapi.md, testing-guide.md, test-env.md) — retargeted to current locations - reference/deployment-model.md: 'SOC 2 evidence-report generator' → 'Audit-evidence report generator' - reference/vendor-matrix.md: 'SOC 2 / PCI auditors paste this into evidence packs' → 'reviewers paste this into vendor-evaluation packs' - contributor/qa-test-suite.md: 'compliance exist' coverage description cut; 'Compliance (PCI / SOC2 / HIPAA-relevant)' risk-class label → 'Audit-relevant' What was kept: - CWE references (legitimate technical pointers) - Microsoft API/feature names that happen to use 'compliance' literally ('Microsoft Graph compliance API', 'device-compliance validators' — these are MS product names, not framework name-drops) - 'NIST PQC' on the landing page (Post-Quantum Cryptography is the actual NIST standard family, not a compliance framework) Verified: zero hyperlinks into docs/compliance/ remain. All 24 ci-guards/*.sh pass locally. qa-doc-seed-count.sh clean. Net diff: 26 files / -1,883 deletions in compliance/ + -32 net across the prose sweep. Companion edits in cowork/ (CLAUDE.md doc-tree summary + WORKSPACE-CHANGELOG.md retirement note) land separately.	2026-05-05 05:26:44 +00:00
shankar0123	426760d737	docs: Phase 13 — README rewrite to 250-line target Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/. README went from 457 lines to a target of 250 (operator decision in Phase 1 conversation). Focus shifts from feature-catalog + landing-page duplicate to "developer cloning the repo needs orientation + quickstart + entry points to docs." What stayed: - Logo + title + badges (~15 lines) - Elevator paragraph + 47-day cliff context (3 paragraphs, compressed) - Active-maintenance callout - Documentation table — restructured from 22 entries linking to flat docs/ to ~6 audience-organized rows linking through the new docs/README.md navigation index - Screenshots grid (4 tiles) - "What it does" — compressed from 33 lines of prose to 8 capability bullets, each linking to the canonical doc - Architecture paragraph — compressed to one paragraph linking to docs/reference/architecture.md - Quick Start (Docker Compose, Agent install, Helm, container images) - Examples table (5 turnkey scenarios) - Development commands - License paragraph - Dependencies block - Footer CTA What got moved out: - Cosign verification / SLSA / SBOM section (67 lines) → docs/reference/release-verification.md (NEW). README links to it in a 3-line "Verifying a release" section. What got removed entirely: - "Why certctl" + "Architecture" + "Security-first" + "Key design decisions" prose walls — duplicated landing page + architecture.md + security.md content. README no longer wades through 11 dense paragraphs. - "Supported Integrations" 4 sub-tables (Issuers / Targets / Protocols / Standards / Notifiers, ~80 lines of dense per-row marketing copy) — content lives at docs/reference/connectors/index.md and docs/reference/protocols/. README mentions counts ("12 issuers, 15 targets, 6 notifiers") with a single link. - "Roadmap" section entirely — V1 + V2 history rotted fastest of any section; replaced with implicit "see Releases + Issues for active work" via the existing footer CTA. - "What It Does" 10-subsection wall (33 lines) — replaced with the 8-bullet capability list, each linking to its canonical doc. - CLI section (20 lines of inline command examples) — links to the contributor docs. - MCP Server section (30 lines of setup) — links to docs/reference/mcp.md. New surface added: - docs/reference/release-verification.md — moved cosign/SLSA/SBOM procedure with one expanded "Why this matters" paragraph explaining the keyless OIDC trust anchor. Every docs/ link in the new README verified to resolve to an existing file. Cross-references from other docs / certctl.io to the deleted sections (if any) need follow-up Phase 11 sweeps.	2026-05-05 03:26:05 +00:00
shankar0123	dca1900815	docs: Phase 11 (partial) — fix cross-references after Phase 2 moves Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/. Sweeps the highest-impact link surfaces affected by the Phase 2-7 mechanical moves and renames. Covers README.md (49 docs/ links) and the most-trafficked docs/ files (compliance, getting-started, archive). README.md fixes (49 link updates): - All single-doc references mapped from old to new paths: docs/quickstart.md → docs/getting-started/quickstart.md docs/architecture.md → docs/reference/architecture.md docs/connectors.md → docs/reference/connectors/index.md docs/acme-server.md → docs/reference/protocols/acme-server.md docs/{soc2,pci-dss,nist}.md → docs/compliance/{soc2,pci-dss,nist-sp-800-57}.md ... (full mapping in the sed pipeline) - 3 references to deleted features.md replaced with pointers to architecture.md + connectors/index.md. docs/compliance/index.md (3 sibling renames): compliance-soc2.md → soc2.md compliance-pci-dss.md → pci-dss.md compliance-nist.md → nist-sp-800-57.md docs/compliance/pci-dss.md (3 external refs need ../): architecture.md → ../reference/architecture.md connectors.md → ../reference/connectors/index.md quickstart.md → ../getting-started/quickstart.md docs/getting-started/concepts.md (4 external refs): crl-ocsp.md → ../reference/protocols/crl-ocsp.md architecture.md → ../reference/architecture.md mcp.md → ../reference/mcp.md openapi.md → ../reference/api.md docs/getting-started/quickstart.md (4 external refs + 1 sibling): tls.md → ../operator/tls.md upgrade-to-tls.md → ../archive/upgrades/to-tls-v2.2.md architecture.md → ../reference/architecture.md demo-advanced.md → advanced-demo.md (sibling rename) docs/getting-started/examples.md (4 external refs): migrate-from-certbot.md → ../migration/from-certbot.md migrate-from-acmesh.md → ../migration/from-acmesh.md certctl-for-cert-manager-users.md → ../migration/cert-manager-coexistence.md connectors.md → ../reference/connectors/index.md docs/archive/upgrades/to-tls-v2.2.md (3 external refs need ../../): tls.md → ../../operator/tls.md quickstart.md → ../../getting-started/quickstart.md test-env.md → ../../contributor/test-environment.md docs/archive/upgrades/to-v2-jwt-removal.md (2 external refs need ../../): architecture.md → ../../reference/architecture.md tls.md → ../../operator/tls.md Verified all README.md docs/ links resolve to existing files. The only remaining top-level link is testing-guide.md which still exists at the top of docs/ (Phase 5 will prune it later). Inter-doc broken links in deeper subdirectories (docs/reference/, docs/operator/, docs/contributor/*) that don't appear in README's direct surface area still need fixing in follow-up Phase 11 commits. This commit handles the operator-facing entry points.	2026-05-05 03:19:21 +00:00
shankar0123	e50ba168ac	docs(README): strategic refresh — surface Rank 4/5/7/8 + ACME server + cloud targets README audit found six classes of drift between the README and the shipped repo. Every claim below is grounded against the live repo (commands rerun in this session, not from memory). Stale numeric claims fixed: '111 routes' → '180+ routes' (live: grep -cE 'r\.Register' router.go = 184) '80 tools' → '85+ tools' (live: grep -cE 'mcp\.AddTool' tools.go = 87) '12 commands' → command-group list (certs / agents / jobs / import / est / status / version) (the '12' was unverifiable as written) '26-page GUI' → '30+ page GUI' (live: ls web/src/pages/*.tsx \| grep -v test = 31) '21 tables' → '35+ tables' (live: distinct CREATE TABLE in migrations = 35) Connectors added to tables (these shipped commits ago without README mentions): Deployment Targets: AWS Certificate Manager (AWSACM) — commit `edf6bee`, Rank 5 Azure Key Vault (AzureKeyVault) — commit `8a56a78`, Rank 5 Enrollment Protocols: ACME v2 server (drop-in for cert-manager / Caddy / Traefik) — Phases 1a-6, ~10 commits ending 340b937. Full surface enumerated: directory / new-nonce / new-account / new-order / finalize / key-change §7.3.5 / revoke-cert §7.6 / renewal-info RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 + per-account rate limiting + scheduler-driven nonce/authz/order GC. Existing rows updated: Local CA: now mentions tree-mode N-level hierarchy (Rank 8) Vault: now mentions auto-token-renewal at TTL/2 (commit `0792271`) EJBCA: now mentions mTLS auto-reload via mtlscache (commit `81f6321`) Major shipped features added to 'What It Does' prose (4 new named blocks): - 'Two-person integrity for issuance (compliance-grade).' — Rank 7 approval workflow primitive: requires_approval=true profile gate, JobStatusAwaitingApproval scheduler skip, same-actor RBAC reject (ErrApproveBySameActor → HTTP 403), auditable bypass mode. Procurement-checklist closer for PCI-DSS Level 1 / FedRAMP / SOC 2 / HIPAA. - 'Multi-level CA hierarchy management.' — Rank 8 first-class CA hierarchy: intermediate_cas table, RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement, drain-first retire, FedRAMP / financial-services / internal-PKI patterns, byte-equivalence pin for unmigrated deployments. - 'Run certctl as your ACME server.' — Beyond consuming public ACME CAs, certctl now serves RFC 8555. Three client walkthroughs (cert-manager, Caddy, Traefik) cited. - 'Cloud-managed targets.' — AWS ACM + Azure Key Vault SDK-driven import + atomic rollback. - 'Notifications + per-policy multi-channel routing.' — Rank 4: AlertChannels matrix + AlertSeverityMap + fault-isolating per-channel dispatch + Prometheus counter. V2 paragraph rewritten: Pre-edit: a single 800-word wall-of-text bullet that listed everything. Buried Rank 4-8 features in the middle. Post-edit: 12 named feature blocks, each one to two sentences. Scannable. Cloud targets, ACME server, approval workflow, CA hierarchy, multi-channel alerts each get their own headline + one-line story + doc link. Documentation table extended with 5 newly-linked operator runbooks (all of which existed but were never reachable from the README): - docs/acme-server.md - docs/approval-workflow.md - docs/intermediate-ca-hierarchy.md - docs/runbook-cloud-targets.md - docs/runbook-expiry-alerts.md Plus 4 deeper cross-links inside the Enrollment Protocols + 'What It Does' prose: - docs/acme-cert-manager-walkthrough.md - docs/acme-caddy-walkthrough.md - docs/acme-traefik-walkthrough.md - docs/acme-server-threat-model.md Verified locally: All 9 previously-orphaned docs now reachable from README.md. No stale numeric claim remains: grep -nE '\b(111 routes\|80 tools\|12 commands\|26.page\|21 tables)' README.md → no matches. README size: 426 → 457 lines (+31). Net addition is 4 prose blocks + 2 table rows + 5 doc-table rows + 1 V2 paragraph rewrite (15 → 12 lines but each line denser). Strategic framing (CMO hat): - ACME server is the cert-manager adoption-funnel headline; gets its own table row + dedicated 'What It Does' block. - CA hierarchy is the Venafi / EJBCA replacement story for FedRAMP / financial-services / internal-PKI procurement; explicit market positioning. - Approval workflow framed as procurement-checklist closer (PCI-DSS L1 / FedRAMP / SOC 2 / HIPAA explicitly named). - Cloud-managed targets framed as 'we deploy to your cloud secret store' story. Doc-only commit. No code, no test changes.	2026-05-04 03:58:21 +00:00
shankar0123	cabe1aee45	docs(README): drop V3 Pro + V4 sections — everything ships free under BSL Strategic pivot. We are NOT building a V3 Pro paid tier or a V4 cloud / scale tier. Every certctl feature — current and future — ships free under the same BSL 1.1 source-available license. No gated features, no paid edition, no enterprise tier. Future revenue path is a managed-service hosting offering: operator runs the certctl-server control plane as a hosted service; customers self-install only the certctl-agent in their infrastructure. The self-hosted code stays free forever; the managed service sells operational convenience (no PostgreSQL to run, no upgrades, no backups, no SSO setup). BSL 1.1 was already structured around exactly this — the license expressly prevents competitors from running their own commercial certctl-as-a-service against the same source while leaving self-hosting unrestricted. Removed the old roadmap sections: - "### V3: certctl Pro" — Enterprise capabilities for larger deployments are available in the commercial tier. - "### V4+: Cloud & Scale" — Kubernetes cert-manager external issuer, cloud infrastructure targets, extended CA support, and platform-scale features. Replaced with a single "Forward-looking work — all free, all self-hostable" section that names the real engineering tracks (OIDC / SSO / RBAC, NATS / real-time, search / risk scoring, HSM / TPM / FIPS, deeper Vault auth, cloud-managed-target deep integrations, adapter hardening, credential lifecycle expansion) and points at the workspace-level WORKSPACE-ROADMAP.md for the unshipped backlog. The full feature surface lands in V2 over time — V3 / V4 are not real version targets, they were positioning artifacts. Diff: 2 insertions / 5 deletions. README's License section (BSL 1.1 licensing-inquiries footer) is unchanged.	2026-05-04 00:00:23 +00:00
shankar0123	0729ee46e0	chore: sweep github.com/shankar0123/certctl URL refs to certctl-io/certctl Post-transfer cosmetic + release-critical URL refresh after moving the repo from github.com/shankar0123/certctl to github.com/certctl-io/certctl (2026-05-03). GitHub HTTP redirects continue to forward old URLs forever, so existing operators are not broken — but aligns the canonical references with the new owner so: - procurement engineers / contributors browsing the docs see the right URL on first read - operators copying the agent install one-liner hit the new path directly without going through a redirect - the Helm chart's default image repository points at the canonical org registry path - the OnboardingWizard rendered to first-run UI users shows the new URL in the install snippets and doc anchor links - the GitHub Actions release workflow pushes container images to ghcr.io/certctl-io/certctl-{server,agent} (was: shankar0123) - the release-notes Markdown body in release.yml — which gets stamped into every future release page — references the post-transfer cert-identity (cosign keyless signing now uses the certctl-io workflow URL) and the post-transfer SLSA provenance source-uri. Without this, every cosign verify / slsa-verifier command on a v2.1.0+ release would fail because the cert-identity-regexp would not match the signing identity GitHub Actions OIDC issues post- transfer. Old releases (v2.0.67 and earlier) keep their immutable release-notes pointing at the shankar0123 path and remain verifiable via their own published instructions. Customer impact: - Operators on ghcr.io/shankar0123/certctl-{server,agent}:latest silently freeze on whatever tag was current at transfer time. They get no errors; they just stop receiving updates. The next release notes need a one-line callout (Phase 3.1 of cowork/transfer- certctl-to-org.md) telling them to update their image path to ghcr.io/certctl-io/certctl-{server,agent}. - All other URLs (git clone, install one-liner, raw.githubusercontent URLs, browser links, GitHub API) continue to resolve via permanent HTTP redirects. The sweep is cosmetic for those. Files swept (30 total): .github/workflows/release.yml — IMAGE_NAMESPACE, source-uri, cosign cert-identity-regexp, IMAGE= snippet (5 refs total). CHANGELOG.md, README.md — anchor links, badges, install one-liner, cosign verify snippets in operator-facing sections. api/openapi.yaml — info / externalDocs URLs. install-agent.sh — GITHUB_REPO const + systemd unit Documentation= field. deploy/ENVIRONMENTS.md, deploy/helm/{CHART_SUMMARY,INDEX, INSTALLATION,README}.md, deploy/helm/certctl/{Chart.yaml, README.md,values.yaml}, deploy/helm/examples/values-.yaml — chart docs + image repository defaults across dev / prod-ha overrides. docs/{certctl-for-cert-manager-users,connector-iis,connectors, migrate-from-acmesh,migrate-from-certbot,quickstart,test-env, why-certctl}.md — operator-facing doc URLs. examples/{acme-nginx,acme-wildcard-dns01,multi-issuer, private-ca-traefik,step-ca-haproxy}/docker-compose.yml + examples/step-ca-haproxy/step-ca-haproxy.md — example image: paths and accompanying narrative. web/src/pages/OnboardingWizard.tsx — first-run-UI URL refs (curl install one-liners, agent docker image path, doc anchor links). Files intentionally NOT swept (Choice A from cowork/transfer-certctl- to-org.md): go.mod, go.sum — module declaration stays github.com/shankar0123/ certctl. Existing imports compile because Go uses the path declared in go.mod, not the URL it was fetched from. Internal- only project; no external Go consumers; rename will land as a mechanical sed when one materializes. ~250 .go files — every import remains github.com/shankar0123/ certctl/internal/... deploy/test/f5-mock-icontrol/go.mod — separate test sub-module; same Choice A logic; module path stays. Files intentionally NOT swept (other reasons): README.md lines 244-245 — Scarf-pixel docker-pull commands. shankar0123.docker.scarf.sh/... is a Scarf-account hostname (per-user, not per-repo) and the pixel keeps tracking pulls against the operator's personal Scarf account. Migrating to a certctl-io Scarf account is a separate decision (create org Scarf account → re-create package → update README). deploy/test/f5-mock-icontrol/f5-mock-icontrol — checked-in compiled binary with shankar0123/certctl baked into Go build info via the sub-module path. Out of scope for a URL sweep; will refresh on the next `make test-integration` rebuild. Verification: gofmt: clean (no .go files touched). go vet ./...: clean (verified at this SHA in 1.3 of the transfer checklist; no .go changes since). go build ./...: clean (same). go test -short on representative packages: green (same). Diff shape: 30 files, 74 insertions / 74 deletions, net-zero size, pure URL substitution.	2026-05-03 23:39:50 +00:00
shankar0123	b3aad02232	chore(README): remove the second Scarf pixel — analytics consolidated to certctl.io The README has carried two Scarf pixels for some time: - 89db181e-76e0-45cc-b9c0-790c3dfdfc73 (kept earlier as 'GitHub traffic complement to GitHub Insights') - b9379aff-9e5c-4d01-8f2d-9e4ffa09d126 (moved to the certctl.io landing page in commit `6a5cfb3`) Re-evaluating: GitHub Insights → Traffic already provides repo views, uniques, clones, and referring sites with click counts at higher granularity than a Scarf pixel can extract from the README (Scarf can only see 'github.com' as the referrer; GitHub Insights knows the actual external referrer that landed the visitor on the README). The 89db181e pixel was duplicative-and-worse. Removing it. All certctl analytics now consolidate to: - GitHub Insights → Traffic (built-in, more granular than Scarf on the README surface) - certctl.io's b9379aff pixel (referrer-attribution for landing- page traffic, where Scarf actually adds value) - Scarf Docker Gateway via shankar0123.docker.scarf.sh/* (when the Helm chart + docker-compose.yml are routed through it — follow-up work) The Docker-pull example block at line 246 stays (it documents how operators install certctl via the Scarf gateway). Only the in-README tracking <img> is removed.	2026-05-01 20:59:22 +00:00
shankar0123	6a5cfb3d01	chore(README): remove duplicative Scarf pixel — moved to certctl.io The README had two Scarf pixels (89db181e and b9379aff). For README visit tracking, GitHub's built-in Insights → Traffic dashboard already provides views, uniques, clones, AND referring sites with click counts (Reddit, HN, Twitter, search, etc.) at higher granularity than a Scarf pixel can extract — Scarf can only see 'github.com' as the referrer because that's where the README HTML is served from, while GitHub Insights knows the actual external referrer that landed the visitor on the README. Removing pixel b9379aff-9e5c-4d01-8f2d-9e4ffa09d126 from the README and reusing it on the certctl.io landing page (sibling commit on certctl-io/certctl.io), where Scarf is the only analytics source and the referrer header actually carries useful attribution. Pixel 89db181e-76e0-45cc-b9c0-790c3dfdfc73 stays in the README as a backup signal alongside GitHub Insights — keeps continuity for the longer-running Scarf project counter. No data loss: GitHub Insights covers what 89db181e was double- counting, and b9379aff now serves a distinct surface (certctl.io) where it actually adds new attribution data.	2026-05-01 06:02:23 +00:00
shankar0123	b95a548f65	docs: deploy-hardening I — atomic deploy + post-verify operator guide + connectors / README updates Phase 12 of the deploy-hardening I master bundle. NEW docs/deployment-atomicity.md (12 sections, ~280 lines): 1. Overview — the three procurement-checklist gaps closed 2. The atomic-write primitive (Plan / File / Apply algorithm) 3. Per-connector atomic contract table (all 13 connectors) 4. Post-deploy TLS verification (handshake + SHA-256 + retries) 5. Rollback semantics (3 triggers + escalation path) 6. ValidateOnly dry-run mode (per-connector matrix) 7. File ownership + mode preservation (precedence + per-distro defaults) 8. Per-target deploy mutex (Phase 2) 9. Idempotency via SHA-256 (defends against retry storms) 10. Troubleshooting matrix (one row per failure mode) 11. V3-Pro deferrals (multi-region, pin manifests, SOC 2 export) 12. Per-connector quick reference (paste-able config snippets) UPDATE README.md::Deployment Targets — every connector row now notes the atomic + verify + rollback semantics that landed in deploy-hardening I. Added a closing paragraph linking to the new docs/deployment-atomicity.md. UPDATE docs/features.md — two new env-var rows: - CERTCTL_DEPLOY_BACKUP_RETENTION (default 3, -1 disables) - CERTCTL_K8S_DEPLOY_KUBELET_SYNC_TIMEOUT (default 60s) The G-3 docs-drift CI guard is satisfied: every new CERTCTL_DEPLOY_* env var documented here also appears in source (internal/deploy/types.go for BACKUP_RETENTION, k8ssecret config for KUBELET_SYNC_TIMEOUT). S-1 stale-counts guard: no literal-number current-state counts in the new doc — the per-connector tests are referenced via the file:line pattern (internal/connector/target/<name>/<name>_atomic_test.go) so the operator can grep for the actual count. Phase 13 next: pre-commit verification (full matrix + CI guard reproductions).	2026-04-30 15:30:45 +00:00
shankar0123	db4a9b7e69	docs(README): expand Standards & Revocation table with production hardening II surfaces Surfaces the eight items shipped in the post-2026-04-30 production hardening II bundle on the README's Supported Integrations → Standards & Revocation table so procurement teams comparing checklists see them without diving into docs/. Updates to the existing rows: - DER-encoded X.509 CRL: now also calls out RFC 7232 caching headers (ETag + If-None-Match 304 short-circuit) - Embedded OCSP responder: now also calls out RFC 6960 §4.4.1 nonce echo + the empty/oversized rejection - S/MIME: spelled out the adaptive KeyUsage delta vs TLS default - Certificate export: spelled out the cipher (AES-256-CBC PBE2 SHA-256 KDF) + V2 cert-only design rationale NEW rows: - CRL DistributionPoints auto-injection (RFC 5280 §4.2.1.13) - OCSP pre-signed response cache (with the load-bearing InvalidateOnRevoke wire called out) - Per-endpoint rate limits (OCSP + cert-export) - Cert-export typed audit (with cipher pin) - Prometheus per-area metrics (certctl_ocsp_counter_total) - Disaster-recovery runbook (docs/disaster-recovery.md, the SOC 2 / PCI procurement deliverable) G-3 docs-drift CI guard reproduced clean (every CERTCTL_* env var mention maps back to internal/config/config.go). S-1 stale-counts prose guard clean (no literal-number prose for current-state counts; the rate-limit defaults are config-default values, not source-derived counts that drift).	2026-04-30 06:00:41 +00:00
shankar0123	c98d83f596	fix(README): drop hardcoded source-counts from EST row to satisfy S-1 guard CI's 'Forbidden hardcoded source-count prose regression guard (S-1)' fired on the new EST row in README.md:109. The trip was on the literal '6 MCP tools' phrase — that matches the regex pattern \b[0-9]+\s+MCP tools\b which the S-1 guard rejects per the CLAUDE.md rule 'Numeric claims about current state rot.' Same rule covers the '13 typed audit-action codes' literal earlier on the same line — the regex doesn't catch that one specifically (no 'audit-action codes' alternation in the guard pattern), but the spirit of the rule applies, so I removed it preemptively to avoid the next operator-reads-the-doc-then-edits-the-code-then-the-count-is-wrong drift cycle. Replacements: '13 typed audit-action codes (...)' → 'Typed audit-action codes per failure dimension (... — full set in internal/service/est_audit_actions.go)' 'CLI + 6 MCP tools' → 'CLI + matching MCP tool family (rebuild count via grep -cE '"est_' internal/mcp/tools_est.go)' The rebuild-command form follows the convention CLAUDE.md::Current-state commands established + the existing docs/features.md row 'MCP tools \| rebuild via grep -cE 'gomcp\.AddTool\(' ...' Verified locally with the exact CI guard regex against README.md + docs/ — 'S-1 stale-counts guardrail: clean.' The 'All six RFC 7030 endpoints' phrasing earlier on the same line is NOT a current-state count — six is fixed by RFC 7030 (cacerts + simpleenroll + simplereenroll + csrattrs + serverkeygen + fullcmc), not derived from source. The S-1 regex requires \b[0-9]+ literal digits, so 'six' as a word doesn't match anyway.	2026-04-30 03:12:25 +00:00
shankar0123	6622883989	docs(est): EST RFC 7030 operator guide + WiFi/802.1X recipe + IoT bootstrap recipe + FreeRADIUS integration + architecture + README EST RFC 7030 hardening master bundle Phase 12 — comprehensive operator- facing documentation for the Phases 1-11 backend work that shipped on 2026-04-29. NEW docs/est.md (19 sections, ~810 lines): Concepts (host vs user enrollment, profile-driven policy, multi-profile dispatch); 5-minute single-profile Quick start with curl + openssl recipes; Multi-profile dispatch (CERTCTL_EST_PROFILES=corp,iot,wifi setup with PathID rules enforced at boot); Authentication modes (mTLS / Basic / both / empty with cross-check semantics); RFC 9266 channel binding (failure-mode HTTP mapping table — ErrChannelBindingMissing/Mismatch/NotTLS13 → 400/409/426); WiFi/802.1X recipe with end-to-end FreeRADIUS integration (EAP-TLS supplicant config, mods-available/eap tls-common block, CRL distribution endpoint cross-ref, troubleshooting playbook); IoT bootstrap recipe (factory provisioning, first boot, steady-state renewal, compromise/decommission via bulk-revoke, recommended cert lifetimes per master prompt §7.7); serverkeygen for resource-constrained devices (CMS EnvelopedData wrap, RSA-only at this revision, zeroize discipline, Phase-1 cross-check refusing _SERVERKEYGEN_ENABLED=true with empty _PROFILE_ID); HSM-backed CA signing for EST cross-ref (signer interface seam); Operator GUI tabbed surface tour (/est: Profiles / Recent Activity / Trust Bundle); CLI + 6 MCP tools; Renewal device-driven model (RFC 7030 §4.2.2 mandate, renewal-trigger ratios for laptops/IoT, operator-push via webhook); Troubleshooting matrix (one row per typed audit-action constant in internal/service/est_audit_actions.go); TLS 1.2 reverse-proxy runbook cross-ref (channel-binding caveat explained); Threat model (load-bearing properties: trust-anchor reload fail-safety, per-profile counter isolation, mTLS cross-profile bleed defense, source-IP limiter process-locality, server-keygen heap residency, HTTP Basic in-process-only, legacy-anonymous-default back-compat carve-out); V3-Pro deferrals; Appendix A (libest sidecar reproducer + 5 integration test names); Appendix B (Cisco IOS 15.x + 16.x + Apple MDM + OpenWRT + libest <v3.0 wire-format quirks tested in internal/api/handler/cisco_ios_quirks_test.go). UPDATED docs/architecture.md: new "EST Server (RFC 7030) — Production Deployment" section under the existing baseline EST section. Mermaid diagram of multi-profile dispatch + mTLS sibling route + per-profile gate ordering + audit + GUI + SIGHUP-equivalent reload. Existing authentication paragraph updated with forward-ref to the hardening section. Audit paragraph updated to enumerate the 13 typed est_* action codes operators grep on. Trust-anchor reload semantics + libest interop tested in CI both called out. UPDATED README.md::Enrollment Protocols: replaced the one-line EST row with the full production-grade surface description matching the SCEP analog. Cross-references docs/est.md. UPDATED docs/connectors.md::EST/SCEP Integration: extended the EST-or-SCEP shared paragraph to point at the per-profile env-var form for both protocols + linked the new architecture.md section. NEW "Multi-profile EST dispatch + production hardening" subsection mirrors the SCEP equivalent: 9-row env-var table, cross-ref to docs/est.md. G-3 docs-drift CI guard reproduced locally clean — every CERTCTL_EST_* mention in docs maps back to internal/config/config.go, and every defined env var is documented. The `<NAME>` placeholder convention matches the SCEP idiom so the docs grep doesn't extract per-deploy profile names as phantom env vars. No new env vars introduced — this is a pure docs commit.	2026-04-30 02:20:30 +00:00
shankar0123	0be889ff1d	refactor(scep-gui): rebrand SCEP admin surface to per-profile tabbed interface (Profiles + Intune + Recent Activity) Phase 9 follow-up to the SCEP RFC 8894 + Intune master bundle. The Phase 9.4 GUI shipped 'SCEP Intune Monitoring' at /scep/intune, which made the per-profile observability surface look Intune-only — operators running EJBCA + Jamf would never click that nav link expecting per- profile RA cert + mTLS observability. The page is per-profile keyed under the hood; this commit rebrands + restructures so the surface matches what operators actually need. Spec: cowork/scep-gui-restructure-prompt.md. User-visible change: - Nav link renamed: 'SCEP Intune' → 'SCEP Admin'. - Route: /scep is the new canonical path; /scep/intune kept as a backward-compat alias that lands directly on the Intune tab. - Page header: 'SCEP Administration'. - Three tabs: * Profiles (default) — per-profile lean cards with RA cert expiry countdown, mTLS sibling-route status badge, Intune enabled/disabled badge, challenge-password-set indicator. 'View Intune details →' link on Intune-enabled cards deep-links into the Intune tab. * Intune Monitoring — the existing Phase 9.4 deep-dive (per-status counters, trust anchor expiry, recent failures table, reload-trust button + confirmation modal). * Recent Activity — full SCEP audit log filter merging all four action codes (scep_pkcsreq + scep_renewalreq + scep_pkcsreq_intune + scep_renewalreq_intune); chip filters for All / Initial / Renewal / Intune / Static. Backend: * internal/service/scep.go — new SCEPProfileStatsSnapshot type + IntuneSection sub-block + ProfileStats(now) accessor. Adds raCertSubject/raCertNotBefore/raCertNotAfter + mtlsEnabled + mtlsTrustBundlePath fields with SetRACert + SetMTLSConfig setters. Existing IntuneStatsSnapshot + IntuneStats(now) preserved UNCHANGED for /admin/scep/intune/stats backward compat (the JSON shape stays byte-stable for external consumers — the aliasing approach the prompt initially suggested doesn't work because the new shape nests Intune while the old one is flat). ChallengePasswordSet is derived from challengePassword != '' (the secret value itself is never surfaced). * internal/api/handler/admin_scep_intune.go — new Profiles handler method on AdminSCEPIntuneHandler with the same M-008 admin gate. AdminSCEPIntuneServiceImpl extended (in place; same map[string]service.SCEPService) to satisfy the new AdminSCEPProfileService interface. Single handler file gets the third method so the M-008 pin entry count stays steady (no new file, no new triplet of admin-gate test files — just three new Profiles tests inside the existing test file). internal/api/router/router.go — one new route 'GET /api/v1/admin/scep/profiles' registered to reg.AdminSCEPIntune.Profiles. HandlerRegistry unchanged. * api/openapi.yaml — new operation 'listSCEPProfiles' documenting the request body / response shape / error mapping. Existing Intune entries unchanged. * cmd/server/main.go — per-profile loop now calls scepService.SetMTLSConfig(profile.MTLSEnabled, profile.MTLSClientCATrustBundlePath) right after SetPathID, and scepService.SetRACert(raCert) right after loadSCEPRAPair returns the leaf cert. Both setters are nil-safe. * internal/api/handler/m008_admin_gate_test.go — extended the existing admin_scep_intune.go entry's justification to mention the third endpoint. No new map entry needed (file already listed). Backend tests (8 new): * TestAdminSCEPProfiles_NonAdmin_Returns403 * TestAdminSCEPProfiles_AdminExplicitFalse_Returns403 * TestAdminSCEPProfiles_AdminPermitted_ForwardsActor — also pins that Intune-enabled profiles emit an 'intune' sub-block while Intune-disabled profiles OMIT it. * TestAdminSCEPProfiles_RejectsNonGetMethod * TestAdminSCEPProfiles_PropagatesServiceError * TestAdminSCEPProfilesServiceImpl_NilMapReturnsEmpty * (existing 16 Phase 9 admin tests still pass — backward-compat preserved) Frontend: * web/src/api/types.ts — new SCEPProfileStatsSnapshot + IntuneSection + SCEPProfilesResponse types. Existing IntuneStatsSnapshot et al unchanged. * web/src/api/client.ts — new getAdminSCEPProfiles helper. * web/src/pages/SCEPAdminPage.tsx — full rewrite as the tabbed surface. Reuses the existing ConfirmReloadModal and Intune deep-dive card components verbatim; adds ProfileSummaryCard (lean card for the Profiles tab) and ActivityTab. URL state sync via useSearchParams so deep links survive reloads + browser back/forward. The legacy /scep/intune route alias defaults the activeTab to 'intune' on mount. * web/src/main.tsx — new <Route path='scep' /> + preserved <Route path='scep/intune' /> alias. Both render SCEPAdminPage. * web/src/components/Layout.tsx — nav link rebranded: label 'SCEP Intune' → 'SCEP Admin', to '/scep/intune' → '/scep'. Frontend tests (20 — full rebuild): * Admin gate (non-admin sees gated banner + zero admin API calls) * Profiles tab default + Intune tab tabswitch + ?tab=intune deep link + legacy /scep/intune alias all land on Intune * Profiles tab status badges (Intune + mTLS + challenge-set) reflect each profile's flags * RA cert expiry tone bands (good ≥30d / warn 7-30d / bad <7d / EXPIRED) verified across three fixture profiles * 'View Intune details →' only renders for Intune-enabled profiles AND switches tabs on click * Empty-state banner when no profiles configured * Intune tab counters render with the existing Phase 9 deep-dive shape; reload modal Open/Confirm/Cancel/Error paths all pinned * Recent Activity tab merges all four SCEP audit actions across four parallel useQuery calls; filter chips (all/initial/renewal/intune/static) narrow correctly * Error path surfaces ErrorState on the active tab Docs: * docs/scep-intune.md — Operational monitoring section heading expanded to '(SCEP Administration → Intune Monitoring tab)'. Page-surface description rewritten for the tabbed shape; admin-endpoints list extended with the new /admin/scep/profiles entry. * docs/architecture.md — Microsoft Intune Connector trust anchor subsection updated to reference the Intune Monitoring tab inside the SCEP Administration page + lists all three admin endpoints. * docs/legacy-est-scep.md — forward-ref expanded with a parallel sentence for the per-profile observability surface (independent of Intune). * README.md — Enrollment Protocols bullet for Intune updated to 'admin GUI SCEP Administration page at /scep' with the three tabs called out. Verification: * gofmt clean on touched files * go vet ./... clean * staticcheck on intune+service+handler+router+cmd-server clean * go test -short across intune+service+handler+router+cmd-server: all green (existing Phase 9 tests + new Profiles tests) * Frontend tsc --noEmit clean * Vitest: 20/20 SCEPAdminPage tests + 3/3 sibling AuditPage tests pass * G-3 docs-drift CI guard reproduced locally: clean (no new env vars; existing CERTCTL_SCEP_ allowlist prefix covers everything) * M-009 hard-zero useMutation guard reproduced locally: clean (the existing reload mutation already used useTrackedMutation from the Phase 9 follow-up commit `28e277a`) * openapi-parity test green (new GET /api/v1/admin/scep/profiles operation documented) * M-008 admin-gate scanner green (existing admin_scep_intune.go entry covers all three handler methods; the test scanner enforces the triplet by file, not by endpoint, and the new Profiles triplet was added to the existing test file) Backward compat preserved: * /api/v1/admin/scep/intune/stats unchanged — same JSON shape, same error codes, same M-008 gate * /api/v1/admin/scep/intune/reload-trust unchanged * /scep/intune route still works (alias to /scep with activeTab=intune) * IntuneStatsSnapshot Go type unchanged * IntuneStats(now) accessor unchanged Refs: cowork/scep-gui-restructure-prompt.md cowork/scep-rfc8894-intune-master-prompt.md::Phase 9 Phase 11.5 (SCEP probe in scanner — opt-in) and Phase 12 (release prep + tag) of the master bundle resume after this.	2026-04-29 17:46:42 +00:00
shankar0123	5d080c86fd	docs(scep-intune): deployment guide + troubleshooting + Microsoft support statement Phase 11 of the SCEP RFC 8894 + Intune master bundle. Phase 11.1 — docs/scep-intune.md (new, ~340 lines): * TL;DR — drop-in NDES replacement framing; what an operator gets over NDES (per-profile endpoints, audit-log forensics, SIGHUP reload, GUI monitoring, per-device rate limit). * Architecture diagram — Intune cloud → Connector → certctl SCEP → issuer connector. Explicit 'certctl replaces NDES, NOT the Connector' framing; nine-gate dispatcher walk (shape pre-check, JWS sig, version dispatch, time bounds, audience pin, CSR binding, replay, per-device rate limit, optional compliance). * Migration playbook (NDES + EJBCA / NDES + ADCS) — 9-step run-book: install alongside, configure per-profile endpoint, extract trust anchor, configure CONNECTOR_CERT_PATH + AUDIENCE, configure issuer connector, migrate one profile, verify enrollment, roll out fleet, decommission NDES. * Intune SCEP profile field mapping table — every Intune admin center field mapped to certctl's behavior (cert type, subject name format, SAN, validity, key storage provider, key usage, EKU, hash algorithm, SCEP server URL). * Trust anchor extraction recipe — step-by-step certlm.msc export of the 'CN=Microsoft Intune Certificate Connector' cert, PEM rename, env-var configuration, HA Connector concatenation, SIGHUP rotation flow. * Troubleshooting matrix — 10 failure modes mapped to root causes and operator actions: signature_invalid (trust anchor stale), claim_mismatch (Intune profile SAN config), expired (clock skew / Connector cert past NotAfter), not_yet_valid (reverse skew), wrong_audience (URL mismatch), replay (retry-window collision), rate_limited (limiter doing its job), unknown_version (Microsoft shipped new format), malformed (proxy mangling body), compliance_failed (V3-Pro hook returned non-compliant). * Operational monitoring — admin GUI surface description, expiry badge tone bands (≥30d green / 7-30d amber / <7d red / EXPIRED), per-status counter polling cadence, audit log filter, recommended Prometheus alert thresholds. * Limitations — explicit V3-Pro deferrals: native Microsoft Graph integration, Conditional Access compliance gating, per-tenant trust anchors (MSP scoping), OCSP stapling at SCEP-response time, auto-discovery of Connector signing cert. * Microsoft support statement — three Microsoft Learn URLs (verified live with HTTP 200): Connector overview, SCEP profile setup, Connector install validation. Microsoft documents the Connector as RFC-8894-compliant and supports its use against any RFC 8894 SCEP server. Phase 11.2 — Cross-references: * docs/legacy-est-scep.md — the previous forward-ref pointed at 'the Phase 11 doc this bundle ships'; updated to a richer pointer that lists what scep-intune.md covers (architecture, migration, profile mapping, extraction, troubleshooting, monitoring, limitations, Microsoft support). * README.md — new bullet under Enrollment Protocols table: 'Microsoft Intune SCEP fleet (drop-in NDES replacement)' with the per-profile dispatcher feature list + link to scep-intune.md. Procurement teams scanning the README see the Intune story alongside ChromeOS / Jamf in the same table row. * docs/architecture.md — new 'Microsoft Intune Connector trust anchor (per-profile, opt-in)' subsection in the Security Model section. ASCII diagram showing the dispatcher walk; calls out the SIGHUP reload + admin-gated GUI surface; forward-link to scep-intune.md. Verification: * All linked anchors inside scep-intune.md resolve to existing headings: #limitations, #microsoft-support-statement, #operational-monitoring, #trust-anchor-extraction. * All linked doc paths resolve: legacy-est-scep.md, architecture.md, features.md, tls.md. * All three Microsoft Learn URLs return HTTP 200 (verified via curl). * G-3 docs-drift CI guard reproduced locally and clean — the migration playbook uses the <NAME> placeholder convention consistently (matching features.md style) so the docs scanner doesn't extract literal env-var names that aren't in config.go. * Backend tests across intune+handler+service+router still green. Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 11 cowork/scep-rfc8894-intune/progress.md	2026-04-29 17:03:56 +00:00
shankar0123	23603f5174	docs(scep): RFC 8894 hardening — README + architecture + connectors SCEP RFC 8894 + Intune master bundle — Phase 6 of 14. Closes Half 1 of the bundle (Phases 0-6). The certctl SCEP server now ships full RFC 8894 wire format (EnvelopedData decrypt + signerInfo POPO verify + CertRep PKIMessage builder), tested against ChromeOS-shape hermetic E2E requests, with multi-profile dispatch and must-staple per-profile policy. Half 2 (Phases 7-12) adds the Microsoft Intune dynamic-challenge layer; Phase 6.5 (mTLS sibling route) is independently shippable as an opt-in enterprise-procurement feature. README.md * Standards & Revocation table SCEP row updated to mention full RFC 8894 wire format (EnvelopedData decryption, signerInfo POPO verification, CertRep PKIMessage builder), PKCSReq + RenewalReq + GetCertInitial messageType dispatch, multi-profile dispatch (/scep/<pathID>), per-profile RA cert + key, MVP fall-through for lightweight clients. * Enrollment protocols paragraph extended with the same scope, plus a link to docs/legacy-est-scep.md for the operator + device- integration guide. docs/architecture.md * SCEP wire format paragraph rewritten to describe the two paths (RFC 8894 first, MVP fall-through), the messageType dispatch table, the EnvelopedData decrypt (constant-time PKCS#7 unpad closing the padding-oracle leg), the SET-OF Attribute re-serialisation quirk per RFC 5652 §5.4, and the CertRep PKIMessage shape (cert chain encrypted to req.SignerCert, NOT the RA cert). * SCEP service interface updated to show the three new WithEnvelope variants alongside the legacy PKCSReq method. Added 'Capabilities advertised', 'Multi-profile dispatch', and 'Must-staple per profile' subsections covering the RFC 7633 extension policy. docs/connectors.md * EST/SCEP Integration section extended with the per-profile issuer-binding env-var form (CERTCTL_SCEP_PROFILE_<NAME>_ISSUER_ID). * New SCEP RA cert + key paragraph pointing operators at the legacy-est-scep.md openssl recipe + ChromeOS Admin Console pointer + must-staple per-profile policy. cowork/CLAUDE.md::Active Focus * 2026-04-29 SCEP RFC 8894 + Intune master bundle status updated to 'HALF 1 COMPLETE (Phases 0-5 of 14 SHIPPED)' with the full chain of commit SHAs (`105c307` → `fdd424b` → `a546a1b` → `b540d44` + `7b40361` → `b33b843`). * Unreleased-on-master bullet extended to enumerate the SCEP bundle deliverables alongside the CRL/OCSP work, plus the new SCEP env vars (CERTCTL_SCEP_RA__PATH, CERTCTL_SCEP_PROFILES, CERTCTL_SCEP_PROFILE_<NAME>_). cowork/CLAUDE.md::Architecture Decisions * Added a new bullet for 'SCEP RFC 8894 native implementation (post-2026-04-29)' covering the load-bearing design decisions: EnvelopedData decrypt with constant-time padding strip, the SET-OF re-serialisation quirk, the dispatch-on-messageType pattern, multi-profile dispatch, the MVP fall-through contract, capability advertisement, ChromeOS-shape E2E test, must-staple per-profile. Smoke test against fresh make docker-up SKIPPED in this commit — the sandbox doesn't have Docker available. The full smoke recipe is in the Phase 6.3 prompt; CI runs the full integration suite via the standard docker-compose.test.yml workflow on the next push. Verification (sandbox): * gofmt + go vet + staticcheck clean for all touched paths. * go test -short -count=1 green across api/handler / api/router / service / pkcs7 / connector/issuer/local / domain / cmd/server. * Coverage held: handler 79.0% / service 73.2% / pkcs7 80.5% / config 96.0% / domain 88.6% / router 100%. Phase 6 of 14 in SCEP RFC 8894 + Intune master bundle. Half 1 COMPLETE. Half 2 (Phases 7-12, Microsoft Intune dynamic- challenge layer) ready to begin.	2026-04-29 13:21:50 +00:00
shankar0123	2519da85f0	docs: README + concepts + features reflect CRL/OCSP responder bundle Audit pass against cowork/crl-ocsp-responder-prompt.md found three operator-facing docs still describing the pre-bundle CRL/OCSP surface (GET-only OCSP, CA-key-direct signing, no scheduler-driven cache). Each claim updated below was ground-truthed against repo HEAD before edit. README.md * Standards & Revocation table — CRL row now mentions scheduler-pre-generated cache (CERTCTL_CRL_GENERATION_INTERVAL, crl_cache table); OCSP row mentions GET + POST forms, dedicated responder cert per RFC 6960 §2.6, id-pkix-ocsp-nocheck per §4.2.2.2.1, 7d auto-rotation grace. * Revocation paragraph — corrected the 'Embedded OCSP responder' one-liner to call out the dedicated-responder-cert design (the CA private key is never used directly for OCSP signing, which is the load-bearing security property for the future PKCS#11/HSM driver path) and added the link to the relying-party guide. docs/concepts.md * CRL paragraph — added the scheduler pre-generation + singleflight coalescing detail. Kept the existing 24h validity claim (verified against internal/connector/issuer/local/local.go:956 — 'NextUpdate: now.Add(24 * time.Hour)'). * OCSP paragraph — corrected the description so it covers both GET and POST forms (POST per RFC 6960 §A.1.1 is what production clients use: Firefox, OpenSSL s_client -status, cert-manager, Intune); added the dedicated-responder-cert + nocheck-extension + auto-rotation explanation; cross-link to docs/crl-ocsp.md. docs/features.md * Revocation Infrastructure section — CRL Endpoint, OCSP Responder, new Admin Cache Observability subsection, new GUI Revocation Endpoints Panel subsection. Corrected the previously-wrong 'Signs with the issuing CA key' OCSP claim — the bundle's load-bearing security improvement is exactly that the CA key is NOT used directly. Cross-link to crl-ocsp.md. * Local CA env vars table — added all four new CERTCTL_CRL_GENERATION_INTERVAL / CERTCTL_OCSP_RESPONDER_KEY_DIR (with the prod 'MUST set' callout) / _ROTATION_GRACE / _VALIDITY rows. Closes the G-3 'env var defined in Go but never documented' drift that broke CI on commit `fc3c7ad`. * Migrations table — added 000019_crl_cache and 000020_ocsp_responder rows so the table reflects the bundle's persisted surface area; also clarified the table is illustrative + pointed readers at 'ls migrations/.up.sql' for the full sequence (the table had drifted behind reality at 000010 even before this bundle). docs/architecture.md was already updated in commit `b4334ed` with the same content scope, so no further architecture edits. Verification: Local G-3 set difference: empty (Go-defined ∖ docs-mentioned for CRL/OCSP env vars). * 24h CRL validity claim verified against local.go:956 NextUpdate. * Migration numbers verified against 'ls migrations/000019* 000020'. id-pkix-ocsp-nocheck OID verified against internal/connector/issuer/local/ocsp_responder.go:60.	2026-04-29 03:20:44 +00:00
shankar0123	e720474fb7	Bundle D: Documentation & transparency sweep — 8 findings closed Closes H-009 + L-001 + L-007 + L-008 + L-016 + L-017 + L-018 + M-027 from comprehensive-audit-2026-04-25. H-009 — README JWT verified-already-clean README has zero JWT mentions at audit time. docs/architecture.md correctly documents JWT/OIDC integration via authenticating-gateway pattern (line 905-912). .github/workflows/ci.yml: new step 'Forbidden README JWT advertising regression guard (H-009)' greps README for JWT-as-supported phrasing; passes verbatim (gateway / pre-G-1) but fails build on net-new advertising. L-001 (CWE-295) — InsecureSkipVerify per-site justification Audit count was 8; recon found 13 production sites. docs/tls.md: new 'InsecureSkipVerify justifications' table enumerates each site by file:line with per-site rationale. cmd/agent/verify.go:78, internal/tlsprobe/probe.go:54, internal/service/network_scan.go:460: each previously-bare InsecureSkipVerify: true now carries //nolint:gosec. .github/workflows/ci.yml: new step 'Forbidden bare InsecureSkipVerify regression guard (L-001)' fails build if any net-new ISV lands in non-test .go without nolint:gosec on the same or preceding line. L-007 — README dependency-audit commands README.md: new Dependencies section with go list -m all \| wc -l, go mod why, govulncheck ./.... Honors operating-rules invariant. L-008 — Release-time govulncheck gate .github/workflows/release.yml: new 'Install govulncheck' + 'Run govulncheck (release gate)' steps in the matrix job. Pinned to same install path as ci.yml. Default exit code semantics (fail on called-vuln only, deferred-call advisories tracked on master via L-021) keeps the gate appropriate. L-016 — architecture.md drift fixes docs/architecture.md: system-components diagram's '21 tables' annotation removed (current 23; replaced with TEXT-keys descriptor); connector-architecture '9 connectors' prose replaced with grep ref + current 12-issuer list (added Entrust/GlobalSign/EJBCA which were missing); API-design '97 operations / 107 total' replaced with grep commands. Connector subgraphs verified-current at 12/13/6. L-017 — workspace CLAUDE.md verified-already-clean Bundle B's pre-commit-gate refactor already converted current- state numeric claims to grep commands. Phase 0 recon confirmed zero remaining hardcoded counts. L-018 — Defect age table cowork/comprehensive-audit-2026-04-25/defect-age.md (NEW): Tabulates all 9 High findings with first-mentioned commit, closing bundle, days-open. Methodology snippet for re-running. Key finding: 8 of 9 closed within 24h of audit publication. M-027 — OpenAPI parity verified-already-clean Audit's 'router 121 vs OpenAPI 125 — 4-op gap' was wrong methodology. The 4-op 'gap' was exactly the 4 routes registered via r.mux.Handle (auth-exempt allowlist) instead of r.Register. When you count both dispatch shapes the totals match exactly. internal/api/router/openapi_parity_test.go (NEW): TestRouter_OpenAPIParity AST-walks router.go for both Register and mux.Handle calls + walks api/openapi.yaml's path/method nesting + asserts the sets match. Adding a route without updating the spec fails CI permanently. Audit deliverables: audit-report.md: score 38/55 -> 46/55 closed (High 7/9 -> 8/9; Medium 20/27 -> 21/27; Low 8/19 -> 14/19) findings.yaml: 8 status flips open -> closed defect-age.md: new file certctl/CHANGELOG.md: Bundle D section Verification: TestRouter_OpenAPIParity PASS L-001 grep guard self-test (after //nolint:gosec adds) PASS H-009 grep guard self-test PASS go test -count=1 -short on changed packages green	2026-04-27 00:47:15 +00:00
shankar0123	45ba27693b	Update LICENSE metadata	2026-04-26 23:29:59 +00:00
shankar0123	52248be717	v2.0.47: HTTPS Everywhere — TLS-only control plane, agents/CLI/MCP Breaking change release. Plaintext HTTP listener removed. The certctl control plane now terminates TLS 1.3 on :8443 via http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md. Server - cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback), preflightServerTLS validation - cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe, watchSIGHUP wiring, cert/key path config threading - tls_test.go: 418-line regression coverage of reload, preflight, callback behavior, SAN validation Config - CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required) - Plaintext rejection: agents/CLI/MCP pre-flight-fail on http:// URLs with a pointer to docs/upgrade-to-tls.md Agents, CLI, MCP - All three pre-flight-reject http:// URLs with fail-loud diagnostic - CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust - CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass (loud warning on startup) - install-agent.sh emits both vars as commented template lines docker-compose - certctl-tls-init sidecar generates SAN-valid self-signed cert into deploy/test/certs/ on first boot - All demo-stack curls pin against ca.crt with --cacert Helm chart - Three TLS provisioning modes, exactly one required: - server.tls.existingSecret (operator-supplied) - server.tls.certManager.enabled (cert-manager integration) - server.tls.selfSigned.enabled (eval only — not for production) - server-certificate.yaml template for cert-manager mode - helm install without a TLS source fails at template render with a pointer to docs/tls.md CI - .github/workflows/ci.yml Helm Chart Validation step renders the chart in both existingSecret and cert-manager modes, plus an inverse guard-regression test that asserts helm template MUST refuse to render when no TLS source is configured. Previously the single `helm template` invocation hit the certctl.tls.required fail-loud guard and exit-1'd CI. Four invocations now: lint (existingSecret), template (existingSecret), template (cert-manager), template (no args — must fail). Integration tests - deploy/test/integration_test.go stands up the Compose stack over HTTPS, extracts the CA bundle, and exercises every certctl API over https://localhost:8443 - All 34 integration subtests green (per Phase 8 local CI-parity) Documentation - New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload) - New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade warnings, fleet-roll sequencing) - CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry (file heading unchanged; release tag is v2.0.47) - All curls in docs/, examples/, deploy/helm/ guides use https://localhost:8443 --cacert Verification - grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits - grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin API default, SSRF doc comment) — zero certctl endpoints - Tasks #197–#206 (Phases 0–8) all closed in the tracker Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).	2026-04-20 03:43:10 +00:00
shankar0123	cb308bb4c7	ci(release): migrate cosign sign-blob to --bundle (cosign v3.0) Cosign v3.0 (shipped by default with sigstore/cosign-installer@cad07c2e, release v3.0.5) removed --output-signature and --output-certificate from the sign-blob subcommand. The replacement is a single --bundle flag that emits a unified Sigstore bundle (.sigstore.json) containing the signature, certificate chain, and Rekor inclusion proof in one file. This change migrates both sign-blob invocations in .github/workflows/ release.yml (per-binary matrix signing and aggregate checksums.txt signing), updates the artefact upload paths, the artefact aggregation case filter, the GitHub Release asset list, and the release-notes body verify-blob example. The README cosign verification snippet and sidecar description are also updated to the --bundle / .sigstore.json shape. No cosign version pinning. No legacy fallback. OCI image signing (cosign sign on image digest) is unchanged — only sign-blob flags changed in v3.0. See M-11 in certctl-audit-report.md. Verification gates: - YAML parse: OK - go vet ./...: exit 0 - go build ./...: exit 0 - grep 'cosign sign-blob' release.yml: 2 (expected: 2) - grep '.sigstore.json' release.yml: 9 (expected: >=5) - grep '.sig/.pem' release.yml non-comment: 0 (expected: 0) - README legacy cosign refs: 0 (expected: 0) - docs/ legacy cosign refs: 0 (expected: 0) Coverage: unchanged (CI workflow edit + README — zero Go code touched).	2026-04-18 09:29:20 +00:00
shankar0123	b1df6dab27	ci(release): add CLI/MCP binaries, checksums, SBOM, Cosign, SLSA provenance (M-3)	2026-04-17 04:04:55 +00:00
shankar0123	e9947dc0fe	docs: redact V3 feature specifics from README (fixes H-7) Problem ------- H-7 (CWE-200 / information disclosure, strategic-policy class): the public README's V3 section enumerated the paid-tier feature set -- "Role-based access control with profile-gating", "Event-driven architecture with real-time operational views", "Advanced search", "compliance scoring", "HSM/TPM integration" -- violating the CLAUDE.md directive "Keep V3+ deliberately vague -- one-liner descriptions only. Don't telegraph the paid feature set." The prior wording also carried factual drift: `compliance scoring` was pulled forward to V2.2 per the V2.2 Roadmap, so pairing it with V3 in the README misrepresented the open-core line. Fix --- Replace the two-sentence enumeration at README.md:322-323 with a single deliberately-vague sentence: Enterprise capabilities for larger deployments are available in the commercial tier. No named features. No SKU enumeration. Matches the policy one-liner shape used in neighboring V1 / V2 / V4+ sections. Net -1 line of prose. Files ----- README.md 1 -, 1 + Wire-format invariants preserved -------------------------------- This is a docs-only change. All protocol surfaces are byte-identical: - RFC 7030 EST handler (internal/api/handler/est.go) -- untouched - RFC 8894 SCEP handler (internal/api/handler/scep.go) -- untouched - Shared internal/pkcs7/ package -- untouched - H-1 revocation composite key (migration 000012) -- untouched - H-2 SCEP challenge-password preflight + PKCSReq guard -- untouched - C-2 AES-256-GCM config encryption contract -- untouched - CRL DER bytes, OCSP response bytes -- untouched Verification ------------ git diff `387fb55` HEAD -- internal/ cmd/ migrations/ api/ deploy/ -> 0 code changes (only README.md modified after H-1) Operational note ---------------- No behavioral change. Product positioning only. The V3 feature set itself remains documented in the gitignored roadmap.md / strategy.md, which are the intended sources of truth for the paid tier. Audit report: see /Users/shankar/Desktop/cowork/certctl-audit-report.md	2026-04-16 23:46:37 +00:00
shankar0123	1c7d085f16	docs: move maintenance notice and quick start link above Documentation section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 11:05:47 -04:00
shankar0123	13cd4d98ba	feat(V2.2): bulk revocation — filter-based fleet-wide certificate revocation Add POST /api/v1/certificates/bulk-revoke with filter criteria (profile_id, owner_id, agent_id, issuer_id, team_id, certificate_ids), partial-failure tolerance, and audit trail. Includes MCP tool, CLI command (certs bulk-revoke), server-side bulk modal in GUI replacing client-side sequential loop, OpenAPI spec, compliance mapping updates, and 21 new tests (12 service, 7 handler, 1 CLI, 1 frontend). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-16 00:06:34 -04:00
shankar0123	e1bcde4cf1	feat(M50): cloud secret manager discovery — AWS SM, Azure KV, GCP SM Extend certificate discovery from filesystem + network to cloud secret managers. Three pluggable DiscoverySource connectors feed into the existing discovery pipeline via sentinel agent pattern, with a 9th scheduler loop for periodic cloud scanning. - AWS Secrets Manager: aws-sdk-go-v2, tag/prefix filtering, 10 tests - Azure Key Vault: stdlib HTTP + OAuth2, base64 DER/PEM, 16 tests - GCP Secret Manager: stdlib HTTP + JWT OAuth2, label filter, 14 tests - CloudDiscoveryService orchestrator with 9 tests - 9th scheduler loop (6h default, atomic.Bool idempotency) - Discovery page: color-coded source type badges - 14 new env vars across CloudDiscoveryConfig structs - Docs: connectors.md, architecture.md, features.md, README updated 49 new tests. All CI checks pass (go vet, race, lint, coverage). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 23:01:00 -04:00
shankar0123	3f619bcaac	feat(M49): Entrust, GlobalSign & EJBCA issuer connectors Add three new issuer connectors completing commercial and open-source CA coverage. Entrust uses mTLS client certificate auth with sync/async issuance. GlobalSign Atlas uses mTLS + API key/secret dual auth with serial-based tracking. EJBCA supports dual auth (mTLS or OAuth2) for self-hosted Keyfactor CAs. Each connector implements the full issuer.Connector interface (9 methods), includes httptest-based unit tests (~14 each), and follows established patterns (injectable HTTP clients, RFC 5280 revocation reason mapping, CRL/OCSP delegated to CA). Also includes: issuer factory cases, env var seeding, config structs, domain types, seed data (3 rows, all disabled), OpenAPI enum updates, frontend issuer catalog entries with config fields, and full docs (connectors.md, architecture.md, features.md, README). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 22:24:12 -04:00
shankar0123	596d86a206	feat(M48): continuous TLS health monitoring — endpoint state machine, shared tlsprobe, 8 API endpoints, GUI Adds continuous TLS endpoint health monitoring that closes the deploy→verify→monitor loop. After M25 verifies a deployment succeeded once, M48 continuously confirms it stays healthy. Key components: - Shared `internal/tlsprobe/` package extracted from network scanner for reuse - Health status state machine: healthy → degraded (2 failures) → down (5 failures), plus cert_mismatch when served fingerprint differs from expected - 8th scheduler loop (60s tick, per-endpoint configurable intervals) - PostgreSQL migration 000011: endpoint_health_checks + endpoint_health_history tables - 8 REST API endpoints (CRUD, history, acknowledge, summary) - Health Monitor GUI page with summary bar, status table, create modal, auto-refresh - 38 new tests (5 tlsprobe + 11 domain + 10 service + 8 handler + 4 frontend) - All coverage thresholds maintained (service 68%, handler 83%, domain 87%, middleware 63%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 21:45:45 -04:00
shankar0123	f2e60b93a3	feat(M11c): crypto policy enforcement — CSR validation, MaxTTL caps, key metadata Enforce certificate profile crypto constraints across all 5 issuance paths (renewal, agent CSR, EST, SCEP). ValidateCSRAgainstProfile() rejects CSRs with key algorithm/size that don't match profile rules. MaxTTL enforcement caps certificate validity per issuer connector (Local CA, Vault, step-ca enforce directly; ACME/DigiCert/Sectigo pass through). Key algorithm and size are now persisted in certificate_versions for audit compliance. 16 new tests (12 service-layer + 4 Local CA connector). Removes hardcoded version number from GUI sidebar. Documentation updated across architecture, features, connectors, and README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 21:05:14 -04:00
shankar0123	f16a9c767a	docs: consolidate README — merge architecture, security, design decisions into Why certctl Fold Architecture, Key Design Decisions, and Security sections into the Why certctl section as bold-header paragraphs. Removes three standalone sections, tightening the README structure: Documentation → Integrations → Why certctl (with architecture, security, design decisions) → What It Does → Quick Start → Examples → CLI → MCP → Development → Roadmap → License. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 17:06:43 -04:00
shankar0123	3a27c87b3f	docs: move Supported Integrations under Documentation links in README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 17:03:11 -04:00

1 2 3 4

171 Commits