Commit Graph

156 Commits

Author SHA1 Message Date
shankar0123 18814854b4 docs(README): surface Bundle 1 RBAC + signal Bundle 2 federation as roadmap
Pre-fix the README said nothing about role-based access control,
the auditor role, the day-0 bootstrap path, or the four-eyes
approval workflow — all shipped in Bundle 1 (commit 9a35c63 +
follow-ons). A prospective adopter landing on the README would
read "API key auth enforced by default" and walk away thinking
certctl had no authz primitive at all. The only OIDC reference
was the cosign-keyless line at the artefact-signing section,
unrelated to authentication.

Three surgical edits:

1. Status block: extend the "production-quality core" enumeration
   with role-based authz, auditor split, day-0 bootstrap, four-eyes
   approval. Add a one-line callout that federated identity (OIDC,
   SAML, WebAuthn, server-side sessions, break-glass, JIT
   elevation) is roadmap-not-shipped — preempts the natural-but-
   wrong assumption that "RBAC means OIDC works".
   The two terms are linked inline:
     - "role-based authz" -> docs/operator/rbac.md (operator how-to:
       role table, permission catalogue, scope semantics, GUI/CLI/
       HTTP/MCP grant flows, day-0 bootstrap).
     - "Federated identity" -> docs/operator/auth-threat-model.md
       #threats-bundle-1-does-not-close (canonical place where
       deferred Bundle-2 work is enumerated).
   Keeps the roadmap promise honest: a skeptic can click through
   to the explicit deferred-work list rather than taking prose at
   face value.

2. "What it does" feature list: insert a new bullet right after the
   approval-workflow bullet covering the 7 default roles, the 33-
   permission canonical catalogue, scope semantics, the auditor
   read-only invariant, the bootstrap path, and the
   privilege-escalation guard. Cross-links to docs/operator/rbac.md,
   the threat model, and the v2.0.x → v2.1.0 migration guide.

3. Security paragraph: replace "API key auth enforced by default
   with SHA-256 hashing and constant-time comparison" with the
   Bundle-1 reality — auth + RBAC + auditor + bootstrap + privilege-
   escalation guard — keeping the rest of the paragraph (CORS,
   SSRF, encryption-at-rest, TLS-1.3, audit trail, CI gates)
   unchanged.

Verified:
  Both link targets exist on disk
    (docs/operator/rbac.md, docs/operator/auth-threat-model.md).
  Threat-model anchor heading "## Threats Bundle 1 does NOT close"
    is intact (line 138).
  All 24 ci-guards pass locally including S-1 (no hardcoded source
    counts re-introduced) and G-3 (no env-var docs drift).

Updates the README to match Bundle 1's actually-shipped surface
and to set honest expectations about Bundle 2 (federated identity)
being the next slice, not yet landed.
2026-05-10 02:21:39 +00:00
shankar0123 531b2a29e7 docs(README): add Status: Early-access disclosure block
Reddit posts and operator-facing copy describe certctl as alpha for
production, but the README's marketing-paragraph framing implied a
more polished maturity. Dual-positioning erodes credibility because
evaluators read both surfaces.

Adds a dedicated "Status: Early-access" blockquote between the
SC-081v3 paragraph and the existing "Actively maintained, shipping
weekly" callout. Calls out the production-quality core (Local CA,
ACME, agent deployment, CRUD, audit) versus the still-maturing
broader surface (intermediate CA hierarchy, ACME/SCEP/EST servers,
network appliances). Encourages lab/dev deployments and welcomes
production deployments with the customer-scale caveat.

The two consecutive blockquotes (Status + Actively maintained) read
as paired signals: the project is early-access AND actively
shipping, which is the honest joint position.
2026-05-06 07:45:55 +00:00
shankar0123 ce7a3a306e docs: fix broken single-file demo invocation in README + qa-prerequisites + ENVIRONMENTS
The README's Quick Start, the qa-prerequisites contributor doc, and the
landing page (separate repo, separate commit) all shipped a copy-paste
command that produces:

  service "certctl-server" has neither an image nor a build context
  specified: invalid compose project

The bug landed silently with commit df4b567 (the U-3 master). Pre-U-3,
docker-compose.demo.yml was self-contained and could be invoked with a
single -f flag. U-3 deliberately reduced it to a 27-line overlay — its
only payload today is `CERTCTL_DEMO_SEED=true` on the certctl-server
service — because the demo seed now applies at boot via
postgres.RunDemoSeed, not via /docker-entrypoint-initdb.d/. The overlay
no longer carries an image: or build: of its own, so it MUST be passed
alongside the base file.

The README/qa-doc/landing-page never picked up the rename of the contract.
Every operator who copy-pasted the Quick Start since U-3 has hit the
"invalid compose project" error and bounced. The operator caught it
running the demo locally today.

This commit fixes the three certctl-repo sites:

  README.md (Quick Start)
    docker compose -f deploy/docker-compose.demo.yml up -d --build
    →
    docker compose -f deploy/docker-compose.yml -f deploy/docker-compose.demo.yml up -d --build

    Plus the "drop the -f flag for clean install" prose now spells out
    the correct fallback (`-f deploy/docker-compose.yml` alone).

  docs/contributor/qa-prerequisites.md (Step 1)
    Same single-file → two-file fix, plus an inline note explaining
    why the override-only file requires the base (so the next person
    who reads it understands the contract instead of re-discovering it).

  deploy/ENVIRONMENTS.md (Demo Overlay → What it adds)
    Replaced the stale "One line: mounts seed_demo.sql into PostgreSQL's
    init directory" claim — that hasn't been true since U-3 — with the
    accurate "One env var: CERTCTL_DEMO_SEED=true; server applies
    seed_demo.sql at boot via postgres.RunDemoSeed" description, plus
    the historical context for why the overlay can't stand alone.

The certctl.io landing page hits the same bug (line 759); fix shipping
in a separate commit in that repo.

Acceptance gate (manual):
  - copy/paste the new README Quick Start command end-to-end against
    a fresh clone — succeeds, dashboard at https://localhost:8443
    shows the seeded demo data within ~30s.
  - clean-install fallback (`docker compose -f deploy/docker-compose.yml
    up -d --build`) starts a working stack with no demo data.
2026-05-05 20:55:26 +00:00
shankar0123 acf4ccd6d8 docs(README): add MCP server bullet to capabilities list
The README's 'What it does' section enumerated 11 capability bullets
(issuers / targets / ACME server / SCEP server / EST server /
hierarchy / approvals / discovery / revocation / alerts) but had
zero mention of the MCP server. The 2026-05-05 CLI/API/MCP ↔ GUI
parity audit confirmed 93 MCP tools shipped today (87 in
internal/mcp/tools.go + 6 in internal/mcp/tools_est.go) covering the
full API surface. That's a real differentiator hidden from anyone
landing on the README.

Adds a 12th bullet positioning the MCP server with concrete example
queries operators can ask their AI client (expiring certs, revoke
with key-compromise reason, agent offline check). Frames the
architectural facts: separate binary at cmd/mcp-server/, stateless
stdio transport, no extra auth surface beyond the existing API key,
no extra attack surface.

Links to docs/reference/mcp.md for setup details.
2026-05-05 19:10:27 +00:00
shankar0123 7888547495 2026-05-05 15:39:08 +00:00
shankar0123 7c134d0575 docs: retire compliance subtree + sweep framework name-drops from prose
Per operator decision the framework-mapping docs are gone. They
were aspirational (no audit, no certification, no validated
mapping); keeping them around was misleading.

Files deleted (1,883 lines):
- docs/compliance/index.md
- docs/compliance/soc2.md
- docs/compliance/pci-dss.md
- docs/compliance/nist-sp-800-57.md

Hyperlinks removed:
- README.md: 'Auditor / compliance' row in the doc table; the
  '(compliance mapping included)' parenthetical in the
  positioning paragraph
- docs/README.md: the '## Compliance' section table; the
  'Auditor / compliance team' reading-order-by-role row

Prose name-drops swept across 24 files:
- README.md: 'FedRAMP boundary CAs / financial-services policy
  CAs' → '4-level boundary CAs / 3-level policy CAs';
  'Compliance-grade for PCI-DSS Level 1, FedRAMP Moderate / High,
  SOC 2 Type II, HIPAA' → cut entirely
- getting-started/{quickstart,concepts,examples,why-certctl,
  advanced-demo}.md: 'compliance' → 'audit' / 'policy';
  'PCI-DSS / SOC 2 / NIST SP 800-57' framework lists cut;
  ''pci': 'true'' tag example → ''environment': 'production''
- migration/cert-manager-coexistence.md: 'compliance rules' →
  'policy rules'
- operator/approval-workflow.md: 'Compliance customers (PCI-DSS
  Level 1, FedRAMP Moderate / High, SOC 2 Type II, HIPAA)' →
  'Operators'; entire 'Compliance control mapping' table
  (PCI-DSS §6.4.5 / NIST SP 800-53 SA-15 / SOC 2 Type II CC6.1
  / HIPAA §164.308(a)(4)) deleted; 'compliance contract' →
  'two-person-integrity contract'; 'compliance auditors' →
  'reviewers'
- operator/legacy-clients-tls-1.2.md: 'PCI-DSS v4.0 Req 4 §2.2.5'
  audit-reference → CWE-326 (kept); 'PCI-DSS Req 4 §2.2.5
  attestation' section retitled to 'TLS posture summary' and
  rewritten without framework framing; 'PCI-DSS, NIST, and
  major browsers will eventually deprecate TLS 1.2' →
  'Major browsers and OS vendors will eventually deprecate
  TLS 1.2'
- operator/database-tls.md: PCI-DSS Req 4 §2.2.5 audit-ref →
  CWE-319 only; 'PCI-DSS scope' → 'sensitive data'; PCI-DSS
  Req 4 v4.0 prose footing → cut
- operator/runbooks/disaster-recovery.md: 'SOC 2 / PCI
  procurement-team deliverable' → 'on-call deliverable';
  'compliance auditors' → 'reviewers'
- reference/connectors/{acme,aws-acm,azure-kv,globalsign,
  local-ca,openssl,ssh,index}.md: 'compliance reporting
  (PCI-DSS §3.6, HIPAA §164.312)' → 'audit reporting';
  'Compliance environments (PCI-DSS Level 1, FedRAMP High,
  HIPAA)' → 'Regulated environments'; 'compliance audits' →
  'audit'; 'FedRAMP boundary CA' pattern names →
  '4-level boundary CA' (technically descriptive)
- reference/protocols/est.md: 'compliance-hook seam' →
  'device-state hook seam'; 'compliance gating' → 'device-state
  gating'; 'est_compliance_failed' → 'est_device_state_failed'
- reference/protocols/scep-intune.md: 'Optional compliance
  check' → 'Optional device-state check'; failure-counter
  'compliance_failed' → 'device_state_failed'; 'Conditional
  Access compliance gating' → 'Conditional Access
  device-state gating'
- reference/intermediate-ca-hierarchy.md: 'FedRAMP boundary-CA
  deployments where the regulator requires...' →
  'Boundary-CA deployments where you want separation of policy
  and issuing authorities'; pattern A retitled '4-level FedRAMP
  boundary CA' → '4-level boundary CA'
- reference/architecture.md: broken Related-docs link to
  compliance.md removed; the rest of that block had stale
  pre-Phase-2 paths (quickstart.md, demo-advanced.md,
  connectors.md, openapi.md, testing-guide.md, test-env.md) —
  retargeted to current locations
- reference/deployment-model.md: 'SOC 2 evidence-report
  generator' → 'Audit-evidence report generator'
- reference/vendor-matrix.md: 'SOC 2 / PCI auditors paste this
  into evidence packs' → 'reviewers paste this into
  vendor-evaluation packs'
- contributor/qa-test-suite.md: 'compliance exist' coverage
  description cut; 'Compliance (PCI / SOC2 / HIPAA-relevant)'
  risk-class label → 'Audit-relevant'

What was kept:
- CWE references (legitimate technical pointers)
- Microsoft API/feature names that happen to use 'compliance'
  literally ('Microsoft Graph compliance API',
  'device-compliance validators' — these are MS product names,
  not framework name-drops)
- 'NIST PQC' on the landing page (Post-Quantum Cryptography is
  the actual NIST standard family, not a compliance framework)

Verified: zero hyperlinks into docs/compliance/ remain. All 24
ci-guards/*.sh pass locally. qa-doc-seed-count.sh clean.
Net diff: 26 files / -1,883 deletions in compliance/ + -32 net
across the prose sweep.

Companion edits in cowork/ (CLAUDE.md doc-tree summary +
WORKSPACE-CHANGELOG.md retirement note) land separately.
2026-05-05 05:26:44 +00:00
shankar0123 2362ca303d docs: Phase 13 — README rewrite to 250-line target
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
README went from 457 lines to a target of 250 (operator decision in
Phase 1 conversation). Focus shifts from feature-catalog + landing-page
duplicate to "developer cloning the repo needs orientation + quickstart
+ entry points to docs."

What stayed:
  - Logo + title + badges (~15 lines)
  - Elevator paragraph + 47-day cliff context (3 paragraphs, compressed)
  - Active-maintenance callout
  - Documentation table — restructured from 22 entries linking to flat
    docs/ to ~6 audience-organized rows linking through the new
    docs/README.md navigation index
  - Screenshots grid (4 tiles)
  - "What it does" — compressed from 33 lines of prose to 8 capability
    bullets, each linking to the canonical doc
  - Architecture paragraph — compressed to one paragraph linking to
    docs/reference/architecture.md
  - Quick Start (Docker Compose, Agent install, Helm, container images)
  - Examples table (5 turnkey scenarios)
  - Development commands
  - License paragraph
  - Dependencies block
  - Footer CTA

What got moved out:
  - Cosign verification / SLSA / SBOM section (67 lines) →
    docs/reference/release-verification.md (NEW). README links to it
    in a 3-line "Verifying a release" section.

What got removed entirely:
  - "Why certctl" + "Architecture" + "Security-first" + "Key design
    decisions" prose walls — duplicated landing page + architecture.md +
    security.md content. README no longer wades through 11 dense
    paragraphs.
  - "Supported Integrations" 4 sub-tables (Issuers / Targets / Protocols
    / Standards / Notifiers, ~80 lines of dense per-row marketing
    copy) — content lives at docs/reference/connectors/index.md and
    docs/reference/protocols/. README mentions counts ("12 issuers, 15
    targets, 6 notifiers") with a single link.
  - "Roadmap" section entirely — V1 + V2 history rotted fastest of any
    section; replaced with implicit "see Releases + Issues for active
    work" via the existing footer CTA.
  - "What It Does" 10-subsection wall (33 lines) — replaced with the
    8-bullet capability list, each linking to its canonical doc.
  - CLI section (20 lines of inline command examples) — links to the
    contributor docs.
  - MCP Server section (30 lines of setup) — links to docs/reference/mcp.md.

New surface added:
  - docs/reference/release-verification.md — moved cosign/SLSA/SBOM
    procedure with one expanded "Why this matters" paragraph
    explaining the keyless OIDC trust anchor.

Every docs/ link in the new README verified to resolve to an existing
file. Cross-references from other docs / certctl.io to the deleted
sections (if any) need follow-up Phase 11 sweeps.
2026-05-05 03:26:05 +00:00
shankar0123 a7b36c4f00 docs: Phase 11 (partial) — fix cross-references after Phase 2 moves
Per Phase 1 audit at cowork/docs-overhaul-phase-1-audit-2026-05-04/.
Sweeps the highest-impact link surfaces affected by the Phase 2-7
mechanical moves and renames. Covers README.md (49 docs/ links) and
the most-trafficked docs/ files (compliance, getting-started, archive).

README.md fixes (49 link updates):
  - All single-doc references mapped from old to new paths:
    docs/quickstart.md → docs/getting-started/quickstart.md
    docs/architecture.md → docs/reference/architecture.md
    docs/connectors.md → docs/reference/connectors/index.md
    docs/acme-server.md → docs/reference/protocols/acme-server.md
    docs/{soc2,pci-dss,nist}.md → docs/compliance/{soc2,pci-dss,nist-sp-800-57}.md
    ... (full mapping in the sed pipeline)
  - 3 references to deleted features.md replaced with pointers to
    architecture.md + connectors/index.md.

docs/compliance/index.md (3 sibling renames):
  compliance-soc2.md     → soc2.md
  compliance-pci-dss.md  → pci-dss.md
  compliance-nist.md     → nist-sp-800-57.md

docs/compliance/pci-dss.md (3 external refs need ../):
  architecture.md  → ../reference/architecture.md
  connectors.md    → ../reference/connectors/index.md
  quickstart.md    → ../getting-started/quickstart.md

docs/getting-started/concepts.md (4 external refs):
  crl-ocsp.md      → ../reference/protocols/crl-ocsp.md
  architecture.md  → ../reference/architecture.md
  mcp.md           → ../reference/mcp.md
  openapi.md       → ../reference/api.md

docs/getting-started/quickstart.md (4 external refs + 1 sibling):
  tls.md           → ../operator/tls.md
  upgrade-to-tls.md → ../archive/upgrades/to-tls-v2.2.md
  architecture.md  → ../reference/architecture.md
  demo-advanced.md → advanced-demo.md (sibling rename)

docs/getting-started/examples.md (4 external refs):
  migrate-from-certbot.md         → ../migration/from-certbot.md
  migrate-from-acmesh.md          → ../migration/from-acmesh.md
  certctl-for-cert-manager-users.md → ../migration/cert-manager-coexistence.md
  connectors.md                   → ../reference/connectors/index.md

docs/archive/upgrades/to-tls-v2.2.md (3 external refs need ../../):
  tls.md           → ../../operator/tls.md
  quickstart.md    → ../../getting-started/quickstart.md
  test-env.md      → ../../contributor/test-environment.md

docs/archive/upgrades/to-v2-jwt-removal.md (2 external refs need ../../):
  architecture.md  → ../../reference/architecture.md
  tls.md           → ../../operator/tls.md

Verified all README.md docs/ links resolve to existing files. The only
remaining top-level link is testing-guide.md which still exists at the
top of docs/ (Phase 5 will prune it later).

Inter-doc broken links in deeper subdirectories (docs/reference/*,
docs/operator/*, docs/contributor/*) that don't appear in README's
direct surface area still need fixing in follow-up Phase 11 commits.
This commit handles the operator-facing entry points.
2026-05-05 03:19:21 +00:00
shankar0123 ad440562b7 docs(README): strategic refresh — surface Rank 4/5/7/8 + ACME server + cloud targets
README audit found six classes of drift between the README and the
shipped repo. Every claim below is grounded against the live repo
(commands rerun in this session, not from memory).

Stale numeric claims fixed:
  '111 routes'   → '180+ routes'
                   (live: grep -cE 'r\.Register' router.go = 184)
  '80 tools'     → '85+ tools'
                   (live: grep -cE 'mcp\.AddTool' tools.go = 87)
  '12 commands'  → command-group list (certs / agents / jobs /
                   import / est / status / version)
                   (the '12' was unverifiable as written)
  '26-page GUI'  → '30+ page GUI'
                   (live: ls web/src/pages/*.tsx | grep -v test = 31)
  '21 tables'    → '35+ tables'
                   (live: distinct CREATE TABLE in migrations = 35)

Connectors added to tables (these shipped commits ago without
README mentions):
  Deployment Targets:
    AWS Certificate Manager (AWSACM)   — commit 54033aa, Rank 5
    Azure Key Vault (AzureKeyVault)    — commit 14fcc82, Rank 5

  Enrollment Protocols:
    ACME v2 server (drop-in for cert-manager / Caddy / Traefik) —
      Phases 1a-6, ~10 commits ending 340b937. Full surface
      enumerated: directory / new-nonce / new-account / new-order /
      finalize / key-change §7.3.5 / revoke-cert §7.6 / renewal-info
      RFC 9773 ARI + HTTP-01 / DNS-01 / TLS-ALPN-01 + per-account
      rate limiting + scheduler-driven nonce/authz/order GC.

  Existing rows updated:
    Local CA: now mentions tree-mode N-level hierarchy (Rank 8)
    Vault: now mentions auto-token-renewal at TTL/2 (commit ceca364)
    EJBCA: now mentions mTLS auto-reload via mtlscache (commit 88e7d0c)

Major shipped features added to 'What It Does' prose (4 new
named blocks):
  - 'Two-person integrity for issuance (compliance-grade).'
    — Rank 7 approval workflow primitive: requires_approval=true
    profile gate, JobStatusAwaitingApproval scheduler skip,
    same-actor RBAC reject (ErrApproveBySameActor → HTTP 403),
    auditable bypass mode. Procurement-checklist closer for PCI-DSS
    Level 1 / FedRAMP / SOC 2 / HIPAA.

  - 'Multi-level CA hierarchy management.'
    — Rank 8 first-class CA hierarchy: intermediate_cas table,
    RFC 5280 §3.2 / §4.2.1.9 / §4.2.1.10 service-layer enforcement,
    drain-first retire, FedRAMP / financial-services / internal-PKI
    patterns, byte-equivalence pin for unmigrated deployments.

  - 'Run certctl as your ACME server.'
    — Beyond consuming public ACME CAs, certctl now serves RFC 8555.
    Three client walkthroughs (cert-manager, Caddy, Traefik) cited.

  - 'Cloud-managed targets.'
    — AWS ACM + Azure Key Vault SDK-driven import + atomic rollback.

  - 'Notifications + per-policy multi-channel routing.'
    — Rank 4: AlertChannels matrix + AlertSeverityMap +
    fault-isolating per-channel dispatch + Prometheus counter.

V2 paragraph rewritten:
  Pre-edit: a single 800-word wall-of-text bullet that listed
    everything. Buried Rank 4-8 features in the middle.
  Post-edit: 12 named feature blocks, each one to two sentences.
    Scannable. Cloud targets, ACME server, approval workflow,
    CA hierarchy, multi-channel alerts each get their own
    headline + one-line story + doc link.

Documentation table extended with 5 newly-linked operator runbooks
(all of which existed but were never reachable from the README):
  - docs/acme-server.md
  - docs/approval-workflow.md
  - docs/intermediate-ca-hierarchy.md
  - docs/runbook-cloud-targets.md
  - docs/runbook-expiry-alerts.md

Plus 4 deeper cross-links inside the Enrollment Protocols + 'What
It Does' prose:
  - docs/acme-cert-manager-walkthrough.md
  - docs/acme-caddy-walkthrough.md
  - docs/acme-traefik-walkthrough.md
  - docs/acme-server-threat-model.md

Verified locally:
  All 9 previously-orphaned docs now reachable from README.md.
  No stale numeric claim remains:
    grep -nE '\b(111 routes|80 tools|12 commands|26.page|21 tables)' README.md
    → no matches.
  README size: 426 → 457 lines (+31). Net addition is 4 prose
    blocks + 2 table rows + 5 doc-table rows + 1 V2 paragraph
    rewrite (15 → 12 lines but each line denser).

Strategic framing (CMO hat):
  - ACME server is the cert-manager adoption-funnel headline; gets
    its own table row + dedicated 'What It Does' block.
  - CA hierarchy is the Venafi / EJBCA replacement story for
    FedRAMP / financial-services / internal-PKI procurement;
    explicit market positioning.
  - Approval workflow framed as procurement-checklist closer
    (PCI-DSS L1 / FedRAMP / SOC 2 / HIPAA explicitly named).
  - Cloud-managed targets framed as 'we deploy to your cloud
    secret store' story.

Doc-only commit. No code, no test changes.
2026-05-04 03:58:21 +00:00
shankar0123 3be1c0c973 docs(README): drop V3 Pro + V4 sections — everything ships free under BSL
Strategic pivot. We are NOT building a V3 Pro paid tier or a V4 cloud /
scale tier. Every certctl feature — current and future — ships free under
the same BSL 1.1 source-available license. No gated features, no paid
edition, no enterprise tier. Future revenue path is a managed-service
hosting offering: operator runs the certctl-server control plane as a
hosted service; customers self-install only the certctl-agent in their
infrastructure. The self-hosted code stays free forever; the managed
service sells operational convenience (no PostgreSQL to run, no upgrades,
no backups, no SSO setup). BSL 1.1 was already structured around exactly
this — the license expressly prevents competitors from running their own
commercial certctl-as-a-service against the same source while leaving
self-hosting unrestricted.

Removed the old roadmap sections:
- "### V3: certctl Pro" — Enterprise capabilities for larger deployments
  are available in the commercial tier.
- "### V4+: Cloud & Scale" — Kubernetes cert-manager external issuer,
  cloud infrastructure targets, extended CA support, and platform-scale
  features.

Replaced with a single "Forward-looking work — all free, all self-hostable"
section that names the real engineering tracks (OIDC / SSO / RBAC,
NATS / real-time, search / risk scoring, HSM / TPM / FIPS, deeper Vault
auth, cloud-managed-target deep integrations, adapter hardening, credential
lifecycle expansion) and points at the workspace-level WORKSPACE-ROADMAP.md
for the unshipped backlog. The full feature surface lands in V2 over time
— V3 / V4 are not real version targets, they were positioning artifacts.

Diff: 2 insertions / 5 deletions. README's License section (BSL 1.1
licensing-inquiries footer) is unchanged.
2026-05-04 00:00:23 +00:00
shankar0123 bc6039a79e chore: sweep github.com/shankar0123/certctl URL refs to certctl-io/certctl
Post-transfer cosmetic + release-critical URL refresh after moving the
repo from github.com/shankar0123/certctl to github.com/certctl-io/certctl
(2026-05-03). GitHub HTTP redirects continue to forward old URLs forever,
so existing operators are not broken — but aligns the canonical
references with the new owner so:

- procurement engineers / contributors browsing the docs see the right
  URL on first read
- operators copying the agent install one-liner hit the new path
  directly without going through a redirect
- the Helm chart's default image repository points at the canonical org
  registry path
- the OnboardingWizard rendered to first-run UI users shows the new
  URL in the install snippets and doc anchor links
- the GitHub Actions release workflow pushes container images to
  ghcr.io/certctl-io/certctl-{server,agent} (was: shankar0123)
- the release-notes Markdown body in release.yml — which gets stamped
  into every future release page — references the post-transfer
  cert-identity (cosign keyless signing now uses the certctl-io
  workflow URL) and the post-transfer SLSA provenance source-uri.
  Without this, every cosign verify / slsa-verifier command on a
  v2.1.0+ release would fail because the cert-identity-regexp would
  not match the signing identity GitHub Actions OIDC issues post-
  transfer. Old releases (v2.0.67 and earlier) keep their immutable
  release-notes pointing at the shankar0123 path and remain
  verifiable via their own published instructions.

Customer impact:
- Operators on ghcr.io/shankar0123/certctl-{server,agent}:latest
  silently freeze on whatever tag was current at transfer time. They
  get no errors; they just stop receiving updates. The next release
  notes need a one-line callout (Phase 3.1 of cowork/transfer-
  certctl-to-org.md) telling them to update their image path to
  ghcr.io/certctl-io/certctl-{server,agent}.
- All other URLs (git clone, install one-liner, raw.githubusercontent
  URLs, browser links, GitHub API) continue to resolve via permanent
  HTTP redirects. The sweep is cosmetic for those.

Files swept (30 total):
  .github/workflows/release.yml — IMAGE_NAMESPACE, source-uri,
    cosign cert-identity-regexp, IMAGE= snippet (5 refs total).
  CHANGELOG.md, README.md — anchor links, badges, install one-liner,
    cosign verify snippets in operator-facing sections.
  api/openapi.yaml — info / externalDocs URLs.
  install-agent.sh — GITHUB_REPO const + systemd unit Documentation=
    field.
  deploy/ENVIRONMENTS.md, deploy/helm/{CHART_SUMMARY,INDEX,
    INSTALLATION,README}.md, deploy/helm/certctl/{Chart.yaml,
    README.md,values.yaml}, deploy/helm/examples/values-*.yaml —
    chart docs + image repository defaults across dev / prod-ha
    overrides.
  docs/{certctl-for-cert-manager-users,connector-iis,connectors,
    migrate-from-acmesh,migrate-from-certbot,quickstart,test-env,
    why-certctl}.md — operator-facing doc URLs.
  examples/{acme-nginx,acme-wildcard-dns01,multi-issuer,
    private-ca-traefik,step-ca-haproxy}/docker-compose.yml +
    examples/step-ca-haproxy/step-ca-haproxy.md — example image:
    paths and accompanying narrative.
  web/src/pages/OnboardingWizard.tsx — first-run-UI URL refs (curl
    install one-liners, agent docker image path, doc anchor links).

Files intentionally NOT swept (Choice A from cowork/transfer-certctl-
to-org.md):
  go.mod, go.sum — module declaration stays github.com/shankar0123/
    certctl. Existing imports compile because Go uses the path
    declared in go.mod, not the URL it was fetched from. Internal-
    only project; no external Go consumers; rename will land as a
    mechanical sed when one materializes.
  ~250 *.go files — every import remains github.com/shankar0123/
    certctl/internal/...
  deploy/test/f5-mock-icontrol/go.mod — separate test sub-module;
    same Choice A logic; module path stays.

Files intentionally NOT swept (other reasons):
  README.md lines 244-245 — Scarf-pixel docker-pull commands.
    shankar0123.docker.scarf.sh/... is a Scarf-account hostname
    (per-user, not per-repo) and the pixel keeps tracking pulls
    against the operator's personal Scarf account. Migrating to a
    certctl-io Scarf account is a separate decision (create org
    Scarf account → re-create package → update README).
  deploy/test/f5-mock-icontrol/f5-mock-icontrol — checked-in
    compiled binary with shankar0123/certctl baked into Go build
    info via the sub-module path. Out of scope for a URL sweep;
    will refresh on the next `make test-integration` rebuild.

Verification:
  gofmt: clean (no .go files touched).
  go vet ./...: clean (verified at this SHA in 1.3 of the transfer
    checklist; no .go changes since).
  go build ./...: clean (same).
  go test -short on representative packages: green (same).
  Diff shape: 30 files, 74 insertions / 74 deletions, net-zero size,
    pure URL substitution.
2026-05-03 23:39:50 +00:00
shankar0123 355da058ed chore(README): remove the second Scarf pixel — analytics consolidated to certctl.io
The README has carried two Scarf pixels for some time:
  - 89db181e-76e0-45cc-b9c0-790c3dfdfc73 (kept earlier as 'GitHub
    traffic complement to GitHub Insights')
  - b9379aff-9e5c-4d01-8f2d-9e4ffa09d126 (moved to the certctl.io
    landing page in commit ef7da23)

Re-evaluating: GitHub Insights → Traffic already provides repo
views, uniques, clones, and referring sites with click counts at
higher granularity than a Scarf pixel can extract from the README
(Scarf can only see 'github.com' as the referrer; GitHub Insights
knows the actual external referrer that landed the visitor on the
README). The 89db181e pixel was duplicative-and-worse.

Removing it. All certctl analytics now consolidate to:
  - GitHub Insights → Traffic (built-in, more granular than Scarf
    on the README surface)
  - certctl.io's b9379aff pixel (referrer-attribution for landing-
    page traffic, where Scarf actually adds value)
  - Scarf Docker Gateway via shankar0123.docker.scarf.sh/* (when
    the Helm chart + docker-compose.yml are routed through it —
    follow-up work)

The Docker-pull example block at line 246 stays (it documents
how operators install certctl via the Scarf gateway). Only the
in-README tracking <img> is removed.
2026-05-01 20:59:22 +00:00
shankar0123 ef7da23018 chore(README): remove duplicative Scarf pixel — moved to certctl.io
The README had two Scarf pixels (89db181e and b9379aff). For README
visit tracking, GitHub's built-in Insights → Traffic dashboard
already provides views, uniques, clones, AND referring sites with
click counts (Reddit, HN, Twitter, search, etc.) at higher
granularity than a Scarf pixel can extract — Scarf can only see
'github.com' as the referrer because that's where the README HTML
is served from, while GitHub Insights knows the actual external
referrer that landed the visitor on the README.

Removing pixel b9379aff-9e5c-4d01-8f2d-9e4ffa09d126 from the README
and reusing it on the certctl.io landing page (sibling commit on
certctl-io/certctl.io), where Scarf is the only analytics source
and the referrer header actually carries useful attribution.

Pixel 89db181e-76e0-45cc-b9c0-790c3dfdfc73 stays in the README as
a backup signal alongside GitHub Insights — keeps continuity for
the longer-running Scarf project counter.

No data loss: GitHub Insights covers what 89db181e was double-
counting, and b9379aff now serves a distinct surface (certctl.io)
where it actually adds new attribution data.
2026-05-01 06:02:23 +00:00
claude 2eb608f40f docs: deploy-hardening I — atomic deploy + post-verify operator guide + connectors / README updates
Phase 12 of the deploy-hardening I master bundle.

NEW docs/deployment-atomicity.md (12 sections, ~280 lines):
1. Overview — the three procurement-checklist gaps closed
2. The atomic-write primitive (Plan / File / Apply algorithm)
3. Per-connector atomic contract table (all 13 connectors)
4. Post-deploy TLS verification (handshake + SHA-256 + retries)
5. Rollback semantics (3 triggers + escalation path)
6. ValidateOnly dry-run mode (per-connector matrix)
7. File ownership + mode preservation (precedence + per-distro defaults)
8. Per-target deploy mutex (Phase 2)
9. Idempotency via SHA-256 (defends against retry storms)
10. Troubleshooting matrix (one row per failure mode)
11. V3-Pro deferrals (multi-region, pin manifests, SOC 2 export)
12. Per-connector quick reference (paste-able config snippets)

UPDATE README.md::Deployment Targets — every connector row now
notes the atomic + verify + rollback semantics that landed in
deploy-hardening I. Added a closing paragraph linking to the new
docs/deployment-atomicity.md.

UPDATE docs/features.md — two new env-var rows:
- CERTCTL_DEPLOY_BACKUP_RETENTION (default 3, -1 disables)
- CERTCTL_K8S_DEPLOY_KUBELET_SYNC_TIMEOUT (default 60s)

The G-3 docs-drift CI guard is satisfied: every new
CERTCTL_DEPLOY_* env var documented here also appears in source
(internal/deploy/types.go for BACKUP_RETENTION, k8ssecret config
for KUBELET_SYNC_TIMEOUT).

S-1 stale-counts guard: no literal-number current-state counts in
the new doc — the per-connector tests are referenced via the
file:line pattern (internal/connector/target/<name>/<name>_atomic_test.go)
so the operator can grep for the actual count.

Phase 13 next: pre-commit verification (full matrix + CI guard
reproductions).
2026-04-30 15:30:45 +00:00
shankar0123 e25b4842f5 docs(README): expand Standards & Revocation table with production hardening II surfaces
Surfaces the eight items shipped in the post-2026-04-30 production
hardening II bundle on the README's Supported Integrations →
Standards & Revocation table so procurement teams comparing
checklists see them without diving into docs/.

Updates to the existing rows:
  - DER-encoded X.509 CRL: now also calls out RFC 7232 caching
    headers (ETag + If-None-Match 304 short-circuit)
  - Embedded OCSP responder: now also calls out RFC 6960 §4.4.1
    nonce echo + the empty/oversized rejection
  - S/MIME: spelled out the adaptive KeyUsage delta vs TLS default
  - Certificate export: spelled out the cipher (AES-256-CBC PBE2
    SHA-256 KDF) + V2 cert-only design rationale

NEW rows:
  - CRL DistributionPoints auto-injection (RFC 5280 §4.2.1.13)
  - OCSP pre-signed response cache (with the load-bearing
    InvalidateOnRevoke wire called out)
  - Per-endpoint rate limits (OCSP + cert-export)
  - Cert-export typed audit (with cipher pin)
  - Prometheus per-area metrics (certctl_ocsp_counter_total)
  - Disaster-recovery runbook (docs/disaster-recovery.md, the SOC 2
    / PCI procurement deliverable)

G-3 docs-drift CI guard reproduced clean (every CERTCTL_* env var
mention maps back to internal/config/config.go). S-1 stale-counts
prose guard clean (no literal-number prose for current-state
counts; the rate-limit defaults are config-default values, not
source-derived counts that drift).
2026-04-30 06:00:41 +00:00
shankar0123 2c1097a321 fix(README): drop hardcoded source-counts from EST row to satisfy S-1 guard
CI's 'Forbidden hardcoded source-count prose regression guard (S-1)'
fired on the new EST row in README.md:109. The trip was on the literal
'6 MCP tools' phrase — that matches the regex pattern
\b[0-9]+\s+MCP tools\b which the S-1 guard rejects per the CLAUDE.md
rule 'Numeric claims about current state rot.'

Same rule covers the '13 typed audit-action codes' literal earlier on
the same line — the regex doesn't catch that one specifically (no
'audit-action codes' alternation in the guard pattern), but the spirit
of the rule applies, so I removed it preemptively to avoid the next
operator-reads-the-doc-then-edits-the-code-then-the-count-is-wrong
drift cycle.

Replacements:
  '13 typed audit-action codes (...)' →
    'Typed audit-action codes per failure dimension (... — full set in
     internal/service/est_audit_actions.go)'

  'CLI + 6 MCP tools' →
    'CLI + matching MCP tool family (rebuild count via
     grep -cE '"est_' internal/mcp/tools_est.go)'

The rebuild-command form follows the convention CLAUDE.md::Current-state
commands established + the existing docs/features.md row
'MCP tools | rebuild via grep -cE 'gomcp\.AddTool\(' ...'

Verified locally with the exact CI guard regex against README.md +
docs/ — 'S-1 stale-counts guardrail: clean.'

The 'All six RFC 7030 endpoints' phrasing earlier on the same line
is NOT a current-state count — six is fixed by RFC 7030 (cacerts +
simpleenroll + simplereenroll + csrattrs + serverkeygen + fullcmc),
not derived from source. The S-1 regex requires \b[0-9]+ literal
digits, so 'six' as a word doesn't match anyway.
2026-04-30 03:12:25 +00:00
shankar0123 9a0430bd87 docs(est): EST RFC 7030 operator guide + WiFi/802.1X recipe + IoT bootstrap recipe + FreeRADIUS integration + architecture + README
EST RFC 7030 hardening master bundle Phase 12 — comprehensive operator-
facing documentation for the Phases 1-11 backend work that shipped on
2026-04-29.

NEW docs/est.md (19 sections, ~810 lines): Concepts (host vs user
enrollment, profile-driven policy, multi-profile dispatch); 5-minute
single-profile Quick start with curl + openssl recipes; Multi-profile
dispatch (CERTCTL_EST_PROFILES=corp,iot,wifi setup with PathID rules
enforced at boot); Authentication modes (mTLS / Basic / both / empty
with cross-check semantics); RFC 9266 channel binding (failure-mode
HTTP mapping table — ErrChannelBindingMissing/Mismatch/NotTLS13 →
400/409/426); WiFi/802.1X recipe with end-to-end FreeRADIUS integration
(EAP-TLS supplicant config, mods-available/eap tls-common block, CRL
distribution endpoint cross-ref, troubleshooting playbook); IoT bootstrap
recipe (factory provisioning, first boot, steady-state renewal,
compromise/decommission via bulk-revoke, recommended cert lifetimes
per master prompt §7.7); serverkeygen for resource-constrained devices
(CMS EnvelopedData wrap, RSA-only at this revision, zeroize discipline,
Phase-1 cross-check refusing _SERVERKEYGEN_ENABLED=true with empty
_PROFILE_ID); HSM-backed CA signing for EST cross-ref (signer interface
seam); Operator GUI tabbed surface tour (/est: Profiles / Recent
Activity / Trust Bundle); CLI + 6 MCP tools; Renewal device-driven
model (RFC 7030 §4.2.2 mandate, renewal-trigger ratios for laptops/IoT,
operator-push via webhook); Troubleshooting matrix (one row per typed
audit-action constant in internal/service/est_audit_actions.go);
TLS 1.2 reverse-proxy runbook cross-ref (channel-binding caveat
explained); Threat model (load-bearing properties: trust-anchor reload
fail-safety, per-profile counter isolation, mTLS cross-profile bleed
defense, source-IP limiter process-locality, server-keygen heap
residency, HTTP Basic in-process-only, legacy-anonymous-default
back-compat carve-out); V3-Pro deferrals; Appendix A (libest sidecar
reproducer + 5 integration test names); Appendix B (Cisco IOS 15.x +
16.x + Apple MDM + OpenWRT + libest <v3.0 wire-format quirks tested
in internal/api/handler/cisco_ios_quirks_test.go).

UPDATED docs/architecture.md: new "EST Server (RFC 7030) — Production
Deployment" section under the existing baseline EST section. Mermaid
diagram of multi-profile dispatch + mTLS sibling route + per-profile
gate ordering + audit + GUI + SIGHUP-equivalent reload. Existing
authentication paragraph updated with forward-ref to the hardening
section. Audit paragraph updated to enumerate the 13 typed est_*
action codes operators grep on. Trust-anchor reload semantics +
libest interop tested in CI both called out.

UPDATED README.md::Enrollment Protocols: replaced the one-line EST
row with the full production-grade surface description matching the
SCEP analog. Cross-references docs/est.md.

UPDATED docs/connectors.md::EST/SCEP Integration: extended the
EST-or-SCEP shared paragraph to point at the per-profile env-var
form for both protocols + linked the new architecture.md section.
NEW "Multi-profile EST dispatch + production hardening" subsection
mirrors the SCEP equivalent: 9-row env-var table, cross-ref to
docs/est.md.

G-3 docs-drift CI guard reproduced locally clean — every CERTCTL_EST_*
mention in docs maps back to internal/config/config.go, and every
defined env var is documented. The `<NAME>` placeholder convention
matches the SCEP idiom so the docs grep doesn't extract per-deploy
profile names as phantom env vars. No new env vars introduced —
this is a pure docs commit.
2026-04-30 02:20:30 +00:00
Shankar 5b67ff3944 refactor(scep-gui): rebrand SCEP admin surface to per-profile tabbed interface (Profiles + Intune + Recent Activity)
Phase 9 follow-up to the SCEP RFC 8894 + Intune master bundle. The
Phase 9.4 GUI shipped 'SCEP Intune Monitoring' at /scep/intune, which
made the per-profile observability surface look Intune-only — operators
running EJBCA + Jamf would never click that nav link expecting per-
profile RA cert + mTLS observability. The page is per-profile keyed
under the hood; this commit rebrands + restructures so the surface
matches what operators actually need.

Spec: cowork/scep-gui-restructure-prompt.md.

User-visible change:

  - Nav link renamed: 'SCEP Intune' → 'SCEP Admin'.
  - Route: /scep is the new canonical path; /scep/intune kept as a
    backward-compat alias that lands directly on the Intune tab.
  - Page header: 'SCEP Administration'.
  - Three tabs:
      * Profiles (default) — per-profile lean cards with RA cert
        expiry countdown, mTLS sibling-route status badge, Intune
        enabled/disabled badge, challenge-password-set indicator.
        'View Intune details →' link on Intune-enabled cards
        deep-links into the Intune tab.
      * Intune Monitoring — the existing Phase 9.4 deep-dive
        (per-status counters, trust anchor expiry, recent failures
        table, reload-trust button + confirmation modal).
      * Recent Activity — full SCEP audit log filter merging all
        four action codes (scep_pkcsreq + scep_renewalreq +
        scep_pkcsreq_intune + scep_renewalreq_intune); chip filters
        for All / Initial / Renewal / Intune / Static.

Backend:

  * internal/service/scep.go — new SCEPProfileStatsSnapshot type +
    IntuneSection sub-block + ProfileStats(now) accessor. Adds
    raCertSubject/raCertNotBefore/raCertNotAfter + mtlsEnabled +
    mtlsTrustBundlePath fields with SetRACert + SetMTLSConfig setters.
    Existing IntuneStatsSnapshot + IntuneStats(now) preserved
    UNCHANGED for /admin/scep/intune/stats backward compat (the
    JSON shape stays byte-stable for external consumers — the
    aliasing approach the prompt initially suggested doesn't work
    because the new shape nests Intune while the old one is flat).
    ChallengePasswordSet is derived from challengePassword != ''
    (the secret value itself is never surfaced).

  * internal/api/handler/admin_scep_intune.go — new Profiles handler
    method on AdminSCEPIntuneHandler with the same M-008 admin gate.
    AdminSCEPIntuneServiceImpl extended (in place; same
    map[string]*service.SCEPService) to satisfy the new
    AdminSCEPProfileService interface. Single handler file gets the
    third method so the M-008 pin entry count stays steady (no new
    file, no new triplet of admin-gate test files — just three new
    Profiles tests inside the existing test file).

  * internal/api/router/router.go — one new route
    'GET /api/v1/admin/scep/profiles' registered to
    reg.AdminSCEPIntune.Profiles. HandlerRegistry unchanged.

  * api/openapi.yaml — new operation 'listSCEPProfiles' documenting
    the request body / response shape / error mapping. Existing
    Intune entries unchanged.

  * cmd/server/main.go — per-profile loop now calls
    scepService.SetMTLSConfig(profile.MTLSEnabled,
    profile.MTLSClientCATrustBundlePath) right after SetPathID, and
    scepService.SetRACert(raCert) right after loadSCEPRAPair returns
    the leaf cert. Both setters are nil-safe.

  * internal/api/handler/m008_admin_gate_test.go — extended the
    existing admin_scep_intune.go entry's justification to mention
    the third endpoint. No new map entry needed (file already
    listed).

Backend tests (8 new):

  * TestAdminSCEPProfiles_NonAdmin_Returns403
  * TestAdminSCEPProfiles_AdminExplicitFalse_Returns403
  * TestAdminSCEPProfiles_AdminPermitted_ForwardsActor — also pins
    that Intune-enabled profiles emit an 'intune' sub-block while
    Intune-disabled profiles OMIT it.
  * TestAdminSCEPProfiles_RejectsNonGetMethod
  * TestAdminSCEPProfiles_PropagatesServiceError
  * TestAdminSCEPProfilesServiceImpl_NilMapReturnsEmpty
  * (existing 16 Phase 9 admin tests still pass — backward-compat
    preserved)

Frontend:

  * web/src/api/types.ts — new SCEPProfileStatsSnapshot +
    IntuneSection + SCEPProfilesResponse types. Existing
    IntuneStatsSnapshot et al unchanged.
  * web/src/api/client.ts — new getAdminSCEPProfiles helper.
  * web/src/pages/SCEPAdminPage.tsx — full rewrite as the tabbed
    surface. Reuses the existing ConfirmReloadModal and Intune
    deep-dive card components verbatim; adds ProfileSummaryCard
    (lean card for the Profiles tab) and ActivityTab. URL state
    sync via useSearchParams so deep links survive reloads + browser
    back/forward. The legacy /scep/intune route alias defaults the
    activeTab to 'intune' on mount.
  * web/src/main.tsx — new <Route path='scep' /> + preserved
    <Route path='scep/intune' /> alias. Both render SCEPAdminPage.
  * web/src/components/Layout.tsx — nav link rebranded:
    label 'SCEP Intune' → 'SCEP Admin', to '/scep/intune' → '/scep'.

Frontend tests (20 — full rebuild):

  * Admin gate (non-admin sees gated banner + zero admin API calls)
  * Profiles tab default + Intune tab tabswitch + ?tab=intune deep
    link + legacy /scep/intune alias all land on Intune
  * Profiles tab status badges (Intune + mTLS + challenge-set)
    reflect each profile's flags
  * RA cert expiry tone bands (good ≥30d / warn 7-30d / bad <7d /
    EXPIRED) verified across three fixture profiles
  * 'View Intune details →' only renders for Intune-enabled
    profiles AND switches tabs on click
  * Empty-state banner when no profiles configured
  * Intune tab counters render with the existing Phase 9 deep-dive
    shape; reload modal Open/Confirm/Cancel/Error paths all pinned
  * Recent Activity tab merges all four SCEP audit actions across
    four parallel useQuery calls; filter chips
    (all/initial/renewal/intune/static) narrow correctly
  * Error path surfaces ErrorState on the active tab

Docs:

  * docs/scep-intune.md — Operational monitoring section heading
    expanded to '(SCEP Administration → Intune Monitoring tab)'.
    Page-surface description rewritten for the tabbed shape;
    admin-endpoints list extended with the new /admin/scep/profiles
    entry.
  * docs/architecture.md — Microsoft Intune Connector trust anchor
    subsection updated to reference the Intune Monitoring tab inside
    the SCEP Administration page + lists all three admin endpoints.
  * docs/legacy-est-scep.md — forward-ref expanded with a parallel
    sentence for the per-profile observability surface (independent
    of Intune).
  * README.md — Enrollment Protocols bullet for Intune updated to
    'admin GUI SCEP Administration page at /scep' with the three
    tabs called out.

Verification:
  * gofmt clean on touched files
  * go vet ./... clean
  * staticcheck on intune+service+handler+router+cmd-server clean
  * go test -short across intune+service+handler+router+cmd-server:
    all green (existing Phase 9 tests + new Profiles tests)
  * Frontend tsc --noEmit clean
  * Vitest: 20/20 SCEPAdminPage tests + 3/3 sibling AuditPage tests
    pass
  * G-3 docs-drift CI guard reproduced locally: clean (no new env
    vars; existing CERTCTL_SCEP_ allowlist prefix covers everything)
  * M-009 hard-zero useMutation guard reproduced locally: clean
    (the existing reload mutation already used useTrackedMutation
    from the Phase 9 follow-up commit 96e81b6)
  * openapi-parity test green (new GET /api/v1/admin/scep/profiles
    operation documented)
  * M-008 admin-gate scanner green (existing admin_scep_intune.go
    entry covers all three handler methods; the test scanner
    enforces the triplet by file, not by endpoint, and the new
    Profiles triplet was added to the existing test file)

Backward compat preserved:
  * /api/v1/admin/scep/intune/stats unchanged — same JSON shape,
    same error codes, same M-008 gate
  * /api/v1/admin/scep/intune/reload-trust unchanged
  * /scep/intune route still works (alias to /scep with activeTab=intune)
  * IntuneStatsSnapshot Go type unchanged
  * IntuneStats(now) accessor unchanged

Refs: cowork/scep-gui-restructure-prompt.md
      cowork/scep-rfc8894-intune-master-prompt.md::Phase 9
      Phase 11.5 (SCEP probe in scanner — opt-in) and Phase 12
      (release prep + tag) of the master bundle resume after this.
2026-04-29 17:46:42 +00:00
Shankar 33c79482db docs(scep-intune): deployment guide + troubleshooting + Microsoft support statement
Phase 11 of the SCEP RFC 8894 + Intune master bundle.

Phase 11.1 — docs/scep-intune.md (new, ~340 lines):

  * TL;DR — drop-in NDES replacement framing; what an operator gets
    over NDES (per-profile endpoints, audit-log forensics, SIGHUP
    reload, GUI monitoring, per-device rate limit).
  * Architecture diagram — Intune cloud → Connector → certctl SCEP
    → issuer connector. Explicit 'certctl replaces NDES, NOT the
    Connector' framing; nine-gate dispatcher walk (shape pre-check,
    JWS sig, version dispatch, time bounds, audience pin, CSR binding,
    replay, per-device rate limit, optional compliance).
  * Migration playbook (NDES + EJBCA / NDES + ADCS) — 9-step run-book:
    install alongside, configure per-profile endpoint, extract trust
    anchor, configure CONNECTOR_CERT_PATH + AUDIENCE, configure
    issuer connector, migrate one profile, verify enrollment, roll
    out fleet, decommission NDES.
  * Intune SCEP profile field mapping table — every Intune admin
    center field mapped to certctl's behavior (cert type, subject
    name format, SAN, validity, key storage provider, key usage,
    EKU, hash algorithm, SCEP server URL).
  * Trust anchor extraction recipe — step-by-step certlm.msc export
    of the 'CN=Microsoft Intune Certificate Connector' cert, PEM
    rename, env-var configuration, HA Connector concatenation, SIGHUP
    rotation flow.
  * Troubleshooting matrix — 10 failure modes mapped to root causes
    and operator actions: signature_invalid (trust anchor stale),
    claim_mismatch (Intune profile SAN config), expired (clock skew /
    Connector cert past NotAfter), not_yet_valid (reverse skew),
    wrong_audience (URL mismatch), replay (retry-window collision),
    rate_limited (limiter doing its job), unknown_version (Microsoft
    shipped new format), malformed (proxy mangling body),
    compliance_failed (V3-Pro hook returned non-compliant).
  * Operational monitoring — admin GUI surface description, expiry
    badge tone bands (≥30d green / 7-30d amber / <7d red / EXPIRED),
    per-status counter polling cadence, audit log filter, recommended
    Prometheus alert thresholds.
  * Limitations — explicit V3-Pro deferrals: native Microsoft Graph
    integration, Conditional Access compliance gating, per-tenant
    trust anchors (MSP scoping), OCSP stapling at SCEP-response time,
    auto-discovery of Connector signing cert.
  * Microsoft support statement — three Microsoft Learn URLs (verified
    live with HTTP 200): Connector overview, SCEP profile setup,
    Connector install validation. Microsoft documents the Connector
    as RFC-8894-compliant and supports its use against any RFC 8894
    SCEP server.

Phase 11.2 — Cross-references:

  * docs/legacy-est-scep.md — the previous forward-ref pointed at
    'the Phase 11 doc this bundle ships'; updated to a richer pointer
    that lists what scep-intune.md covers (architecture, migration,
    profile mapping, extraction, troubleshooting, monitoring,
    limitations, Microsoft support).
  * README.md — new bullet under Enrollment Protocols table:
    'Microsoft Intune SCEP fleet (drop-in NDES replacement)' with
    the per-profile dispatcher feature list + link to scep-intune.md.
    Procurement teams scanning the README see the Intune story
    alongside ChromeOS / Jamf in the same table row.
  * docs/architecture.md — new 'Microsoft Intune Connector trust
    anchor (per-profile, opt-in)' subsection in the Security Model
    section. ASCII diagram showing the dispatcher walk; calls out
    the SIGHUP reload + admin-gated GUI surface; forward-link to
    scep-intune.md.

Verification:
  * All linked anchors inside scep-intune.md resolve to existing
    headings: #limitations, #microsoft-support-statement,
    #operational-monitoring, #trust-anchor-extraction.
  * All linked doc paths resolve: legacy-est-scep.md, architecture.md,
    features.md, tls.md.
  * All three Microsoft Learn URLs return HTTP 200 (verified via curl).
  * G-3 docs-drift CI guard reproduced locally and clean — the
    migration playbook uses the <NAME> placeholder convention
    consistently (matching features.md style) so the docs scanner
    doesn't extract literal env-var names that aren't in config.go.
  * Backend tests across intune+handler+service+router still green.

Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 11
      cowork/scep-rfc8894-intune/progress.md
2026-04-29 17:03:56 +00:00
certctl-copilot 929bc7a899 docs(scep): RFC 8894 hardening — README + architecture + connectors
SCEP RFC 8894 + Intune master bundle — Phase 6 of 14.

Closes Half 1 of the bundle (Phases 0-6). The certctl SCEP server now
ships full RFC 8894 wire format (EnvelopedData decrypt + signerInfo POPO
verify + CertRep PKIMessage builder), tested against ChromeOS-shape
hermetic E2E requests, with multi-profile dispatch and must-staple
per-profile policy. Half 2 (Phases 7-12) adds the Microsoft Intune
dynamic-challenge layer; Phase 6.5 (mTLS sibling route) is independently
shippable as an opt-in enterprise-procurement feature.

README.md
  * Standards & Revocation table SCEP row updated to mention full RFC
    8894 wire format (EnvelopedData decryption, signerInfo POPO
    verification, CertRep PKIMessage builder), PKCSReq + RenewalReq +
    GetCertInitial messageType dispatch, multi-profile dispatch
    (/scep/<pathID>), per-profile RA cert + key, MVP fall-through for
    lightweight clients.
  * Enrollment protocols paragraph extended with the same scope, plus
    a link to docs/legacy-est-scep.md for the operator + device-
    integration guide.

docs/architecture.md
  * SCEP wire format paragraph rewritten to describe the two paths
    (RFC 8894 first, MVP fall-through), the messageType dispatch
    table, the EnvelopedData decrypt (constant-time PKCS#7 unpad
    closing the padding-oracle leg), the SET-OF Attribute
    re-serialisation quirk per RFC 5652 §5.4, and the CertRep
    PKIMessage shape (cert chain encrypted to req.SignerCert, NOT
    the RA cert).
  * SCEP service interface updated to show the three new
    *WithEnvelope variants alongside the legacy PKCSReq method.
  * Added 'Capabilities advertised', 'Multi-profile dispatch', and
    'Must-staple per profile' subsections covering the RFC 7633
    extension policy.

docs/connectors.md
  * EST/SCEP Integration section extended with the per-profile
    issuer-binding env-var form (CERTCTL_SCEP_PROFILE_<NAME>_ISSUER_ID).
  * New SCEP RA cert + key paragraph pointing operators at the
    legacy-est-scep.md openssl recipe + ChromeOS Admin Console
    pointer + must-staple per-profile policy.

cowork/CLAUDE.md::Active Focus
  * 2026-04-29 SCEP RFC 8894 + Intune master bundle status updated
    to 'HALF 1 COMPLETE (Phases 0-5 of 14 SHIPPED)' with the full
    chain of commit SHAs (770ddd56d30493f5a20a6df0a4dd +
    302314435fcfa7).
  * Unreleased-on-master bullet extended to enumerate the SCEP
    bundle deliverables alongside the CRL/OCSP work, plus the new
    SCEP env vars (CERTCTL_SCEP_RA_*_PATH, CERTCTL_SCEP_PROFILES,
    CERTCTL_SCEP_PROFILE_<NAME>_*).

cowork/CLAUDE.md::Architecture Decisions
  * Added a new bullet for 'SCEP RFC 8894 native implementation
    (post-2026-04-29)' covering the load-bearing design decisions:
    EnvelopedData decrypt with constant-time padding strip, the
    SET-OF re-serialisation quirk, the dispatch-on-messageType
    pattern, multi-profile dispatch, the MVP fall-through contract,
    capability advertisement, ChromeOS-shape E2E test, must-staple
    per-profile.

Smoke test against fresh make docker-up SKIPPED in this commit — the
sandbox doesn't have Docker available. The full smoke recipe is in
the Phase 6.3 prompt; CI runs the full integration suite via the
standard docker-compose.test.yml workflow on the next push.

Verification (sandbox):
  * gofmt + go vet + staticcheck clean for all touched paths.
  * go test -short -count=1 green across api/handler / api/router /
    service / pkcs7 / connector/issuer/local / domain / cmd/server.
  * Coverage held: handler 79.0% / service 73.2% / pkcs7 80.5% /
    config 96.0% / domain 88.6% / router 100%.

Phase 6 of 14 in SCEP RFC 8894 + Intune master bundle.
Half 1 COMPLETE. Half 2 (Phases 7-12, Microsoft Intune dynamic-
challenge layer) ready to begin.
2026-04-29 13:21:50 +00:00
certctl-copilot 63d6ed256d docs: README + concepts + features reflect CRL/OCSP responder bundle
Audit pass against cowork/crl-ocsp-responder-prompt.md found three
operator-facing docs still describing the pre-bundle CRL/OCSP surface
(GET-only OCSP, CA-key-direct signing, no scheduler-driven cache). Each
claim updated below was ground-truthed against repo HEAD before edit.

README.md
  * Standards & Revocation table — CRL row now mentions
    scheduler-pre-generated cache (CERTCTL_CRL_GENERATION_INTERVAL,
    crl_cache table); OCSP row mentions GET + POST forms, dedicated
    responder cert per RFC 6960 §2.6, id-pkix-ocsp-nocheck per
    §4.2.2.2.1, 7d auto-rotation grace.
  * Revocation paragraph — corrected the 'Embedded OCSP responder'
    one-liner to call out the dedicated-responder-cert design (the CA
    private key is never used directly for OCSP signing, which is the
    load-bearing security property for the future PKCS#11/HSM driver
    path) and added the link to the relying-party guide.

docs/concepts.md
  * CRL paragraph — added the scheduler pre-generation + singleflight
    coalescing detail. Kept the existing 24h validity claim (verified
    against internal/connector/issuer/local/local.go:956 — 'NextUpdate:
    now.Add(24 * time.Hour)').
  * OCSP paragraph — corrected the description so it covers both GET
    and POST forms (POST per RFC 6960 §A.1.1 is what production
    clients use: Firefox, OpenSSL s_client -status, cert-manager,
    Intune); added the dedicated-responder-cert + nocheck-extension +
    auto-rotation explanation; cross-link to docs/crl-ocsp.md.

docs/features.md
  * Revocation Infrastructure section — CRL Endpoint, OCSP Responder,
    new Admin Cache Observability subsection, new GUI Revocation
    Endpoints Panel subsection. Corrected the previously-wrong 'Signs
    with the issuing CA key' OCSP claim — the bundle's load-bearing
    security improvement is exactly that the CA key is NOT used
    directly. Cross-link to crl-ocsp.md.
  * Local CA env vars table — added all four new
    CERTCTL_CRL_GENERATION_INTERVAL / CERTCTL_OCSP_RESPONDER_KEY_DIR
    (with the prod 'MUST set' callout) / _ROTATION_GRACE / _VALIDITY
    rows. Closes the G-3 'env var defined in Go but never documented'
    drift that broke CI on commit 0ffd9ea.
  * Migrations table — added 000019_crl_cache and 000020_ocsp_responder
    rows so the table reflects the bundle's persisted surface area;
    also clarified the table is illustrative + pointed readers at
    'ls migrations/*.up.sql' for the full sequence (the table had
    drifted behind reality at 000010 even before this bundle).

docs/architecture.md was already updated in commit 0a548aa with the
same content scope, so no further architecture edits.

Verification:
  * Local G-3 set difference: empty (Go-defined ∖ docs-mentioned for
    CRL/OCSP env vars).
  * 24h CRL validity claim verified against local.go:956 NextUpdate.
  * Migration numbers verified against 'ls migrations/000019* 000020*'.
  * id-pkix-ocsp-nocheck OID verified against
    internal/connector/issuer/local/ocsp_responder.go:60.
2026-04-29 03:20:44 +00:00
Shankar 7990b6fab7 Bundle D: Documentation & transparency sweep — 8 findings closed
Closes H-009 + L-001 + L-007 + L-008 + L-016 + L-017 + L-018 + M-027
from comprehensive-audit-2026-04-25.

H-009 — README JWT verified-already-clean
  README has zero JWT mentions at audit time. docs/architecture.md
  correctly documents JWT/OIDC integration via authenticating-gateway
  pattern (line 905-912).
  .github/workflows/ci.yml: new step
    'Forbidden README JWT advertising regression guard (H-009)'
    greps README for JWT-as-supported phrasing; passes verbatim
    (gateway / pre-G-1) but fails build on net-new advertising.

L-001 (CWE-295) — InsecureSkipVerify per-site justification
  Audit count was 8; recon found 13 production sites.
  docs/tls.md: new 'InsecureSkipVerify justifications' table
    enumerates each site by file:line with per-site rationale.
  cmd/agent/verify.go:78, internal/tlsprobe/probe.go:54,
  internal/service/network_scan.go:460: each previously-bare
    InsecureSkipVerify: true now carries //nolint:gosec.
  .github/workflows/ci.yml: new step
    'Forbidden bare InsecureSkipVerify regression guard (L-001)'
    fails build if any net-new ISV lands in non-test .go without
    nolint:gosec on the same or preceding line.

L-007 — README dependency-audit commands
  README.md: new Dependencies section with go list -m all | wc -l,
    go mod why, govulncheck ./.... Honors operating-rules invariant.

L-008 — Release-time govulncheck gate
  .github/workflows/release.yml: new 'Install govulncheck' +
    'Run govulncheck (release gate)' steps in the matrix job.
    Pinned to same install path as ci.yml. Default exit code
    semantics (fail on called-vuln only, deferred-call advisories
    tracked on master via L-021) keeps the gate appropriate.

L-016 — architecture.md drift fixes
  docs/architecture.md: system-components diagram's '21 tables'
    annotation removed (current 23; replaced with TEXT-keys
    descriptor); connector-architecture '9 connectors' prose
    replaced with grep ref + current 12-issuer list (added
    Entrust/GlobalSign/EJBCA which were missing); API-design
    '97 operations / 107 total' replaced with grep commands.
  Connector subgraphs verified-current at 12/13/6.

L-017 — workspace CLAUDE.md verified-already-clean
  Bundle B's pre-commit-gate refactor already converted current-
  state numeric claims to grep commands. Phase 0 recon confirmed
  zero remaining hardcoded counts.

L-018 — Defect age table
  cowork/comprehensive-audit-2026-04-25/defect-age.md (NEW):
    Tabulates all 9 High findings with first-mentioned commit,
    closing bundle, days-open. Methodology snippet for re-running.
    Key finding: 8 of 9 closed within 24h of audit publication.

M-027 — OpenAPI parity verified-already-clean
  Audit's 'router 121 vs OpenAPI 125 — 4-op gap' was wrong
  methodology. The 4-op 'gap' was exactly the 4 routes registered
  via r.mux.Handle (auth-exempt allowlist) instead of r.Register.
  When you count both dispatch shapes the totals match exactly.
  internal/api/router/openapi_parity_test.go (NEW):
    TestRouter_OpenAPIParity AST-walks router.go for both
    Register and mux.Handle calls + walks api/openapi.yaml's
    path/method nesting + asserts the sets match. Adding a route
    without updating the spec fails CI permanently.

Audit deliverables:
  audit-report.md: score 38/55 -> 46/55 closed
    (High 7/9 -> 8/9; Medium 20/27 -> 21/27; Low 8/19 -> 14/19)
  findings.yaml: 8 status flips open -> closed
  defect-age.md: new file
  certctl/CHANGELOG.md: Bundle D section

Verification:
  TestRouter_OpenAPIParity                                   PASS
  L-001 grep guard self-test (after //nolint:gosec adds)     PASS
  H-009 grep guard self-test                                 PASS
  go test -count=1 -short on changed packages                green
2026-04-27 00:47:15 +00:00
Shankar 4f3c8136e5 Update LICENSE metadata 2026-04-26 23:29:59 +00:00
Shankar 3155b9475f v2.0.47: HTTPS Everywhere — TLS-only control plane, agents/CLI/MCP
Breaking change release. Plaintext HTTP listener removed. The certctl
control plane now terminates TLS 1.3 on :8443 via
http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape
hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md.

Server
- cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert
  swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback),
  preflightServerTLS validation
- cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe,
  watchSIGHUP wiring, cert/key path config threading
- tls_test.go: 418-line regression coverage of reload, preflight,
  callback behavior, SAN validation

Config
- CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required)
- Plaintext rejection: agents/CLI/MCP pre-flight-fail on http://
  URLs with a pointer to docs/upgrade-to-tls.md

Agents, CLI, MCP
- All three pre-flight-reject http:// URLs with fail-loud diagnostic
- CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust
- CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass
  (loud warning on startup)
- install-agent.sh emits both vars as commented template lines

docker-compose
- certctl-tls-init sidecar generates SAN-valid self-signed cert into
  deploy/test/certs/ on first boot
- All demo-stack curls pin against ca.crt with --cacert

Helm chart
- Three TLS provisioning modes, exactly one required:
  - server.tls.existingSecret (operator-supplied)
  - server.tls.certManager.enabled (cert-manager integration)
  - server.tls.selfSigned.enabled (eval only — not for production)
- server-certificate.yaml template for cert-manager mode
- helm install without a TLS source fails at template render with
  a pointer to docs/tls.md

CI
- .github/workflows/ci.yml Helm Chart Validation step renders the
  chart in both existingSecret and cert-manager modes, plus an
  inverse guard-regression test that asserts helm template MUST
  refuse to render when no TLS source is configured. Previously
  the single `helm template` invocation hit the certctl.tls.required
  fail-loud guard and exit-1'd CI. Four invocations now: lint
  (existingSecret), template (existingSecret), template
  (cert-manager), template (no args — must fail).

Integration tests
- deploy/test/integration_test.go stands up the Compose stack over
  HTTPS, extracts the CA bundle, and exercises every certctl API
  over https://localhost:8443
- All 34 integration subtests green (per Phase 8 local CI-parity)

Documentation
- New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload)
- New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade
  warnings, fleet-roll sequencing)
- CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry
  (file heading unchanged; release tag is v2.0.47)
- All curls in docs/, examples/, deploy/helm/ guides use
  https://localhost:8443 --cacert

Verification
- grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits
- grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin
  API default, SSRF doc comment) — zero certctl endpoints
- Tasks #197–#206 (Phases 0–8) all closed in the tracker

Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).
2026-04-20 03:43:10 +00:00
Shankar a4db52ab5a ci(release): migrate cosign sign-blob to --bundle (cosign v3.0)
Cosign v3.0 (shipped by default with sigstore/cosign-installer@cad07c2e,
release v3.0.5) removed --output-signature and --output-certificate from
the sign-blob subcommand. The replacement is a single --bundle flag that
emits a unified Sigstore bundle (.sigstore.json) containing the
signature, certificate chain, and Rekor inclusion proof in one file.

This change migrates both sign-blob invocations in .github/workflows/
release.yml (per-binary matrix signing and aggregate checksums.txt
signing), updates the artefact upload paths, the artefact aggregation
case filter, the GitHub Release asset list, and the release-notes body
verify-blob example. The README cosign verification snippet and sidecar
description are also updated to the --bundle / .sigstore.json shape.

No cosign version pinning. No legacy fallback. OCI image signing
(cosign sign on image digest) is unchanged — only sign-blob flags
changed in v3.0. See M-11 in certctl-audit-report.md.

Verification gates:
- YAML parse: OK
- go vet ./...: exit 0
- go build ./...: exit 0
- grep 'cosign sign-blob' release.yml: 2 (expected: 2)
- grep '.sigstore.json' release.yml: 9 (expected: >=5)
- grep '.sig/.pem' release.yml non-comment: 0 (expected: 0)
- README legacy cosign refs: 0 (expected: 0)
- docs/ legacy cosign refs: 0 (expected: 0)

Coverage: unchanged (CI workflow edit + README — zero Go code touched).
2026-04-18 09:29:20 +00:00
shankar0123 81fc6b26b9 ci(release): add CLI/MCP binaries, checksums, SBOM, Cosign, SLSA provenance (M-3) 2026-04-17 04:04:55 +00:00
Shankar 0750c5f8cc docs: redact V3 feature specifics from README (fixes H-7)
Problem
-------
H-7 (CWE-200 / information disclosure, strategic-policy class): the
public README's V3 section enumerated the paid-tier feature set --
"Role-based access control with profile-gating", "Event-driven
architecture with real-time operational views", "Advanced search",
"compliance scoring", "HSM/TPM integration" -- violating the
CLAUDE.md directive "Keep V3+ deliberately vague -- one-liner
descriptions only. Don't telegraph the paid feature set." The prior
wording also carried factual drift: `compliance scoring` was pulled
forward to V2.2 per the V2.2 Roadmap, so pairing it with V3 in the
README misrepresented the open-core line.

Fix
---
Replace the two-sentence enumeration at README.md:322-323 with a
single deliberately-vague sentence:

  Enterprise capabilities for larger deployments are available in
  the commercial tier.

No named features. No SKU enumeration. Matches the policy one-liner
shape used in neighboring V1 / V2 / V4+ sections. Net -1 line of
prose.

Files
-----
  README.md                          1 -, 1 +

Wire-format invariants preserved
--------------------------------
This is a docs-only change. All protocol surfaces are byte-identical:
  - RFC 7030 EST handler (internal/api/handler/est.go) -- untouched
  - RFC 8894 SCEP handler (internal/api/handler/scep.go) -- untouched
  - Shared internal/pkcs7/ package -- untouched
  - H-1 revocation composite key (migration 000012) -- untouched
  - H-2 SCEP challenge-password preflight + PKCSReq guard -- untouched
  - C-2 AES-256-GCM config encryption contract -- untouched
  - CRL DER bytes, OCSP response bytes -- untouched

Verification
------------
  git diff 844a05c HEAD -- internal/ cmd/ migrations/ api/ deploy/
    -> 0 code changes (only README.md modified after H-1)

Operational note
----------------
No behavioral change. Product positioning only. The V3 feature set
itself remains documented in the gitignored roadmap.md / strategy.md,
which are the intended sources of truth for the paid tier.

Audit report: see /Users/shankar/Desktop/cowork/certctl-audit-report.md
2026-04-16 23:46:37 +00:00
Shankar 0bcbad9e93 docs: move maintenance notice and quick start link above Documentation section
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:05:47 -04:00
Shankar 4e3927e8b4 feat(V2.2): bulk revocation — filter-based fleet-wide certificate revocation
Add POST /api/v1/certificates/bulk-revoke with filter criteria (profile_id,
owner_id, agent_id, issuer_id, team_id, certificate_ids), partial-failure
tolerance, and audit trail. Includes MCP tool, CLI command (certs bulk-revoke),
server-side bulk modal in GUI replacing client-side sequential loop, OpenAPI
spec, compliance mapping updates, and 21 new tests (12 service, 7 handler,
1 CLI, 1 frontend).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 00:06:34 -04:00
Shankar e1630bcb44 feat(M50): cloud secret manager discovery — AWS SM, Azure KV, GCP SM
Extend certificate discovery from filesystem + network to cloud secret
managers. Three pluggable DiscoverySource connectors feed into the
existing discovery pipeline via sentinel agent pattern, with a 9th
scheduler loop for periodic cloud scanning.

- AWS Secrets Manager: aws-sdk-go-v2, tag/prefix filtering, 10 tests
- Azure Key Vault: stdlib HTTP + OAuth2, base64 DER/PEM, 16 tests
- GCP Secret Manager: stdlib HTTP + JWT OAuth2, label filter, 14 tests
- CloudDiscoveryService orchestrator with 9 tests
- 9th scheduler loop (6h default, atomic.Bool idempotency)
- Discovery page: color-coded source type badges
- 14 new env vars across CloudDiscoveryConfig structs
- Docs: connectors.md, architecture.md, features.md, README updated

49 new tests. All CI checks pass (go vet, race, lint, coverage).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 23:01:00 -04:00
Shankar dd79096b70 feat(M49): Entrust, GlobalSign & EJBCA issuer connectors
Add three new issuer connectors completing commercial and open-source CA
coverage. Entrust uses mTLS client certificate auth with sync/async
issuance. GlobalSign Atlas uses mTLS + API key/secret dual auth with
serial-based tracking. EJBCA supports dual auth (mTLS or OAuth2) for
self-hosted Keyfactor CAs.

Each connector implements the full issuer.Connector interface (9 methods),
includes httptest-based unit tests (~14 each), and follows established
patterns (injectable HTTP clients, RFC 5280 revocation reason mapping,
CRL/OCSP delegated to CA).

Also includes: issuer factory cases, env var seeding, config structs,
domain types, seed data (3 rows, all disabled), OpenAPI enum updates,
frontend issuer catalog entries with config fields, and full docs
(connectors.md, architecture.md, features.md, README).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 22:24:12 -04:00
Shankar de82de953b feat(M48): continuous TLS health monitoring — endpoint state machine, shared tlsprobe, 8 API endpoints, GUI
Adds continuous TLS endpoint health monitoring that closes the deploy→verify→monitor loop.
After M25 verifies a deployment succeeded once, M48 continuously confirms it stays healthy.

Key components:
- Shared `internal/tlsprobe/` package extracted from network scanner for reuse
- Health status state machine: healthy → degraded (2 failures) → down (5 failures),
  plus cert_mismatch when served fingerprint differs from expected
- 8th scheduler loop (60s tick, per-endpoint configurable intervals)
- PostgreSQL migration 000011: endpoint_health_checks + endpoint_health_history tables
- 8 REST API endpoints (CRUD, history, acknowledge, summary)
- Health Monitor GUI page with summary bar, status table, create modal, auto-refresh
- 38 new tests (5 tlsprobe + 11 domain + 10 service + 8 handler + 4 frontend)
- All coverage thresholds maintained (service 68%, handler 83%, domain 87%, middleware 63%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:45:45 -04:00
Shankar ff223e2586 feat(M11c): crypto policy enforcement — CSR validation, MaxTTL caps, key metadata
Enforce certificate profile crypto constraints across all 5 issuance paths
(renewal, agent CSR, EST, SCEP). ValidateCSRAgainstProfile() rejects CSRs
with key algorithm/size that don't match profile rules. MaxTTL enforcement
caps certificate validity per issuer connector (Local CA, Vault, step-ca
enforce directly; ACME/DigiCert/Sectigo pass through). Key algorithm and
size are now persisted in certificate_versions for audit compliance.

16 new tests (12 service-layer + 4 Local CA connector). Removes hardcoded
version number from GUI sidebar. Documentation updated across architecture,
features, connectors, and README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 21:05:14 -04:00
Shankar e8ddd3c327 docs: consolidate README — merge architecture, security, design decisions into Why certctl
Fold Architecture, Key Design Decisions, and Security sections into the
Why certctl section as bold-header paragraphs. Removes three standalone
sections, tightening the README structure: Documentation → Integrations →
Why certctl (with architecture, security, design decisions) → What It Does →
Quick Start → Examples → CLI → MCP → Development → Roadmap → License.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:06:43 -04:00
Shankar f1670de4a0 docs: move Supported Integrations under Documentation links in README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:03:11 -04:00
Shankar 741db039b8 docs: rewrite README to highlight all adoption-driving features
Move documentation table to top (below Gantt chart). Condense screenshots
to 4 key images with "see all" link. Add Enrollment Protocols and
Standards & Revocation tables. Surface previously buried features:
dynamic GUI config, onboarding wizard, approval workflows, agent groups,
TLS verification, certificate export, SCEP, revocation infrastructure.
Fix stale numbers (26 pages, 111 routes) verified against repo source.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 17:00:09 -04:00
Shankar 98bb57e6b4 feat(M51): add SCEP server (RFC 8894) for MDM and network device enrollment
Implements Simple Certificate Enrollment Protocol with single-endpoint
operation-based dispatch (GetCACaps, GetCACert, PKIOperation), PKCS#7
SignedData CSR extraction with fallback for raw/base64 CSR, challenge
password authentication via CSR attributes, and shared internal/pkcs7
package extracted from EST handler to eliminate code duplication.

24 new tests (11 service + 13 handler) plus 5 shared pkcs7 package tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 16:47:18 -04:00
Shankar 8a27557773 tighten BSL license scope, fix documentation underselling shipped features
Broadened BSL Additional Use Grant from "hosted or managed service" to cover
any commercial offering (embedded, bundled, integrated). Updated README to
promote all shipped connectors from Beta to Implemented, added EST/ARI/S/MIME
highlight, Helm quickstart, and corrected license description. Fixed
connectors.md stale claims (AWS ACM PCA listed as planned, K8s Secrets
listed as coming soon) and updated overview with exact connector counts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 15:54:03 -04:00
Shankar 28d8771d8c docs: rewrite features.md, audit README + architecture against repo
Rewrote docs/features.md from scratch as authoritative feature inventory
(1255 lines, every claim verified against source files).

Audited README.md and architecture.md against repo — fixed 19 stale
references: K8s Secrets status, issuer counts, dashboard page counts,
CI thresholds, missing connectors in Mermaid diagrams, OpenAPI operation
count, GetCACertPEM behavior, and V2/V4 roadmap accuracy.

Also includes related fixes discovered during audit:
- Scheduler skips expired/failed/revoked certs from auto-renewal
- Seed demo expiry dates moved outside 31-day scheduler query window
- Agent pages use correct last_heartbeat_at field name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 00:22:57 -04:00
Shankar cabe8d33eb fix: correct K8s Secrets status to 'Coming in 2.1', increase audit trail page size to 200
The Kubernetes Secrets target connector has config validation, tests, UI,
and Helm RBAC implemented but the realK8sClient is a stub — runtime
deployment will fail. Update README and connectors.md to reflect actual
status instead of misleading 'Beta' label.

Also increase the audit trail GUI default from 50 to 200 events per page
(backend already permits up to 500).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 12:11:01 -04:00
Shankar e72f06f35b feat(M47): add Kubernetes Secrets target + AWS ACM PCA issuer connectors
Implement both M47 connectors with full cross-layer wiring:

Kubernetes Secrets target: DNS-1123 validation, kubernetes.io/tls Secret
create-or-update, chain concatenation, serial number validation, Helm
RBAC gating. 18 tests.

AWS ACM Private CA issuer: synchronous issuance (like Vault), ARN regex
validation, RFC 5280 revocation reason mapping, CA cert retrieval,
factory + env var seeding. 23 tests.

Cross-cutting: domain types, service validation, config, factory, agent
dispatch, frontend (TargetsPage, issuerTypes), OpenAPI, seed data, Helm
chart, connectors docs, README. Testing docs (testing-guide, qa-test-guide,
qa_test.go) with Parts thematically integrated near related connectors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 20:21:09 -04:00
Shankar 8fe53b7d2d docs: add Docker Compose environments guide and fix compose files
- New deploy/ENVIRONMENTS.md: comprehensive walkthrough of all 4 compose
  files with service-by-service explanations, beginner-friendly Docker
  concepts, and expert-level networking/config details
- Fix docker-compose.dev.yml: agent LOG_LEVEL → CERTCTL_LOG_LEVEL (was
  silently ignored without the CERTCTL_ prefix)
- Add CERTCTL_CONFIG_ENCRYPTION_KEY to base and test compose (enables
  M34/M35 dynamic issuer/target config encryption)
- Add CERTCTL_DISCOVERY_DIRS to base compose agent (enables filesystem
  certificate discovery in default deployment)
- Cross-link ENVIRONMENTS.md from README doc table and quickstart.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:57:17 -04:00
Shankar 1c0457525c docs: add quick-start jump link near top of README
Adds a one-line "Ready to try it?" link right after the maintainer
callout, before the longer prose sections. Gives scanners an immediate
exit to install instructions without rearranging the README's
explain → show → install flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:38:34 -04:00
Shankar c91067b021 docs: expand README documentation table and fix orphaned doc links
- README: Add 7 missing docs to documentation table (MCP server, OpenAPI
  guide, migration guides for certbot/acme.sh/cert-manager, test
  environment, testing guide). Fix connector reference description to
  remove stale counts. Link OpenAPI guide instead of raw YAML.
- architecture.md: Add cross-references to testing-guide.md and
  test-env.md from testing strategy section and What's Next links.
  These were the only two orphaned docs with zero inbound references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:37:47 -04:00
Shankar 6efdb40607 docs: comprehensive documentation audit — fix stale counts, V2/V3 matrix, connector status
- features.md: Fix Feature Matrix to correctly show all V2 Free features
  (F5/IIS/WinCertStore/JavaKeystore as Implemented, not Stub; Vault/DigiCert/
  Sectigo/GoogleCAS as V2 Free, not V3 Paid). Add missing shipped features
  (EST, verification, export, S/MIME, ARI, digest, Helm, onboarding). Update
  issuer count to 9, target count to 13.
- architecture.md: Fix F5/IIS from "interface only, implementation planned"
  to implemented. Add all 13 target connectors to built-in targets list.
- why-certctl.md: Add Sectigo and Google CAS to issuer list (7→9). Fix
  target count (10→13). Remove hardcoded endpoint/operation counts.
- connectors.md: Fix F5 BIG-IP TOC entry from "Interface Only" to
  "Implemented". Remove dead "Planned Issuers" TOC link.
- README.md: Remove competitor product names (CertKit, KeyTalk). Remove
  hardcoded dashboard page count. Remove hardcoded endpoint counts. Fix V4
  roadmap to remove already-shipped issuers (Sectigo, Google CAS).
- Remove hardcoded MCP tool counts (78/80) across 8 files (mcp.md,
  architecture.md, features.md, testing-guide.md, concepts.md, quickstart.md,
  demo-advanced.md, why-certctl.md). Replace with "REST API exposed via MCP"
  to avoid future drift.
- quickstart.md: Docker Compose environments table (from previous session).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 21:33:12 -04:00
Shankar 104ded63ca feat(M45): ACME certificate profile selection, ARI RFC 9773 renumber, 45-day renewal positioning
Three related ACME ecosystem changes shipped as a single milestone:

1. ACME Certificate Profile Selection: Custom JWS-signed newOrder POST with
   `profile` field (e.g., `tlsserver`, `shortlived` for 6-day certs) bypassing
   acme.Client.AuthorizeOrder() since golang.org/x/crypto lacks profile support.
   ES256 JWS signing with kid mode, nonce management, directory discovery.
   Empty profile delegates to standard library path (zero behavior change).
   Configurable via CERTCTL_ACME_PROFILE env var. GUI: profile dropdown on
   ACME issuer config.

2. ARI RFC 9702 → 9773 Renumber: All 25+ references updated across Go source,
   docs, README, and examples. Zero remaining occurrences of RFC 9702.

3. 45-Day / Short-Lived Certificate Positioning: 5 domain tests validating
   renewal thresholds against SC-081v3 validity reduction timeline (200→100→47
   days) and Let's Encrypt 45-day/6-day profiles. ARI (RFC 9773) is the
   expected renewal path for 6-day shortlived certs.

New tests: 13 profile + 5 domain threshold + 1 frontend = 19 new tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 13:52:13 -04:00
Shankar 3cf75ffb73 feat(M38): SSH target connector for agentless deployment via SSH/SFTP
Adds a new target connector enabling certificate deployment to any
Linux/Unix server without installing the certctl agent binary. Uses the
proxy agent pattern — a single agent in the same network zone deploys
certs to remote servers over SSH/SFTP.

Key additions:
- SSH/SFTP connector with key auth (file/inline) + password auth
- Injectable SSHClient interface for cross-platform testing (25 tests)
- Shell injection prevention via validation.ValidateShellCommand()
- Configurable cert/key/chain paths with octal permissions
- GUI: 11 SSH config fields in target create wizard

Also fixes pre-existing frontend bug where all target type strings
(nginx, apache, etc.) were sent as lowercase but the backend expects
proper-case (NGINX, Apache, etc.), breaking GUI-created targets.
Adds missing TargetTypeSSH to validTargetTypes service map.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 12:36:01 -04:00
Shankar cacde5a0ce docs: remove hardcoded test counts from public-facing docs
Replace brittle test count numbers (1,554+, 1,088+, 211, etc.) with
descriptions of testing approach and CI-enforced coverage gates.
Counts go stale every milestone — coverage thresholds are machine-
verified and never drift.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-04 00:20:22 -04:00
Shankar 2a0285f498 feat(M40): F5 BIG-IP target connector via iControl REST
Replace 190-line stub with full iControl REST implementation (~580 lines).
Token auth with 401 auto-retry, file upload + crypto object install,
transaction-based atomic SSL profile updates, cleanup on failure.
Injectable F5Client interface for cross-platform testing. 32 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 22:26:58 -04:00
Shankar 4cc91d0f28 feat(M44): Google CAS issuer connector
Google Cloud Certificate Authority Service integration via REST API
with OAuth2 service account auth (JWT→access token). Synchronous
issuance model, CA pool selection, mutex-guarded token caching,
revocation with RFC 5280 reason mapping. No Google SDK dependency —
all stdlib. 19 tests with httptest mock OAuth2 + CAS API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-03 21:25:34 -04:00