certctl

mirror of https://github.com/shankar0123/certctl.git synced 2026-06-07 21:41:39 +00:00

Author	SHA1	Message	Date
shankar0123	0157510d48	ci-pipeline-cleanup Phase 5+6: collapse vendor matrix; delete Windows matrix Bundle: ci-pipeline-cleanup, Phases 5+6 / frozen decisions 0.4 + 0.5 + 0.6. Revises Bundle II decisions 0.4 (Windows matrix) and 0.9 (per- vendor granularity). PHASE 5 — Linux vendor matrix collapsed (12 jobs → 1): The previous per-vendor matrix produced 12 status-check rows for ~1 real assertion (115/116 vendor-edge tests are t.Log placeholders per Bundle II Phase 2-13 design). Granularity was fake signal. Single-job version: brings up all 11 sidecars at once via docker compose --profile deploy-e2e up -d, runs go test -run 'VendorEdge_' once, tears down once. Critical caveat: requireSidecar() in deploy/test/vendor_e2e_helpers.go uses t.Skipf() when a sidecar isn't reachable — silent test skip, not CI failure. The new Skip-count enforcement step (scripts/ci-guards/vendor-e2e-skip-check.sh) counts SKIP lines and fails the build if it exceeds the allowlist at scripts/ci-guards/vendor-e2e-skip-allowlist.txt (15 windows-iis- requiring tests legitimately skip on Linux per Phase 6). PHASE 6 — Windows matrix deleted entirely: The deploy-vendor-e2e-windows job removed. Two reasons: 1. Can't physically work on windows-latest today (Docker not started in Windows-containers mode by default; bridge network driver missing on Windows Docker — see CI run 25183374742 failure logs). 2. Even fixed, validates nothing — all 16 IIS + WinCertStore tests are t.Log placeholders that exercise no IIS-specific behavior. Per Bundle II frozen decision 0.14, the third criterion for "verified" status in the vendor matrix is operator manual smoke against a real instance. IIS + WinCertStore now satisfy that via the playbook (Phase 6 follow-up adds docs/connector-iis.md:: Operator validation playbook). The windows-iis-test sidecar STAYS in deploy/docker-compose.test.yml under profiles: [deploy-e2e-windows] for operator local use. Linux CI never activates this profile. Operator-required action before merge: RAM headroom verification on prototype branch (per frozen decision 0.14). If peak RSS > 12 GB on ubuntu-latest with all 11 sidecars up, fall back to bucketed matrix per cowork/ci-pipeline-cleanup/decisions-revised.md. ci.yml: 417 → 383 lines (-34 net; -1105 cumulative since baseline 1488). Status checks per push: 19 → 7 (collapse 12 vendor + 2 windows = -14; add image-and-supply-chain in Phase 7-9 = +1; net 19-12-2+1 = ~7). Operator action for Phase 13: update GitHub branch protection rules (required-checks list 19 → 7 entries). Documented in cowork/ ci-pipeline-cleanup/decisions-revised.md.	2026-04-30 20:46:05 +00:00
shankar0123	86d92efd2b	ci-pipeline-cleanup Phase 2: coverage thresholds → YAML manifest Bundle: ci-pipeline-cleanup, Phase 2 / frozen decision 0.3. Move 9 hardcoded coverage thresholds from inline bash to a YAML manifest at .github/coverage-thresholds.yml. The load-bearing per-package context (Bundle reference, HEAD measurement, gap rationale) survives in the YAML's `why:` field instead of in inline bash comments. Adding a new gated package: one YAML entry instead of ~30 lines of bash + 50 lines of comment. Coverage check logic extracted to scripts/check-coverage-thresholds.sh so the operator can run the same check locally: bash scripts/check-coverage-thresholds.sh ci.yml dropped 557 → 417 lines (-140, total Phase 1+2: -1071, -72% from baseline 1488). Same 9 floors, same fail-on-miss semantics — pure relocation: internal/service: 70 (was: 70) internal/api/handler: 75 (was: 75) internal/domain: 40 (was: 40) internal/api/middleware: 30 (was: 30) internal/crypto: 88 (was: 88) internal/connector/issuer/local: 86 (was: 86) internal/connector/issuer/acme: 80 (was: 80) internal/connector/issuer/stepca: 80 (was: 80) internal/mcp: 85 (was: 85) Sandbox verification: - ci.yml YAML-parses cleanly - coverage-thresholds.yml YAML-parses cleanly with all 9 entries - scripts/check-coverage-thresholds.sh extracts the (pkg, floor) table correctly from the YAML	2026-04-30 20:39:30 +00:00
shankar0123	1caedd5fd3	ci-pipeline-cleanup Phase 1: extract 20 regression guards to scripts/ci-guards/ Bundle: ci-pipeline-cleanup, Phase 1. Pure relocation — no behavior change. Each guard's bash logic is byte-identical to the prior inline version; the only changes are: (a) the guard becomes a sibling script under scripts/ci-guards/<id>.sh, (b) ci.yml's per-guard step is replaced by a single loop step that iterates all scripts. 20 scripts extracted (alphabetized): B-1-orphan-crud.sh, D-1-D-2-statusbadge-phantom.sh, G-1-jwt-auth-literal.sh, G-2-api-key-hash-json.sh, G-3-env-docs-drift.sh, H-001-bare-from.sh, H-009-readme-jwt.sh, L-001-insecure-skip-verify.sh, L-1-bulk-action-loop.sh, M-012-no-root-user.sh, P-1-documented-orphan-fns.sh, S-1-hardcoded-source-counts.sh, S-2-strings-contains-err.sh, T-1-frontend-page-coverage.sh, U-2-plaintext-healthcheck.sh, U-3-migration-mount.sh, bundle-8-L-015-target-blank-rel-noopener.sh, bundle-8-L-019-dangerously-set-inner-html.sh, bundle-8-M-009-bare-usemutation.sh, test-naming-convention.sh Plus scripts/ci-guards/README.md documenting the contract: - Each script must exit 0 on clean repo, non-zero with ::error:: prefix on regression - Runnable from repo root via 'bash scripts/ci-guards/<id>.sh' - Adding a new guard: drop a new <id>.sh; CI auto-picks it up ci.yml dropped 1488 → 557 lines (-931, -63%). Single CI loop step now collects ALL guard failures before failing the build instead of fail-fast — UX win for regressions that hit two guards at once. Two guards (QA-doc Part-count + seed-count, ci.yml lines 868-917) deliberately NOT extracted — they move to 'make verify-docs' in Phase 11 because they protect docs-the-operator-reads, not the product itself. Verification (sandbox): - All 20 scripts pass against HEAD (chmod +x; for g in scripts/ci-guards/*.sh; do bash $g; done) - New ci.yml YAML-parses cleanly - Job boundaries preserved: go-build-and-test, frontend-build, helm-lint, deploy-vendor-e2e, deploy-vendor-e2e-windows - Loop step appears twice (once at end of go-build-and-test, once at end of frontend-build) so both jobs continue running their set of guards	2026-04-30 20:36:26 +00:00
shankar0123	e11cdda135	fix(bundle-7): Verification & Tool Suite Execution — wire mandatory scans + first-run evidence Closes Audit-2026-04-25 D-001..D-002 + D-006 (partial) + H-005 (partial). Opens new tracker IDs H-010, M-028, L-020, L-021 (see closure document in cowork/comprehensive-audit-2026-04-25/tool-output/_BUNDLE-7-CLOSURE.md). What changed - scripts/install-security-tools.sh (NEW) — idempotent installer for the Go-based subset (govulncheck, staticcheck, errcheck, ineffassign, gosec, osv-scanner). Used locally + by both CI workflows. - .github/workflows/security-deep-scan.yml (NEW) — daily + workflow_dispatch scans for tools that need docker/network: trivy image, syft SBOM, ZAP baseline, schemathesis, nuclei, testssl.sh, gosec, osv-scanner, full-suite race detector at -count=10. Every step continue-on-error; artefacts uploaded for triage. - .github/workflows/ci.yml — staticcheck added as a soft (continue-on-error) gate alongside the existing govulncheck hard gate. Soft until M-028 closes the 6 remaining SA1019 deprecated-API sites; flip to fail-on- non-zero then. Per-package coverage gates extended: pkcs7 hard ≥85% (currently 100%), local-issuer soft ≥65% transitional floor (H-010 raises to 85%). - staticcheck.conf (NEW) — suppresses 4 style-only rules (ST1005, ST1000, ST1003, S1009, S1011, SA9003) with documented justifications. Real defects (SA1019) NOT suppressed. - .govulnignore (NEW) — empty placeholder with the suppression contract (one OSV ID + justification + review-by date per line). Bundle-7's 5 deferred-call advisories don't need entries because govulncheck's default exit code already passes. Local tool-run evidence (cowork/comprehensive-audit-2026-04-25/tool-output/2026-04-26/): - govulncheck.txt + govulncheck-verbose.txt — clean (0 affected; 5 deferred-call) - staticcheck.txt + staticcheck-after-suppressions.txt — 6 SA1019 → M-028 - errcheck.txt — 1294 sites, all defer-Close / response-write convention → triaged - ineffassign.txt — 15 unique sites → L-020 - helm-lint.txt — clean (1 INFO-level icon recommendation) - go-test-race.txt — clean across scheduler/middleware/mcp at -count=3 (CI runs -count=10 against the full suite) - go-test-cover.txt — crypto 86.7% ✓, pkcs7 100% ✓, local-issuer 68.3% ✗ → H-010 Closures in this bundle - D-001 partial — 4 of 6 Go-based tools ran locally; remainder wired in CI - D-002 closed — race detector clean - D-006 partial — helm lint passes; kube-score / kubesec deferred to CI - D-007 deferred — semgrep p/react-security wired in CI (needs docker) - D-003 / D-004 / D-005 deferred — wired in security-deep-scan.yml - H-005 partial — crypto + pkcs7 meet 85%; local-issuer at 68.3% → H-010 New tracker IDs opened (next-bundle scope) - H-010 — local-issuer coverage gap (68.3% vs 85% target). 2-3 days. - M-028 — 6 deprecated-API sites (SA1019). Migration coordinated. - L-020 — ineffassign cleanup sweep, 15 mechanical sites. - L-021 — 5 transitive Go-module CVEs (deferred-call). Monitor + bump. NOT addressed in this bundle (deferred to a future Bundle 7-bis) - M-007 bulk-operation partial-failure tests - M-008 admin-gated role-gate tests - L-010 mock.Anything overuse audit - L-018 defect age analysis on remaining High findings Verification - go vet ./... → clean - go build ./... → clean - go test -short -count=1 ./... → all packages pass - go test -race -count=3 ./scheduler/middleware/mcp → clean - go test -cover ./crypto/pkcs7/local-issuer → see go-test-cover.txt - govulncheck ./... → clean - staticcheck ./... → 6 SA1019 (tracked as M-028) - helm lint → clean - yaml lint .github/workflows/*.yml → clean - python3 yaml.safe_load(api/openapi.yaml) → 89 paths Bundle 7 of the 2026-04-25 comprehensive audit. Tool-output evidence preserved at cowork/comprehensive-audit-2026-04-25/tool-output/2026-04-26/.	2026-04-26 14:37:28 +00:00
shankar0123	52248be717	v2.0.47: HTTPS Everywhere — TLS-only control plane, agents/CLI/MCP Breaking change release. Plaintext HTTP listener removed. The certctl control plane now terminates TLS 1.3 on :8443 via http.Server.ListenAndServeTLS. No CERTCTL_TLS_ENABLED=false escape hatch. No dual-listener mode. One-step cutover per docs/upgrade-to-tls.md. Server - cmd/server/tls.go: certHolder with SIGHUP hot-reload + atomic cert swap, buildServerTLSConfig (TLS 1.3 min, GetCertificate callback), preflightServerTLS validation - cmd/server/main.go: ListenAndServeTLS in place of ListenAndServe, watchSIGHUP wiring, cert/key path config threading - tls_test.go: 418-line regression coverage of reload, preflight, callback behavior, SAN validation Config - CERTCTL_TLS_CERT_PATH / CERTCTL_TLS_KEY_PATH (required) - Plaintext rejection: agents/CLI/MCP pre-flight-fail on http:// URLs with a pointer to docs/upgrade-to-tls.md Agents, CLI, MCP - All three pre-flight-reject http:// URLs with fail-loud diagnostic - CERTCTL_SERVER_CA_BUNDLE_PATH for private-CA trust - CERTCTL_SERVER_TLS_INSECURE_SKIP_VERIFY for dev-only bypass (loud warning on startup) - install-agent.sh emits both vars as commented template lines docker-compose - certctl-tls-init sidecar generates SAN-valid self-signed cert into deploy/test/certs/ on first boot - All demo-stack curls pin against ca.crt with --cacert Helm chart - Three TLS provisioning modes, exactly one required: - server.tls.existingSecret (operator-supplied) - server.tls.certManager.enabled (cert-manager integration) - server.tls.selfSigned.enabled (eval only — not for production) - server-certificate.yaml template for cert-manager mode - helm install without a TLS source fails at template render with a pointer to docs/tls.md CI - .github/workflows/ci.yml Helm Chart Validation step renders the chart in both existingSecret and cert-manager modes, plus an inverse guard-regression test that asserts helm template MUST refuse to render when no TLS source is configured. Previously the single `helm template` invocation hit the certctl.tls.required fail-loud guard and exit-1'd CI. Four invocations now: lint (existingSecret), template (existingSecret), template (cert-manager), template (no args — must fail). Integration tests - deploy/test/integration_test.go stands up the Compose stack over HTTPS, extracts the CA bundle, and exercises every certctl API over https://localhost:8443 - All 34 integration subtests green (per Phase 8 local CI-parity) Documentation - New: docs/tls.md (provisioning patterns, rotation, SIGHUP reload) - New: docs/upgrade-to-tls.md (one-step cutover, no-downgrade warnings, fleet-roll sequencing) - CHANGELOG.md: v2.2.0 "HTTPS Everywhere — The Irony" entry (file heading unchanged; release tag is v2.0.47) - All curls in docs/, examples/, deploy/helm/ guides use https://localhost:8443 --cacert Verification - grep -rn "ListenAndServe[^T]" cmd/ internal/ → 0 hits - grep -rn "\"http://" cmd/ internal/ → 2 benign hits (Caddy admin API default, SSRF doc comment) — zero certctl endpoints - Tasks #197–#206 (Phases 0–8) all closed in the tracker Files: 65 changed, 3489 insertions, 372 deletions (pre-CI-fix).	2026-04-20 03:43:10 +00:00
shankar0123	d395776a95	Initial scaffold: certificate control plane v0.1.0	2026-03-14 08:22:17 -04:00

6 Commits