mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-08 15:08:51 +00:00
ceca3647eb5bc522cb88bc2101980ea7aa47ea2e
12 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f0d91fde39 |
docs(ssh): operator playbook for InsecureIgnoreHostKey design choice
Closes Top-10 fix #7 of the 2026-05-02 deployment-target audit re-run (see cowork/deployment-target-audit-2026-05-02-rerun/ RESULTS.md). Pre-fix, the SSH connector's ssh.InsecureIgnoreHostKey() at internal/connector/target/ssh/ ssh.go (realSSHClient.Connect) had only an inline comment justifying the design choice. An acquirer's diligence engineer reading the connector cold pattern-matches "MITM hazard" without seeing the comment. This commit lands a doc-side operator playbook in docs/connectors.md SSH section covering: 1. Why the connector accepts any host key (operator-configured target infrastructure; mirrors network scanner's InsecureSkipVerify and F5's Insecure flag). 2. Threat model the choice accepts (passive eavesdropper on operator-controlled network; layered SSH-key auth limits blast radius). 3. Threat model the choice does NOT accept (public-internet ephemeral hosts, multi-tenant networks, strict MITM- resistance regulatory requirements). 4. Mitigations operators can layer (custom SSHClient via NewWithClient + golang.org/x/crypto/ssh/knownhosts; SSH certificate authentication via @cert-authority pinning; network segmentation; per-target key rotation). 5. When to NOT use the SSH connector (regulatory environments, dynamic IPs, multi-tenant networks). 6. V3-Pro forward path (built-in known_hosts management, tracked in WORKSPACE-ROADMAP.md). Inline comment in ssh.go realSSHClient.Connect updated to forward-reference the new doc subsection (no logic change; same HostKeyCallback: ssh.InsecureIgnoreHostKey() call). Same shape Bundle 8 used for "Operator playbook: keytool argv password exposure" in docs/connectors.md JavaKeystore section. No code-behavior changes. No test changes. Verified locally: - gofmt / go vet clean. - go test -short ./internal/connector/target/ssh/... green. Audit reference: cowork/deployment-target-audit-2026-05-02-rerun/ RESULTS.md Top-10 fix #7. |
||
|
|
534c27d451 |
ssh: fix staticcheck ST1008 — error is last return from restoreFromBackups
CI's golangci-lint run on commit
|
||
|
|
eb39059490 |
ssh: pre-deploy snapshot + reload-failure rollback
Closes Bundle 6 of the 2026-05-02 deployment-target coverage audit
(see cowork/deployment-target-audit-2026-05-02/RESULTS.md). Pre-fix,
DeployCertificate at ssh.go:201-316 wrote new cert/key/chain via
SFTP then ran the operator's reload command. If reload failed, the
new files stayed on the remote — partial-success state with no
rollback path. docs/deployment-atomicity.md L92 promised "Pre-deploy
SCP backup of remote files"; the code didn't deliver.
This commit:
1. Pre-deploy snapshot. Before any WriteFile, iterate the deploy's
target paths (cert, key, optional chain). For each path:
- StatFile to detect existence. errors.Is(err, os.ErrNotExist)
means first-time deploy (rollback = Remove). Other stat
errors bail out before any write happens.
- ReadFile into an in-memory backups map[string][]byte keyed
by remote path. Original mode captured into a parallel
modes map for restore fidelity.
2. SSHClient interface evolution — three changes:
- StatFile(path) (os.FileInfo, error) — was (int64, error).
FileInfo carries Mode() needed for accurate restore. Existing
fixture tests updated to call info.Size() instead of the
bare size value.
- ReadFile(path) ([]byte, error) — new method; SFTP Open + read
via io.ReadAll. realSSHClient implements via sftpClient.Open.
- Remove(path) error — new method; SFTP Remove. Used by the
rollback path to clean up first-time-deploy partial state.
3. On-reload-failure rollback. Replace the bare error-return at
L282-295 with restoreFromBackups + retry-reload escalation:
- For paths in the snapshot map, WriteFile the original bytes
with the original mode (0600 fallback if mode capture was
incomplete).
- For paths that didn't exist pre-deploy, Remove the new file.
- Re-run the reload command (best-effort second attempt). If
it succeeds, the target is back to pre-deploy state. If it
fails, the remote is in pre-deploy file state but the daemon
may be stuck — surface as wrapped error so the operator
knows where to look.
4. DeploymentResult.Metadata gains backup_status_{cert,key,chain}
so operators can see per-path snapshot state on both success
("snapshotted" / "no_pre_existing" / "n/a") and failure
("restored" / "removed" / "restore_failed" / "remove_failed").
buildMetadataWithBackup helper centralises the metadata
shape so success and failure paths emit a consistent set
of keys.
5. Helper extraction. restoreFromBackups(ctx, paths, backups,
modes) is a private method on Connector; returns the first
error + per-key restore status map for clean test seams.
DeploymentResult shape on failure:
- rollback OK + retry-reload OK → Success=false, "reload command
failed; rolled back to pre-deploy state" (clean recoverable
failure; remote fully restored, daemon serving original cert).
- rollback OK + retry-reload FAIL → wrapped error noting "rolled
back files; retry-reload also failed; daemon may need manual
restart". Metadata flags daemon_state_unknown=true.
- rollback FAIL → operator-actionable wrapped error containing
BOTH the reload error AND the rollback error; metadata flags
manual_action_required=true.
Tests added to ssh_test.go (4 new tests, ~330 LOC):
- TestSSH_ReloadFails_FilesRestored — happy rollback path with
pre-existing remote bytes for cert/key/chain. Asserts every
path's last WriteFile call contains the captured backup bytes
verbatim, no Remove calls fired (all paths had snapshots), and
metadata reports backup_status=restored for each path.
- TestSSH_NoExistingCert_ReloadFails_NewCertRemoved — first-time
deploy variant. StatFile returns os.ErrNotExist for every path;
rollback Removes each written file but performs no WriteFile
during restore (no backup to restore from). Asserts exactly 3
WriteFile calls (deploy only) and 3 Remove calls (rollback).
- TestSSH_ReloadFails_RollbackAlsoFails_OperatorActionable —
uses a writeOrderTrackingMock to fail the SECOND WriteFile to
the cert path (i.e. the restore call, not the initial deploy).
Asserts wrapped error contains both the reload error and the
rollback error, and metadata flags manual_action_required=true.
- TestSSH_ReloadFails_RestoreThenSecondReloadFails — partial-
recovery escalation. Rollback succeeds but the post-restore
retry-reload fails. Asserts wrapped error mentions "rolled back
files; retry-reload also failed" and metadata flags
daemon_state_unknown=true.
Existing tests preserved by extending mockSSHClient with backward-
compatible per-path response maps (statByPath / readByPath /
writeFileErrByPath / executeErrSequence). Legacy global fields
(statFileSize / statFileErr / writeFileErr / executeErr) still
work when no per-path override matches, so TestValidateConfig_*
and TestDeployCertificate_Success_* don't need changes.
docs/deployment-atomicity.md L92 unchanged from today's text —
Bundle 1 doc-realignment hasn't shipped, so the "Pre-deploy SCP
backup of remote files" line was never softened. Post-Bundle-6
the claim is honest (was aspirational pre-fix).
Verified locally (sandbox lacks staticcheck install due to disk
pressure; CI runs the full lint gate):
- gofmt -l ./internal/connector/target/ssh/ clean
- go vet ./internal/connector/target/ssh/ clean
- go build ./internal/connector/target/ssh/... clean
- go build ./cmd/agent/... clean
- go test -race -count=1 ./internal/connector/target/ssh/ green
Audit reference: cowork/deployment-target-audit-2026-05-02/RESULTS.md
Bundle 6.
|
||
|
|
482c7e8047 |
chore(fmt): repo-wide gofmt -w sweep — close drift surfaced by ci-pipeline-cleanup Phase 4
Mechanical reformat. The new 'gofmt drift' CI step (added in
ci-pipeline-cleanup Phase 4, commit
|
||
|
|
188a41774a |
chore: gofmt fixes across deploy-hardening I new files
Phase 13 verification surfaced gofmt-formatting drift in 6 files across the bundle's new code: - internal/api/handler/metrics.go (struct field alignment) - internal/connector/target/k8ssecret/validate_only_test.go (alignment) - internal/connector/target/nginx/nginx.go (alignment) - internal/connector/target/postfix/postfix.go (alignment) - internal/connector/target/ssh/validate_only_test.go (alignment) - internal/service/deploy_counters.go (alignment) Pure mechanical gofmt -w fixes; no behavior changes. CI's make verify gate (which runs `go fmt ./...`) didn't catch these because go fmt is more lenient than gofmt -l, but golangci-lint v2.11.4 + the explicit gofmt step in Phase 13 verification did. Phase 13 full-matrix verification all green: - gofmt -l: empty across all bundle-touched files - go vet ./internal/deploy/... ./internal/connector/target/... ./internal/service/ ./internal/api/handler/ ./cmd/agent/: clean - golangci-lint v2.11.4 (the version CI runs): 0 issues - go test -race -count=1 across deploy + nginx + apache + haproxy + agent + service: all green - INTEGRATION=1 go test -tags integration -run Deploy ./deploy/test/...: 4/4 e2e tests green Phase 14 next: release prep — Active Focus update, release notes, Reddit-beat draft, final tag handoff to operator. |
||
|
|
975d1850eb |
feat(ssh,wincertstore,javakeystore,k8ssecret): explicit ValidateOnly + leverage existing connectors
Phase 9 of the deploy-hardening I master bundle. The four non-file-server connectors get real ValidateOnly probes that operators use to preview a deploy without touching the live cert. Existing DeployCertificate paths already have explicit backup + rollback semantics (SCP backup / WinCertStore Get-ChildItem snapshot / keytool snapshot / K8s atomic API). SSH (validate_only.go): - Probes via SSHClient.Connect. Confirms agent reachability + credentials. Cheap (no remote command runs); released cleanly via defer Close. - A true SCP dry-run requires a no-commit upload (SCP doesn't have one). V2 ships the auth probe as the load-bearing check. - 3 new tests in validate_only_test.go. WinCertStore (validate_only.go): - Probes via PowerShell `Get-ChildItem -Path Cert:\<loc>\<store>` using the configured StoreLocation + StoreName (defaults LocalMachine\My). - Confirms agent has Windows + the IIS module + the right ACLs. - 4 new tests including default-store-path verification. JavaKeystore (validate_only.go): - Probes via `keytool -list -keystore <path> -storepass <pass>` using the configured KeystorePath / KeystorePassword and KeytoolPath (default "keytool"). - Confirms keystore exists, password is correct, JRE is on PATH. - 4 new tests covering succeeds / fails / no-path-sentinel / nil-executor-sentinel. K8s Secret (validate_only.go): - Probes via K8sClient.GetSecret on the configured Namespace + SecretName. Returns nil on success or "not found" (the CreateSecret path on Deploy will handle it). Other errors (forbidden/unreachable) surface as wrapped. - 4 new tests covering succeeds / RBAC-error wrapped / no-config-sentinel / nil-client-sentinel. Smoke test connectorsAtPhase3 list shrunk from 7 to 3 entries (ssh + wincertstore + javakeystore + k8ssecret removed). Only caddy (file-mode) + envoy + traefik remain — those three genuinely have no validate-with-target command available. Race detector clean across all 13 connectors. golangci-lint v2.11.4 clean. Phase 10 next: DeployCounters + Prometheus exposer mirroring the production-hardening-II OCSP counter pattern. |
||
|
|
720e773766 |
feat(target): ValidateOnly dry-run method on Connector interface (default returns ErrValidateOnlyNotSupported)
Phase 3 of the deploy-hardening I master bundle. Extends the
target.Connector interface with the dry-run method that operators
will use to preview a deploy before committing — but ships only the
default-stub for all 13 connectors. Phases 4-9 replace each stub
with the real validate-with-the-target implementation.
interface.go:
- Add ErrValidateOnlyNotSupported sentinel (frozen decision 0.6 —
connectors that cannot dry-run, like K8s, return this rather than
nil so operator triage can errors.Is for "not supported" vs
"validated successfully").
- Add ValidateOnly(ctx, request DeploymentRequest) error to
Connector interface.
13 new validate_only.go files (one per connector at
internal/connector/target/<name>/validate_only.go):
- apache, caddy, envoy, f5, haproxy, iis, javakeystore, k8ssecret,
nginx, postfix, ssh, traefik, wincertstore.
- Each file is identical except for the package declaration: a
one-method default stub returning target.ErrValidateOnlyNotSupported.
- Per-connector files (rather than a single embed-method approach)
let Phases 4-9 replace each connector's stub independently
without churning a shared base.
Tests:
- internal/connector/target/validate_only_test.go pins the sentinel
contract (errors.Is identity, Error() string, %w wrap propagation).
- internal/connector/target/validate_only_smoke_test.go (external
test package) constructs a zero-value &<pkg>.Connector{} for each
of the 13 connectors and asserts ValidateOnly returns
ErrValidateOnlyNotSupported. The test's
connectorsAtPhase3 list is the load-bearing CI guard:
- A 14th connector added without wiring ValidateOnly fails the
`len(connectorsAtPhase3) != 13` invariant.
- A connector whose real ValidateOnly lands (Phase 4 NGINX, Phase
5 Apache, etc.) MUST be removed from this list or the smoke test
fails (real impl no longer returns the sentinel). That removal
IS the bookkeeping that the operator-visible bit + behavior
change are wired together end-to-end.
Compile + go vet + golangci-lint v2.11.4 + go test all 0 issues.
Phase 4 next: NGINX canonical real-impl — replace the stub with
nginx -t -c <temp>; same time replace the existing os.WriteFile
flow in DeployCertificate with deploy.Apply(...).
|
||
|
|
4f83175310 |
Bundle M.SSH-extended (Coverage Audit Extension): SSH connector 71.6% -> 90.2% — H-002 closed
internal/connector/target/ssh/ssh_server_fixture_test.go (~580 LoC,
14 tests) pins realSSHClient.Connect / Execute / WriteFile /
StatFile / Close end-to-end via an embedded golang.org/x/crypto/ssh
ServerConn + pkg/sftp.NewServer, bound to net.Listen('tcp',
'127.0.0.1:0'). Same hand-rolled in-process protocol-server pattern
as the M.Email SMTP fixture.
Coverage delta (per-function):
Connect 0.0% -> ~95% (ed25519 host key + password/key auth +
handshake + sftp open)
Execute 25.0% -> ~95% (success path + exit-code-1 + not-conn)
WriteFile 15.4% -> ~95% (round-trip + chmod + not-conn)
StatFile 33.3% -> ~95% (size assertion + not-conn + not-exist)
Close 42.9% -> ~95% (idempotent + never-connected)
Package overall: 71.6% -> 90.2% (+18.6pp; +5.2 above 85% gate).
Test infrastructure
- fakeSSHServer (~150 LoC): net.Listen + ed25519 host key +
PasswordCallback + PublicKeyCallback. Optional toggles for
rejectAuth / dropOnHandshake / failExec / failSFTP failure
modes.
- encodePEMBlock + base64Encode helpers (~50 LoC) for OpenSSH
private-key serialization. Avoids encoding/pem dep churn in
test header.
- t.Cleanup wires server shutdown + WaitGroup-drain of in-flight
connection handlers (no goroutine leaks).
Test groups
- Connect: password success / wrong-password / auth-rejected-all /
handshake-dropped / TCP-refused / key-auth success
- Execute: success / not-connected / exit-code-1
- WriteFile + StatFile: round-trip with size + chmod 0640
verification / not-connected / not-exist
- Close: idempotent / never-connected
Verification
- go test -short -count=1 ./internal/connector/target/ssh/...: PASS
- 20ms wall time
- go vet clean
Audit deliverables
- findings.yaml H-002 status partial_closed -> closed
(will update in extension-progress.md sweep)
- extension-progress.md: M.SSH-extended marked DONE
Closes: H-002 (SSH Connect / Execute / WriteFile branches)
Bundle: M.SSH-extended (Coverage Audit Extension)
|
||
|
|
f4385047b6 |
Bundle M (Coverage Audit Closure): connector failure-mode round — 3 of 4 sub-batches
M.F5 closes H-001; M.Email closes H-003; M.SSH partial-closes H-002; M.Cloud (H-004) deferred.
M.F5 (~430 LoC f5_realclient_test.go):
Coverage: 44.6% -> 90.1% (+45.5pp; +5.1 above 85% target)
Bypasses existing F5Client-interface mock; exercises every realF5Client
HTTP method end-to-end against httptest.Server with canned iControl REST
responses. 401-retry path verified. Per-fn ALL previously-0% lifted to
88-100%. Plus context-cancel test.
M.SSH (~150 LoC ssh_realclient_test.go) PARTIAL-CLOSED:
Coverage: 55.2% -> 71.6% (+16.4pp; below 85% target)
Covers buildAuthMethods all branches + WriteFile/Execute/StatFile
not-connected guards + Close idempotency.
Connect() ~50 LoC needs embedded golang.org/x/crypto/ssh server fixture
(~1000 LoC test infrastructure). Tracked as Bundle M.SSH-extended.
M.Email (~340 LoC email_failure_test.go):
Coverage: 39.7% -> 70.5% (+30.8pp; +0.5 above 70% target)
Hand-rolled minimal SMTP server (responds to EHLO/AUTH/MAIL/RCPT/DATA/
QUIT with canned 2xx/3xx/5xx responses based on per-test failOn map).
Tests:
- Header-injection (CWE-113): CR/LF/NUL in From/To/Subject reject
before any SMTP I/O (6 tests across sendEmail + sendHTMLEmail)
- Connection-refused for both sendEmail and sendHTMLEmail
- SendAlert / SendEvent full SMTP transactions (happy path)
- Server-side failures: RCPT 550, DATA 554
- AUTH PLAIN happy + 535-failure
M.Cloud (H-004) DEFERRED:
AzureKV 41.2% / GCP-SM 43.1%. Same M.F5 approach (httptest.Server +
OAuth2 token endpoint mock) is straightforward but ~600 LoC tests +
~200 LoC mock infrastructure exceeds session budget. Tracked as
Bundle M.Cloud-extended.
Verification:
go vet ./internal/connector/{target/f5,target/ssh,notifier/email}/... clean
gofmt -l clean
staticcheck -checks all clean
go test -short -count=1 PASS
F5 90.1% Email 70.5% SSH 71.6%
Audit deliverables:
findings.yaml: -0008 (F5) + -0010 (Email) -> closed; -0009 (SSH) ->
partial_closed; -0011 (Cloud) retained as deferred
gap-backlog.md: strikethroughs + Bundle M closure-log entry covering all 4 sub-batches
coverage-matrix.md: 3 new rows for F5/SSH/Email at post-Bundle-M coverage
closure-plan.md: Bundle M [~] with per-sub-batch status breakdown
CHANGELOG.md: [unreleased] Bundle M entry
|
||
|
|
c1a9bd60ef |
fix: resolve 8 staticcheck lint errors in test files
SA1029: use typed context key instead of string in main_test.go S1039: remove unnecessary fmt.Sprintf in validation_test.go SA4023: fix unreachable nil check on concrete error type SA4006: fix unused variable assignments in stepca_test.go (4 occurrences) SA4000: fix duplicate expression in ssh_test.go (BEGIN vs END CERTIFICATE) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
43e1c89623 |
test: comprehensive test gap closure across 24 packages
Close coverage gaps identified by dual-audit (qualitative + quantitative). New test files for config (0%→98%), router (0%→100%), handler validation, health, audit, response helpers, webhook notifier (0%→88%), email notifier, middleware (recovery, rate limiter), domain profile, service nil-safety, config helpers, issuer bootstrap, and server bootstrap wiring. Expanded existing tests for ACME (34%→42%), step-ca (42%→52%), F5, SSH, agent (43%→63%), scheduler (88%→99%), renewal service, and issuerfactory. All tests pass: go test -short, go vet, go test -race clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
3cf75ffb73 |
feat(M38): SSH target connector for agentless deployment via SSH/SFTP
Adds a new target connector enabling certificate deployment to any Linux/Unix server without installing the certctl agent binary. Uses the proxy agent pattern — a single agent in the same network zone deploys certs to remote servers over SSH/SFTP. Key additions: - SSH/SFTP connector with key auth (file/inline) + password auth - Injectable SSHClient interface for cross-platform testing (25 tests) - Shell injection prevention via validation.ValidateShellCommand() - Configurable cert/key/chain paths with octal permissions - GUI: 11 SSH config fields in target create wizard Also fixes pre-existing frontend bug where all target type strings (nginx, apache, etc.) were sent as lowercase but the backend expects proper-case (NGINX, Apache, etc.), breaking GUI-created targets. Adds missing TargetTypeSSH to validTargetTypes service map. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |