mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 14:11:31 +00:00
7e2481b22524ebdc8d03f44c62a6930cab1d1ab2
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a849c8b8cf |
fix(security): close BUNDLE 2 — safe first run, demo mode, agent bootstrap
Bundle 2 closure (2026-05-12 acquisition diligence audit). Closes the
"docker compose up == accidental production" hazard: pre-Bundle-2 the
base deploy/docker-compose.yml WAS the demo path (AUTH_TYPE=none +
DEMO_MODE_ACK=true + KEYGEN_MODE=server + DEMO_SEED=true + literal
change-me-... placeholder creds), the README claimed "drop the demo
overlay for a clean install", and ENVIRONMENTS.md table documented
auth-type default as api-key — three contradictory stories layered on
the same compose file.
Source findings closed:
R2 R3 C1 D9 finding-2 S9 (repo audit)
SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6 (cowork audit)
Compose split (deploy/docker-compose.yml + deploy/docker-compose.demo.yml):
The base now ships production-shaped — no AUTH_TYPE override, no
KEYGEN_MODE override, no DEMO_MODE_ACK, no DEMO_SEED, no literal
placeholder fallbacks. POSTGRES_PASSWORD / CERTCTL_AUTH_SECRET /
CERTCTL_CONFIG_ENCRYPTION_KEY / CERTCTL_API_KEY / CERTCTL_AGENT_ID
must come from deploy/.env (sample template in deploy/.env.example +
root .env.example). The demo overlay carries the full demo posture
(every env var + every placeholder credential) so the
`-f docker-compose.demo.yml` one-flag flip remains a zero-config
populated-dashboard path.
Fail-closed startup guards (internal/config/config.go::Validate):
Three new gates layered on the existing HIGH-12 demo-mode listen-bind
guard. All three exempt CERTCTL_DEMO_MODE_ACK=true so the demo overlay
keeps working:
• HIGH-6: AUTH_SECRET = "change-me-in-production" → refuse
• HIGH-6: CONFIG_ENCRYPTION_KEY = "change-me-32-char..." → refuse
• LOW-5: CORS_ORIGINS contains "*" (CWE-942 + CWE-352) → refuse
Visible DEMO MODE banner (cmd/server/main.go): every boot under
DEMO_MODE_ACK=true now emits a prominent WARN line with a 6-step
production-promotion checklist. The 2026-04-19 incident (a screenshot
run that kept running for three days) drove this; the per-startup
banner makes the posture unmissable in any log scraper.
Agent enrollment doc alignment:
• docs/reference/configuration.md L83: corrected the non-existent
URL `POST /api/v1/agents/register` to the real route
`POST /api/v1/agents`; added the bootstrap-token note and the
install-agent.sh handoff sequence.
• docs/reference/architecture.md L154: replaced "agents register
themselves at first heartbeat" (false — cmd/agent/main.go fail-
fasts when CERTCTL_AGENT_ID is unset) with the actual two-step
operator-driven flow (REST or GUI registration first, returned ID
fed to install-agent.sh second).
Tests + CI guard:
• 9 new TestValidate_Bundle2_* cases in internal/config/config_test.go
covering: placeholder-secret refused + demo-ack exempt; placeholder
encryption-key refused + demo-ack exempt; real key not mistaken for
placeholder; wildcard CORS refused + demo-ack exempt; wildcard mixed
into a concrete allowlist still refused; concrete allowlist accepted.
• scripts/ci-guards/B2-compose-base-no-demo-env.sh: greps the base
compose for any of the demo-mode env vars + placeholder credentials.
Comments stripped before checking so the narrative header in the
base file can still reference the overlay's posture in prose.
Cold-DB CI smoke (.github/workflows/ci.yml::cold-db-compose-smoke):
Switched to layering -f docker-compose.demo.yml on top of the base —
the new production base requires real env vars the smoke doesn't have,
and the smoke's purpose (catch migration-on-cold-DB regressions + the
bootstrap-token mint path) is orthogonal to which auth posture the
boot lands in.
Receipts:
• Current first-run truth table
compose flag → posture
-f docker-compose.yml (production)
→ requires .env;
fail-fasts on
missing AUTH_SECRET
/ CONFIG_ENCRYPTION
_KEY / POSTGRES
_PASSWORD; agent
fail-fasts on
missing AGENT_ID
-f docker-compose.yml -f docker-compose.demo.yml (demo)
→ zero-config;
AUTH_TYPE=none +
DEMO_MODE_ACK=true
+ KEYGEN=server +
DEMO_SEED=true;
boot banner WARN
-f docker-compose.yml -f docker-compose.dev.yml (dev)
→ base + PgAdmin
+ debug logging
-f docker-compose.test.yml (test, standalone)
→ production-shape
posture, real CA
backends
• Verification (PATH=/tmp/go/bin export GO* paths to /tmp):
gofmt -l # clean (no diffs)
go vet ./internal/config ./cmd/server # clean
go test -short -count=1 ./internal/config/... # PASS (cumulative +
all 9 new Bundle 2
cases green)
go test -short -count=1 # PASS (no regression
./internal/connector/target/configcheck in the Bundle 1 -
closure tests)
go build ./cmd/server ./cmd/agent # clean
./cmd/cli ./cmd/mcp-server
bash scripts/ci-guards/B2-compose-base-no-demo-env.sh # clean
bash scripts/ci-guards/H-1-encryption-key-min-length.sh # clean
bash scripts/ci-guards/G-3-env-docs-drift.sh # clean
Remaining operator warnings (not blocking; tracked in CLAUDE.md
"Open decisions"):
• The first `docker compose -f docker-compose.yml up -d` against a
pre-Bundle-2 .env (placeholder values still in place) will now
fail-fast. This is the intended posture but operators upgrading
from v2.0.x via .env-from-old-master need to rotate before
upgrading. The CHANGELOG note for the v2.1.0 release should
call this out alongside Auth Bundle 2's other breaking changes.
Audit-Closes: BUNDLE-2 R2 R3 C1 D9 S9 SEC-H2 SEC-M1 SEC-M3 OPS-M3 LOW-5 HIGH-6
|
||
|
|
2d9110b0c4 |
auth-bundle-2 Phase 0: dependency-add + oidc auth-type literal + runtime guard
Bundle 2 Phase 0 stages the dependencies + auth-type discriminator
literal that later phases consume. No handler chain wired yet; an
operator who sets CERTCTL_AUTH_TYPE=oidc on this commit gets a clear
refuse-to-start error rather than a silent fallback to api-key (the
G-1 failure mode that drove "jwt" out of the allowed set).
Deliverables:
* go.mod: github.com/coreos/go-oidc/v3 v3.18.0 added as a direct
require. Per the pre-bundle dependency audit (Apache-2.0, zero CVEs
ever per OSV.dev, 2,400+ stars, used by Hashicorp Vault + Dex +
Hydra + Authentik + every Kubernetes OIDC integration), this is the
ecosystem-standard Go OIDC client. Pinned to a specific minor
(v3.18.0) per the prompt's "no bare latest" rule.
* go.mod: golang.org/x/oauth2 promoted from // indirect to direct,
bumped from v0.34.0 to v0.36.0 by go mod tidy. Both versions are
OSV-clean. Maintained by the Go team.
* No JSON-path library added (forbidden by the dependency audit; the
group-claim resolver is hand-rolled in Phase 3).
* internal/config/config.go: AuthTypeOIDC constant added with a
load-bearing comment explaining (a) this is the AUTH-TYPE literal,
not a JWT alg literal, so the G-1 closure invariant is preserved
("jwt" stays out of ValidAuthTypes forever); (b) the runtime guard
in cmd/server/main.go intentionally refuses-to-start when oidc is
set pre-Phase-6 to avoid the silent-downgrade failure mode.
ValidAuthTypes() now returns {api-key, none, oidc}.
* internal/config/config_test.go: TestValidAuthTypesIsExactly_APIKey_None
renamed to TestValidAuthTypesIsExactly_APIKey_None_OIDC and now pins
the 3-entry set. TestValidAuthTypesDoesNotContainJWT (G-1 closure
test) still passes because "jwt" is never added back.
TestValidate_GenericInvalidAuthType's bad-types list updated:
"oidc" removed (now valid), "saml" added (correctly rejected per
Decision 5's SAML deferral).
* cmd/server/main.go: defense-in-depth runtime auth-type guard now
has an explicit AuthTypeOIDC case that exit(1)s with an actionable
message: "the OIDC auth chain is not yet wired in this build (Auth
Bundle 2 Phase 6 ships the session middleware that consumes this
auth-type literal)." This closes the lying-field gap the literal
would otherwise create. Phase 6 of Bundle 2 relaxes this case to
fall through alongside api-key + none.
* api/openapi.yaml: /v1/auth/info auth_type enum extended from
[api-key, none] to [api-key, none, oidc] with an in-line comment
explaining the Phase-0-vs-Phase-6 timing so an OpenAPI consumer
isn't surprised by "oidc" appearing here pre-Bundle-2-merge.
* deploy/helm/certctl/templates/_helpers.tpl::certctl.validateAuthType:
valid set extended to include "oidc". Chart-time validation now
passes for type=oidc; the binary's runtime guard takes over to
refuse the start. Once Bundle 2 ships, the runtime guard relaxes
and OIDC works end-to-end with no further chart edits.
* .env.example: CERTCTL_AUTH_TYPE comment block updated to document
the three valid values + the Phase-0-vs-Phase-6 timing.
* internal/auth/oidc/doc.go: new package directory with package doc
+ transitional blank imports for coreos/go-oidc/v3 + x/oauth2 so
go mod tidy keeps both deps as direct requires until Phase 3's
service.go replaces the blanks with real symbol use. Doc explains
the package layout (oidc/ + oidc/domain/ + oidc/groupclaim/ +
oidc/testfixtures/) so the post-Bundle-2 reader can navigate.
Verifications:
* gofmt clean on every changed file.
* go vet clean on internal/config + cmd/server + internal/auth/oidc.
* go test -short -count=1 green on internal/config (including the
G-1 closure + new validation tests), cmd/server, internal/auth (all
Bundle 1 packages), internal/service/auth.
* govulncheck ./... clean (M-024 hard CI gate).
* All 24 ci-guards pass locally.
Phase 0 exit criteria from cowork/auth-bundle-2-prompt.md:
* go.mod shows coreos/go-oidc/v3 as direct: yes.
* golang.org/x/oauth2 is direct (not indirect): yes.
* govulncheck ./... clean: yes.
* No JSON-path library in go.mod / go.sum deltas: confirmed (only
v3 of go-oidc + the x/oauth2 bump landed).
* make verify green: gofmt + vet + go test pass; full make verify
(which would invoke golangci-lint) deferred to CI since the
sandbox doesn't have golangci-lint installed; the operator runs
make verify locally before pushing per CLAUDE.md operating rule.
|
||
|
|
9c1d446e40 |
fix(security,config): remove unimplemented JWT auth-type, close silent downgrade (G-1)
The pre-G-1 config validator accepted CERTCTL_AUTH_TYPE=jwt and the
startup log faithfully echoed 'authentication enabled type=jwt'.
Reasonable people read that and concluded JWT auth was on. It wasn't.
The auth-middleware wiring at cmd/server/main.go unconditionally routed
every request through the api-key bearer middleware regardless of
cfg.Auth.Type. So CERTCTL_AUTH_TYPE=jwt quietly compared the incoming
'Authorization: Bearer <token>' against whatever string the operator put
in CERTCTL_AUTH_SECRET — real JWT clients got 401, and operators who
treated CERTCTL_AUTH_SECRET as a *signing* secret (because they thought
they were configuring JWT) had effectively handed an attacker an api-key.
A security finding masquerading as a config option.
We chose the audit-recommended structural fix: remove the option, fail
fast at startup, and add the gateway-fronting pattern as the documented
forward path. Implementing JWT middleware would have meant jwks vs
static-secret rotation, claim mapping, expiry enforcement, audience and
issuer validation, key rollover semantics, and regression coverage at the
same depth as the existing api-key path — a feature, not a fix. Operators
who genuinely need JWT/OIDC front certctl with an authenticating gateway
(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium /
Authelia) and run the upstream certctl with CERTCTL_AUTH_TYPE=none. Same
shape works on docker-compose and Helm.
The change is comprehensive across 7 phases — every surface that
mentioned 'jwt' as a certctl-auth-type is updated, plus structural
backstops (typed enum, runtime guard, helm template validation, CI grep
guard) so the lie can't reappear.
Files changed:
Phase 1 — production code (typed enum + jwt removal):
- internal/config/config.go: AuthType typed alias + AuthTypeAPIKey /
AuthTypeNone constants + ValidAuthTypes() helper. Validate() routes
literal 'jwt' through a dedicated multi-line diagnostic naming the
authenticating-gateway pattern, then cross-checks against
ValidAuthTypes(). Secret-required branch simplified to api-key-only.
Field comment on AuthConfig.Type rewritten to drop jwt and point at
the gateway pattern.
- internal/api/middleware/middleware.go: AuthConfig.Type field comment
references the typed config.AuthType constants.
- internal/api/handler/health.go: same treatment for HealthHandler.AuthType.
- cmd/server/main.go: defense-in-depth runtime switch immediately after
config.Load() — exits 1 on any unsupported auth-type that bypassed the
validator. Auth-disabled startup log explicitly names the
authenticating-gateway pattern.
Phase 2 — tests (Red→Green, contract pinning):
- internal/config/config_test.go: TestValidate_JWTAuth_RejectedDedicated
(two table rows pinning the dedicated G-1 error fires regardless of
whether Secret is set), TestValidAuthTypesDoesNotContainJWT (property
guard against future re-introduction),
TestValidAuthTypesIsExactly_APIKey_None (allowed-set contract),
TestValidate_GenericInvalidAuthType (pins non-jwt invalid values still
hit the generic invalid-auth-type error). Removed the prior
TestValidate_JWTAuth_MissingSecret happy-path since its premise is
inverted post-G-1.
- internal/api/handler/health_test.go: removed
TestAuthInfo_ReturnsAuthType_JWT (which baked the silent-downgrade lie
into the regression suite). Pre-existing _APIKey test continues to
cover the api-key happy path.
Phase 3 — spec, docs, env templates:
- api/openapi.yaml: auth_type enum dropped to [api-key, none] with
inline comment naming the G-1 closure.
- .env.example (root): CERTCTL_AUTH_TYPE comment block rewritten to drop
jwt and point at the gateway pattern; secret-required conditional
simplified to api-key-only.
- docs/architecture.md: middleware-stack bullet rewritten to drop the
JWT mention; new H3 'Authenticating-gateway pattern (JWT, OIDC, mTLS)'
section explaining the design rationale and listing oauth2-proxy /
Envoy ext_authz / Traefik ForwardAuth / Pomerium / Authelia / Caddy
forward_auth / Apache mod_auth_openidc / nginx auth_request as the
standard fronting options.
- docs/upgrade-to-v2-jwt-removal.md (new ~125 lines): migration guide
with preconditions, what-changes, both recovery paths, complete
docker-compose oauth2-proxy walkthrough, Traefik ForwardAuth and Envoy
ext_authz patterns, rollback posture.
Phase 4 — Helm chart (template validation + docs):
- deploy/helm/certctl/templates/_helpers.tpl: new certctl.validateAuthType
helper mirroring the existing certctl.tls.required pattern. Fails
template render on any server.auth.type outside {api-key, none} with
a multi-line diagnostic.
- deploy/helm/certctl/templates/server-deployment.yaml,
server-configmap.yaml, server-secret.yaml: invoke the helper at the
top of each template that depends on .Values.server.auth.type.
- deploy/helm/certctl/values.yaml: auth: block comment expanded with the
G-1 rationale and gateway-pattern cross-reference.
- deploy/helm/CHART_SUMMARY.md: server.auth.type table row now surfaces
the allowed set and points at the upgrade doc.
- deploy/helm/certctl/README.md: new 'JWT / OIDC via authenticating
gateway' section with a Kubernetes-flavored oauth2-proxy + certctl
walkthrough.
Phase 5 — release surface:
- CHANGELOG.md: new [unreleased] top entry with Breaking / Removed /
Added / Changed sections; explicit pointer at
docs/upgrade-to-v2-jwt-removal.md from the Breaking subsection.
Phase 6 — CI guardrail:
- .github/workflows/ci.yml: new 'Forbidden auth-type literal regression
guard (G-1)' step. Scoped patterns catch the actual regression shapes
(map literal, slice literal, switch case, OpenAPI enum, env-file
default, AuthType('jwt') cast). Comments and the dedicated rejection
branch are intentionally exempt; connector-package JWT references
(Google OAuth2 / step-ca) are exempt as out-of-scope external
protocols. Verified locally: the guard passes on the actual tree and
fires on all 4 synthetic regression patterns.
Out of scope (explicitly untouched):
- internal/connector/discovery/gcpsm/gcpsm.go — Google OAuth2 service-
account JWT (external protocol).
- internal/connector/issuer/googlecas/googlecas.go — same.
- internal/connector/issuer/stepca/stepca.go — step-ca's provisioner
one-time-token JWT for /sign API.
- docs/test-env.md, docs/connectors.md, docs/features.md — describe
external CAs' use of JWT, not certctl's auth shape.
- Implementing actual JWT middleware. Feature, not a fix.
Verification (all gates pass):
- go build ./... — clean
- go vet ./... — clean
- go test -short ./... — every package green
- go test -short -race ./internal/config/... ./internal/api/... — clean
- govulncheck ./... — no vulnerabilities in our code
- helm lint deploy/helm/certctl/ — clean
- helm template with auth.type=api-key — renders OK
- helm template with auth.type=none — renders OK
- helm template with auth.type=jwt — fails with validateAuthType
diagnostic (exit 1)
- python3 yaml.safe_load on api/openapi.yaml — parses
- CI guardrail mirror — clean on real tree, fires on all 4 synthetic
regression patterns
- Smoke test: 'CERTCTL_AUTH_TYPE=jwt ./certctl-server' exits non-zero
with: 'Failed to load configuration: CERTCTL_AUTH_TYPE=jwt is no
longer accepted (G-1 silent auth downgrade): no JWT middleware ships
with certctl. To use JWT/OIDC, run an authenticating gateway
(oauth2-proxy / Envoy ext_authz / Traefik ForwardAuth / Pomerium) in
front of certctl and set CERTCTL_AUTH_TYPE=none on the upstream.
See docs/architecture.md "Authenticating-gateway pattern" and
docs/upgrade-to-v2-jwt-removal.md for the migration walkthrough'
config pkg coverage: ValidAuthTypes 100%, Validate 94.7%, total 75.5%.
Refs: coverage-gap-audit-2026-04-24-v5/unified-audit.md
§2 P1 cluster, cat-g-jwt_silent_auth_downgrade
Audit recommendation followed verbatim: 'Remove jwt from
validAuthTypes until middleware ships'.
|
||
|
|
af47d19ae2 |
fix(deploy,examples,env): close U-1 trap end-to-end across Helm, examples, and root env
Follow-up to |
||
|
|
c153361bbc |
Rewrite README and .env.example to match actual implementation
README.md:
- Replace ASCII architecture diagram with Mermaid
- Fix all database table names (managed_certificates, audit_events, etc.)
- Fix env var names to use CERTCTL_ prefix matching config.go
- Fix API endpoint paths ({id} not :id, /audit not /audit/logs)
- Add all missing endpoints (renew, deploy, CSR, heartbeat, policies, notifications)
- Add dashboard as primary feature (was completely missing)
- Link to all new docs (concepts, advanced demo, architecture, connectors)
- Fix integration status (Local CA implemented, ACME in progress)
- Fix security section (API key auth, not mTLS)
- Remove broken links to non-existent docs (api.md, k8s-deployment.md, scaling.md)
- Remove placeholder Support & Community section
.env.example:
- Change all var names to CERTCTL_ prefix (CERTCTL_DATABASE_URL, etc.)
- Remove vars that don't exist in config.go (ACME_*, SMTP_*, feature flags)
- Add scheduler tuning vars as commented examples
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
||
|
|
d395776a95 | Initial scaffold: certificate control plane v0.1.0 |