mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-07 16:51:31 +00:00
dfdba5b260545f2895e3a79b3327e2e5fd26363a
94 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
dfdba5b260 |
test(gui): Vitest coverage for the 2026-05-10/11 GUI batch (Fix 12)
Audit 2026-05-11 Fix 12 closure. The original GUI-batch commit
|
||
|
|
ad69158405 |
Merge Fix 07 (HIGH A-7): editable Advanced form on OIDCProviderDetailPage (MED-4)
# Conflicts: # CHANGELOG.md # web/src/pages/auth/OIDCProviderDetailPage.test.tsx # web/src/pages/auth/OIDCProviderDetailPage.tsx |
||
|
|
4e31568d3d |
Merge Fix 05 (HIGH A-5): approval payload preview with profile-edit diff + cert-issuance preview
# Conflicts: # CHANGELOG.md |
||
|
|
df53b80cb6 | Merge Fix 03 (CRIT A-3): expose AllowedEmailDomains on create + edit forms | ||
|
|
9af5dad2b0 |
feat(gui/oidc): editable Advanced form on OIDCProviderDetailPage (A-7 / MED-4)
The 2026-05-10 audit tagged MED-4 as DEFERRED to v3 with the rationale
"backend already accepts the five fields." The 2026-05-11 adversarial
review verified the deferral framing was inaccurate — the read-only
`<dl>` rendered scopes / groups_claim_path / groups_claim_format /
iat_window_seconds (and persisted but invisible jwks_cache_ttl_seconds),
which gave operators the impression those fields were editable.
Switching to edit mode revealed no inputs but the saveEdit handler at
OIDCProviderDetailPage.tsx:107-134 silently passed `provider.scopes` /
`provider.groups_claim_path` / etc. through to the PUT body unchanged
from the loaded provider object.
Result: a "lying UX" anti-pattern. The page collected updates to other
fields (display name, issuer URL, client secret, redirect URI,
fetch_userinfo), the PUT succeeded with HTTP 204, and no error fired —
but the displayed Advanced values were whatever the create form
persisted or curl last set. A second operator bumping `iat_window_seconds`
from 60 to 300 had to drop to curl. The "DEFERRED to v3" framing hid
the gap from acquisition reviewers who only inspect the GUI.
Closure (frontend-only — backend already accepts all 5 fields on
`PUT /api/v1/auth/oidc/providers/{id}`):
OIDCProviderDetailPage.tsx
- New `<details data-testid="oidc-provider-edit-advanced">` section
collapsed by default inside the edit form. Most edits don't
touch these fields, so they shouldn't clutter the primary form.
- Five new inputs wired through component state:
* `editScopesInput` — text input rendered as space-separated
string per OIDC convention (every IdP docs page shows scopes
that way). Submit splits on whitespace + filters empty strings.
* `editGroupsClaimPath` — text input with `groups` default.
* `editGroupsClaimFormat` — select with the actual backend enum
`string-array` | `json-path` (NOT `string_array` /
`space_separated` / `comma_separated` as the spec mistakenly
proposed — those values don't exist in
`internal/auth/oidc/domain/types.go::GroupsClaimFormat*`).
* `editIATWindow` — number input with `min=1, max=600` matching
`MaxIATWindowSeconds=600` from the domain validator.
* `editJWKSCacheTTL` — number input with `min=60` matching
`MinJWKSCacheTTLSeconds=60`.
- `startEdit` pre-populates all five from the live provider so
operators see current values when expanding the section.
- `saveEdit` validates client-side mirroring the backend
`Validate` rules (empty scopes / empty path / invalid format /
IAT out of (0, 600] / JWKS < 60) → inline error + does NOT
POST. Server is still source-of-truth; any 400 surfaces via
the existing error UI.
- Read-only `<dl>` gained the previously-invisible
`jwks_cache_ttl_seconds` row so all five values are visible
without entering edit mode.
Each input carries a help paragraph linking the operator mental
model to the backend semantic (e.g. Keycloak's
`realm_access.roles`, Auth0's namespaced claims; RFC 7519 §4.1.6
for IAT; MED-6 auto-refresh-on-cache-miss for the JWKS TTL).
Tests (9 new + 5 pre-existing, all passing under vitest):
A-7 Advanced details section is collapsed by default and visible
in edit mode — pin <details> has no `open` attribute initially.
A-7 Advanced fields pre-populate from the live provider — start
edit with a non-default provider (Keycloak shape: realm_access.roles,
json-path, IAT=120, JWKS TTL=600); assert each input carries the
live value.
A-7 all five Advanced fields round-trip into the PUT body — change
every field, submit, assert the PUT body carries the parsed shapes
(whitespace-normalized scopes array, trimmed groups_claim_path,
enum value, numeric values).
A-7 IAT window above 600 rejects with inline error and does NOT POST
— operator types 601, save handler rejects before reaching
updateOIDCProvider.
A-7 IAT window <= 0 rejects with inline error.
A-7 JWKS cache TTL below 60 rejects with inline error.
A-7 empty scopes input rejects — guards against operator
accidentally wiping the array via whitespace.
A-7 empty groups-claim-path rejects.
A-7 unchanged Advanced fields still round-trip as the existing
values — pin that a name-only edit still carries the live
advanced config (no regression to the pass-through behavior;
operators don't lose their config when editing other fields).
Verify gate green: tsc --noEmit clean; vitest passes all 14 tests
in OIDCProviderDetailPage.test.tsx (5 pre-existing + 9 new A-7
cases).
Spec at cowork/auth-bundles-fixes-2026-05-11/07-high-oidc-provider-advanced-form.md.
Audit doc: MED-4 section in cowork/auth-bundles-audit-2026-05-10.md
appended with the A-7 follow-up closure annotation correcting the
"DEFERRED to v3" framing and explaining the lying-UX pattern;
status table row updated from "CLOSED" (incorrectly tagged on the
pass-through behavior) to "CLOSED 2026-05-11 (A-7)" with the
5-field enumeration. Operator-visible CHANGELOG.md entry under
Security retires the lying-UX caveat.
|
||
|
|
f502da306f |
feat(gui/approvals): payload preview with profile-edit diff + cert-issuance preview (A-5)
The MED-10 closure claim in `cowork/auth-bundles-audit-2026-05-10.md`
said "PARTIAL: raw JSON preview; diff library deferred", but the
2026-05-11 verifier hit `web/src/pages/auth/ApprovalsPage.tsx` and
found ZERO payload rendering — only a doc-comment mention. Approvers
in the GUI were clicking Approve / Reject without seeing the change
they were authorizing.
That defeats the entire two-person-approval primitive. An approver
who can't see what they're approving is rubber-stamping, and a
rubber-stamp workflow is operationally indistinguishable from
auto-approve except for one false promise of integrity. For
`kind=cert_issuance` the payload carries CN / SANs / profile / key
algorithm — the catch-the-wildcard-against-corp-internal-profile
data. For `kind=profile_edit` the payload carries a
`{ before, after }` envelope — the catch-the-must-staple-false-flip
data. Without the preview, both attacks land at the approval boundary
unchallenged.
Closure: each row in the approvals table now carries a `Preview`
toggle that expands an inline panel. Dispatch by `kind`:
- profile_edit → ProfileEditDiff. Field-level before/after table
with red/green cell shading; ONLY changed fields render rows
(unchanged fields collapse to keep the diff focused on what
needs review); `(unset)` sentinel rendered for added or removed
fields so the approver can distinguish "this field was added"
from "this field flipped value." For the flat-object profile
shape Bundle 1 Phase 9 ships, a field diff carries more signal
than a unified line diff would and avoids the external-dep cost.
- cert_issuance → IssuanceRequestPreview. Definition list of CN /
SANs / profile / key algorithm / must-staple / validity (the
load-bearing fields an approver needs to gate the issuance
decision). Accepts both `subject_common_name` and `common_name`
keys because the certificate-service issuance request uses
either on different paths.
- any other kind → generic <pre> JSON dump. Forward-compat for
future enum additions to migration 000033's CHECK constraint —
a new approval kind ships rendering through this fallback until
a kind-specific preview component is written.
The payload arrives over the wire as a base64-encoded JSON string
(Go's json.Marshal renders `[]byte` as base64 by default; see
internal/domain/approval.go:41 where `Payload []byte`). The new
exported `decodePayload(payload)` helper atob()s + JSON.parse()s,
returning null on any failure. Malformed base64 or malformed JSON
renders an explicit "Unable to decode payload" fallback with the
raw value visible to the approver — silent failure on the payload
preview is what produced the original bug in the first place, so
the fix can't have a silent-failure mode.
Component dispatch and base64 decode are also exposed for testing:
decodePayload(undefined) → null
decodePayload('') → null
decodePayload(btoa(JSON.stringify(x))) → x
decodePayload('!!!not-base64!!!') → null (atob throws)
decodePayload(btoa('not a json document')) → null (JSON.parse throws)
Each interactive element carries a data-testid so future E2E
coverage can exercise the contract without brittle CSS selectors —
same pattern as Bundle 1's RolesPage.
Tests (13 total, all passing under vitest):
Page-level (8):
A-5 Preview button toggles the payload panel
A-5 ProfileEdit kind renders field diff with changed-only rows
A-5 ProfileEdit before/after values are visible in the diff cells
A-5 ProfileEdit with no changes renders empty-state
A-5 CertIssuance renders definition list with SANs + profile + key algo
A-5 Unknown kind falls back to generic JSON pre block
A-5 Empty payload renders the "No payload attached" sentinel
A-5 Malformed base64 payload renders the decode-error fallback
decodePayload pure-function suite (5):
returns null for undefined input
returns null for empty string
round-trips base64-encoded JSON
returns null on malformed base64
returns null on valid base64 of non-JSON content
Verify gate green: tsc --noEmit clean; vitest passes all 17 tests
in ApprovalsPage.test.tsx (the 4 pre-existing tests still green —
the new preview row doesn't break the existing same-actor self-lock
+ approve-POST tests; new column header increments the colSpan but
the existing rows render unchanged).
Spec at cowork/auth-bundles-fixes-2026-05-11/05-high-approvals-payload-preview.md.
Audit doc: MED-10 row in `cowork/auth-bundles-audit-2026-05-10.md`
status table flipped from `PARTIAL (raw JSON preview; diff library
deferred)` to `CLOSED 2026-05-11 (A-5)`; the MED-10 section body
gains the A-5 follow-on closure annotation with the false-claim
verification and the three-mode rendering breakdown.
Operator-visible CHANGELOG.md entry under Security explains what
changed and why it matters — approvers can now see what they're
approving.
|
||
|
|
cc8024932b |
feat(gui/oidc): expose AllowedEmailDomains on create + edit forms (A-3)
The CRIT-5 closure (2026-05-10) made `OIDCProvider.AllowedEmailDomains`
load-bearing on the OIDC login path: a token whose email domain isn't in
the configured allowlist gets ErrEmailDomainNotAllowed. But the GUI never
exposed the field — `web/src/pages/auth/OIDCProvidersPage.tsx`'s create
form had zero inputs for it, and `OIDCProviderDetailPage.tsx` neither
rendered nor edited the value.
For multi-tenant IdPs (Auth0, Azure AD common endpoint, Google Workspace)
this is the single most important provider knob — the difference between
"anyone in any tenant of this IdP can log in" and "only @acme.com can log
in." Operators driving certctl from the GUI had no way to know the field
exists, let alone set it. Same shape as CRIT-5's pre-closure state: the
control was claimed, persisted, accepted via API, but invisible at the
surface 90% of operators actually use.
Closure across both GUI pages:
web/src/pages/auth/OIDCProvidersPage.tsx
- Create modal gains a chip-style multi-input below fetch_userinfo.
- New exported `validateEmailDomain(s)` mirrors the backend validator
(CRIT-5 closure rules: no @ / no whitespace / no wildcards /
lowercase only / must be FQDN). Returns "" on accept, a
non-empty error string on reject. Server is still the source of
truth — server-returned 400s render via the existing error UI.
- Inline "addEmailDomain" handler: trim → lowercase → validate →
dedupe → push onto form.allowed_email_domains. Enter key in the
input adds the entry without requiring a click on Add.
- Each chip carries a × remove button + data-testid plumbing for
E2E coverage.
web/src/pages/auth/OIDCProviderDetailPage.tsx
- Read-only view's <dl> renders a new row "Allowed email domains"
with an explicit "any (no gate configured)" sentinel when the
list is empty. Operators can tell the difference between "not
configured" and "field exists but the GUI doesn't show it" — the
whole class of lying-field this fix exists to retire.
- Edit form mirrors the create-modal chip control + pre-populates
from provider.allowed_email_domains at startEdit time (defensive
clone so chip mutations don't reach through into the cached
TanStack Query data).
- Save round-trips the trimmed list as `allowed_email_domains` in
the PUT body alongside the other editable fields.
- "Clear all" affordance with a confirm() dialog that warns about
removing the tenant gate (cross-tenant logins permitted after
save) — for operators who want to test enforcement-off then turn
back on without retyping the full domain list.
- Imports `validateEmailDomain` from OIDCProvidersPage for parity.
web/src/api/client.ts
- No changes — `allowed_email_domains?: string[]` was already in
both OIDCProvider and OIDCProviderRequest types. The CRIT-5
backend closure had already shipped the type but no GUI consumer
ever used it.
Regression coverage (Vitest, all passing):
OIDCProvidersPage.test.tsx (7 new):
AllowedEmailDomains — Add persists a chip and is included in submit body
AllowedEmailDomains — rejects entries containing @
AllowedEmailDomains — rejects wildcard entries
AllowedEmailDomains — normalizes mixed-case input to lowercase
AllowedEmailDomains — Enter key adds the entry without clicking Add
AllowedEmailDomains — chip × button removes the entry
AllowedEmailDomains — duplicate entry is rejected
validateEmailDomain unit suite (7 new):
accepts a plain lowercase FQDN (with multi-label TLDs)
rejects entries containing @ (with leading-@ variant)
rejects entries with whitespace (with tab variant)
rejects wildcards (with both *.x and x.* variants)
rejects mixed-case
rejects bare hostnames (no dot)
rejects empty strings
OIDCProviderDetailPage.test.tsx (5 new):
AllowedEmailDomains — read-only view shows configured entries
AllowedEmailDomains — read-only view shows "any" sentinel when empty
AllowedEmailDomains — edit form pre-populates + PUT round-trips
AllowedEmailDomains — removing a chip and saving submits the trimmed list
AllowedEmailDomains — Add validates against backend rules
Verify gate green: `tsc --noEmit` clean across the web/ tree;
OIDCProvidersPage + OIDCProviderDetailPage suites pass all 29 tests
(19 + 10) — 13 of those are new A-3 cases, 16 were existing CRIT-5 /
Bundle 2 Phase 8 coverage. Three pre-existing test failures in
AuthSettingsPage.test.tsx + KeysPage.test.tsx confirmed unrelated
(reproduce on the base commit `191384c` without any of this fix's
changes applied; not in scope for this CRIT fix).
Spec at cowork/auth-bundles-fixes-2026-05-11/03-crit-allowed-email-domains-gui.md
Closure annotation appended to CRIT-5 row of cowork/auth-bundles-audit-2026-05-10.md;
Lying-fields cross-reference table row #1 marked closed across both
the backend (CRIT-5, 2026-05-10) and GUI (A-3, 2026-05-11) legs.
Operator advisory in CHANGELOG.md v2.1.0 release notes — operators
who provisioned OIDC providers through the GUI between v2.1.0 and
this fix should verify allowed_email_domains matches their tenant
policy (the field was configurable only via API / MCP / direct SQL
during that window).
|
||
|
|
78485f7429 |
fix(auth/users): close MED-11 lying field — DeactivatedAt loaded + enforced on login (A-2)
The MED-11 closure shipped users.deactivated_at + DELETE /api/v1/auth/users/{id}
+ cascade-revoke, but the federated-user soft-delete was reversible: the next
OIDC login under the same (provider, subject) tuple re-minted a session and
re-elevated the user.
Three legs of the chain were severed (each independently CRIT-shaped):
Leg A — postgres/user.go::userColumns omitted `deactivated_at`, so scanUser
never populated User.DeactivatedAt. Every Get / GetByOIDCSubject /
ListAll returned DeactivatedAt = nil regardless of the column value.
Leg B — postgres/user.go::Update SQL omitted `deactivated_at = $X`, so the
handler's `u.DeactivatedAt = now()` mutation was a no-op write at
the SQL level. Even with leg A closed, no row ever flipped.
Leg C — oidc/service.go::upsertUser did not inspect DeactivatedAt on the
existing-user path. Even with legs A + B closed, the OIDC login
would still proceed normally.
The cascade-session-revoke half of the original closure remained correct, but
only for the duration of the user's current cookie. SOC 2 CC6.3 + ISO 27001
A.9.2.6 "user access removal" controls require both immediate revoke AND
persistent block — this fix restores the persistent-block leg.
Closure across layers:
internal/repository/postgres/user.go
- userColumns adds `deactivated_at`
- scanUser reads via sql.NullTime intermediate (column is nullable)
- Create writes deactivated_at explicitly (NULL for new active users;
forward-compat for future seed-data flows that pre-populate the column)
- Update writes deactivated_at on every call; nil DeactivatedAt → NULL
(supports reactivation)
internal/auth/oidc/service.go
- New sentinel ErrUserDeactivated
- upsertUser checks existing.DeactivatedAt != nil BEFORE mutating email /
display_name / last_login_at — preserves last_login_at forensics on
rejected login attempts (defense-in-depth pin against future
"performance optimization" that reorders the gate)
internal/api/handler/auth_session_oidc.go
- classifyOIDCFailure adds typed errors.Is dispatch for ErrUserDeactivated
→ audit category "user_deactivated" (SOC/SIEM observability surface)
internal/api/handler/auth_users.go
- Self-deactivate guard on Deactivate: HTTP 409 + audit row
auth.user_deactivate_self_rejected when caller targets own User row.
Prevents an admin from one-way-door locking themselves out via the
standard handler; break-glass remains the recovery path.
- New Reactivate handler: inverse of Deactivate. Clears DeactivatedAt
via Update; emits auth.user_reactivated audit row. Idempotent on
already-active rows. Sessions revoked at deactivation stay revoked
(cascade irreversible by design — user must complete fresh OIDC
login).
internal/api/router/router.go
- POST /api/v1/auth/users/{id}/reactivate wired with auth.user.deactivate
gate (reactivation is the inverse op, not a separate privilege)
web/src/api/client.ts + web/src/pages/auth/UsersPage.tsx
- authReactivateUser() client function
- Reactivate button on deactivated rows in UsersPage
Regression coverage:
Postgres (testcontainers, skipped under -short):
TestUserRepository_DeactivatedAt_RoundTrip — Create → set DeactivatedAt
→ Update → Get / GetByOIDCSubject / ListAll round-trip the value
TestUserRepository_DeactivatedAt_CreateWritesNullForActive — new active
user reads back DeactivatedAt = nil
TestUserRepository_DeactivatedAt_CreatePersistsPreDeactivated — Create
with non-nil DeactivatedAt round-trips (forward-compat path)
OIDC service:
TestService_HandleCallback_RejectsDeactivatedUser — errors.Is
ErrUserDeactivated; CallbackResult nil; persisted email / last_login_at
/ deactivated_at NOT mutated by the rejected attempt
TestService_HandleCallback_AllowsReactivatedUser — DeactivatedAt = nil
→ happy path resumes
TestService_HandleCallback_DeactivatedUserPreservesForensics —
defense-in-depth pin against future regressions that reorder the
gate-vs-mutation sequence
Classifier:
TestClassifyOIDCFailure extended — typed dispatch + wrapped variant
round-trip through errors.Is
Handler:
TestAuthUsers_Deactivate_RejectsSelfDeactivate — HTTP 409 + audit
row + cascade-revoke NOT fired + row stays active
TestAuthUsers_Deactivate_OtherUser_HappyPath — HTTP 204 + cascade
fires + row soft-deleted
TestAuthUsers_Reactivate_HappyPath / _IdempotentOnActiveUser /
_UnknownID / _MissingID / _UpdateError
Phase 6 verify gate green on the targeted packages: gofmt clean, go vet
clean, go test -short pass across internal/auth/oidc, internal/api/handler,
internal/api/router, internal/repository/postgres, internal/auth/...,
internal/service/..., internal/tlsprobe/..., internal/trustanchor/...,
internal/validation/...
Spec at cowork/auth-bundles-fixes-2026-05-11/02-crit-deactivated-at-enforcement.md
Closure annotation at cowork/auth-bundles-audit-2026-05-10.md MED-11 row.
Operator advisory in CHANGELOG.md v2.1.0 release notes.
|
||
|
|
191384c1d2 |
feat(gui): auth GUI batch — MED-4/7/8/10/11/12 + LOW-1/11/12 + HIGH-10 GUI half
Audit 2026-05-10 GUI batch closure. WHAT. Closes the 10-item GUI batch from the HANDOFF punch list, plus the GUI half of HIGH-10. Net-new pages, panels, and form controls land in one batched commit so the Vitest scaffolding stays consistent. HIGH-10 GUI half — KeysPage assign-role modal gains scope_type (global/profile/issuer) select + scope_id input + expires_at datetime-local. Validates scope_id required when type != global. Threads through the api/client.ts AssignKeyRoleOptions extension that was prepared on the backend side in |
||
|
|
0f340beb14 |
fix(auth/ux): cause-aware OIDC + session error surfacing (HIGH-7 + HIGH-8 closure)
Server (HIGH-7): the OIDC callback failure path now 302-redirects to /login?error=oidc_failed&reason=<category> instead of emitting a blank 400. `category` is the existing audit `failure_category` value; classifyOIDCFailure was extended with three new sentinel paths (email_domain_not_allowed, email_missing_but_required, pkce_invalid) so CRIT-5 + PKCE failures get distinguishable GUI rendering. Audit-log observability is unchanged — the same failure_category is written to the auth.oidc_login_failed audit row; the 302 is purely a UX leg layered on top. Server (HIGH-8): SessionMiddleware now stashes a cause classification on the request context when Validate returns an error, mapping the sentinels via classifySessionError (errors.Is-based, so wrapped sentinels still classify) to the stable wire-strings idle_timeout / absolute_timeout / back_channel_revoked / invalid_token. The 401 emit point in bearerSkipIfAuthenticated reads the stashed cause and emits WWW-Authenticate: Bearer realm="certctl", error="invalid_token", error_description=<cause> per RFC 6750 §3. GUI (HIGH-7): LoginPage reads ?error= + ?reason= from the URL via react-router useSearchParams and renders an operator-friendly amber-bordered banner above the form; OIDC_FAILURE_REASON_TEXT maps all 16 known categories with a defensive 'unspecified' fallback for forward-compat with future server-side categories. GUI (HIGH-8): api/client fetchJSON parses the WWW-Authenticate cause via parseWWWAuthenticateCause and attaches it to the 'certctl:auth-required' CustomEvent detail; AuthProvider redirects to /login?session_expired=<cause> on cause-aware 401s; LoginPage renders a blue-bordered session-cause banner. invalid_token stays on the current page (no hard redirect for opaque failures). Misc cleanup: ErrorState now accepts the title/message/data-testid form added by CRIT-4 BreakglassPage (was erroring tsc on master). Regression matrix: - internal/api/handler/oidc_redirect_categories_test.go pins all 16 failure categories to the 302 + reason= location + audit-row leg - internal/auth/session/www_authenticate_test.go pins the 4 stable cause categories on classifySessionError (incl. errors.Is wrapped sentinels) + the WWW-Authenticate emission across all 4 categories + the no-session-context fallback case - internal/api/handler/auth_session_oidc_test.go: 4 pre-existing TestLoginCallback_*Returns400 tests updated to assert 302 + reason= location (the wire shape changed from 400 to 302, but the audit observability and behaviour-equivalent failure-classification are preserved) - web/src/pages/LoginPage.test.tsx: 6 new cases pinning the failure banner, session-cause banner, unknown-reason fallback, and forward-compat 'unspecified' category Spec: cowork/auth-bundles-fixes-2026-05-10/08-high-7-8-error-surfacing.md Closes: HIGH-7, HIGH-8 of cowork/auth-bundles-audit-2026-05-10.md |
||
|
|
f1d97710e1 |
feat(gui+auth): break-glass admin GUI surface (CRIT-4 closure)
Closes CRIT-4 of the 2026-05-10 audit. Bundle 2 Phase 7.5 shipped the
break-glass backend (Argon2id + lockout + 4 endpoints) but no GUI
surface. Operators recovering during an SSO outage had to hand-craft
curl commands — operationally hostile and the opposite of what
docs/operator/security.md advertised. This commit closes the gap.
Three GUI surfaces:
1. LoginPage.tsx — inline "Use break-glass account (SSO outage
recovery)" toggle below the API-key form. Clicking reveals an
amber-bordered inline form (actor-id + password, autocomplete=off).
Calls breakglassLogin(actor_id, password); on success navigates
to "/" where AuthProvider re-validates via the session-cookie path.
Intentionally low-visibility (text-amber-600 small text) — this is
the deliberate-bypass path, not the everyday-login path.
2. web/src/pages/auth/BreakglassPage.tsx — admin page at /auth/breakglass
(permission-gated by auth.breakglass.admin). Three sections:
- Sticky security banner ("every action audited; use only during
incidents").
- Set/rotate-password form (≥12-char + confirm-match).
- Credentialed-actor table with rotate / unlock (disabled when
not locked) / remove per row. Remove requires type-the-actor-id
confirmation.
3. Layout.tsx nav — "Break-glass" entry under the auth section. Visible
to all callers; the page itself permission-gates (server-side 403 is
the load-bearing defense). Cosmetic hide-when-no-perm is deferred
to fix 14's LOW bundle.
Backend support (new endpoint required to enumerate credentialed actors):
- internal/repository/breakglass.go — BreakglassCredentialRepository
gains List(ctx, tenantID) method.
- internal/repository/postgres/breakglass.go — postgres impl; reuses
the existing breakglassColumns / scanBreakglass helpers.
- internal/auth/breakglass/service.go — Service.List(ctx) method;
returns ErrDisabled when CERTCTL_BREAKGLASS_ENABLED=false (handler
maps to 404 for surface invisibility).
- internal/api/handler/auth_breakglass.go — ListCredentials handler;
password_hash field NEVER serialized to the wire (response shape
is intentionally limited to actor_id + timestamps + failure_count +
locked_until).
- internal/api/router/router.go — registers GET
/api/v1/auth/breakglass/credentials gated by auth.breakglass.admin.
- internal/api/router/openapi_parity_test.go — SpecParityExceptions
entry for the new endpoint (full OpenAPI row rides along with the
next OpenAPI sweep).
GUI api/client.ts gains breakglassListCredentials() + the
BreakglassCredentialRow type matching the wire shape.
Six Vitest cases in BreakglassPage.test.tsx pin the contract:
permission gate (forbidden state when caller lacks the perm; admin
surface when they have it), set-password mismatch rejection, set-
password below-threshold-length rejection, unlock-disabled-when-not-
locked, remove-modal type-confirm.
Verification gate green:
- gofmt -l clean on all touched files
- go vet clean
- go test -short -count=1 on internal/api/router (TestRouter_OpenAPIParity
+ TestRouterRBACGateCoverage + TestRouter_AuthExemptAllowlist),
internal/api/handler (all BCL tests + ListCredentials),
internal/auth/breakglass (Service.List + stubRepo.List),
internal/repository/postgres, internal/domain/auth (auditor pin)
— all pass.
CRIT-1 + CRIT-2 + CRIT-3 from the same audit are already closed on
this branch (commits
|
||
|
|
9143003e95 |
auth-bundle-2 Phase 8: GUI auth surface (OIDC providers + group mappings + sessions + LoginPage IdP buttons + AuthState refactor + logout wiring)
Closes Phase 8 of cowork/auth-bundle-2-prompt.md. Every Bundle 2 endpoint
now has a permission-gated, data-testid-instrumented React surface.
Frontend changes
================
api/client.ts (Category H — AuthState refactor):
* fetchJSON now sends `credentials: 'include'` on every request so the
HttpOnly session cookie + the JS-readable CSRF cookie ride along with
Bearer-mode requests transparently. Mode is determined per call by
what cookies are present, NOT by a state-machine — the same client
works for Bearer-only deploys, session-only deploys, and the mixed
upgrade path described in cowork/auth-bundles-index.md Category H.
* readCSRFCookie() + isStateChangingMethod() helpers auto-attach
`X-CSRF-Token` to POST/PUT/PATCH/DELETE when the CSRF cookie exists.
Bearer-only callers ride through unchanged (no CSRF cookie → no
header → backend's CSRF middleware skips).
* AuthInfoResponse extended with optional `oidc_providers?:
AuthInfoOIDCProvider[]` matching the Phase 6 server extension.
* New API helpers (1:1 with Phase 5 / 7.5 endpoints):
- listOIDCProviders / createOIDCProvider / updateOIDCProvider /
deleteOIDCProvider / refreshOIDCProvider
- listGroupMappings / addGroupMapping / removeGroupMapping
- listSessions(actorID?, actorType?) / revokeSession / logout
- breakglassLogin / breakglassSetPassword / breakglassUnlock /
breakglassRemove
Permission gates fire server-side; the GUI predicates are UX only.
pages/auth/OIDCProvidersPage.tsx (NEW):
* Lists configured OIDC providers, gated on `auth.oidc.list`.
* Empty state + error state + loading state.
* Embedded Configure-Provider modal with form fields for name,
issuer_url, client_id, client_secret, redirect_uri,
groups_claim_path/format, fetch_userinfo, scopes. Modal hidden
unless caller has `auth.oidc.create`.
* Unsaved-changes confirmation on cancel.
pages/auth/OIDCProviderDetailPage.tsx (NEW):
* Provider config dl + edit/delete/refresh action buttons.
* Edit and refresh require `auth.oidc.edit`. Delete requires
`auth.oidc.delete`.
* Type-confirm-name delete dialog. Surfaces server's 409 Conflict
("ErrOIDCProviderInUse") inline so the operator knows to revoke
the provider's active sessions first.
* Refresh discovery cache button → POST .../refresh → server re-runs
RefreshKeys with the IdP-downgrade-attack defense from Phase 3.
* Group→role mappings link.
pages/auth/GroupMappingsPage.tsx (NEW):
* Per-provider group-claim → role-id mapping CRUD.
* Empty state explains the fail-closed semantics from Phase 3
(no mappings ⇒ no users authenticate via this provider).
* Inline add form (group_name input + role_id select populated from
`authListRoles`); add/remove gated on `auth.oidc.edit`.
pages/auth/SessionsPage.tsx (NEW):
* Default "My sessions" view available to anyone holding
`auth.session.list`.
* "All actors (admin)" toggle exposed only when caller holds
`auth.session.list.all`; renders an actor_id filter input that
threads ?actor_id= through the GET.
* Self-pill marker on the caller's own rows.
* Revoke button is shown when (a) the row is the caller's own session
(handler-side own-bypass) OR (b) caller holds `auth.session.revoke`.
* Confirms via window.confirm; surfaces revocation errors inline.
pages/LoginPage.tsx (MODIFIED):
* Fetches /v1/auth/info on mount; if `oidc_providers[]` is non-empty,
renders one "Sign in with X" button per provider linking to the
provider's `login_url` (the server-side handler in Phase 5 builds
this URL with state + nonce + PKCE verifier sealed in the pre-login
cookie; the GUI never touches those values).
* The API-key form remains as a fallback for Bearer-mode deploys and
the Phase 7.5 break-glass path.
* All interactive elements carry data-testid:
login-oidc-providers / login-oidc-button-{id} / login-api-key-form /
login-api-key-input / login-api-key-submit.
components/AuthProvider.tsx (MODIFIED):
* logout() now also fires POST /auth/logout via the api/client helper
before clearing local state. The endpoint is auth-exempt; the
catch-and-swallow keeps the local logout flow working even if the
cookie is already invalid (idempotent server-side as well).
components/Layout.tsx (MODIFIED):
* Two new nav entries under the Auth section: "OIDC Providers" + "Sessions".
main.tsx (MODIFIED):
* Four new routes:
- /auth/oidc/providers
- /auth/oidc/providers/:id
- /auth/oidc/providers/:id/mappings
- /auth/sessions
Vitest coverage
===============
Five new test files, 28 new test cases. Pattern matches Bundle 1
Phase 10's Vitest scaffold (vi.mock api/client, render with
QueryClient + MemoryRouter, authMe-driven permission shaping,
data-testid selectors).
* OIDCProvidersPage.test.tsx (5 tests): ErrorState w/o auth.oidc.list,
empty state, list + create button render, hide-create-button
without auth.oidc.create, submit-creates-via-API.
* OIDCProviderDetailPage.test.tsx (5 tests): ErrorState w/o list,
full-perms render, hide edit/refresh/delete with only list,
refresh button calls API, delete confirm-button stays disabled
until typed text matches provider name.
* GroupMappingsPage.test.tsx (5 tests): ErrorState w/o list, empty
fail-closed warning, mapping rows render, hide-form without
auth.oidc.edit, submit-add-form-calls-API.
* SessionsPage.test.tsx (6 tests): ErrorState w/o list, own sessions
+ self-pill, hide All-actors toggle without list.all, show
toggle with list.all, hide revoke on other-actor sessions without
auth.session.revoke, click-revoke calls API after window.confirm.
* LoginPage.test.tsx (extended +2 tests): renders OIDC buttons when
/auth/info reports providers; omits the OIDC block when none.
Verification
============
* `npx tsc --noEmit` — 0 errors.
* Vitest run across api/components/hooks/utils/auth/pages = 475 tests,
all green.
* `npm run build` — green (980 KB bundle, no surprises vs Phase 7).
* No backend (Go) changes in this commit; Phase 5-7.5 surfaces
consumed unchanged.
Not in this commit (deferred)
=============================
* "Test login flow" button on the provider detail page (prompt §Phase 8
optional row). Requires a server-side test=true flag on the OIDC
login handler — out of scope for the GUI commit.
* `web/src/__tests__/e2e/` Keycloak-via-testcontainers harness for the
15 comprehensive flow checks. Tracked under Phase 10 of
cowork/auth-bundle-2-prompt.md.
|
||
|
|
cfe76ad381 |
auth-bundle-1 Phase 10 follow-up: approvals queue GUI + transparent E2E deferral
Self-audit caught the missing GUI surface for Phase 9's flow #6 (profile edit gated → second admin approves → edit lands). The backend path is fully wired + tested in 69a508d; this commit adds the operator-facing UI so an approver can act without curl. # ApprovalsPage Lists every ApprovalRequest in the chosen state filter (default 'pending', toggleable to approved / rejected / expired). Renders both kinds: - cert_issuance — Rank-7 row with cert + job populated. - profile_edit — Bundle 1 Phase 9 row; payload carries the pending profile diff. Pill-rendered amber so an approver can distinguish at a glance. Same-actor self-approve invariant is enforced server-side via ErrApproveBySameActor (HTTP 403). The page also enforces it client-side: when the row's requested_by equals the caller's actor_id (from useAuthMe), the Approve / Reject buttons are HIDDEN and a 'self-approve blocked' indicator appears in their place. The operator literally cannot click the wrong button. Approve + Reject prompt for an optional note via window.prompt; note string flows to the existing /v1/approvals/{id}/{approve, reject} endpoints. Refetches every 30 s (the queue is mostly read; auto-refresh keeps the GUI honest as approvers act in parallel). # Wiring * /auth/approvals route in main.tsx. * Layout nav entry between API Keys and Auth Settings. * api/client.ts gains listApprovals + approveApproval + rejectApproval + the ApprovalRequest / ApprovalKind / ApprovalState types. # Tests ApprovalsPage.test.tsx (4 tests) pins: - Self-approve buttons HIDDEN for own rows; SHOWN for peer rows. - profile_edit kind renders with the amber pill. - Approve POSTs the right URL with the note. - Empty state. Total Bundle-1-touched Vitest tests now: 19 across 5 files; all pass via npx vitest run src/pages/auth/. # Transparent deferrals (called out for the record) The prompt's 9-flow Playwright E2E suite remains DEFERRED. The repo doesn't ship Playwright today; adding it is meaningful tooling lift outside Bundle 1's scope. Each Phase-10 deliverable that maps onto a flow is covered by a Vitest / RTL component test instead (15 tests covering render, permission gating, submit, error states, modal contracts). Full E2E coverage and the ≥75% src/pages/auth/ coverage metric are tracked as Phase 12 work; @vitest/coverage-v8 will land in the same commit that wires the coverage gate. # Verifications * npx tsc --noEmit clean. * npm run build green. * 19 Vitest tests pass. |
||
|
|
69a508dfcf |
auth-bundle-1 Phase 9 + 10: approval-bypass closure + RBAC GUI
# Phase 9 — approval-bypass closure (Decision 9, option a)
* Migration 000033_approval_kinds.up.sql: ALTER TABLE
issuance_approval_requests ADD COLUMN approval_kind +
payload JSONB; relax certificate_id + job_id to nullable;
CHECK (approval_kind IN ('cert_issuance','profile_edit'))
+ CHECK (per-kind nullability invariant) + index on
approval_kind. Idempotent throughout via DO blocks.
* domain.ApprovalKind enum (cert_issuance / profile_edit) +
IsValidApprovalKind. ApprovalRequest gains Kind +
Payload []byte for the pending profile diff.
* postgres.ApprovalRepository.Create + scanApprovalRow extended
to round-trip the new columns; certificate_id + job_id
switched to sql.NullString so profile_edit rows persist
cleanly. Default Kind=cert_issuance preserves back-compat
for every Phase-7-2026-05-03 caller.
* ApprovalService.RequestProfileEditApproval: new entry point
that creates a pending profile-edit row carrying the
serialized profile diff. Bypass mode (CERTCTL_APPROVAL_BYPASS)
short-circuits the same way it does for cert_issuance.
* ApprovalService.SetProfileEditApply hook: cmd/server/main.go
registers a closure that deserializes req.Payload + persists
via profileRepo.Update + emits a profile.edit_applied audit
row with category=auth. The hook avoids the Approval ↔
Profile import cycle.
* ProfileService.UpdateProfile: gates when (a) the live
profile carries RequiresApproval=true, OR (b) the proposed
edit would set it true. Returns ErrProfileEditPendingApproval
with the new approval ID; ProfileHandler maps to HTTP 202
Accepted + {pending_approval_id}. Both arms close the
flip-flop loophole because every transition through an
approval-tier profile fires the gate.
* TestProfileEdit_RequiresApprovalLoopholeClosed pins all 3
bypass attempts (flip-off / kept-on / flip-on) gated; nil-
approval-service preserves pre-Phase-9 direct-apply for
test fixtures.
* Approval service tests gain 4 profile_edit rows: pending row
shape; same-actor self-approve rejected with
ErrApproveBySameActor (load-bearing two-person integrity);
approve fails-closed when apply callback unwired;
apply callback invoked on approve.
* docs/reference/profiles.md (new) explains the gate +
edit response shape (202) + same-actor invariant + bypass
+ audit hooks.
# Phase 10 — RBAC management GUI
* useAuthMe hook (web/src/hooks/useAuthMe.ts): TanStack Query
fetches /api/v1/auth/me on app boot, caches for 60s, exposes
hasPerm(p) + hasAnyPerm + isAdmin predicates. Every Phase-10
page consumes this on mount + gates affordances against the
cached effective_permissions slice. Server-side enforcement
is the load-bearing gate; client-side hide/disable is UX.
* New routes:
- /auth/roles — list (auth.role.list); create-role modal
(auth.role.create) hidden when missing.
- /auth/roles/:id — detail + permissions; edit
(auth.role.edit), delete (auth.role.delete), add/remove
permission affordances each gated.
- /auth/keys — list of every actor with role grants; assign
+ revoke modals (auth.role.assign). actor-demo-anon
flagged system-managed; mutation buttons hidden for it.
- /auth/settings — stub showing /v1/auth/me identity +
bootstrap-endpoint availability via /v1/auth/bootstrap.
* AuditPage extended with category filter ('All categories'
+ the 3 enum values from migration 000032). Selection flows
to the API call params + the URL-driven query state.
* Layout: 3 new nav entries (Roles / API Keys / Auth Settings).
* api/client.ts: 12 new exported functions for the RBAC
surface (authMe, list/get/create/update/delete role,
list/add/remove role permissions, list keys, assign/revoke
key role, bootstrap-availability probe).
* data-testid attributes on every interactive element so a
future Playwright suite can assert behavior without brittle
CSS selectors.
* Empty state, error state, and unsaved-changes warnings on
every form per the prompt's implementation rules.
# Frontend tests
* RolesPage.test.tsx (6 tests): list render, empty state,
error state, hide-create-button-without-perm,
show-create-button-with-perm, submit-create-modal.
* KeysPage.test.tsx (3 tests): demo-anon flagged
system-managed (no buttons), permission-gated affordance
hide for auditor caller, assign-modal-POST contract.
* AuthSettingsPage.test.tsx (2 tests): identity surface,
bootstrap-OPEN-status surface.
* AuditPage.test.tsx (+1): category-filter select renders
with the 4 documented options.
15 frontend tests total in src/pages/auth/ + the audit
category-filter test; all pass via npx vitest run.
# Verifications
* go vet ./... clean.
* staticcheck across internal/auth + handler + router + cli +
service + repository + cmd + domain: clean.
* gofmt -l clean repo-wide.
* go test -short -count=1 green across internal/service,
internal/api/handler, internal/api/router, internal/auth,
internal/auth/bootstrap, internal/service/auth,
internal/domain/auth, cmd/server, cmd/cli, internal/cli.
* npx tsc --noEmit clean.
* npm run build green (vite build produces dist/index.html
+ 946KB JS bundle; chunk-size warning is pre-existing).
* npx vitest run src/pages/auth/ src/pages/AuditPage.test.tsx
green (15 tests, 4 files).
|
||
|
|
f40e975439 |
gui(certificates): surface profile contract in create-cert form (closes P3-3, P3-4, P3-5)
Closes findings P3-3, P3-4, P3-5 from the 2026-05-05 CLI/API/MCP↔GUI
parity audit (cowork/cli-gui-parity-audit-2026-05-05/RESULTS.md). The
audit flagged three "hidden defaults" in the create-certificate form:
environment='production', shortLived=false, selectedEkus=['serverAuth'].
Re-grounding against the live source:
P3-3 was a false positive. The form already exposes an environment
selector with three options (Production / Staging / Development) and
defaults to Production. No change needed — covered by new test pin.
P3-4 + P3-5 misread the architecture. allow_short_lived and
allowed_ekus are NOT per-cert form-state fields; they are properties
of the CertificateProfile that the operator binds via the existing
Profile dropdown. Adding form-level toggles for them would contradict
the profile-as-primitive design (the profile carries the policy
contract — TTL, EKUs, key-algo allow-list, short-lived eligibility —
so the cert can inherit a coherent set rather than letting operators
hand-mix invalid combinations).
The genuine UX gap was opacity: operators picked a profile without
seeing what allow_short_lived / allowed_ekus the profile carried.
This commit closes the spirit of the finding by surfacing the selected
profile's load-bearing properties in a read-only "Profile contract"
panel that appears below the Profile dropdown once a profile is
selected. The panel shows:
- allowed_ekus list (so operators see whether a profile is
serverAuth, emailProtection, codeSigning, or a mix)
- allow_short_lived flag (highlighted when true so operators know
they're picking a profile that allows TTL < 1h CRL/OCSP-exempt
certs per the M15b regime)
- explanatory text that EKUs and short-lived eligibility are
profile-level (not per-cert), guiding operators to edit the
profile or pick a different one
Test pins (web/src/pages/CertificatesPage.test.tsx):
- environment selector renders with 3 options, defaults to production
- environment selector toggles to staging / development on change
- Profile contract panel is hidden until a profile is selected
- Profile contract panel surfaces allowed_ekus when a TLS-server
profile is picked
- Profile contract panel surfaces emailProtection EKU when an S/MIME
profile is picked (closes the "S/MIME flows can't be initiated
from the GUI" sub-finding — they can, by picking an emailProtection
profile)
- Profile contract panel flags allow_short_lived=true when an IoT
short-lived profile is picked (closes the "operators can't issue
short-lived certs through the GUI" sub-finding — they can, by
picking an allow_short_lived profile)
Implementation notes:
- data-testid='cert-form-environment' + 'cert-form-profile' +
'cert-form-profile-detail' added to make the test selectors stable
across DOM-restructuring refactors. No production behaviour change
from the test IDs.
- No new dependencies; no form-library introduction (per the prompt's
out-of-scope list); uses the existing bare React state pattern.
- No API changes — Certificate.allowed_ekus / allow_short_lived
already exist on the CertificateProfile type in web/src/api/types.ts.
Acceptance gate (verified):
- npm test on src/pages/CertificatesPage.test.tsx: 12/12 pass
(6 pre-existing T-1 tests + 6 new P3-3..P3-5 pins).
- All sibling page tests (AuditPage, TargetDetailPage, ShortLivedPage,
etc.) still pass.
|
||
|
|
75097909e9 | |||
|
|
ff6ffcda1b |
refactor(web): drop 5 unused imports across 4 pages (CodeQL #6, #7, #8, #9)
Four CodeQL js/unused-local-variable alerts in one sweep — all Note severity, all pure dead-import cleanup verified by grep (each removed symbol had exactly 1 occurrence in its file: the import line itself). Alert #6 — web/src/pages/AgentFleetPage.tsx:3: Drop Legend from recharts named-import list. The fleet pie chart renders without a legend (the slice colors are labeled inline via Tooltip). Alert #7 — web/src/pages/DashboardPage.tsx:9: Drop getAgents + getNotifications from the api/client named- import list. The dashboard summary card now uses getDashboardSummary (single endpoint) instead of fanning out to per-resource list calls; the agents + notifications full list is reachable via dedicated pages. Alert #8 — web/src/pages/CertificatesPage.tsx:6: Drop revokeCertificate from the api/client named-import list. The page uses bulkRevokeCertificates for the multi-cert UX; single-cert revoke happens on CertificateDetailPage which imports revokeCertificate independently. Alert #9 — web/src/pages/DiscoveryPage.tsx:15: Drop the StatusBadge default-import line. Discovered-cert status renders inline (text label colored via the row's state-class) without the StatusBadge component. Verified locally: Each flagged symbol: 0 occurrences in its file post-edit. tsc --noEmit: exit 0. No behavioral change — pure import-list cleanup. References: https://github.com/certctl-io/certctl/security/code-scanning/6 https://github.com/certctl-io/certctl/security/code-scanning/7 https://github.com/certctl-io/certctl/security/code-scanning/8 https://github.com/certctl-io/certctl/security/code-scanning/9 Closes all four alerts. |
||
|
|
b6a5278df1 |
refactor(web): drop unused imports (CodeQL #5 + #10)
Two CodeQL js/unused-local-variable alerts in one sweep — both Note severity, both pure dead-import cleanup. Alert #10 (web/src/pages/NotificationsPage.tsx:8): formatDateTime imported but only timeAgo used. Verified via repo-wide grep — formatDateTime appears on the import line only. Drop from the import statement; leave timeAgo in place. Alert #5 (web/src/api/client.test.ts:2): Five unused imports in the test file's import block (the test file imports nearly the full API client surface): - acknowledgeHealthCheck - createPolicy - deleteHealthCheck - getHealthCheckHistory - updateHealthCheck Each appears only on the import line — verified via grep -c. Removing them doesn't change test coverage (the corresponding client functions are exported and exercised in their own tests elsewhere, but the integration covered by client.test.ts doesn't reach them yet). Verified locally: tsc --noEmit: exit 0. grep -c on each removed symbol in its file: 0 occurrences. No behavioral change — pure import-list cleanup. References: https://github.com/certctl-io/certctl/security/code-scanning/10 https://github.com/certctl-io/certctl/security/code-scanning/5 Closes both alerts. |
||
|
|
439905e546 |
refactor(scep-gui): remove unused pickTabFromQuery (CodeQL #22)
CodeQL alert #22 (js/unused-local-variable, severity: Note) flagged pickTabFromQuery at web/src/pages/SCEPAdminPage.tsx:584 as dead code. Audit: this function is a leftover from an incomplete refactor. The SCEP admin page picks its initial tab via pickInitialTab (line 594 post-edit), which subsumes the same query-string check that pickTabFromQuery did: pickInitialTab honors three signals (precedence high → low): 1. ?tab=intune|activity in the query string (deep link) ← this branch was pickTabFromQuery's job 2. Pathname ending in /scep/intune (legacy alias from Phase 9.4) 3. Default to 'profiles' pickTabFromQuery only handled signal (1); pickInitialTab inlined the same logic on its first branch and added (2) + (3). Nothing references pickTabFromQuery (verified via repo-wide grep). Pure dead code. Fix: delete the function. No behavioral change — pickInitialTab already does the work. Verified locally: tsc --noEmit: exit 0. grep -nE 'pickTabFromQuery' web/src/: zero references. Reference: https://github.com/certctl-io/certctl/security/code-scanning/22 Closes CodeQL alert #22 (js/unused-local-variable). |
||
|
|
8908c8ff5c |
web, docs: IssuerHierarchyPage + sysadmin runbook + connectors row (Rank 8 commit 5)
Final commit of the 5-commit Rank 8 chain. Operator-facing surface
on top of the service + handler layers shipped in commits 1-4.
Frontend (web/src):
- api/client.ts: 3 new functions + IntermediateCA interface
(listIntermediateCAs, getIntermediateCA, retireIntermediateCA).
- pages/IssuerHierarchyPage.tsx: recursive nested <ul> render of
the hierarchy tree at /issuers/:id/hierarchy. buildHierarchyTree
is a pure helper that walks the flat list and groups children
on parent_ca_id; the dendrogram view is parking-lot work tracked
in WORKSPACE-ROADMAP. Two-phase retire UX surfaces 'Retire…'
then 'Confirm retire (terminal)' when the row is in retiring
state. Admin gate is enforced at the API; the page renders the
backend's 403 as ErrorState for non-admin callers.
- main.tsx: register the new /issuers/:id/hierarchy route.
CI guard update:
- scripts/ci-guards/T-1-frontend-page-coverage.sh: add
IssuerHierarchyPage to the deferred-test allowlist with the
standard 'why deferred' comment. Admin-gate + recursive build
semantics are already pinned at the backend layer
(intermediate_ca_test.go service tests + intermediate_ca_test.go
handler triplet). Vitest test deferred until next feature
change touches the page.
Docs:
- docs/intermediate-ca-hierarchy.md: new operator runbook
covering:
Concepts (HierarchyMode 'single' vs 'tree', defense-in-depth
on key bytes never persisting on rows).
Lifecycle states + drain-first semantics
(active → retiring → retired with active-children gate).
Three deployment patterns: 4-level FedRAMP boundary CA,
3-level financial-services policy CA, 2-level internal
PKI.
RFC 5280 enforcement (§3.2 self-signed, §4.2.1.9 path-length
tightening, §4.2.1.10 NameConstraints subset).
Migration from single → tree using the load-bearing
TestLocal_HierarchyMode_SingleVsTree_ByteIdentical pin as
the canary.
API reference + observability (IntermediateCAMetrics
Prometheus exposure).
Known limitations + Rank-8 follow-on roadmap.
- docs/connectors.md: extend the Built-in Local CA section with
a 'Tree mode (Rank 8)' paragraph describing the new chain
assembly path + cross-link to docs/intermediate-ca-hierarchy.md.
Roadmap:
- WORKSPACE-ROADMAP.md: 5 follow-on items under a new
'Intermediate CA hierarchy extensions (Rank 8 V2 follow-ons)'
bullet block:
HSM-backed roots (PKCS#11 / cloud KMS drivers via existing
signer.Driver interface — no service-layer change needed).
Automated CA rotation (parallel-validity windows ahead of
expiry).
Intra-hierarchy CRL chaining (per-CA CRL endpoints stitched
at issue time).
NameConstraints policy templates (FedRAMP / financial /
internal PKI declarative templates instead of hand-rolled
JSON).
D3 dendrogram visualization (separate page so the existing
list view stays the default + the dep stays opt-in).
Verified locally:
gofmt: clean.
go vet ./...: exit 0.
tsc --noEmit (web/): exit 0 (no TypeScript errors).
go test -short -count=1 ./internal/api/handler/... + service +
local: ok across all three packages, 4-5s each.
All 24 CI guards: clean
(T-1 frontend-page-coverage with the new
IssuerHierarchyPage allowlist entry; openapi-handler-parity,
M-008 admin-gate, every other guard untouched).
Rank 8 chain complete:
|
||
|
|
0729ee46e0 |
chore: sweep github.com/shankar0123/certctl URL refs to certctl-io/certctl
Post-transfer cosmetic + release-critical URL refresh after moving the
repo from github.com/shankar0123/certctl to github.com/certctl-io/certctl
(2026-05-03). GitHub HTTP redirects continue to forward old URLs forever,
so existing operators are not broken — but aligns the canonical
references with the new owner so:
- procurement engineers / contributors browsing the docs see the right
URL on first read
- operators copying the agent install one-liner hit the new path
directly without going through a redirect
- the Helm chart's default image repository points at the canonical org
registry path
- the OnboardingWizard rendered to first-run UI users shows the new
URL in the install snippets and doc anchor links
- the GitHub Actions release workflow pushes container images to
ghcr.io/certctl-io/certctl-{server,agent} (was: shankar0123)
- the release-notes Markdown body in release.yml — which gets stamped
into every future release page — references the post-transfer
cert-identity (cosign keyless signing now uses the certctl-io
workflow URL) and the post-transfer SLSA provenance source-uri.
Without this, every cosign verify / slsa-verifier command on a
v2.1.0+ release would fail because the cert-identity-regexp would
not match the signing identity GitHub Actions OIDC issues post-
transfer. Old releases (v2.0.67 and earlier) keep their immutable
release-notes pointing at the shankar0123 path and remain
verifiable via their own published instructions.
Customer impact:
- Operators on ghcr.io/shankar0123/certctl-{server,agent}:latest
silently freeze on whatever tag was current at transfer time. They
get no errors; they just stop receiving updates. The next release
notes need a one-line callout (Phase 3.1 of cowork/transfer-
certctl-to-org.md) telling them to update their image path to
ghcr.io/certctl-io/certctl-{server,agent}.
- All other URLs (git clone, install one-liner, raw.githubusercontent
URLs, browser links, GitHub API) continue to resolve via permanent
HTTP redirects. The sweep is cosmetic for those.
Files swept (30 total):
.github/workflows/release.yml — IMAGE_NAMESPACE, source-uri,
cosign cert-identity-regexp, IMAGE= snippet (5 refs total).
CHANGELOG.md, README.md — anchor links, badges, install one-liner,
cosign verify snippets in operator-facing sections.
api/openapi.yaml — info / externalDocs URLs.
install-agent.sh — GITHUB_REPO const + systemd unit Documentation=
field.
deploy/ENVIRONMENTS.md, deploy/helm/{CHART_SUMMARY,INDEX,
INSTALLATION,README}.md, deploy/helm/certctl/{Chart.yaml,
README.md,values.yaml}, deploy/helm/examples/values-*.yaml —
chart docs + image repository defaults across dev / prod-ha
overrides.
docs/{certctl-for-cert-manager-users,connector-iis,connectors,
migrate-from-acmesh,migrate-from-certbot,quickstart,test-env,
why-certctl}.md — operator-facing doc URLs.
examples/{acme-nginx,acme-wildcard-dns01,multi-issuer,
private-ca-traefik,step-ca-haproxy}/docker-compose.yml +
examples/step-ca-haproxy/step-ca-haproxy.md — example image:
paths and accompanying narrative.
web/src/pages/OnboardingWizard.tsx — first-run-UI URL refs (curl
install one-liners, agent docker image path, doc anchor links).
Files intentionally NOT swept (Choice A from cowork/transfer-certctl-
to-org.md):
go.mod, go.sum — module declaration stays github.com/shankar0123/
certctl. Existing imports compile because Go uses the path
declared in go.mod, not the URL it was fetched from. Internal-
only project; no external Go consumers; rename will land as a
mechanical sed when one materializes.
~250 *.go files — every import remains github.com/shankar0123/
certctl/internal/...
deploy/test/f5-mock-icontrol/go.mod — separate test sub-module;
same Choice A logic; module path stays.
Files intentionally NOT swept (other reasons):
README.md lines 244-245 — Scarf-pixel docker-pull commands.
shankar0123.docker.scarf.sh/... is a Scarf-account hostname
(per-user, not per-repo) and the pixel keeps tracking pulls
against the operator's personal Scarf account. Migrating to a
certctl-io Scarf account is a separate decision (create org
Scarf account → re-create package → update README).
deploy/test/f5-mock-icontrol/f5-mock-icontrol — checked-in
compiled binary with shankar0123/certctl baked into Go build
info via the sub-module path. Out of scope for a URL sweep;
will refresh on the next `make test-integration` rebuild.
Verification:
gofmt: clean (no .go files touched).
go vet ./...: clean (verified at this SHA in 1.3 of the transfer
checklist; no .go changes since).
go build ./...: clean (same).
go test -short on representative packages: green (same).
Diff shape: 30 files, 74 insertions / 74 deletions, net-zero size,
pure URL substitution.
|
||
|
|
36885da2da |
EST RFC 7030 hardening master bundle Phases 8-9: GUI ESTAdminPage
(Profiles + Recent Activity + Trust Bundle tabs) + CLI subcommand
family `certctl-cli est {cacerts,csrattrs,enroll,reenroll,
serverkeygen,test}` + 6 MCP tools.
Phase 8 — ESTAdminPage tabbed GUI:
- web/src/pages/ESTAdminPage.tsx mirrors SCEPAdminPage's three-tab
surface. Profiles tab renders per-profile cards with auth-mode
badges (mTLS / Basic / ServerKeygen), mTLS trust-anchor expiry
countdown (good ≥30d / warn 7-30d / bad <7d / EXPIRED), 12-cell
counter grid (success_simpleenroll/.../internal_error), and the
admin-gated "Reload trust anchor" action. Recent Activity tab
merges the four EST audit actions (est_simple_enroll +
est_simple_reenroll + est_server_keygen + est_auth_failed) across
four parallel useQuery calls with chip filters for All/Enrollment/
Re-enrollment/ServerKeygen/AuthFailure. Trust Bundle tab renders
per-mTLS-profile cert subjects + expiries.
- M-009 useTrackedMutation guard: every mutation routes through
the tracked hook so audit/progress hooks fire.
- Page-level admin gate renders "Admin access required" banner for
non-admin callers + skips underlying API requests so the server
never sees a 403-prone request. Server-side enforcement is the
M-008 admin gate; this is a UX hint.
- Wired into web/src/main.tsx at /est; nav link added to Layout.tsx.
- New web/src/api/types.ts types ESTStatsSnapshot +
ESTTrustAnchorInfo + ESTProfilesResponse + ESTReloadTrustResponse
mirror service.ESTStatsSnapshot 1:1.
- New web/src/api/client.ts helpers getAdminESTProfiles +
reloadAdminESTTrust.
- 14 Vitest cases (admin gate non-admin / non-auth-required deploy /
default tab / tab switch / deep-link tab / per-profile card render
+ counter cells / reload-button mTLS-only / trust-expiry badge
band / reload modal Confirm-Cancel-Error paths / Trust Bundle
empty-state / Activity filter chip toggle).
Phase 9.1 — CLI subcommands:
- internal/cli/est.go adds 6 subcommands: cacerts / csrattrs /
enroll / reenroll / serverkeygen / test. CSR input via --csr
with file-path or '-' for stdin; multipart serverkeygen response
is parsed by stdlib mime/multipart and split into <prefix>.cert.pem
+ <prefix>.key.enveloped so the operator can decrypt the key with
openssl smime. EST `test` smoke-tests cacerts + csrattrs + emits
one-line OK/FAIL diagnostics.
- cmd/cli/main.go grows the `est` dispatch + Usage entries.
Phase 9.2 — MCP tools:
- internal/mcp/tools_est.go adds 6 tools mapped to the EST endpoints
+ admin observability: est_list_profiles + est_admin_stats (alias)
+ est_get_cacerts + est_get_csrattrs + est_enroll + est_reenroll.
Tool count grew from 87 → 93 (verified via the registered-vs-
covered guard in tools_per_tool_test.go); the per-tool happy/error-
path table grew with 6 matching entries so the future-tool-no-test
CI guard stays green.
- internal/mcp/client.go grows PostRaw — non-JSON POST helper that
the EST enroll/reenroll tools use to ship raw application/pkcs10
CSR bytes through the MCP fence-wrapped response.
- estRawResultJSON wraps the raw response body in a JSON envelope
the MCP consumer can structurally consume (content_type +
body_base64 + body_size_bytes). Mirrors the CRL/OCSP MCP tools'
binary-DER envelope.
Phase 9.3 — Tests:
- internal/cli/est_test.go: 8 cases pinning the wire-shape contract
on the CLI side without dragging the full ESTHandler into the
test build.
- internal/mcp/tools_est_test.go: path-builder + JSON-envelope unit
tests + end-to-end tool exercise that pins all 5 captured request
paths through a fake API.
Pre-commit verification (sandbox): gofmt clean, go vet clean
(excluding repository/postgres which the sandbox can't build —
pre-existing testcontainers limit), staticcheck clean across
cli/mcp/cmd/cli, go test -short -count=1 green for every non-
postgres Go package, Vitest green for ESTAdminPage (14) +
SCEPAdminPage (20) — 34 page tests total. G-3 docs-drift guard
reproduced locally clean (Phases 8-9 added zero new env vars).
Spec preserved at cowork/est-rfc7030-hardening-prompt.md. Phases
10-13 (libest sidecar e2e / bulk revocation + audit codes /
docs/est.md / release prep + tag) remain — post-2.1.0 work.
|
||
|
|
506cff137d |
feat(scep): SCEP probe in network scanner for fleet-readiness assessment
Phase 11.5 of the SCEP RFC 8894 + Intune master bundle. Adds an
operator-facing SCEP probe that issues GetCACaps + GetCACert against
an arbitrary SCEP server URL and returns a structured posture snapshot
(reachable + advertised caps + RFC 8894 / AES / POST / Renewal /
SHA-256 / SHA-512 support flags + CA cert subject + issuer + NotBefore
+ NotAfter + days-to-expiry + algorithm + chain length).
Two operator use cases per the master prompt:
1. Pre-migration assessment — probe an existing EJBCA / NDES SCEP
server before switching to certctl to see what capabilities it
advertises and what the CA cert looks like.
2. Compliance posture audits — periodic ad-hoc probes against the
operator's own SCEP servers to flag drift.
Capability-only — does NOT POST a CSR per the spec (would consume slot
allocations on the target server + create audit noise). Standalone CLI
binary explicitly out of scope (per the master prompt §11.5.6 and the
operator's confirmation): the probe code lands inside certctl; a
future thin Cobra wrapper is a separate decision.
Backend (six new + one extended file):
* internal/domain/network_scan.go — new SCEPProbeResult struct with
every probe field documented for the GUI's display layer.
* migrations/000021_scep_probe_results.up.sql + .down.sql — new
scep_probe_results table with TEXT id, target_url, all probe
flags, CA cert metadata, probed_at, probe_duration_ms, error.
Two indexes: idx_scep_probe_results_probed_at (DESC) for the
'recent probes' GUI query, idx_scep_probe_results_target_url
(target_url, probed_at DESC) for the future per-URL history view.
* internal/repository/interfaces.go — new SCEPProbeResultRepository
interface (Insert + ListRecent).
* internal/repository/postgres/scep_probe_results.go — Postgres
implementation. ListRecent clamps limit to [1, 200]; on read
re-derives ca_cert_days_to_expiry against the query-time wall
clock so 'X days remaining' stays fresh.
* internal/service/scep_probe.go — ProbeSCEP(ctx, url) on
NetworkScanService. Validation order:
1. Up-front URL validation via validation.ValidateSafeURL
(defaults to validation.ValidateSafeURL but injectable for
tests via the new scepValidateURL field on the service).
2. Dial-time SSRF re-check via SafeHTTPDialContext on the
http.Transport (defends against DNS rebinding).
3. GET ?operation=GetCACaps + GET ?operation=GetCACert.
GetCACert handles three response shapes: PKCS#7 SignedData
certs-only envelope (multi-cert), raw DER (single-cert),
and PEM-wrapped DER (non-conforming servers).
Times out at 30s; uses a 1MB body cap for DoS defense; wraps
the result + persists via the repo (nil-safe) before returning.
describeCertAlgorithm helper returns 'RSA-N' / 'ECDSA-curve' /
'Ed25519' / 'DSA' for the GUI's algorithm column.
* internal/service/network_scan.go — added scepProbeRepo +
scepHTTPClient + scepValidateURL + scepIDFn + nowFn fields;
SetSCEPProbeRepo wires the repo at startup.
* internal/api/handler/network_scan.go — extended NetworkScanService
interface with ProbeSCEP + ListRecentSCEPProbes; added two new
HTTP handlers:
POST /api/v1/network-scan/scep-probe (body {url})
GET /api/v1/network-scan/scep-probes (recent history)
Synchronous probe; HTTP 200 with the result body for both success
and reachable-but-failed cases (so the GUI can render the failure
tone with the operator-actionable error message).
* internal/api/router/router.go — registered the two routes inline
after the existing network-scan target endpoints.
* api/openapi.yaml — documented both endpoints (operationId
probeSCEP + listSCEPProbes) with full schema + response codes.
* cmd/server/main.go — wires the new SCEPProbeResultRepository
onto the network scan service via SetSCEPProbeRepo right after
the existing NewNetworkScanService construction.
Backend tests (6 new — exit-criteria-named per the master prompt):
* TestProbeSCEP_AdvertisesAllCaps — happy path, full RFC 8894
capability set, ECDSA P-256 CA cert, 365-day expiry.
* TestProbeSCEP_MissingSCEPStandard — pre-RFC-8894 server (only
POSTPKIOperation + SHA-1 + DES3); SupportsRFC8894 = false.
* TestProbeSCEP_GetCACertExpired — CA cert NotAfter 30d in the
past; CACertExpired = true.
* TestProbeSCEP_Unreachable — connect to TCP port 1; probe
returns Reachable=false + non-empty Error.
* TestProbeSCEP_RejectsReservedIP — http://169.254.169.254/scep
(EC2 metadata literal) rejected by the up-front
validation.ValidateSafeURL gate; result captures the error
without ever issuing the HTTP call.
* TestProbeSCEP_PEMWrappedCert — server returns PEM instead of
raw DER for GetCACert; the fallback parse path handles it.
Frontend (one extended file + types/client):
* web/src/api/types.ts — SCEPProbeResult + SCEPProbesResponse.
* web/src/api/client.ts — probeSCEPServer + listSCEPProbes
helpers.
* web/src/pages/NetworkScanPage.tsx — new SCEPProbeSection
component + ProbeResultPanel (with capability badges + CA cert
details panel + raw caps line) + SCEPProbeHistoryTable. Form
rejects empty URL with inline error before calling the API.
Reload mutation goes through useTrackedMutation with explicit
invalidates: [['scep-probes']] (M-009 contract).
Frontend tests (5 new + 0 regressions):
* Scep probe section header + form renders.
* Empty URL is rejected with inline error and never calls the
probe endpoint.
* Successful probe renders capability badges + CA cert subject
+ days-remaining inline panel.
* Probe-level errors are surfaced in the inline panel (no result
panel rendered).
* Recent-probes history table renders one row per probe.
* (Existing 2 NetworkScanPage XSS-hardening tests stub the new
listSCEPProbes endpoint to an empty list so they still pass.)
Verification:
* gofmt clean on touched files
* go vet ./... clean
* staticcheck on service+handler+router+repository+cmd-server clean
* go test -short across service+handler+router+repository+cmd-server
+ integration: all green (existing + 6 new probe tests pass)
* Frontend tsc --noEmit clean
* Vitest: 7/7 NetworkScanPage tests pass (2 existing XSS + 5 new
probe section)
* G-3 docs-drift CI guard reproduced locally clean (no new env vars)
* M-009 hard-zero useMutation guard clean (probe mutation goes
through useTrackedMutation)
* openapi-parity guard satisfied (both new routes documented)
* The mockNetworkScanService in handler + integration packages
extended with stub Probe methods; targeted coverage stays in
scep_probe_test.go.
Out of scope (per master prompt §11.5.6 + operator confirmation):
* Standalone certctl-scan CLI binary — separate decision, ~1d of
follow-up work when/if shipped.
Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 11.5
cowork/scep-rfc8894-intune/progress.md
|
||
|
|
0be889ff1d |
refactor(scep-gui): rebrand SCEP admin surface to per-profile tabbed interface (Profiles + Intune + Recent Activity)
Phase 9 follow-up to the SCEP RFC 8894 + Intune master bundle. The
Phase 9.4 GUI shipped 'SCEP Intune Monitoring' at /scep/intune, which
made the per-profile observability surface look Intune-only — operators
running EJBCA + Jamf would never click that nav link expecting per-
profile RA cert + mTLS observability. The page is per-profile keyed
under the hood; this commit rebrands + restructures so the surface
matches what operators actually need.
Spec: cowork/scep-gui-restructure-prompt.md.
User-visible change:
- Nav link renamed: 'SCEP Intune' → 'SCEP Admin'.
- Route: /scep is the new canonical path; /scep/intune kept as a
backward-compat alias that lands directly on the Intune tab.
- Page header: 'SCEP Administration'.
- Three tabs:
* Profiles (default) — per-profile lean cards with RA cert
expiry countdown, mTLS sibling-route status badge, Intune
enabled/disabled badge, challenge-password-set indicator.
'View Intune details →' link on Intune-enabled cards
deep-links into the Intune tab.
* Intune Monitoring — the existing Phase 9.4 deep-dive
(per-status counters, trust anchor expiry, recent failures
table, reload-trust button + confirmation modal).
* Recent Activity — full SCEP audit log filter merging all
four action codes (scep_pkcsreq + scep_renewalreq +
scep_pkcsreq_intune + scep_renewalreq_intune); chip filters
for All / Initial / Renewal / Intune / Static.
Backend:
* internal/service/scep.go — new SCEPProfileStatsSnapshot type +
IntuneSection sub-block + ProfileStats(now) accessor. Adds
raCertSubject/raCertNotBefore/raCertNotAfter + mtlsEnabled +
mtlsTrustBundlePath fields with SetRACert + SetMTLSConfig setters.
Existing IntuneStatsSnapshot + IntuneStats(now) preserved
UNCHANGED for /admin/scep/intune/stats backward compat (the
JSON shape stays byte-stable for external consumers — the
aliasing approach the prompt initially suggested doesn't work
because the new shape nests Intune while the old one is flat).
ChallengePasswordSet is derived from challengePassword != ''
(the secret value itself is never surfaced).
* internal/api/handler/admin_scep_intune.go — new Profiles handler
method on AdminSCEPIntuneHandler with the same M-008 admin gate.
AdminSCEPIntuneServiceImpl extended (in place; same
map[string]*service.SCEPService) to satisfy the new
AdminSCEPProfileService interface. Single handler file gets the
third method so the M-008 pin entry count stays steady (no new
file, no new triplet of admin-gate test files — just three new
Profiles tests inside the existing test file).
* internal/api/router/router.go — one new route
'GET /api/v1/admin/scep/profiles' registered to
reg.AdminSCEPIntune.Profiles. HandlerRegistry unchanged.
* api/openapi.yaml — new operation 'listSCEPProfiles' documenting
the request body / response shape / error mapping. Existing
Intune entries unchanged.
* cmd/server/main.go — per-profile loop now calls
scepService.SetMTLSConfig(profile.MTLSEnabled,
profile.MTLSClientCATrustBundlePath) right after SetPathID, and
scepService.SetRACert(raCert) right after loadSCEPRAPair returns
the leaf cert. Both setters are nil-safe.
* internal/api/handler/m008_admin_gate_test.go — extended the
existing admin_scep_intune.go entry's justification to mention
the third endpoint. No new map entry needed (file already
listed).
Backend tests (8 new):
* TestAdminSCEPProfiles_NonAdmin_Returns403
* TestAdminSCEPProfiles_AdminExplicitFalse_Returns403
* TestAdminSCEPProfiles_AdminPermitted_ForwardsActor — also pins
that Intune-enabled profiles emit an 'intune' sub-block while
Intune-disabled profiles OMIT it.
* TestAdminSCEPProfiles_RejectsNonGetMethod
* TestAdminSCEPProfiles_PropagatesServiceError
* TestAdminSCEPProfilesServiceImpl_NilMapReturnsEmpty
* (existing 16 Phase 9 admin tests still pass — backward-compat
preserved)
Frontend:
* web/src/api/types.ts — new SCEPProfileStatsSnapshot +
IntuneSection + SCEPProfilesResponse types. Existing
IntuneStatsSnapshot et al unchanged.
* web/src/api/client.ts — new getAdminSCEPProfiles helper.
* web/src/pages/SCEPAdminPage.tsx — full rewrite as the tabbed
surface. Reuses the existing ConfirmReloadModal and Intune
deep-dive card components verbatim; adds ProfileSummaryCard
(lean card for the Profiles tab) and ActivityTab. URL state
sync via useSearchParams so deep links survive reloads + browser
back/forward. The legacy /scep/intune route alias defaults the
activeTab to 'intune' on mount.
* web/src/main.tsx — new <Route path='scep' /> + preserved
<Route path='scep/intune' /> alias. Both render SCEPAdminPage.
* web/src/components/Layout.tsx — nav link rebranded:
label 'SCEP Intune' → 'SCEP Admin', to '/scep/intune' → '/scep'.
Frontend tests (20 — full rebuild):
* Admin gate (non-admin sees gated banner + zero admin API calls)
* Profiles tab default + Intune tab tabswitch + ?tab=intune deep
link + legacy /scep/intune alias all land on Intune
* Profiles tab status badges (Intune + mTLS + challenge-set)
reflect each profile's flags
* RA cert expiry tone bands (good ≥30d / warn 7-30d / bad <7d /
EXPIRED) verified across three fixture profiles
* 'View Intune details →' only renders for Intune-enabled
profiles AND switches tabs on click
* Empty-state banner when no profiles configured
* Intune tab counters render with the existing Phase 9 deep-dive
shape; reload modal Open/Confirm/Cancel/Error paths all pinned
* Recent Activity tab merges all four SCEP audit actions across
four parallel useQuery calls; filter chips
(all/initial/renewal/intune/static) narrow correctly
* Error path surfaces ErrorState on the active tab
Docs:
* docs/scep-intune.md — Operational monitoring section heading
expanded to '(SCEP Administration → Intune Monitoring tab)'.
Page-surface description rewritten for the tabbed shape;
admin-endpoints list extended with the new /admin/scep/profiles
entry.
* docs/architecture.md — Microsoft Intune Connector trust anchor
subsection updated to reference the Intune Monitoring tab inside
the SCEP Administration page + lists all three admin endpoints.
* docs/legacy-est-scep.md — forward-ref expanded with a parallel
sentence for the per-profile observability surface (independent
of Intune).
* README.md — Enrollment Protocols bullet for Intune updated to
'admin GUI SCEP Administration page at /scep' with the three
tabs called out.
Verification:
* gofmt clean on touched files
* go vet ./... clean
* staticcheck on intune+service+handler+router+cmd-server clean
* go test -short across intune+service+handler+router+cmd-server:
all green (existing Phase 9 tests + new Profiles tests)
* Frontend tsc --noEmit clean
* Vitest: 20/20 SCEPAdminPage tests + 3/3 sibling AuditPage tests
pass
* G-3 docs-drift CI guard reproduced locally: clean (no new env
vars; existing CERTCTL_SCEP_ allowlist prefix covers everything)
* M-009 hard-zero useMutation guard reproduced locally: clean
(the existing reload mutation already used useTrackedMutation
from the Phase 9 follow-up commit
|
||
|
|
28e277a88e |
fix(scep-intune): use useTrackedMutation for trust-anchor reload (M-009)
Phase 9 follow-up — the M-009 hard-zero regression guard in
.github/workflows/ci.yml flagged the SCEPAdminPage's reload mutation as
a bare useMutation() call. The repo's invalidation contract requires
every mutation to go through useTrackedMutation with explicit
invalidates: QueryKey[] | 'noop' so cached data never goes stale after
a write.
Swap the bare useMutation for useTrackedMutation with
invalidates: [['admin', 'scep', 'intune', 'stats']] — the trust-anchor
reload changes the per-profile trust pool reflected in IntuneStats, so
the stats query MUST refetch on success. The audit-log queries stay on
their own 60s timer (a SIGHUP-equivalent reload doesn't backfill new
audit rows; nothing to invalidate there).
Verification:
* tsc --noEmit clean
* vitest SCEPAdminPage.test.tsx: 13/13 still pass (the wrapper's
onSuccess fires AFTER invalidation, so the modal-close + state
reset assertions hold)
* M-009 grep guard reproduced locally — bare useMutation sites = 0
|
||
|
|
77e0281a0e |
feat(scep-intune): GUI monitoring tab + admin endpoints
Phase 9 of the SCEP RFC 8894 + Intune master bundle. Lands the operator-
facing Intune Monitoring tab plus the two admin-gated endpoints it reads
from. Per the constitutional 'complete path' rule: counters tick on
every typed dispatcher branch, the GUI poll is live (30s for stats,
60s for the audit log filter), and the SIGHUP-equivalent reload action
is one click + a confirmation modal — no follow-up plumbing required.
Backend (Phase 9.1 + 9.2 + 9.3):
* internal/service/scep.go gains:
- intuneCounterTab — atomic per-status counters keyed by the same
labels intuneFailReason() emits (success / signature_invalid /
expired / not_yet_valid / wrong_audience / replay / rate_limited /
claim_mismatch / compliance_failed / malformed / unknown_version).
Lock-free on the dispatcher hot path; snapshot() returns a
zero-allocation map for the admin endpoint.
- dispatchIntuneChallenge wires intuneCounters.inc(...) on every
typed return path INCLUDING the success leg (credited before
processEnrollment so a downstream issuer-connector failure
doesn't double-count).
- SetPathID + PathID accessors (so admin rows surface the SCEP
profile path ID per row).
- IntuneStatsSnapshot + IntuneTrustAnchorInfo public types, plus
IntuneStats(now) accessor that walks the trust holder pool and
packages a per-profile snapshot. ReloadIntuneTrust() is the
typed wrapper around TrustAnchorHolder.Reload that returns
ErrSCEPProfileIntuneDisabled when called on a profile where
Intune isn't enabled (admin endpoint maps that to HTTP 409).
* internal/api/handler/admin_scep_intune.go:
- AdminSCEPIntuneService narrow interface (Stats + ReloadTrust)
so the handler depends on a small surface; AdminSCEPIntuneServiceImpl
is the production walker over the per-profile SCEPService map.
- AdminSCEPIntuneHandler.Stats handles GET /api/v1/admin/scep/intune/stats
with the M-008 admin gate (non-admin → 403 + service never
invoked); returns {profiles, profile_count, generated_at}.
- AdminSCEPIntuneHandler.ReloadTrust handles POST
/api/v1/admin/scep/intune/reload-trust. Body is {path_id: '<id>'};
empty body targets the legacy /scep root profile. Returns 200 on
success / 404 on unknown PathID / 409 when the profile is Intune-
disabled / 500 on a parse error from intune.LoadTrustAnchor (the
holder retains its previous pool — fail-safe). 400 on malformed
JSON.
- ErrAdminSCEPProfileNotFound typed error so the handler can
distinguish 'wrong profile' from 'broken file'.
* internal/api/router/router.go: HandlerRegistry gains
AdminSCEPIntune; both routes registered as bearer-auth-required
(the admin-gate is at the handler layer per the M-008 pattern).
* cmd/server/main.go: declares scepServices map[string]*service.SCEPService
BEFORE HandlerRegistry construction so the same map can be referenced
from both the admin handler (constructed early) and the SCEP startup
loop (which populates it later by reference). The per-profile loop
now calls scepService.SetPathID(profile.PathID) and stores the service
pointer into the shared map. AdminSCEPIntune handler is constructed
at the same time as AdminCRLCache.
* internal/api/handler/m008_admin_gate_test.go: AdminGatedHandlers
map gains 'admin_scep_intune.go' with a one-line justification —
the regression scanner enforces the per-handler test triplet
(TestAdminSCEPIntune_NonAdmin_Returns403 + _AdminExplicitFalse_Returns403
+ _AdminPermitted_ForwardsActor) plus their POST siblings for
ReloadTrust.
* api/openapi.yaml: documents both endpoints with request body /
response shape / error mapping; openapi-parity-test now matches
the registered routes.
Frontend (Phase 9.4):
* web/src/pages/SCEPAdminPage.tsx — single-page Intune Monitoring
surface:
- Per-profile cards (one card per SCEP profile). Enabled profiles
get the full counter grid + trust-anchor-expiry badge tone
(good ≥30d / warn 7-30d / bad <7d / EXPIRED). Disabled profiles
get an off-state pill with the env-var hint to opt in.
- Counters polled every 30s via TanStack Query against
GET /admin/scep/intune/stats.
- Recent failures table (last 50) populated from the audit log
filtered to action=scep_pkcsreq_intune AND scep_renewalreq_intune;
merged + sorted by timestamp descending. Polled every 60s.
- Reload trust anchor button per profile + confirmation modal that
explains the SIGHUP equivalence and the fail-safe behavior.
onConfirm runs a TanStack mutation, refetches the stats query
on success, surfaces the underlying error (eg 'trust anchor
cert expired') in the modal on failure (modal stays open so
operator can retry).
- Admin gate: when authRequired && !admin the page renders an
'Admin access required' banner and the underlying admin API
requests are never issued (React Query enabled flag gated on
auth.admin) — server-side enforcement is M-008.
* web/src/api/types.ts: IntuneStatsSnapshot + IntuneTrustAnchorInfo +
IntuneStatsResponse + IntuneReloadTrustResponse.
* web/src/api/client.ts: getAdminSCEPIntuneStats +
reloadAdminSCEPIntuneTrust(pathID).
* web/src/main.tsx: new route /scep/intune. The route is unconditional;
the gating is at the page level so deep-links land cleanly.
* web/src/components/Layout.tsx: 'SCEP Intune' nav link between
Observability and Audit Trail with the appropriate sidebar icon.
Tests (Phase 9.5):
* internal/api/handler/admin_scep_intune_test.go (16 tests):
- M-008 admin-gate triplet for both Stats (GET) and ReloadTrust
(POST): NonAdmin / AdminExplicitFalse / AdminPermitted.
- Method-gate tests (Stats rejects POST, ReloadTrust rejects GET).
- Stats propagates service errors as 500.
- ReloadTrust maps ErrAdminSCEPProfileNotFound→404,
ErrSCEPProfileIntuneDisabled→409, generic err→500.
- Empty body targets legacy root PathID.
- Malformed JSON→400.
- AdminSCEPIntuneServiceImpl handles nil map + unknown PathID.
* web/src/pages/SCEPAdminPage.test.tsx (13 tests):
- Admin gate (non-admin sees gated banner + zero admin API calls;
admin sees the page; no-auth dev mode also passes).
- Profile rendering (counters with correct labels, expiry badge
tone for ≥30d / EXPIRED states, off-state pill for disabled
profiles, empty-state banner when no profiles configured).
- Reload modal (opens on click, calls mutation on Confirm,
keeps modal open + shows error on failure, Cancel skips
mutation).
- Error path renders ErrorState with retry.
- Audit log filter merges PKCSReq + RenewalReq events and sorts
descending.
Verification:
* gofmt clean on touched files
* go vet ./... clean
* staticcheck on intune/service/api/cmd-server clean
* go test -short across api+service+intune+cmd-server: all green
* web tsc --noEmit clean
* Vitest: SCEPAdminPage.test.tsx 13/13 + sibling page suites all
pass
* G-3 docs-drift CI guard: Phase 9 adds no new CERTCTL_* env vars
so the guard does not fire
* openapi-parity-test green (both new admin endpoints documented)
* M-008 regression scanner enforces the per-handler test triplet —
pin updated, all triplets present
Refs: cowork/scep-rfc8894-intune-master-prompt.md::Phase 9
cowork/scep-rfc8894-intune/progress.md
|
||
|
|
0594631e6a |
gui/cert-detail: revocation endpoints panel (CRL/OCSP) — Phase 5
CertificateDetailPage now surfaces a Revocation Endpoints card showing
the standards-compliant /.well-known/pki/crl/{issuer_id} CRL distribution
point (RFC 5280 §4.2.1.13) and /.well-known/pki/ocsp/{issuer_id} OCSP
responder URL (RFC 6960 §A.1) for relying parties that don't already know
certctl's well-known scheme.
Two action buttons exercise the same network path the issued leaves'
AIA/CDP extensions advertise, so an operator can confirm 'did the
backend Phases 1-4 actually wire end-to-end?' without curl:
* 'Test CRL fetch' — fetchCRL(issuer_id) helper, surfaces byte count
* 'Check OCSP status' — getOCSPStatus(issuer_id, serial_hex) helper
Admin-only cache-age badge: when useAuth().admin is true the panel pulls
GET /api/v1/admin/crl/cache (M-008 admin-gated handler) and shows
'Cache fresh · 2m ago' / 'Cache stale' / 'Not yet generated' next to
the heading. Non-admin callers don't trigger the fetch (gated client-side
on enabled flag, server-side on middleware.IsAdmin) so the badge cannot
leak generation cadence.
Test coverage in CertificateDetailPage.test.tsx pins:
1. CRL + OCSP URLs render with issuer_id substituted
2. Test CRL fetch button calls fetchCRL with the issuer_id and renders
the byte-count success message
3. Check OCSP status button calls getOCSPStatus with (issuer_id, serial)
and renders the DER byte-count
4. Admin badge stays HIDDEN (and getAdminCRLCache is NEVER called) when
useAuth().admin is false — pins the no-info-leak invariant
P-1 closure docblock + CI guardrail (.github/workflows/ci.yml) updated
to remove getOCSPStatus from the documented-orphan list since it now
has a real consumer.
types.ts: CRLCacheRow / CRLCacheEvent / CRLCacheResponse mirrors of the
backend admin handler payload (admin_crl_cache.go).
client.ts: fetchCRL + getAdminCRLCache helpers; getOCSPStatus already
existed and is now an active consumer.
Tests: 6/6 in CertificateDetailPage.test.tsx, 150/150 across api+page
suite. tsc --noEmit clean.
|
||
|
|
12adc97381 |
Bundle H follow-up #2: end-to-end fix for Pass 3 CI multi-match failures
Second CI run surfaced 8 real failures across 7 detail/list pages and 1
mock-shape error. Root causes:
1. Multi-match disambiguation. screen.getByText(...) matched both the
PageHeader <h2> AND duplicated text in InfoRow / detail-row spans
within the same page (e.g., issuer name appears as page title AND
in the Issuer Details panel; cert.common_name appears as page title
AND in the Common Name InfoRow). The regex variants (getByText(/X/i))
were even worse — matched any element containing the substring.
2. NetworkScanPage mock-shape. xssScanTarget.ports was '443,8443'
(string), but NetworkScanPage.tsx:180 calls t.ports?.join() which
requires a number[] per src/api/types.ts:506. Page errored before
rendering the DataTable, so the XSS test's body.textContent
assertion saw an empty string.
Fixes:
- Every page-title assertion in the 14 Pass 3 test files now uses
screen.getByRole('heading', { level: 2, name: ... }), which matches
ONLY the PageHeader <h2> (PageHeader.tsx:11 renders an actual <h2>).
Detail-row spans / InfoRow text / column-header text in lower-level
headings (h3) is excluded by the level filter.
- NetworkScanPage xssScanTarget.ports changed from '443,8443' (string)
to [443, 8443] (number[]) per the NetworkScanTarget TS type.
Pages with assertion fixes (8 tests across 7 files):
- AgentFleetPage /Agent/i -> 'Agent Fleet Overview' (h2)
- AuditPage /Audit/ -> 'Audit Trail' (h2)
- CertificateDetailPage 'plain.example.com' (text) -> heading h2
- HealthMonitorPage /Health/i -> 'Health Monitor' (h2)
- IssuerDetailPage 'Plain Name' (text) -> heading h2
- JobDetailPage /j-xss-001/ (text) -> heading h2
- JobsPage /Jobs/i -> 'Jobs' (h2)
- ProfilesPage /Profile/i -> 'Certificate Profiles' (h2)
- TargetDetailPage 'Plain Name' (text) -> heading h2
Plus 4 already-correct pages updated for consistency:
- DigestPage text 'Certificate Digest' -> heading h2
- ObservabilityPage text 'Observability' -> heading h2
- NetworkScanPage /Network/i -> 'Network Scanning' (h2)
- ShortLivedPage text 'Short-Lived...' -> heading h2
Mock-shape fix:
- NetworkScanPage.test.tsx ports: '443,8443' -> [443, 8443]
End-to-end audit:
Every Pass 3 test now anchors on the unambiguous PageHeader <h2>;
no remaining getByText() with regex or substring that could spuriously
multi-match. Mock data shapes verified against src/api/types.ts
interfaces (NetworkScanTarget, MetricsResponse, ManagedCertificate).
|
||
|
|
52a9e4977c |
Bundle H follow-up: fix Pass 3 test mock shape mismatches caught by CI
CI surfaced two real failures in the Pass 3 tests: 1. ObservabilityPage.test.tsx — tests 2 + 3 mocked getMetrics with only the uptime field, but ObservabilityPage.tsx:85 reads metrics.gauge .certificate_total. Test 2 silently 'passed' because the page error bailed out before any rendering took place — its assertions (no live <script>, __xss_pwned__ undefined) became vacuous; test 3 surfaced the actual TypeError. Fix: every getMetrics mock now returns the full MetricsResponse shape (gauge / counter / uptime) per src/api/types.ts :517 — sanity-checked against the actual TS interface. 2. CertificateDetailPage.test.tsx — the xssCert mock was missing updated_at, which CertificateDetailPage.tsx:605 reads through formatDateTime. formatDateTime tolerates undefined per utils.ts:6, so the page didn't throw, but the cert mock should mirror the real ManagedCertificate shape — added updated_at. Both fixes are mock-shape corrections; no production code changes. |
||
|
|
a5c4f42ec9 |
M-029 Pass 3 batch C (FINAL): T-1 tests for 5 list pages — Pass 3 complete
Closes M-029 Pass 3 fully. Every src/pages/*.tsx now has a *.test.tsx peer.
Audit recon: 'comm -23 <pages> <test-peers>' returns zero (all 14 T-1-deferred
pages now covered).
Test files added (each ships render-coverage + an XSS-hardening contract):
- HealthMonitorPage.test.tsx endpoint URL + last_error payloads
- JobsPage.test.tsx type / certificate_id / agent_id /
error_message payloads
- NetworkScanPage.test.tsx network_range / agent_id / last_scan_message
payloads
- ProfilesPage.test.tsx profile name / description / EKUs payloads
- AgentFleetPage.test.tsx agent name / hostname / OS / arch / IP
payloads (mirrors the M-003 MCP fence shape)
Pass 3 totals across batches A + B + C: 14 new test files, 14/14 T-1-deferred
pages closed. Each test pins three invariants:
1. The page renders against mock data without crashing.
2. No live <script data-xss='...'> attaches to the DOM.
3. The literal payload appears as escaped text content (proving the page
surfaces the data without rendering it as HTML).
M-029 status after Pass 3:
Pass 1 — useMutation -> useTrackedMutation COMPLETE (6 batches, 56 -> 0)
Pass 2 — useState pagination -> useListParams COMPLETE (CertificatesPage)
Pass 3 — XSS-hardening test suites COMPLETE (14/14 pages)
M-029 IS NOW READY TO CLOSE.
|
||
|
|
00168e009e |
M-029 Pass 3 batch B: T-1 tests for 4 detail pages — XSS hardening
Continues Pass 3. Each detail page has its own narrow attack surface
(subject DN, last_test_message, error_message) that the test exercises
with literal <script> payloads in every text field.
Test files added:
- CertificateDetailPage.test.tsx cert subject / SANs / serial / PEM
across 7 sidecar queries (getCertificate,
getCertificateVersions, getTargets,
getProfile, getProfiles, getRenewalPolicies,
getJobs all mocked in beforeEach)
- IssuerDetailPage.test.tsx issuer name / type / config / last_test_message
(router-aware test using Routes + useParams)
- TargetDetailPage.test.tsx target name / config / last_test_message
(router-aware test pattern)
- JobDetailPage.test.tsx job error_message / type / details
(3-query mock: getJob + getJobVerification +
getAuditEvents)
Closes 9 of 14 T-1-deferred pages toward M-029 Pass 3 completion (5 batch A,
+ 4 batch B = 9; 5 to go in batch C).
|
||
|
|
b676888242 |
M-029 Pass 3 batch A: T-1 page tests for 5 simpler pages — XSS hardening
Pass 3 of M-029 ships per-page render + XSS-hardening test suites for the
14 T-1-deferred pages. Each test:
- Renders the page with mock data containing <script> payloads in every
text-rendering field.
- Asserts no live <script data-xss='...'> element attached to the DOM.
- Asserts no global side-effect from the script body executed (window
__xss_pwned__ stays undefined).
- Asserts the literal payload text appears as escaped content (proving
the page surfaces the data without rendering it as HTML).
Batch A: 5 simpler pages (display-only / single-mutation / login).
Test files added:
- DigestPage.test.tsx preview HTML payload + render coverage
- LoginPage.test.tsx useAuth.error payload + form invariants
(mocked AuthProvider via Layout.test pattern)
- ShortLivedPage.test.tsx cert subject DN / SAN / id / environment
payloads through the DataTable rendering
- AuditPage.test.tsx audit-event action / actor / resource_*
payloads through the DataTable rendering
- ObservabilityPage.test.tsx health.status + Prometheus text payloads
through the <pre> rendering surface
Closes 5 of 14 T-1-deferred pages toward M-029 Pass 3 completion.
|
||
|
|
876f6bd48d |
M-029 Pass 2: migrate CertificatesPage to useListParams (Pass 2 complete)
M-029 Pass 2 surface turned out to be much smaller than the audit estimated:
the only page with real UI-driven pagination + filter state stored in
useState was CertificatesPage. Most other pages either fetch filter-dropdown
data with hardcoded per_page (sidecars, not pagination) or use
useSearchParams directly already. So Pass 2 is a single-page migration.
What changed:
- 9 useState hooks (statusFilter, envFilter, issuerFilter, ownerFilter,
profileFilter, teamFilter, expiresBefore, sortBy, page, perPage) collapse
into a single useListParams({ pageSize: 50 }) call.
- All filter onChange handlers now call setFilter('<key>', value).
- setFilter automatically resets page to 1 on every filter / sort change,
so the manual setPage(1) calls at three sites (team / expires_before /
sort) are no longer needed — the F-1 contract is now enforced by the
hook, not by hand-rolled setPage calls scattered through onChange.
- Pagination handler simplified: onPerPageChange: setPageSize (the hook
drops the page param from the URL when pageSize changes).
Behavior preserved:
- The 8 filter keys (status / environment / issuer_id / owner_id /
profile_id / team_id / expires_before / sort) still flow through
getCertificates with the same param names — pinned by the existing
CertificatesPage.test.tsx F-1 contract tests.
- Default pageSize stays at 50 (matches the F-1 baseline; the hook's
global default is 25 but the per-page override takes precedence).
- Page reset on filter / per_page change preserved (now hook-enforced).
Side benefit: filter / sort / pagination state is now URL-resident (browser
deep-link + back-button correct). Sharing a filtered list view is now a
URL copy, not a 'recreate this filter combo by hand' message.
Verification:
legacy useMutation count still 0 (Pass 1 invariant intact)
CertificatesPage useListParams 0 -> 1 site
CertificatesPage local pagination removed
|
||
|
|
213b464d95 |
M-029 Pass 1 batch 6 (FINAL): migrate 2 five-mutation pages — Pass 1 complete
Drains the last 10 useMutation sites (10 -> 0). Pass 1 is now COMPLETE:
every legacy useMutation site in src/pages and src/components has been
migrated to useTrackedMutation with explicit invalidates contract. The only
remaining useMutation reference in the codebase is inside useTrackedMutation.ts
itself (the wrapper).
Pages migrated:
- CertificateDetailPage.tsx 5 mutations across 2 components:
InlinePolicyEditor.saveMutation invalidates
[['certificate', certId]];
main page renew/deploy/archive/revoke invalidate
various combinations of [['certificate', id]]
and [['certificates']].
(queryClient + useQueryClient dropped from both)
- OnboardingWizard.tsx 5 mutations across 4 components:
Issuer step create/test invalidates [['issuers']]
(test refreshes last_tested_at server-side);
CreateTeamModalInline.create invalidates [['teams']];
CreateOwnerModalInline.create invalidates [['owners']];
CertificateStep.create invalidates
[['certificates'], ['dashboard-summary']].
(queryClient + useQueryClient dropped from all 4)
Verification:
legacy useMutation calls 10 -> 0 (-10) — Pass 1 COMPLETE
useTrackedMutation count 46 -> 61 (+15; some 5-mutation pages collapse
two invalidate-pairs into one array literal,
hence net is greater than the +10 removal)
Pass 1 totals: 56 useMutation sites -> 0; 0 useTrackedMutation -> 61.
Total work in Pass 1: 6 batches across 21 page files merged --no-ff to master.
|
||
|
|
190a27e824 |
M-029 Pass 1 batch 5: migrate 2 four-mutation pages to useTrackedMutation
Drains 8 more useMutation sites (18 -> 10). NetworkScanPage hoists the
shared invalidation array into scanTargetInvalidates const.
Pages migrated:
- IssuersPage.tsx test/delete/create/update all invalidate [['issuers']]
(testIssuerConnection updates last_tested_at
server-side, so the list refreshes; local
setTestResult banner still surfaces immediate result)
(queryClient + useQueryClient dropped)
- NetworkScanPage.tsx create/delete/toggle/scan all invalidate
[['network-scan-targets']] (hoisted to shared const)
(queryClient + useQueryClient dropped)
Verification:
legacy useMutation count 18 -> 10 (-8)
useTrackedMutation count 38 -> 46 (+8)
Closes 46 of 56 sites toward M-029 Pass 1 completion (82%).
|
||
|
|
ec3772d4e3 |
M-029 Pass 1 batch 4: migrate 5 more 3-mutation pages to useTrackedMutation
Drains 15 more useMutation sites (33 -> 18). All five pages follow the same
create/update/delete CRUD shape — invalidates the page's primary list query.
Pages migrated:
- OwnersPage.tsx CRUD invalidates [['owners']]
(queryClient kept — modal onSuccess props use it)
- PoliciesPage.tsx toggle/delete/create invalidates [['policies']]
(queryClient kept — modal onSuccess prop uses it)
- ProfilesPage.tsx CRUD invalidates [['profiles']]
(queryClient kept — modal onSuccess prop uses it)
- RenewalPoliciesPage.tsx CRUD invalidates [['renewal-policies']]
(queryClient + useQueryClient dropped)
- TeamsPage.tsx CRUD invalidates [['teams']]
(queryClient kept — modal onSuccess props use it)
Verification:
legacy useMutation count 33 -> 18 (-15)
useTrackedMutation count 23 -> 38 (+15)
Closes 38 of 56 sites toward M-029 Pass 1 completion (68%).
|
||
|
|
ee25f00207 |
M-029 Pass 1 batch 3: migrate 3 three-mutation pages to useTrackedMutation
Drains 9 more useMutation sites (42 -> 33). HealthMonitorPage hoists the
shared invalidation pair into a healthCheckInvalidates const so the three
mutations don't repeat the array literal.
Pages migrated:
- HealthMonitorPage.tsx create + delete + acknowledge all invalidate
[['health-checks'], ['health-checks-summary']]
(hoisted to a shared const)
- AgentGroupsPage.tsx delete + create + update all invalidate [['agent-groups']]
(queryClient kept — modal onSuccess props still use it)
- JobsPage.tsx cancel + approve + reject all invalidate [['jobs']]
Verification:
legacy useMutation count 42 -> 33 (-9)
useTrackedMutation count 14 -> 23 (+9)
Closes 23 of 56 sites toward M-029 Pass 1 completion.
|
||
|
|
e0a3d50f5e |
M-029 Pass 1 batch 2: migrate 5 two-mutation pages to useTrackedMutation
Drains 10 more useMutation sites (52 -> 42). Each migration declares explicit
invalidates per the M-009 contract.
Pages migrated:
- DashboardPage.tsx previewDigest + sendDigest both 'noop' (read-only
preview / fire-and-forget email — no client cache impact)
- DiscoveryPage.tsx claim + dismiss both invalidate
[['discovered-certificates'], ['discovery-summary']]
- NotificationsPage.tsx markRead + requeue both invalidate [['notifications']]
- TargetDetailPage.tsx update + testConnection both invalidate [['target', id]]
- TargetsPage.tsx createTarget + deleteTarget both invalidate [['targets']]
Verification:
legacy useMutation count 52 -> 42 (-10)
useTrackedMutation count 4 -> 14 (+10)
Closes 14 of 56 sites toward M-029 Pass 1 completion.
|
||
|
|
2057e76706 |
M-029 Pass 1 batch 1: migrate 4 single-mutation pages to useTrackedMutation
Drains the Bundle 8 useMutation backlog (56 -> 52). Each migration declares
explicit invalidates per the M-009 contract; the wrapper invalidates BEFORE
calling the caller's onSuccess so user code drops the redundant qc.invalidateQueries.
Pages migrated:
- AgentsPage.tsx invalidates: [['agents'], ['agents', 'retired']]
- CertificatesPage.tsx invalidates: [['certificates']]
- DigestPage.tsx invalidates: 'noop' (sendDigest is a server-side email
dispatch — no client query reflects digest-send state)
- IssuerDetailPage.tsx invalidates: [['issuer', id]] (testIssuerConnection
updates last_tested_at server-side)
Verification:
legacy useMutation count 56 -> 52 (-4 sites)
useTrackedMutation count 0 -> 4 (+4 sites)
invalidation surface 82 -> 84 (+2; DigestPage is noop, AgentsPage
collapses 2 invalidates into 1 array, others +1)
Closes 4 of 56 sites toward M-029 Pass 1 completion.
|
||
|
|
7013227a34 |
test(web): Vitest coverage for 8 high-leverage pages (T-1 master)
Closes T-1 (cat-s2-c24a548076c6) — frontend page-level Vitest coverage was
3 of 28 pages pre-T-1. T-1 lifts that to 11 of 28 (39%) by writing focused
behavior tests for the 8 highest-leverage pages.
Tests added:
- CertificatesPage.test.tsx (6 cases) — F-1 filter+pagination contract:
team_id / expires_before / sort param wiring, page=1 reset on filter
change, page+per_page always present in getCertificates params.
- PoliciesPage.test.tsx (4 cases) — D-006/D-008 TitleCase contract:
list render, severity badge, toggle-enabled inversion, delete confirm.
- IssuersPage.test.tsx (3 cases) — D-2 phantom-trim + B-1 EditIssuer:
list render, StatusBadge derives from enabled, Test fires
testIssuerConnection.
- TargetsPage.test.tsx (3 cases) — D-2 phantom-trim:
list render, Status derives from enabled, Delete fires deleteTarget.
- AgentsPage.test.tsx (3 cases) — D-2 phantom-trim + heartbeatStatus:
list render, undefined last_heartbeat_at -> Offline,
listRetiredAgents lazy-loaded.
- AgentDetailPage.test.tsx (3 cases) — D-2 phantom-trim:
fetches by URL :id, Registered row reads registered_at,
Capabilities + Tags sections absent.
- OwnersPage.test.tsx (3 cases) — B-1 EditOwnerModal closure:
list render, Edit opens modal, Save fires updateOwner.
- TeamsPage.test.tsx (2 cases) — B-1 EditTeamModal closure.
- AgentGroupsPage.test.tsx (2 cases) — B-1 EditAgentGroupModal closure.
- RenewalPoliciesPage.test.tsx (3 cases) — B-1 brand-new-page closure:
list + alert_thresholds_days display, Create modal, Edit modal.
- DiscoveryPage.test.tsx (3 cases) — I-2 claim/dismiss closure:
list render, status filter wiring, Dismiss fires dismissDiscoveredCertificate.
CI guardrail: .github/workflows/ci.yml step "Frontend page-coverage
regression guard (T-1)" blocks new pages from landing without sibling
.test.tsx unless added to a 14-name deferred allowlist with one-line
"why deferred" justifications.
Net coverage: 13 page-level vitest cases -> ~35 page-level vitest cases
across 14 files (was 3); total project tests 302 -> 337.
See coverage-gap-audit-2026-04-24-v5/unified-audit.md
cat-s2-c24a548076c6 for closure rationale.
|
||
|
|
94ca69554b |
feat(web): expand CertificatesPage filters + reusable DataTable pagination (F-1 master)
Closes two 2026-04-24 audit findings (P2):
- cat-e-610251c8f72d: CertificatesPage exposed only 5 of the
backend handler's 17 supported query filters. Audit recommended
minimum-add: team_id (already first-class elsewhere),
expires_before (drives the "expiring in N days" workflow), and
sort (sort by notAfter for the most common operator triage).
Fix: 3 new useState hooks + 3 new filter UIs in the toolbar +
3 new param wires. Remaining filters (agent_id, expires_after,
created_after, updated_after, cursor, fields, sort_desc) deferred
until a consumer use case demands them — over-stuffing the
toolbar is its own UX cost.
- cat-k-e85d1099b2d7: CertificatesPage rendered the first 50
certs returned by the backend with no way to advance. Backend
response carries {data, total, page, per_page} — a pure render
gap. Fix: lifted pagination into the reusable DataTable
component as an opt-in `pagination?` prop. CertificatesPage is
the first consumer; TargetsPage / IssuersPage / OwnersPage /
others can adopt by passing the same prop.
DataTable changes:
- New `PaginationProps` interface (page, perPage, total,
onPageChange, onPerPageChange?, perPageOptions?).
- New optional `pagination?` prop on DataTable.
- New `PaginationControls` subcomponent rendered in the table
footer when `pagination` is set and `total > 0`. Renders
"Showing X–Y of Z" + per-page selector + page counter +
Prev/Next buttons. Disabling logic guards both boundaries.
CertificatesPage changes:
- 3 new filter useState hooks: teamFilter, expiresBefore, sortBy.
- 2 new pagination useState hooks: page (1), perPage (50).
- Added 4th cohort hook: getTeams via useQuery (mirrors the
existing issuers/owners/profiles filter-data pattern).
- params object gains team_id, expires_before, sort, page, per_page.
- 3 new filter UIs in the toolbar (team select, expires_before
date picker, sort select).
- DataTable gets the new pagination prop.
- Filter changes reset page=1 to keep results visible.
Verification:
- tsc --noEmit — clean
- vitest run — 9 files, 302 tests passing (no regression)
- golangci-lint v2.11.4 run ./... — 0 issues
- All sibling guardrails (S-1, G-3, D-1+D-2, B-1, L-1, H-1, C-1) pass
Audit findings closed:
- cat-e-610251c8f72d (P2)
- cat-k-e85d1099b2d7 (P2)
Deferred follow-ups:
- 8 backend filters (agent_id, expires_after, created_after,
updated_after, cursor, fields, sort_desc, plus secondary sort
fields) deferred until consumer demand justifies UI weight.
- TargetsPage / IssuersPage / OwnersPage / etc. opt-in to the
pagination prop incrementally — DataTable now supports it; per-
page adoption is a follow-up commit each.
- CertificatesPage Vitest coverage of the new filter+pagination
paths deferred to the per-page test campaign (cat-s2-c24a548076c6).
|
||
|
|
55eb7135be |
fix(web,ci): close TS↔Go type drift across 5 entities (D-2 master)
Closes five 2026-04-24 audit findings (all P2, all category cat-f /
diff-05x06-*) by reconciling the TypeScript interfaces in
web/src/api/types.ts with the on-wire JSON shape Go's
internal/domain/*.go structs actually emit. D-1 closed the same pattern
for one entity (Certificate / ManagedCertificate); D-2 covers the
remaining five.
Per-entity verdicts (audit's "stricter side is the contract"):
Agent — TRIM 5 phantoms (last_heartbeat, capabilities, tags,
created_at, updated_at). Go emits last_heartbeat_at only.
Target — ADD 2 (retired_at?, retired_reason?) — I-004 fields.
DiscCert — ADD pem_data? — real field, real Go emit, omitempty.
Issuer — TRIM phantom status. Go has Enabled bool only.
Notif — TRIM phantom subject. Go has Message string only.
Certificate — verify-only; D-1 closure confirmed clean at recon.
Consumer fixes (same commit as the trim):
- AgentDetailPage.tsx — remove dead Capabilities + Tags sections (always
rendered empty); replace agent.created_at/updated_at row with the
Go-emitted registered_at; widen heartbeatStatus() to accept undefined.
- AgentsPage.tsx — same heartbeatStatus widening.
- IssuersPage.tsx + IssuerDetailPage.tsx — issuerStatus() now derives
from `enabled` exclusively; the dead `issuer.status || 'Unknown'`
fallback is gone.
- NotificationsPage.tsx — drop dead `|| n.subject` fallback.
- NotificationsPage.test.tsx — drop dead `subject:` from mocks.
- api/utils.ts::timeAgo widened to accept string | undefined | null.
- api/types.test.ts — Agent (I-004) fixture trimmed of the 5 phantoms.
Tests (Vitest):
- 5 new describe blocks in web/src/api/types.test.ts:
- Agent interface (D-2 phantom-fields trim) — 2 it blocks
- Target interface (D-2 retirement fields) — 2 it blocks
- DiscoveredCertificate interface (D-2 pem_data ADD) — 2 it blocks
- Issuer interface (D-2 status phantom trim) — 1 it block
- Notification interface (D-2 subject phantom trim) — 1 it block
- Each block uses the literal-construction pattern from D-1; trimmed
fields are pinned via excess-property comments that compile-fail when
uncommented if a phantom is reintroduced.
CI regression guardrail:
- .github/workflows/ci.yml — existing D-1 step renamed to "Forbidden
StatusBadge dead-key + TS phantom-field regression guard (D-1 + D-2)".
Three new awk-windowed greps over Agent / Issuer / Notification
interfaces in types.ts. The Agent grep includes a `grep -v
'last_heartbeat_at'` filter to avoid false positives on the
legitimate Go-emitted heartbeat field.
Documentation:
- CHANGELOG.md — new D-2 section above B-1 under [unreleased] with full
Added/Removed/Audit findings closed/Known follow-ups breakdown.
- docs/architecture.md — Web Dashboard section gains a new "TS ↔ Go
type contract rule (D-1 + D-2 closure)" paragraph capturing the
stricter-side-wins rule and the CI guardrail it's anchored by.
- coverage-gap-audit-2026-04-24-v5/unified-audit.md — Live Tracker score
20/47 → 25/47 (P2: 6/27 → 11/27). Per-finding ✅ RESOLVED Status
blocks added to all 5 diff-05x06-* entries plus the verify-only
Certificate entry. Closed-bundle index gets D-2 row.
Verification (all gates green):
- cd web && tsc --noEmit → clean
- cd web && vitest run --reporter=dot → 9 files, 302 tests passing
(was 294 → +8 D-2 cases)
- cd web && vite build → clean
- go vet ./internal/... ./cmd/... → clean (no Go touched)
- golangci-lint v2.11.4 run ./... → 0 issues
- D-2 Agent guardrail dry-run → empty (good)
- D-2 Issuer guardrail dry-run → empty (good)
- D-2 Notification guardrail dry-run → empty (good)
- D-2 Target ADD-shape sanity → 2 retirement fields present
- D-2 DiscCert ADD-shape sanity → pem_data present
- D-1 Certificate guardrail still clean → empty (good)
- OpenAPI YAML parses → 89 paths
Audit findings closed:
- diff-05x06-7cdf4e78ae24 (P2, Agent TS↔Go drift)
- diff-05x06-2044a46f4dd0 (P2, Target TS↔DeploymentTarget Go drift)
- diff-05x06-85ab6b98a2f7 (P2, DiscoveredCertificate TS↔Go drift)
- diff-05x06-97fab8783a5c (P2, Issuer TS↔Go drift)
- diff-05x06-caba9eb3620e (P2, Notification TS↔NotificationEvent drift)
- diff-05x06-af18a8d7ef41 (P2) — verified clean since D-1; no edit
Deferred follow-ups:
- Issuer richer status view (enabled × test_status) — UX scope, not drift.
- Real Agent metadata (capabilities, tags) — backend feature, not drift.
- DiscoveredCertificate pem_data list-response perf — separate backend change.
|
||
|
|
097995e503 |
fix(web,ci): close orphan-CRUD GUI gaps + dead exportCertificatePEM (B-1 master)
Closes four 2026-04-24 audit findings via per-page Edit modals on five
existing pages, a brand-new RenewalPoliciesPage for the rp-* CRUD surface,
and removal of one dead duplicate so the public client surface stops
growing without consumers. Anchored by a CI grep guardrail that fails
the build if any of the eight previously-orphan client functions loses
its non-test page consumer or if exportCertificatePEM is resurrected.
Per-page Edit modals (mirroring existing CreateXModal scaffolding):
- web/src/pages/OwnersPage.tsx — EditOwnerModal (name/email/team_id)
- web/src/pages/TeamsPage.tsx — EditTeamModal (name/description)
- web/src/pages/AgentGroupsPage.tsx — EditAgentGroupModal (full match-rule
set: name/description/match_os/match_architecture/match_ip_cidr/
match_version/enabled)
- web/src/pages/IssuersPage.tsx — EditIssuerModal (rename-only; type
locked, config blob preserved untouched, footer note about delete+
recreate for credential rotation)
- web/src/pages/ProfilesPage.tsx — EditProfileModal (rename + description
only; policy fields preserved untouched, footer note about deferred
policy editing)
New page (closes cat-b-4631ca092bee — RenewalPolicy CRUD orphan):
- web/src/pages/RenewalPoliciesPage.tsx — full CRUD page with shared
PolicyFormModal for Create + Edit (form shape identical), 7-column
DataTable (Policy/RenewalWindow/Auto/Retries/AlertThresholds/Created/
Actions), comma-separated alert_thresholds_days input parser, and
alert() surfacing of repository.ErrRenewalPolicyInUse (409) on Delete
so operators can re-target dependent certs before deletion.
- web/src/main.tsx — adds /renewal-policies route.
- web/src/components/Layout.tsx — adds sidebar nav item slotted between
Policies and Profiles.
Removed (closes cat-b-9b97ffb35ef7 — dead duplicate):
- web/src/api/client.ts::exportCertificatePEM — zero consumers across
web/, MCP, CLI, tests; downloadCertificatePEM is the actual call site
in CertificateDetailPage. Test references in client.test.ts and
client.error.test.ts also removed.
CI regression guardrail:
- .github/workflows/ci.yml — adds 'Forbidden orphan-CRUD client function
regression guard (B-1)' step. Greps for all eight previously-orphan
fns (updateOwner/updateTeam/updateAgentGroup/updateIssuer/updateProfile
+ createRenewalPolicy/updateRenewalPolicy/deleteRenewalPolicy) under
web/src/pages/ and fails the build if any has zero non-test consumers.
Also blocks resurrection of exportCertificatePEM. Verified locally
(all 8 fns have ≥2 consumers; exportCertificatePEM is gone) and
against synthetic regressions.
Documentation:
- CHANGELOG.md — new B-1 section above L-1 under [unreleased].
- docs/architecture.md — Web Dashboard section gains a new paragraph
capturing the 'every backend CRUD must have a GUI consumer' rule
with reference to the CI guardrail.
- coverage-gap-audit-2026-04-24-v5/unified-audit.md — flips four
findings to ✅ RESOLVED with detailed Status blocks; bumps Live
Tracker score 16/47 → 20/47 (P1: 9→12, P3: 1→2); adds B-1 row to
closed-bundle index.
Verification:
- cd web && tsc --noEmit — clean
- cd web && vitest run — 9 test files, 294 tests, all passing
- cd web && vite build — clean (no new warnings)
- B-1 guardrail dry-run — all 8 client fns have ≥2 page consumers,
exportCertificatePEM removed (good), FAIL=0
Audit findings closed:
- cat-b-31ceb6aaa9f1 (P1, updateOwner/updateTeam/updateAgentGroup orphan)
- cat-b-7a34f893a8f9 (P1, updateIssuer/updateProfile orphan, rename-only)
- cat-b-4631ca092bee (P1, RenewalPolicy CRUD orphan)
- cat-b-9b97ffb35ef7 (P3, exportCertificatePEM dead duplicate)
Deferred follow-ups:
- Fuller EditIssuerModal with credential-rotation flow (needs threat
model: rotation reuse window, in-flight CSR cancellation, audit-trail
granularity).
- Fuller EditProfileModal with policy-field editing (max-TTL, allowed
EKUs, allowed key algorithms — affect already-issued cert evaluation).
- Per-page Vitest coverage for the new Edit modals (CI grep guardrail
catches the same regression vector at lower cost).
|
||
|
|
f0865bb051 |
fix(api,web,mcp): add bulk-renew + bulk-reassign endpoints, drop client-side N×HTTP loops (L-1 master)
Two audit findings, both category cat-l, both rooted in
web/src/pages/CertificatesPage.tsx. Pre-L-1 the GUI looped per-cert
HTTP calls — 100 selected certs = 100 sequential round-trips × ~50–200
ms each = a 5–20-second wedge during which the operator stared at a
progress bar. Post-L-1 each workflow is a single POST.
cat-l-fa0c1ac07ab5 [P1, primary] — bulk renew loop
handleBulkRenewal: for/await triggerRenewal(id)
cat-l-8a1fb258a38a [P2] — bulk reassign loop
handleReassign: for/await updateCertificate(id, {owner_id})
The bulk-revoke endpoint (POST /api/v1/certificates/bulk-revoke +
BulkRevocationCriteria/Result) already existed as the canonical shape
in v2.0.x — L-1 ports that pattern to renew + reassign with per-action
twists.
Backend (Go)
- internal/domain/bulk_renewal.go: BulkRenewalCriteria mirrors
BulkRevocationCriteria (criteria + IDs modes); BulkRenewalResult
envelope adds EnqueuedJobs[] for per-cert {certificate_id, job_id};
shared BulkOperationError type for all bulk paths.
- internal/domain/bulk_reassignment.go: narrower shape — IDs-only,
owner_id required, team_id optional.
- internal/service/bulk_renewal.go::BulkRenewalService.BulkRenew:
resolves criteria → status filter (Archived/Revoked/Expired/
RenewalInProgress all silent-skip) → per-cert status flip + job
create. Keygen-mode-aware so jobs land in the same initial status
as single-cert TriggerRenewal. Single bulk audit event per call,
not N.
- internal/service/bulk_reassignment.go::BulkReassignmentService.
BulkReassign: validates owner_id upfront via the
ErrBulkReassignOwnerNotFound typed sentinel — non-existent owner
returns 400 before any cert is touched. Already-owned-by-target
is silent-skip. Single bulk audit event.
- internal/api/handler/{bulk_renewal,bulk_reassignment}.go: HTTP
shape mirrors bulk_revocation.go. NOT admin-gated (renew is non-
destructive; reassign is a common-case workflow). Sentinel-error
→ 400 mapping for OwnerNotFound.
- internal/api/router/router.go: three bulk-* routes registered as a
block before the {id} routes. HandlerRegistry gains BulkRenewal +
BulkReassignment fields.
- cmd/server/main.go: NewBulkRenewalService threads cfg.Keygen.Mode
so bulk-renew jobs land in same initial state as single-cert path.
Frontend
- web/src/api/client.ts: bulkRenewCertificates(criteria) +
bulkReassignCertificates(request) functions with full TS types.
- web/src/pages/CertificatesPage.tsx: handleBulkRenewal + handleReassign
rewritten from N-call loops to single calls. Result envelope drives
progress UI; first-error message surfaced when total_failed > 0.
Stale triggerRenewal + updateCertificate imports removed.
MCP
- internal/mcp/types.go: BulkRenewCertificatesInput +
BulkReassignCertificatesInput.
- internal/mcp/tools.go: certctl_bulk_renew_certificates +
certctl_bulk_reassign_certificates tools mirroring the existing
certctl_bulk_revoke_certificates pattern.
OpenAPI
- api/openapi.yaml: two new operations (bulkRenewCertificates,
bulkReassignCertificates) under Certificates tag. Four new schemas
(BulkRenewRequest, BulkRenewResult, BulkEnqueuedJob,
BulkReassignRequest, BulkReassignResult).
Tests
- Domain: BulkRenewalCriteria.IsEmpty + BulkReassignmentRequest.IsEmpty
IsEmpty contracts; JSON round-trip shape pinning.
- Service: 7 BulkRenew tests (happy/criteria-mode/skips-RenewalInProgress/
skips-revoked-archived/empty-criteria-error/partial-failure/
audit-event-emitted) + 8 BulkReassign tests (happy/skips-already-
owned/owner-required/empty-IDs/owner-not-found-sentinel/team-id-
optional/team-id-provided/partial-failure/audit-event-emitted).
- Handler: 5 BulkRenew handler tests (happy/empty-body-400/wrong-
method-405/actor-attribution/service-error-500) + 6 BulkReassign
handler tests (happy/empty-IDs-400/missing-owner-400/owner-not-
found-400-via-sentinel/wrong-method-405/generic-error-500).
CI guardrail
- .github/workflows/ci.yml: 'Forbidden client-side bulk-action loop
regression guard (L-1)'. Greps web/src/pages/CertificatesPage.tsx
for 'for(...) await triggerRenewal(...)' and 'for(...) await
updateCertificate(...)' patterns; comment lines exempt; test files
exempt. Verified locally (passes against post-fix tree, fires
against synthetic regression).
Counts (deltas)
- Routes: 119 → 121 (+2)
- OpenAPI operations: 123 → 125 (+2)
- MCP tools: 83 → 85 (+2)
Performance
- 100-cert bulk-renew: ~10s of sequential HTTP → ~100ms (99% latency
reduction on the canonical operator workflow).
- Audit event volume: 1 + N per operation → 1.
Out of scope (deferred follow-ups)
- cat-b-31ceb6aaa9f1: updateOwner/updateTeam/updateAgentGroup orphan
(different shape — wire existing PUT to GUI, not new bulk endpoint).
- cat-k-e85d1099b2d7: CertificatesPage no pagination UI.
- cat-i-b0924b6675f8: MCP missing claim/dismiss/acknowledge (L-1 added
2 new tools but does not close that finding).
Verification
- go build / vet / test -short / test -short -race all clean.
- web tsc --noEmit + vitest run all clean (296 tests passing).
- OpenAPI YAML parses (89 paths, 125 ops).
- L-1 CI guardrail passes against post-fix tree, fires against
synthetic regression.
No push.
|
||
|
|
9dc0742e77 |
fix(web): close StatusBadge enum drift + Certificate TS phantom fields (D-1 master)
Five audit findings, all category cat-d or cat-f, all rooted in two
frontend files. The dashboard silently lied:
cat-d-359e92c20cbf [P1, primary] — Agent: 'Stale' dead key + 'Degraded'
neutral fallthrough
cat-d-9f4c8e4a91f1 [P2] — Notification: 'dead' missing
cat-d-1447e04732e7 [P3] — Cert: 'PendingIssuance' dead key
cat-f-cert_detail_page_key_render_fallback [P2] — render-site reads
cert.key_algorithm directly
cat-f-ae0d06b6588f [P2] — Certificate TS phantom fields (root cause)
Pre-D-1, agents in the only Go AgentStatus that means 'needs operator
attention' (Degraded) rendered as default neutral grey because StatusBadge
mapped 'Stale' (a key Go has never emitted) to yellow. Dead-letter
notifications visually equated with 'read' (operator-acknowledged). The
Certificate badge map carried a 'PendingIssuance' key no Go enum emits.
CertificateDetailPage's Key Algorithm and Key Size rows always rendered
'—' even when the data was a single fetch away — the lookup went through
cert.key_algorithm / cert.key_size directly, both phantom Certificate TS
fields. Trim the TS type so the missing-data case is explicit; fix the
render site to use latestVersion?.field; pin the contract with a 38-case
Vitest property test that walks every Go enum.
StatusBadge (web/src/components/StatusBadge.tsx)
- Drop 'Stale' (Agent dead key) + 'PendingIssuance' (Cert dead key).
- Add 'Degraded' (Agent → badge-warning) + 'dead' (Notification → badge-danger).
- Add leading docblock naming Go-side source-of-truth file for every
status family and pointing at the property test as regression vector.
Property test (web/src/components/StatusBadge.test.tsx — 38 cases)
- Iterates every Go-emitted enum value (AgentStatus, CertificateStatus,
JobStatus, NotificationStatus, DiscoveryStatus, HealthStatus) plus the
two frontend-synthesized Enabled/Disabled labels, asserts every value
gets a non-default class (or an explicit 'badge badge-neutral' for the
five intentionally-neutral terminal values: Archived, Cancelled,
Dismissed, read, unknown).
- Negative assertions: 'Stale' and 'PendingIssuance' must fall through
to the dictionary default — re-adding either key surfaces here.
- Specific UX-correctness assertions: 'dead' → badge-danger,
'Degraded' → badge-warning.
- Unknown-status fallthrough preserves label text.
Certificate TS trim (web/src/api/types.ts)
- Drop serial_number?, fingerprint_sha256?, key_algorithm?, key_size?,
issued_at? from Certificate. Go's ManagedCertificate has never carried
these — they live on CertificateVersion. Post-trim a cert.X access for
any of the five fields is a TS compile error.
- Leading docblock cross-references the closure rationale and the
latestVersion fallback pattern.
Render-site fix (web/src/pages/CertificateDetailPage.tsx)
- Key Algorithm / Key Size rows now read latestVersion?.key_algorithm /
latestVersion?.key_size, mirroring the existing latestVersion fallback
used a few lines above for serial_number / fingerprint_sha256.
- The same edit also tightened the serial / fingerprint / issued_at
derivations to drop the now-impossible 'cert.X || latestVersion?.X'
cert-side leg (cert.serial_number is a TS error post-trim).
Type-test regression (web/src/api/types.test.ts)
- Certificate literal construction pinned post-trim — adding any of the
five fields back makes the literal an excess-property TS error.
- Sibling CertificateVersion literal pinning the trimmed fields still
live on the version envelope (so the CertificateDetailPage fallback
path can't break).
OpenAPI (api/openapi.yaml)
- ManagedCertificate schema unchanged — was already correct (no phantom
fields). Added a leading comment cross-referencing the D-5 closure for
future readers.
CI guardrail (.github/workflows/ci.yml)
- 'Forbidden StatusBadge dead-key + Certificate phantom-field regression
guard (D-1)'. Two grep blocks: catches Stale/PendingIssuance map
literals in StatusBadge.tsx; uses an awk-scoped window over the
'export interface Certificate {' block in types.ts to catch the five
phantom fields reappearing while explicitly excluding CertificateVersion
(which legitimately carries them). Comments + test files exempt.
Verification
- Backend build/vet/test -short -race all clean across handler/router/
middleware packages.
- Frontend tsc --noEmit clean.
- Vitest 256 → 296 tests (+40: 38 from new StatusBadge test, 2 from D-5
Certificate trim regression in types.test.ts).
- OpenAPI YAML parses (87 paths).
- Both CI guardrail patterns clear on the post-fix tree; both fire
against synthetic regression patterns (re-add Stale → fires; re-add
serial_number? to Certificate → fires).
Out of scope (deferred)
- diff-05x06-* type drifts for Agent/DeploymentTarget/Notification/
DiscoveredCertificate/Issuer TS interfaces. Per-type field-by-field
Go ↔ TS diff is codegen-shaped, not edit-shaped — warrants its own
D-2 master prompt. Noted in CHANGELOG follow-ups section.
|
||
|
|
9834b4e4a4 |
G-1: renewal-policies API + frontend FK-drift fix
Three frontend call sites (OnboardingWizard.tsx:603, CertificatesPage.tsx:52,
CertificateDetailPage.tsx:169) populated the renewal_policy_id dropdown from
getPolicies() — the compliance-rule endpoint returning pol-* IDs — which
violated the FK managed_certificates.renewal_policy_id REFERENCES
renewal_policies(id) ON DELETE RESTRICT. Create would fail pg 23503 at insert.
Backend (new):
- RenewalPolicyRepository CRUD + ListAll/ExistsByID (pg 23503 → ErrRenewalPolicyInUse
→ HTTP 409; pg 23505 → ErrRenewalPolicyDuplicateName → HTTP 409)
- RenewalPolicyService with repo-only constructor. Service sentinels
var-alias the repo sentinels so errors.Is walks across layers.
- RenewalPolicyHandler with validation bounds: name 1–255;
renewal_window_days [1,365] default 30; max_retries [0,10] not defaulted;
retry_interval_seconds [60,86400] default 3600; alert_thresholds_days
[0,365] default [30,14,7,0]. Auto-generated IDs rp-<slug(name)>.
- Router registers 5 routes under /api/v1/renewal-policies[/{id}].
Frontend:
- CertificatesPage/CertificateDetailPage/OnboardingWizard now call
getRenewalPolicies() and render rp-* IDs.
- client.ts adds getRenewalPolicies/createRenewalPolicy/updateRenewalPolicy/
deleteRenewalPolicy. types.ts adds the RenewalPolicy shape.
OpenAPI: RenewalPolicies tag + 5 operations + 3 schemas (RenewalPolicy,
RenewalPolicyCreateRequest, RenewalPolicyUpdateRequest). 409 responses
on create/update duplicate-name and delete FK-in-use.
No migration — renewal_policies table already exists from the initial
schema (000001).
Tests:
- internal/service/renewal_policy_test.go: CRUD + validation + sentinel
error wrapping.
- internal/api/handler/renewal_policy_handler_test.go: handler endpoint
contracts including 400/404/409.
- web/src/api/client.test.ts: 4 subtests covering the 4 new API functions.
Phase 3 gates all green: go vet, build, short tests, race tests (service/
handler/router/scheduler), staticcheck (G-1 packages), govulncheck (0
reachable), coverage (service 69.7%, handler 79.0%, domain 86.9%,
middleware 80.6% — all above thresholds), tsc, vitest (256 passed),
vite build, OpenAPI structural validation.
|
||
|
|
675b87ba63 |
I-005: notification retry loop + dead-letter queue
Critical alerts can no longer be silently dropped by a transient
notifier failure. Failed notification attempts now ride an exponential
backoff retry loop, with a 5-attempt budget before promotion to the
dead-letter queue for operator intervention.
Schema (migration 000016, idempotent):
- retry_count INTEGER NOT NULL DEFAULT 0
- next_retry_at TIMESTAMPTZ
- last_error TEXT
- idx_notification_events_retry_sweep partial index
(next_retry_at) WHERE status='failed' AND next_retry_at IS NOT NULL
Dead rows clear next_retry_at so the index stops matching them.
Service contract:
- NotificationService.RetryFailedNotifications drives 2^n-minute
exponential backoff capped at 1h (notifRetryBackoffCap) with
5-attempt budget (notifRetryMaxAttempts).
- Exhaustion (RetryCount >= notifRetryMaxAttempts-1) promotes to
status='dead' via MarkAsDead.
- Non-terminal failures record via RecordFailedAttempt.
- Success path promotes to 'sent' without touching retry_count
(audit preserves "delivered on attempt N").
- Missing-notifier branch defensively promotes to 'sent' to avoid
wedging a row on a deleted channel.
- RequeueNotification operator escape hatch atomically resets
retry_count -> 0, next_retry_at -> NULL, last_error -> NULL,
status -> pending via notifRepo.Requeue.
Scheduler:
- New always-on notificationRetryLoop wired into the base loop set at
CERTCTL_NOTIFICATION_RETRY_INTERVAL (default 2m).
- sync/atomic.Bool idempotency guard.
- sync.WaitGroup shutdown drain via WaitForCompletion.
StatsService:
- SetNotifRepo setter pattern preserves 9 pre-existing
NewStatsService call sites (main.go + stats_test.go + 8 digest
tests) without touching the constructor signature.
- DashboardSummary.NotificationsDead populated via
notifRepo.CountByStatus(ctx, "dead") — nil-safe when unwired
(reports zero on systems without a notification repository).
- CountByStatus error is non-fatal (dashboard summary is
best-effort for this field).
- Prometheus certctl_notification_dead_total counter emitted from
the same snapshot.
Handler:
- New POST /api/v1/notifications/{id}/requeue endpoint.
- dead status surfaces to MCP + CLI.
Frontend:
- NotificationsPage gains two-tab toolbar ("All" / "Dead letter")
with queryKey: ['notifications', activeTab] so switching tabs
doesn't serve stale data until the 30s refetch.
- Dead rows surface "Retry {n}/5" + truncated last_error with
full-text title tooltip.
- Requeue mutation wrapped as
mutationFn: (id: string) => requeueNotification(id)
to prevent react-query v5's positional context argument from
leaking into the API client — pinned against future refactors
by strict-match toHaveBeenCalledWith('notif-dead-001') in
NotificationsPage.test.tsx:181.
Closes I-005.
|
||
|
|
707d8de4fb |
UX-001: sidebar re-entry + inline team/owner creation in wizard
Closes UX-001 (OnboardingWizard CertificateStep dead-end): users no
longer have to navigate away from the wizard and lose their in-flight
state when the required Owner/Team dropdowns are empty.
Layout.tsx
- Adds persistent 'Setup guide' button in the left sidebar.
- Clears localStorage 'certctl:onboarding-dismissed' then navigates
to /?onboarding=1 as a re-entry signal that overrides dismissal.
- localStorage.removeItem wrapped in try/catch to tolerate storage
access errors (private browsing, quota, etc.).
DashboardPage.tsx
- Reads ?onboarding=1 via useSearchParams as a forceOnboarding flag.
- forceOnboarding bypasses the latched first-run gate so the wizard
reopens even after dismissal or with certs/issuers already present.
- onDismiss now also strips ?onboarding=1 via setSearchParams(next,
{ replace: true }) so a page refresh does not relaunch the wizard.
OnboardingWizard.tsx
- Adds CreateTeamModalInline and CreateOwnerModalInline inside
CertificateStep. Both wire through React Query: createTeam /
createOwner mutation on success invalidates ['teams'] / ['owners']
and calls onCreated(id) so the parent select auto-selects the new
row as soon as the refetch lands.
- '+ New team' and '+ New owner' buttons placed next to the select
labels; empty-state copy replaced with inline 'create one now'
buttons (no more Link back to /owners /teams).
- CreateOwner coerces empty teamId to undefined before mutation so
the server contract matches OwnersPage.
Tests (12 new, all green; total suite 252 passed / 0 failed):
- Layout.test.tsx (4): Setup guide button renders, clicking it clears
the dismissal key and navigates to /?onboarding=1, tolerates
localStorage.removeItem throwing.
- DashboardPage.test.tsx (4): first-run auto-open, ?onboarding=1
re-entry after dismissal, onDismiss writes localStorage + strips
the query param, dismissed-with-no-param stays closed.
- OnboardingWizard.test.tsx (4): Skip-Skip reaches CertificateStep
with '+ New team' / '+ New owner' buttons visible; '+ New team'
happy path with React Query invalidation + parent-select
auto-select via option-parent traversal (label is a sibling, not
htmlFor-linked); '+ New owner' happy path pins team_id: undefined
coercion; Cancel abort never mutates.
Test infrastructure notes:
- Closure-driven vi.fn().mockImplementation pattern drives the
post-invalidation refetch: the mutation mock mutates a closure
variable that the getTeams/getOwners mock reads, so the parent
select's new <option> exists by the time the refetch lands.
- Anchored regex (/^Create Team$/, /^Create Owner$/) disambiguates
the modal submit from the '+ New team' / '+ New owner' triggers.
Verification gates (all green):
- vitest run: 252 passed / 0 failed (8 files, 13.98s)
- tsc --noEmit: 0 errors
- vite build: clean production bundle (851.77 kB js / 226.81 kB gzip)
No new runtime dependencies. Frontend-only change.
|
||
|
|
0725713e19 |
Close I-004 (agent hard-delete cascades targets) coverage-gap finding
Operator decision answered as full soft-delete with optional forced
cascade — hard-delete is not reachable from any public surface. Prior
to this commit, DELETE /agents/{id} ran a plain `DELETE FROM agents`
whose schema-level `ON DELETE CASCADE` on deployment_targets.agent_id
silently wiped every target, orphaning certs and aborting in-flight
jobs. The finding closure reshapes the agent-removal contract around
soft retirement with explicit preflight counts, an opt-in cascade
gated by a mandatory reason, and unconditional protection for the
four reserved sentinel agents used by discovery sources.
Schema — migration 000015:
migrations/000015_agent_retire.up.sql flips
deployment_targets_agent_id_fkey from ON DELETE CASCADE to ON DELETE
RESTRICT, so a stray `DELETE FROM agents` now errors at the DB
boundary instead of quietly destroying targets. Both `agents` and
`deployment_targets` grow a retired_at TIMESTAMPTZ + retired_reason
TEXT pair (TEXT not VARCHAR so operator comments are never
truncated), indexed via partial indexes WHERE retired_at IS NOT
NULL. The migration is self-healing (ADD COLUMN IF NOT EXISTS, DROP
CONSTRAINT IF EXISTS then ADD CONSTRAINT, CREATE INDEX IF NOT
EXISTS) so repeated runs against partially-migrated databases
converge. migrations/000015_agent_retire.down.sql restores CASCADE
and drops the new columns for clean rollback. A dedicated
repository-layer testcontainers test
(internal/repository/postgres/migration_000015_test.go) asserts the
before/after FK action, column presence, index presence, and
round-trip idempotency under up→down→up.
Domain — sentinel guard + dependency counts:
internal/domain/connector.go gains IsRetired() on Agent, the
exported SentinelAgentIDs slice listing server-scanner,
cloud-aws-sm, cloud-azure-kv, cloud-gcp-sm verbatim (matching the
four reserved IDs documented in CLAUDE.md and created at startup in
cmd/server/main.go), IsSentinelAgent(id string) predicate,
AgentDependencyCounts{ActiveTargets, ActiveCertificates,
PendingJobs} with a HasDependencies() method, and ActorTypeAgent /
ActorTypeSystem enum values used by audit emission downstream.
Coverage locked down by internal/domain/connector_test.go.
Service — 8-step ordered contract:
internal/service/agent_retire.go:RetireAgent(ctx, id, actor,
opts{Force, Reason}) enforces a fixed execution order:
(1) sentinel guard — IsSentinelAgent(id) returns ErrAgentIsSentinel
unconditionally; force=true does NOT bypass it.
(2) fetch — ErrAgentNotFound on miss.
(3) idempotency — if IsRetired() already, return
AgentRetirementResult{AlreadyRetired: true} with no new audit
event and no state change (safe to replay from flaky clients).
(4) preflight counts — collectAgentDependencyCounts runs
ActiveTargets, ActiveCertificates, PendingJobs sequentially
(not in parallel; keeps the per-query timeout predictable and
matches the repo's existing call-chain shape).
(5) force-reason guard — opts.Force=true with empty Reason returns
ErrForceReasonRequired (wired into the 400 status surface).
(6) dependency guard — HasDependencies() with opts.Force=false
returns BlockedByDependenciesError{Counts} (wired into the 409
body with per-bucket counts).
(7) mutation — single pinned retiredAt := time.Now(); agent
retirement first, then cascade target retirement if opts.Force,
all under the repo's single transaction so the two retired_at
stamps match to the second.
(8) best-effort audit — agent_retired always; agent_retirement_
cascaded additionally on the force path. Actor is whatever the
handler resolves from the request; actor type is mapped by
resolveActorType (system/agent-prefix→Agent/else→User). Audit
emission failures are logged via slog.Error but do not abort
the retirement (matches the house convention used by every
other scheduler-emitted event).
BlockedByDependenciesError implements Error() as
"active_targets=%d, active_certificates=%d, pending_jobs=%d" and
Unwrap() → ErrBlockedByDependencies. The single struct satisfies
errors.Is via Unwrap (used by scheduler-level tests) and errors.As
via the concrete type (used by the handler to fish out Counts for
the 409 body). ListRetiredAgents(page, perPage) adds a separate
paginated accessor with page<1→1 and perPage<1→50 normalization so
retired rows are queryable without polluting the default agent
listing.
Sentinel guard coverage is asymmetric by design: all four reserved
IDs are protected, and force=true cannot override. Regression tests
in internal/service/agent_retire_test.go assert each of the eight
steps in order, plus sentinel bypass attempts and idempotency
replay.
Handler + router — status-code surface:
internal/api/handler/agents.go:RetireAgent exposes seven status
codes on DELETE /agents/{id}:
200 on a fresh retirement (body echoes AgentRetirementResult).
204 on idempotent replay (AlreadyRetired=true; no new audit).
400 on ErrForceReasonRequired.
403 on ErrAgentIsSentinel.
404 on ErrAgentNotFound.
409 on BlockedByDependenciesError, with a custom body shape
{error, counts{active_targets, active_certificates,
pending_jobs}} that bypasses the default ErrorWithRequestID
envelope so callers get the per-bucket numbers directly.
500 on any other error.
Heartbeat HandleHeartbeat returns 410 Gone when the agent is
retired (ErrAgentRetired), signalling the agent to shut down.
Query params `force=true` and `reason=<text>` drive the cascade
path; both are forwarded as url.Values through the new MCP
transport.
internal/api/router/router.go registers GET /api/v1/agents/retired
literal-path BEFORE /api/v1/agents/{id} — Go 1.22 ServeMux's
literal-beats-pattern-var precedence routes "retired" to the
paginated retired-agents listing instead of fetching a hypothetical
agent named "retired".
Agent binary — clean shutdown on 410:
cmd/agent/main.go gains the ErrAgentRetired sentinel, a
retiredOnce sync.Once, and a retiredSignal chan struct{}. A
markRetired(source, statusCode, body) helper closes the channel
exactly once; the Run() select loop observes the close and returns
ErrAgentRetired; main() matches via errors.Is(err, ErrAgentRetired)
and exits cleanly instead of spinning in the heartbeat retry loop.
The 410 Gone surface is therefore terminal for the agent process.
MCP transport:
internal/mcp/client.go adds Client.DeleteWithQuery(path, query),
a new additive transport method. Client.Delete is path-only; without
this method the retire tool would silently drop `force` and `reason`,
turning every cascade retire into a default soft-retire. The new
method shares do()'s 204 normalization and 4xx/5xx error
propagation so tool authors get one contract.
internal/mcp/tools.go + internal/mcp/types.go expose the
retire_agent tool with Force+Reason inputs wired through
DeleteWithQuery.
CLI:
cmd/cli/main.go + internal/cli/client.go add two CLI surfaces:
`agents list --retired` (client-side strip of --retired then
delegation to ListRetiredAgents, sharing --page/--per-page parsing
with the default listing) and `agents retire <id> [--force --reason
"…"]` (mirrors ErrForceReasonRequired — force without reason is
rejected client-side before the request is sent). JSON + table
output modes both honor the new columns.
Frontend:
web/src/pages/AgentsPage.tsx surfaces retired/retire affordances.
web/src/api/client.ts + web/src/api/types.ts expose the retire
endpoint and the retired-listing. 4 new Vitest regression cases.
OpenAPI:
api/openapi.yaml documents DELETE /agents/{id} with all seven
status codes, 410 on heartbeat, and the 409 per-bucket body shape.
Regression coverage (six new test files, all green):
internal/service/agent_retire_test.go — 8-step contract + sentinel guards
internal/api/handler/agent_retire_handler_test.go — 7-status-code surface + 410 heartbeat
internal/mcp/retire_agent_test.go — DeleteWithQuery wire-through
internal/cli/agent_retire_test.go — --retired listing + --force/--reason pairing
internal/repository/postgres/migration_000015_test.go — FK flip + columns + indexes + up↔down
internal/domain/connector_test.go — IsRetired, IsSentinelAgent, SentinelAgentIDs, HasDependencies
Files:
api/openapi.yaml — DELETE + 410 + 409 body shape
cmd/agent/main.go — ErrAgentRetired, markRetired, retiredSignal
cmd/cli/main.go — handleAgents list/get/retire dispatch
docs/architecture.md, docs/concepts.md,
docs/testing-guide.md — retirement contract narrative
internal/api/handler/agents.go — RetireAgent, status surface, 410 on heartbeat
internal/api/handler/agent_handler_test.go — extended coverage
internal/api/handler/agent_retire_handler_test.go — new
internal/api/router/router.go — /agents/retired before /agents/{id}
internal/cli/agent_retire_test.go — new
internal/cli/client.go — ListRetiredAgents + RetireAgent
internal/domain/connector.go — IsRetired, SentinelAgentIDs,
IsSentinelAgent, AgentDependencyCounts,
ActorTypeAgent/System
internal/domain/connector_test.go — new
internal/integration/lifecycle_test.go — retirement fixture
internal/mcp/client.go — DeleteWithQuery additive transport
internal/mcp/retire_agent_test.go — new
internal/mcp/tools.go, internal/mcp/types.go — retire_agent tool + Force/Reason inputs
internal/repository/interfaces.go — AgentRepository retirement methods
internal/repository/postgres/agent.go — retire + cascade target retire + counts
internal/repository/postgres/migration_000015_test.go — new
internal/service/agent.go — wire into AgentService surface
internal/service/agent_retire.go — new 8-step contract
internal/service/agent_retire_test.go — new
internal/service/deployment.go — skip retired agents
internal/service/target.go — skip retired agents
internal/service/testutil_test.go — shared mocks extended
migrations/000015_agent_retire.up.sql — new
migrations/000015_agent_retire.down.sql — new
web/src/api/client.ts, types.ts + tests — retire endpoint wiring
web/src/pages/AgentsPage.tsx — retire UI
|
||
|
|
91642e2860 |
C-001 scope expansion: tighten parallel POST /api/v1/certificates call sites to six-field contract
Problem: |