mirror of
https://github.com/shankar0123/certctl.git
synced 2026-06-12 14:18:52 +00:00
7c01f811a173d7e696d7b249a189ee0f6af6b44d
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7c01f811a1 |
feat(frontend): Phase 2 TanStack Query Discipline — close TQ-H1/H2 + TQ-M1/M2/M3 + PERF-H1 + P-H1 + partial TQ-L1
Phase 2 of the frontend-design audit: TanStack Query discipline.
Set the cross-cutting QueryClient defaults + staleTime/gcTime tier
model + visibility-aware polling + 4 optimistic-update mutations
before any further per-page work.
New foundation
==============
web/src/api/queryConstants.ts (new)
STALE_TIME = { REAL_TIME: 15s, REFERENCE: 5m, CONSTANT: 1h }
GC_TIME = { HEAVY: 1m, STANDARD: 5m, REFERENCE: 30m }
Doc-comment explains the tier model so every new useQuery picks
a tier rather than a hardcoded ms integer.
web/src/main.tsx
QueryClient defaults rewritten:
pre: staleTime: 10_000 + refetchOnWindowFocus: true (refetch
storm on every tab refocus across 242 query sites)
post: staleTime: STALE_TIME.REFERENCE (5min) + gcTime: GC_TIME
.STANDARD (explicit 5min) + refetchOnWindowFocus: false
(per-query opt-in for live-tile queries)
retry: 1 unchanged per the audit's DO NOT.
Findings closed by source ID
============================
TQ-H2 (refetch storm)
main.tsx QueryClient defaults — refetchOnWindowFocus: false root +
per-query opt-in. STALE_TIME.REFERENCE 5min for everything else.
TQ-M1 (no gcTime overrides)
main.tsx now sets gcTime: GC_TIME.STANDARD explicitly — the
contract is documented at the root, not implicit-defaulted by
TanStack.
TQ-M2 (12 inconsistent staleTime values)
All 11 hardcoded numeric staleTime overrides migrated to the
STALE_TIME tier constants. useAuthMe.ts (the 12th) already used
its own constant — left alone. Tier mapping:
- operator-facing live data (KeysPage keys, RoleDetail role,
UsersPage, OIDCJWKSStatusPanel, ApprovalsPage):
STALE_TIME.REAL_TIME (15s)
- slow-changing reference data (KeysPage roles, RolesPage,
AuthSettings bootstrap+runtime-config):
STALE_TIME.REFERENCE (5min)
- effectively immutable (RoleDetail permissions catalogue):
STALE_TIME.CONSTANT (1hr)
TQ-H1 (OnboardingWizard infinite 5s poll)
OnboardingWizard.tsx:288-302 — refetchInterval rewritten to v5
functional form:
refetchInterval: (query) =>
(query.state.data?.data?.length ?? 0) > 0 ? false : 5_000;
As soon as the first agent registers, the interval flips to false
and the poll stops. Also explicit: refetchOnWindowFocus: true +
staleTime: STALE_TIME.REAL_TIME (because this IS a live-tile poll
during the wizard).
PERF-H1 (Dashboard polling storm)
DashboardPage.tsx
- jobs poll bumped 10s → 30s (10s granularity isn't needed when
30s is already inside the human-attention window; the
CertificateDetail page is where 10s polling lives)
- visibility-listener pauses ALL Dashboard polls when
document.visibilityState === 'hidden'; on visibility return,
immediately invalidates the 4 live-tile queries (health,
dashboard-summary, jobs, certs-by-status) so the operator
sees fresh data instantly rather than waiting one tick.
- The 4 live-tile queries (health, dashboard-summary, jobs,
certs-by-status) opt into refetchOnWindowFocus: true +
staleTime: STALE_TIME.REAL_TIME explicitly.
- Backend aggregation gap (dashboard-summary + certs-by-status
+ certificates could collapse into 1 endpoint) tracked
separately — Phase 3 backend follow-up.
P-H1 (CertificatesPage 4 duplicate-key pairs)
Pre-Phase-2 4 pairs of distinct cache slots fetching the same data:
['profiles'] vs ['profiles-filter']
['issuers'] vs ['issuers-filter']
['owners', 'form'] vs ['owners-filter']
['teams', 'form'] vs ['teams-filter']
Post-Phase-2 all four pairs collapse to a single parameterized
queryKey shape: `[name, { per_page: 100 }]`. TanStack v5 dedupes
on serialized queryKey — the modal + filter now share one cache
slot per resource. 8 useQuery sites → 4 cache slots; backend
hits halved on first paint of CertificatesPage.
TQ-M3 (4 of 5 priority optimistic-update mutations)
Wired onMutate / onError-rollback / onSettled-invalidation on:
1. mark-notification-read (NotificationsPage)
— flips row status to 'read' in both ['notifications','all']
+ ['notifications','dead'] cache slots
2. claim-discovered-cert (DiscoveryPage)
— flips status to 'Managed' in ['discovered-certificates']
3. dismiss-discovery (DiscoveryPage)
— flips status to 'Dismissed' in same cache slot
4. archive-certificate (CertificateDetailPage)
— flips status to 'Archived' in ['certificate', id]; on
success navigates to /certificates (optimistic data
doesn't linger); on error restores snapshot + toasts
All four fire the Phase 1 Sonner toast on success/failure.
The 5th priority site (role-assignment toggle in
auth/RoleDetailPage) uses raw async/await handlers rather than
useTrackedMutation — converting it requires a structural
refactor outside Phase 2's TQ-focus; tracked as Phase 2 follow-up.
TQ-L1 (useTrackedMutation extended tests)
useTrackedMutation.test.tsx grew from 3 tests to 8:
+ passes onMutate through and runs it before mutationFn
+ passes onError through with the onMutate context (rollback
path — pins the 3rd-arg snapshot semantics)
+ does NOT invalidate on error (only on success)
+ passes onSettled through (fires after both success + error)
+ parity with raw useMutation when no extra options given
Verification
============
$ grep -E "refetchOnWindowFocus: false" web/src/main.tsx
89: refetchOnWindowFocus: false, // per-query opt-in
$ grep -E "STALE_TIME\.REFERENCE" web/src/main.tsx
86: staleTime: STALE_TIME.REFERENCE, // 5 min
$ grep -cE "useQuery.*\['profiles" web/src/pages/CertificatesPage.tsx
2 (was 6 pre-Phase-2 — '[profiles]' modal + '[profiles-filter]'
+ '[profiles]' top-of-page; now both refer to the same
parameterized key '[profiles, { per_page: 100 }]')
$ grep -rE "onMutate" web/src --include='*.tsx' --exclude='*.test.*' | wc -l
5 (≥ 4 priority sites; the 5th is the optional onMutate in
queryConstants test wiring)
$ grep -rE "STALE_TIME\." web/src --include='*.tsx' --include='*.ts' \
--exclude='*.test.*' | wc -l
18 (queryConstants.ts + main.tsx + 11 migrated callsites
+ OnboardingWizard + DashboardPage)
$ npx tsc --noEmit
(exit 0)
$ npx vitest run [13 affected test files]
Test Files 13 passed (13)
Tests 100 passed (100)
$ npx vite build
✓ built in 2.49s
dist/assets/index-yg3cYtYA.js 1,113 kB
(+3 kB vs Phase 1 — queryConstants + optimistic-update wrappers)
Audit-accuracy callouts
=======================
* The audit claimed 10 useQuery on Dashboard; live count is 9 (one
issuers query has no interval). All 8 polling queries now gated
behind visibility-listener; the 9th (issuers) is non-polling and
not affected.
* TQ-L1 originally specified 4 test extensions; shipped 5
(onMutate ordering, onError-with-context, no-invalidate-on-error,
onSettled pass-through, parity-with-raw-useMutation).
* Optimistic-update 5th-site (role-assignment toggle in
auth/RoleDetailPage) deferred — RoleDetailPage handlers use raw
async/await instead of useTrackedMutation. Refactoring it adds
one more optimistic path but requires a structural change
outside Phase 2's TQ-discipline scope. Tracked as Phase 2
follow-up.
Residual risks
==============
* The Dashboard visibility-listener gate may need per-page opt-in
if a page genuinely needs to keep polling while hidden (e.g.
a background-tab monitor). Not aware of any such case today;
if needed, the gate is a simple `useState`-driven hook
extracted to web/src/hooks/useTabVisibility.ts.
* The Dashboard backend-aggregation collapse
(dashboard-summary + certs-by-status + certificates → one
endpoint) is documented as a Phase-3 backend item.
* The 4 collapsed CertificatesPage pairs now request per_page=100
everywhere. Operator with >100 issuers/owners/profiles/teams
will see a truncated dropdown — that's an unrelated Phase-1-
Combobox-migration concern; the right fix when it lands is to
move issuer/owner/profile selectors to Combobox with
server-side typeahead.
* The 12-second total Bundle-1 audit of all useQuery sites
still leaves ~230 queries running with the new 5-min
REFERENCE default. The default is generous; aggressively-
fresh per-page queries that genuinely need 15s freshness
must opt in (the audit page, the agent-fleet live counter,
in-flight scan progress).
|
||
|
|
78485f7429 |
fix(auth/users): close MED-11 lying field — DeactivatedAt loaded + enforced on login (A-2)
The MED-11 closure shipped users.deactivated_at + DELETE /api/v1/auth/users/{id}
+ cascade-revoke, but the federated-user soft-delete was reversible: the next
OIDC login under the same (provider, subject) tuple re-minted a session and
re-elevated the user.
Three legs of the chain were severed (each independently CRIT-shaped):
Leg A — postgres/user.go::userColumns omitted `deactivated_at`, so scanUser
never populated User.DeactivatedAt. Every Get / GetByOIDCSubject /
ListAll returned DeactivatedAt = nil regardless of the column value.
Leg B — postgres/user.go::Update SQL omitted `deactivated_at = $X`, so the
handler's `u.DeactivatedAt = now()` mutation was a no-op write at
the SQL level. Even with leg A closed, no row ever flipped.
Leg C — oidc/service.go::upsertUser did not inspect DeactivatedAt on the
existing-user path. Even with legs A + B closed, the OIDC login
would still proceed normally.
The cascade-session-revoke half of the original closure remained correct, but
only for the duration of the user's current cookie. SOC 2 CC6.3 + ISO 27001
A.9.2.6 "user access removal" controls require both immediate revoke AND
persistent block — this fix restores the persistent-block leg.
Closure across layers:
internal/repository/postgres/user.go
- userColumns adds `deactivated_at`
- scanUser reads via sql.NullTime intermediate (column is nullable)
- Create writes deactivated_at explicitly (NULL for new active users;
forward-compat for future seed-data flows that pre-populate the column)
- Update writes deactivated_at on every call; nil DeactivatedAt → NULL
(supports reactivation)
internal/auth/oidc/service.go
- New sentinel ErrUserDeactivated
- upsertUser checks existing.DeactivatedAt != nil BEFORE mutating email /
display_name / last_login_at — preserves last_login_at forensics on
rejected login attempts (defense-in-depth pin against future
"performance optimization" that reorders the gate)
internal/api/handler/auth_session_oidc.go
- classifyOIDCFailure adds typed errors.Is dispatch for ErrUserDeactivated
→ audit category "user_deactivated" (SOC/SIEM observability surface)
internal/api/handler/auth_users.go
- Self-deactivate guard on Deactivate: HTTP 409 + audit row
auth.user_deactivate_self_rejected when caller targets own User row.
Prevents an admin from one-way-door locking themselves out via the
standard handler; break-glass remains the recovery path.
- New Reactivate handler: inverse of Deactivate. Clears DeactivatedAt
via Update; emits auth.user_reactivated audit row. Idempotent on
already-active rows. Sessions revoked at deactivation stay revoked
(cascade irreversible by design — user must complete fresh OIDC
login).
internal/api/router/router.go
- POST /api/v1/auth/users/{id}/reactivate wired with auth.user.deactivate
gate (reactivation is the inverse op, not a separate privilege)
web/src/api/client.ts + web/src/pages/auth/UsersPage.tsx
- authReactivateUser() client function
- Reactivate button on deactivated rows in UsersPage
Regression coverage:
Postgres (testcontainers, skipped under -short):
TestUserRepository_DeactivatedAt_RoundTrip — Create → set DeactivatedAt
→ Update → Get / GetByOIDCSubject / ListAll round-trip the value
TestUserRepository_DeactivatedAt_CreateWritesNullForActive — new active
user reads back DeactivatedAt = nil
TestUserRepository_DeactivatedAt_CreatePersistsPreDeactivated — Create
with non-nil DeactivatedAt round-trips (forward-compat path)
OIDC service:
TestService_HandleCallback_RejectsDeactivatedUser — errors.Is
ErrUserDeactivated; CallbackResult nil; persisted email / last_login_at
/ deactivated_at NOT mutated by the rejected attempt
TestService_HandleCallback_AllowsReactivatedUser — DeactivatedAt = nil
→ happy path resumes
TestService_HandleCallback_DeactivatedUserPreservesForensics —
defense-in-depth pin against future regressions that reorder the
gate-vs-mutation sequence
Classifier:
TestClassifyOIDCFailure extended — typed dispatch + wrapped variant
round-trip through errors.Is
Handler:
TestAuthUsers_Deactivate_RejectsSelfDeactivate — HTTP 409 + audit
row + cascade-revoke NOT fired + row stays active
TestAuthUsers_Deactivate_OtherUser_HappyPath — HTTP 204 + cascade
fires + row soft-deleted
TestAuthUsers_Reactivate_HappyPath / _IdempotentOnActiveUser /
_UnknownID / _MissingID / _UpdateError
Phase 6 verify gate green on the targeted packages: gofmt clean, go vet
clean, go test -short pass across internal/auth/oidc, internal/api/handler,
internal/api/router, internal/repository/postgres, internal/auth/...,
internal/service/..., internal/tlsprobe/..., internal/trustanchor/...,
internal/validation/...
Spec at cowork/auth-bundles-fixes-2026-05-11/02-crit-deactivated-at-enforcement.md
Closure annotation at cowork/auth-bundles-audit-2026-05-10.md MED-11 row.
Operator advisory in CHANGELOG.md v2.1.0 release notes.
|
||
|
|
191384c1d2 |
feat(gui): auth GUI batch — MED-4/7/8/10/11/12 + LOW-1/11/12 + HIGH-10 GUI half
Audit 2026-05-10 GUI batch closure. WHAT. Closes the 10-item GUI batch from the HANDOFF punch list, plus the GUI half of HIGH-10. Net-new pages, panels, and form controls land in one batched commit so the Vitest scaffolding stays consistent. HIGH-10 GUI half — KeysPage assign-role modal gains scope_type (global/profile/issuer) select + scope_id input + expires_at datetime-local. Validates scope_id required when type != global. Threads through the api/client.ts AssignKeyRoleOptions extension that was prepared on the backend side in |