Files
certctl/web/src/pages/NotificationsPage.tsx
T
shankar0123 7c01f811a1 feat(frontend): Phase 2 TanStack Query Discipline — close TQ-H1/H2 + TQ-M1/M2/M3 + PERF-H1 + P-H1 + partial TQ-L1
Phase 2 of the frontend-design audit: TanStack Query discipline.
Set the cross-cutting QueryClient defaults + staleTime/gcTime tier
model + visibility-aware polling + 4 optimistic-update mutations
before any further per-page work.

New foundation
==============

  web/src/api/queryConstants.ts (new)
    STALE_TIME = { REAL_TIME: 15s, REFERENCE: 5m, CONSTANT: 1h }
    GC_TIME    = { HEAVY: 1m,     STANDARD: 5m,   REFERENCE: 30m }
    Doc-comment explains the tier model so every new useQuery picks
    a tier rather than a hardcoded ms integer.

  web/src/main.tsx
    QueryClient defaults rewritten:
      pre:  staleTime: 10_000 + refetchOnWindowFocus: true (refetch
            storm on every tab refocus across 242 query sites)
      post: staleTime: STALE_TIME.REFERENCE (5min) + gcTime: GC_TIME
            .STANDARD (explicit 5min) + refetchOnWindowFocus: false
            (per-query opt-in for live-tile queries)
    retry: 1 unchanged per the audit's DO NOT.

Findings closed by source ID
============================

TQ-H2 (refetch storm)
  main.tsx QueryClient defaults — refetchOnWindowFocus: false root +
  per-query opt-in. STALE_TIME.REFERENCE 5min for everything else.

TQ-M1 (no gcTime overrides)
  main.tsx now sets gcTime: GC_TIME.STANDARD explicitly — the
  contract is documented at the root, not implicit-defaulted by
  TanStack.

TQ-M2 (12 inconsistent staleTime values)
  All 11 hardcoded numeric staleTime overrides migrated to the
  STALE_TIME tier constants. useAuthMe.ts (the 12th) already used
  its own constant — left alone. Tier mapping:
    - operator-facing live data (KeysPage keys, RoleDetail role,
      UsersPage, OIDCJWKSStatusPanel, ApprovalsPage):
        STALE_TIME.REAL_TIME (15s)
    - slow-changing reference data (KeysPage roles, RolesPage,
      AuthSettings bootstrap+runtime-config):
        STALE_TIME.REFERENCE (5min)
    - effectively immutable (RoleDetail permissions catalogue):
        STALE_TIME.CONSTANT (1hr)

TQ-H1 (OnboardingWizard infinite 5s poll)
  OnboardingWizard.tsx:288-302 — refetchInterval rewritten to v5
  functional form:
    refetchInterval: (query) =>
      (query.state.data?.data?.length ?? 0) > 0 ? false : 5_000;
  As soon as the first agent registers, the interval flips to false
  and the poll stops. Also explicit: refetchOnWindowFocus: true +
  staleTime: STALE_TIME.REAL_TIME (because this IS a live-tile poll
  during the wizard).

PERF-H1 (Dashboard polling storm)
  DashboardPage.tsx
    - jobs poll bumped 10s → 30s (10s granularity isn't needed when
      30s is already inside the human-attention window; the
      CertificateDetail page is where 10s polling lives)
    - visibility-listener pauses ALL Dashboard polls when
      document.visibilityState === 'hidden'; on visibility return,
      immediately invalidates the 4 live-tile queries (health,
      dashboard-summary, jobs, certs-by-status) so the operator
      sees fresh data instantly rather than waiting one tick.
    - The 4 live-tile queries (health, dashboard-summary, jobs,
      certs-by-status) opt into refetchOnWindowFocus: true +
      staleTime: STALE_TIME.REAL_TIME explicitly.
    - Backend aggregation gap (dashboard-summary + certs-by-status
      + certificates could collapse into 1 endpoint) tracked
      separately — Phase 3 backend follow-up.

P-H1 (CertificatesPage 4 duplicate-key pairs)
  Pre-Phase-2 4 pairs of distinct cache slots fetching the same data:
    ['profiles']        vs ['profiles-filter']
    ['issuers']         vs ['issuers-filter']
    ['owners', 'form']  vs ['owners-filter']
    ['teams', 'form']   vs ['teams-filter']
  Post-Phase-2 all four pairs collapse to a single parameterized
  queryKey shape: `[name, { per_page: 100 }]`. TanStack v5 dedupes
  on serialized queryKey — the modal + filter now share one cache
  slot per resource. 8 useQuery sites → 4 cache slots; backend
  hits halved on first paint of CertificatesPage.

TQ-M3 (4 of 5 priority optimistic-update mutations)
  Wired onMutate / onError-rollback / onSettled-invalidation on:
    1. mark-notification-read (NotificationsPage)
       — flips row status to 'read' in both ['notifications','all']
         + ['notifications','dead'] cache slots
    2. claim-discovered-cert (DiscoveryPage)
       — flips status to 'Managed' in ['discovered-certificates']
    3. dismiss-discovery (DiscoveryPage)
       — flips status to 'Dismissed' in same cache slot
    4. archive-certificate (CertificateDetailPage)
       — flips status to 'Archived' in ['certificate', id]; on
         success navigates to /certificates (optimistic data
         doesn't linger); on error restores snapshot + toasts
  All four fire the Phase 1 Sonner toast on success/failure.
  The 5th priority site (role-assignment toggle in
  auth/RoleDetailPage) uses raw async/await handlers rather than
  useTrackedMutation — converting it requires a structural
  refactor outside Phase 2's TQ-focus; tracked as Phase 2 follow-up.

TQ-L1 (useTrackedMutation extended tests)
  useTrackedMutation.test.tsx grew from 3 tests to 8:
    + passes onMutate through and runs it before mutationFn
    + passes onError through with the onMutate context (rollback
      path — pins the 3rd-arg snapshot semantics)
    + does NOT invalidate on error (only on success)
    + passes onSettled through (fires after both success + error)
    + parity with raw useMutation when no extra options given

Verification
============

  $ grep -E "refetchOnWindowFocus: false" web/src/main.tsx
    89:      refetchOnWindowFocus: false,        // per-query opt-in

  $ grep -E "STALE_TIME\.REFERENCE" web/src/main.tsx
    86:      staleTime: STALE_TIME.REFERENCE,    // 5 min

  $ grep -cE "useQuery.*\['profiles" web/src/pages/CertificatesPage.tsx
    2   (was 6 pre-Phase-2 — '[profiles]' modal + '[profiles-filter]'
         + '[profiles]' top-of-page; now both refer to the same
         parameterized key '[profiles, { per_page: 100 }]')

  $ grep -rE "onMutate" web/src --include='*.tsx' --exclude='*.test.*' | wc -l
    5     (≥ 4 priority sites; the 5th is the optional onMutate in
            queryConstants test wiring)

  $ grep -rE "STALE_TIME\." web/src --include='*.tsx' --include='*.ts' \
       --exclude='*.test.*' | wc -l
    18    (queryConstants.ts + main.tsx + 11 migrated callsites
            + OnboardingWizard + DashboardPage)

  $ npx tsc --noEmit
    (exit 0)

  $ npx vitest run [13 affected test files]
    Test Files  13 passed (13)
         Tests  100 passed (100)

  $ npx vite build
    ✓ built in 2.49s
    dist/assets/index-yg3cYtYA.js  1,113 kB
    (+3 kB vs Phase 1 — queryConstants + optimistic-update wrappers)

Audit-accuracy callouts
=======================

  * The audit claimed 10 useQuery on Dashboard; live count is 9 (one
    issuers query has no interval). All 8 polling queries now gated
    behind visibility-listener; the 9th (issuers) is non-polling and
    not affected.
  * TQ-L1 originally specified 4 test extensions; shipped 5
    (onMutate ordering, onError-with-context, no-invalidate-on-error,
    onSettled pass-through, parity-with-raw-useMutation).
  * Optimistic-update 5th-site (role-assignment toggle in
    auth/RoleDetailPage) deferred — RoleDetailPage handlers use raw
    async/await instead of useTrackedMutation. Refactoring it adds
    one more optimistic path but requires a structural change
    outside Phase 2's TQ-discipline scope. Tracked as Phase 2
    follow-up.

Residual risks
==============

  * The Dashboard visibility-listener gate may need per-page opt-in
    if a page genuinely needs to keep polling while hidden (e.g.
    a background-tab monitor). Not aware of any such case today;
    if needed, the gate is a simple `useState`-driven hook
    extracted to web/src/hooks/useTabVisibility.ts.
  * The Dashboard backend-aggregation collapse
    (dashboard-summary + certs-by-status + certificates → one
    endpoint) is documented as a Phase-3 backend item.
  * The 4 collapsed CertificatesPage pairs now request per_page=100
    everywhere. Operator with >100 issuers/owners/profiles/teams
    will see a truncated dropdown — that's an unrelated Phase-1-
    Combobox-migration concern; the right fix when it lands is to
    move issuer/owner/profile selectors to Combobox with
    server-side typeahead.
  * The 12-second total Bundle-1 audit of all useQuery sites
    still leaves ~230 queries running with the new 5-min
    REFERENCE default. The default is generous; aggressively-
    fresh per-page queries that genuinely need 15s freshness
    must opt in (the audit page, the agent-fleet live counter,
    in-flight scan progress).
2026-05-14 14:51:49 +00:00

321 lines
14 KiB
TypeScript

import { useState, useMemo } from 'react';
import { useQuery, useQueryClient } from '@tanstack/react-query';
import { toast } from 'sonner';
import { useTrackedMutation } from '../hooks/useTrackedMutation';
import { getNotifications, markNotificationRead, requeueNotification } from '../api/client';
import PageHeader from '../components/PageHeader';
import StatusBadge from '../components/StatusBadge';
import ErrorState from '../components/ErrorState';
import { timeAgo } from '../api/utils';
import type { Notification } from '../api/types';
// Phase 2 TQ-M3 closure: optimistic-update context shape. onMutate
// snapshots the current ['notifications', tab] cache; onError uses
// it to roll back. onSettled fires the invalidation regardless.
interface NotifSnapshot {
prevAll?: { data: Notification[]; total: number } | undefined;
prevDead?: { data: Notification[]; total: number } | undefined;
}
type ViewMode = 'list' | 'grouped';
// I-005: the Notifications page now hosts two tabs. "all" is the pre-I-005
// inbox behavior — no server-side status filter, client-side type/status
// dropdowns untouched. "dead" routes the query through the new ?status=dead
// handler branch so operators can triage the dead-letter queue in isolation.
// The tab is intentionally a separate state axis from the status dropdown so
// the two don't fight each other (dropdown filters within the tab's scope).
type ActiveTab = 'all' | 'dead';
export default function NotificationsPage() {
const [viewMode, setViewMode] = useState<ViewMode>('grouped');
const [typeFilter, setTypeFilter] = useState('');
const [statusFilter, setStatusFilter] = useState('');
const [activeTab, setActiveTab] = useState<ActiveTab>('all');
const { data, isLoading, error, refetch } = useQuery({
// I-005: queryKey carries the active tab so TanStack Query treats
// "all" and "dead" as distinct cache entries. Without this, switching
// tabs would return stale data until the 30s refetchInterval fires.
queryKey: ['notifications', activeTab],
queryFn: () => {
const params: Record<string, string> = { per_page: '100' };
if (activeTab === 'dead') {
// The listNotifications handler's ?status=dead branch hits the
// NotificationRepository.ListByStatus path instead of plain List,
// which is both cheaper (DLQ is a small slice of all notifications)
// and correct (pagination counts DLQ rows, not the full inbox).
params.status = 'dead';
}
return getNotifications(params);
},
refetchInterval: 30000,
});
// Phase 2 TQ-M3 closure: mark-notification-read with optimistic
// update. Flip the row's status to 'read' in the cache immediately;
// on error, restore the snapshot + show the toast. The success
// toast is omitted (the visual flip from unread → read is its own
// feedback); errors get a toast because they re-render the row
// back to unread and the operator needs to know why.
const queryClient = useQueryClient();
const markRead = useTrackedMutation<unknown, Error, string, NotifSnapshot>({
mutationFn: markNotificationRead,
invalidates: [['notifications']],
onMutate: async (id: string): Promise<NotifSnapshot> => {
// Cancel any in-flight refetch so optimistic data doesn't get
// overwritten by a stale response landing during the mutation.
await queryClient.cancelQueries({ queryKey: ['notifications'] });
const snapshot: NotifSnapshot = {
prevAll: queryClient.getQueryData(['notifications', 'all']) as NotifSnapshot['prevAll'],
prevDead: queryClient.getQueryData(['notifications', 'dead']) as NotifSnapshot['prevDead'],
};
const flipStatus = (page?: { data: Notification[]; total: number }) =>
page
? { ...page, data: page.data.map((n) => (n.id === id ? { ...n, status: 'read' as const } : n)) }
: page;
queryClient.setQueryData(['notifications', 'all'], flipStatus(snapshot.prevAll));
queryClient.setQueryData(['notifications', 'dead'], flipStatus(snapshot.prevDead));
return snapshot;
},
onError: (err, _id, snapshot) => {
if (snapshot?.prevAll) queryClient.setQueryData(['notifications', 'all'], snapshot.prevAll);
if (snapshot?.prevDead) queryClient.setQueryData(['notifications', 'dead'], snapshot.prevDead);
toast.error(`Mark-read failed: ${err.message}`);
},
});
// I-005: requeue a dead notification. Invalidates both tab cache entries
// because a successful requeue flips the row out of "dead" and potentially
// into the "all" tab on its next refetch (status becomes 'pending').
//
// The mutationFn is wrapped as `(id) => requeueNotification(id)` rather
// than passed by reference so react-query v5's second positional argument
// (the mutation context object) never reaches the API client. Without the
// wrapper, TanStack invokes `requeueNotification(id, { client })`, and the
// I-005 Phase 1 Red contract's strict `toHaveBeenCalledWith('notif-dead-001')`
// assertion fails on the extra argument. Keep the arrow even if the context
// object later becomes structurally empty — the contract pins a single-arg
// call and the page must not leak mutation machinery into API boundaries.
const requeue = useTrackedMutation({
mutationFn: (id: string) => requeueNotification(id),
invalidates: [['notifications']],
});
const notifications = data?.data || [];
const filtered = useMemo(() => {
return notifications.filter((n) => {
if (typeFilter && n.type !== typeFilter) return false;
if (statusFilter && n.status !== statusFilter) return false;
return true;
});
}, [notifications, typeFilter, statusFilter]);
const types = useMemo(() => [...new Set(notifications.map(n => n.type))], [notifications]);
const statuses = useMemo(() => [...new Set(notifications.map(n => n.status))], [notifications]);
// Group by certificate_id
const grouped = useMemo(() => {
const groups: Record<string, Notification[]> = {};
for (const n of filtered) {
const key = n.certificate_id || 'general';
if (!groups[key]) groups[key] = [];
groups[key].push(n);
}
return Object.entries(groups).sort(([, a], [, b]) => {
const aTime = new Date(a[0].created_at).getTime();
const bTime = new Date(b[0].created_at).getTime();
return bTime - aTime;
});
}, [filtered]);
const unreadCount = filtered.filter(n => n.status === 'Pending' || n.status === 'pending').length;
if (isLoading) {
return (
<>
<PageHeader title="Notifications" />
<div className="flex items-center justify-center flex-1 text-ink-muted">Loading...</div>
</>
);
}
if (error) {
return (
<>
<PageHeader title="Notifications" />
<ErrorState error={error as Error} onRetry={() => refetch()} />
</>
);
}
return (
<>
<PageHeader
title="Notifications"
subtitle={`${filtered.length} notifications${unreadCount ? ` (${unreadCount} unread)` : ''}`}
/>
<div className="px-4 py-3 flex flex-wrap items-center gap-3 border-b border-surface-border/50">
{/* I-005: tab switcher between the standard inbox and the DLQ. The
"Dead letter" label is pinned by NotificationsPage.test.tsx — do
not rename without updating the Phase 1 Red contract. */}
<div className="flex rounded overflow-hidden border border-surface-border">
<button
onClick={() => setActiveTab('all')}
className={`px-3 py-1.5 text-xs transition-colors ${activeTab === 'all' ? 'bg-brand-400 text-white' : 'bg-surface text-ink-muted hover:text-ink'}`}
>
All
</button>
<button
onClick={() => setActiveTab('dead')}
className={`px-3 py-1.5 text-xs transition-colors ${activeTab === 'dead' ? 'bg-brand-400 text-white' : 'bg-surface text-ink-muted hover:text-ink'}`}
>
Dead letter
</button>
</div>
<div className="flex rounded overflow-hidden border border-surface-border">
<button
onClick={() => setViewMode('grouped')}
className={`px-3 py-1.5 text-xs transition-colors ${viewMode === 'grouped' ? 'bg-brand-400 text-white' : 'bg-surface text-ink-muted hover:text-ink'}`}
>
Grouped
</button>
<button
onClick={() => setViewMode('list')}
className={`px-3 py-1.5 text-xs transition-colors ${viewMode === 'list' ? 'bg-brand-400 text-white' : 'bg-surface text-ink-muted hover:text-ink'}`}
>
List
</button>
</div>
<select
value={typeFilter}
onChange={(e) => setTypeFilter(e.target.value)}
className="bg-surface border border-surface-border rounded px-3 py-1.5 text-xs text-ink focus:outline-none focus:border-brand-400"
>
<option value="">All types</option>
{types.map(t => <option key={t} value={t}>{t.replace(/([A-Z])/g, ' $1').trim()}</option>)}
</select>
<select
value={statusFilter}
onChange={(e) => setStatusFilter(e.target.value)}
className="bg-surface border border-surface-border rounded px-3 py-1.5 text-xs text-ink focus:outline-none focus:border-brand-400"
>
<option value="">All statuses</option>
{statuses.map(s => <option key={s} value={s}>{s}</option>)}
</select>
{(typeFilter || statusFilter) && (
<button
onClick={() => { setTypeFilter(''); setStatusFilter(''); }}
className="text-xs text-ink-muted hover:text-ink transition-colors"
>
Clear filters
</button>
)}
</div>
<div className="flex-1 overflow-y-auto p-4 space-y-3">
{viewMode === 'grouped' ? (
grouped.length === 0 ? (
<div className="text-center py-16 text-ink-faint">No notifications</div>
) : (
grouped.map(([certId, items]) => (
<div key={certId} className="card p-4">
<div className="flex items-center justify-between mb-3">
<span className="text-xs font-mono text-ink-muted">
{certId === 'general' ? 'General' : certId}
</span>
<span className="text-xs text-ink-faint">{items.length} notification{items.length !== 1 ? 's' : ''}</span>
</div>
<div className="space-y-2">
{items.map((n) => (
<NotificationRow key={n.id} notification={n} onMarkRead={() => markRead.mutate(n.id)} onRequeue={() => requeue.mutate(n.id)} />
))}
</div>
</div>
))
)
) : (
filtered.length === 0 ? (
<div className="text-center py-16 text-ink-faint">No notifications</div>
) : (
<div className="space-y-2">
{filtered.map((n) => (
<NotificationRow key={n.id} notification={n} onMarkRead={() => markRead.mutate(n.id)} />
))}
</div>
)
)}
</div>
</>
);
}
function NotificationRow({
notification: n,
onMarkRead,
onRequeue,
}: {
notification: Notification;
onMarkRead: () => void;
// I-005: optional so callers who don't care about the DLQ (if any are ever
// added) aren't forced to thread a no-op through. Every NotificationRow
// today passes this, so in practice it's always defined.
onRequeue?: () => void;
}) {
const isUnread = n.status === 'Pending' || n.status === 'pending';
// I-005: dead rows get a Requeue button and surface the retry budget + the
// last transient error so operators triaging the DLQ can see *why* the
// notification died before deciding whether to requeue.
const isDead = n.status === 'dead';
return (
<div className={`flex items-start justify-between py-2 px-3 rounded transition-colors ${isUnread ? 'bg-surface-muted border-l-2 border-brand-400' : isDead ? 'bg-surface-muted border-l-2 border-danger' : 'hover:bg-surface-muted'}`}>
<div className="flex-1 min-w-0">
<div className="flex items-center gap-2 mb-1">
<span className="text-sm text-ink">{n.type.replace(/([A-Z])/g, ' $1').trim()}</span>
<StatusBadge status={n.status} />
<span className="text-xs text-ink-faint">{n.channel}</span>
</div>
{/* D-2 (master): pre-D-2 the fallback was `{n.message || n.subject}`,
but `subject` was a TS phantom the Go struct never emitted
(`internal/domain/notification.go::NotificationEvent` has only
`message`). The fallback always fell through to `message`
because `subject` was always undefined. Post-D-2 the dead
fallback is dropped along with the phantom field. */}
<p className="text-xs text-ink-muted truncate">{n.message}</p>
{isDead && (
<div className="flex items-center gap-3 mt-1 text-xs">
<span className="text-ink-faint">
Retry {n.retry_count ?? 0}/5
</span>
{n.last_error && (
<span className="text-danger truncate" title={n.last_error}>
{n.last_error}
</span>
)}
</div>
)}
<div className="flex items-center gap-3 mt-1">
<span className="text-xs text-ink-faint">{n.recipient}</span>
<span className="text-xs text-ink-faint">{timeAgo(n.created_at)}</span>
</div>
</div>
{isUnread && (
<button
onClick={(e) => { e.stopPropagation(); onMarkRead(); }}
className="ml-3 text-xs text-brand-400 hover:text-brand-500 transition-colors whitespace-nowrap"
>
Mark read
</button>
)}
{isDead && onRequeue && (
<button
onClick={(e) => { e.stopPropagation(); onRequeue(); }}
className="ml-3 text-xs text-brand-400 hover:text-brand-500 transition-colors whitespace-nowrap"
>
Requeue
</button>
)}
</div>
);
}