feat(scouts): scout fleet control pane in the agents area#2586
feat(scouts): scout fleet control pane in the agents area#2586andrewm4894 wants to merge 38 commits into
Conversation
Adds a Scouts surface under /code/agents/scouts backed by the PostHog Cloud
scout endpoints (/api/projects/{id}/signals/scout/):
- api-client: ScoutConfig/ScoutRun/ScoutEmission/ScoutScratchpadEntry types
and list/get/patch methods for configs, runs, emissions, scratchpad
- core/scouts: pure presentation helpers (status normalization, timeout
inference, stuck-run detection, run filters, fleet/per-scout rollups,
interval formatting) with unit tests
- ui/features/scouts: fleet view with inline config controls (on/off,
live vs dry-run, cadence), per-scout run history with emitted/quiet/failed
filter chips and summary previews, run detail with emissions, memory
written, and an external task-log link
- agents config page gains a Scouts entry card (above Responders);
/code/agents becomes a layout route with nested scout routes
Known API gaps are patched client-side and marked with comments: no
skill_name filter on runs (client-side filter over the recent window),
no failure kind (inferred from ~30m duration), no aggregate stats
(windowed client rollups), canonical-vs-custom via hardcoded skill list,
per-run memory limited to created entries.
SettingsOptionSelect hardcodes w-full on its trigger; as a direct flex child in the scout row the interval select stretched against the row and pushed the enable toggle past the card edge. Fixed trigger widths keep the controls in a tidy column (w-36 fits the longest cadence label).
- Replace the standalone /code/agents/scouts fleet screen and summary card with ScoutsFleetSection, an expandable section on the agents page (collapsed pulse line; expanded list capped at ~10 rows with inner scroll). Detail routes survive; back link now goes to Agents. - Scout run stats now describe a true last-24h window: core fetchScoutRunsWindow walks the backend's 100-row pages on a started_at cursor with run_id dedupe (created_at isn't serialized – api gap 7), flagging truncation honestly. All window copy updated. - Status dots get tooltips + aria-labels explaining each state. - Runs polling now only happens while the section is expanded.
…og cloud - All signal counts now read "N signal(s) emitted" (was "0 signals (quiet)" / bare "N signals") across fleet rows and run rows. - Scout detail header links to the skill in PostHog cloud (/skills/<name>) via new skillUrl helper in posthogLinks.
Each fleet row now shows a strip of small boxes, one per run in the 24h window (oldest left), colored by outcome: iris emitted, green quiet, red failed, amber timed out, blue pulsing running, red pulsing stuck. Hovering shows outcome + duration + relative time; clicking a box opens the run detail (summary, emissions, task log link). - core: ScoutRollup now carries the per-scout runs in timeline order; new deriveRunOutcome/scoutRunOutcomeLabel classifiers with tests. - ui: new ScoutRunBoxes component wired into ScoutRow.
Fleet rows now show only the enable switch plus a gear toggle; mode (live/dry-run) and cadence move into a labeled form that expands under the row. The detail header keeps the full horizontal controls strip. New ScoutEnabledSwitch + ScoutConfigForm exports alongside ScoutConfigControls.
Drop the per-row stats subtitle (last ran / run counts / emitted) – the status dot, run boxes, and their tooltips already carry it. A row is now: dot, name, badges, cadence, run boxes, switch, gear.
Small external-link icon next to the name opening the skill in PostHog cloud (/skills/<name>). Revealed on row hover to keep the single-line rows quiet; sits outside the router Link to avoid a nested anchor.
"N signals emitted" after the cadence chip, only when the scout emitted at least one signal in the 24h window – quiet scouts stay clean.
Extract the fleet row into ScoutRowCard and reuse it as the header card on the scout detail screen (linkToDetail=false there), replacing the old title block + Configuration card so the two surfaces stay identical. Also drop the quiet-run sentence from the Scouts subsection description.
Boxes with a task_url now link straight to the backing task run in
the browser, with the tooltip calling it out ('open task run in
PostHog'). Runs without a task link keep the in-app run detail
fallback.
"41 signals emitted (20%) · last 24h" – emitRate is the share of
window runs that emitted at least one signal; window label loses the
space ("24 h" → "24h") and its parentheses.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 51cbe5d60b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Five typed events covering the scouts control pane: fleet viewed (incl empty state), detail viewed, config changed (with old/new value and rollback flag), chat started (which CTA, which scout), and a generic scout action for the smaller interactions (run/emission expand, task-run and skill links, run filter, settings gear, hide-disabled toggle).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c458e526ce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
# Conflicts: # packages/ui/src/features/inbox/CLAUDE.md
|
React Doctor could not complete this scan.
Reviewed by React Doctor for commit |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bde0f5c5ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Re the Greptile findings (posted as "Comments Outside Diff", so replying here):
|
…fan-out - Surface configs/runs query failures instead of rendering empty states (fleet section error strip with retry; detail-view config and runs error messages), so a missing scope or API outage is distinguishable from a genuinely quiet project. - Gate the detail config card on configsLoading to stop the false "No config found" flash during the initial fetch. - Scope optimistic-rollback to the failed config and invalidate the configs query when the last in-flight mutation settles, so concurrent edits cannot clobber each other or land stale server responses. - Add a per-run fallback when an emissions query errors or returns empty, keeping the task-run link as the escape hatch. - Cap initially rendered emitted runs at 10 with a "show more" expander (new show_more_emitted_runs scout action) to bound the per-run emissions query fan-out. - Call out window truncation explicitly in the no-runs empty state. - Convert table-shaped scoutPresentation tests to it.each.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e27aba763
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const { data: runsWindow } = useScoutRuns(); | ||
| const { updateConfig } = useScoutConfigMutations(); | ||
| const [hideDisabled, setHideDisabled] = useState(false); | ||
| useTrackFleetViewed(configs); | ||
|
|
||
| const runs = runsWindow?.runs; | ||
| const rollups = useMemo(() => computeScoutRollups(runs ?? []), [runs]); |
There was a problem hiding this comment.
Surface fleet run query failures
When the expanded fleet's runs request is still loading or rejects (for example a missing signal_scout:read scope, a 500, or endpoint rollout gap), this hook only reads data, so runsWindow stays undefined and the code computes rollups from []. The fleet then reports “none running now”, “0 signals emitted”, and empty run boxes under “last 24h” instead of indicating that run stats are unavailable; read isLoading/isError from useScoutRuns() and render a loading/error state before deriving zero-valued stats.
Useful? React with 👍 / 👎.
| // Fire the viewed event once per scout, after both queries settle so the | ||
| // config and run-window stats are real rather than loading-state zeros. | ||
| const viewTrackedFor = useRef<string | null>(null); | ||
| useEffect(() => { |
There was a problem hiding this comment.
I remember when LLMs were terrible at this kind of stuff. Today this implementation actually makes sense
De-LLM-ifying.
| if ( | ||
| !RUN_INTERVAL_OPTIONS.some( | ||
| (option) => option.minutes === config.run_interval_minutes, | ||
| ) | ||
| ) { | ||
| options.push({ | ||
| value: String(config.run_interval_minutes), | ||
| label: formatRunInterval(config.run_interval_minutes), | ||
| }); |
There was a problem hiding this comment.
It's odd that the select option is last - feels like it'd be best to not funk around with this, and show them always in the same order
Problem
The signals-scout fleet (scheduled agents that sweep a project and emit findings to the Signals inbox) had no UI in PostHog Code. Configs could only be inspected and tuned via the MCP tools or the REST API directly: no way to see what scouts exist, whether they are healthy, what they found, or to toggle/tune them without leaving the app.
Changes
Adds a scouts control pane to the agents area (
/code/agents), plus a per-scout detail screen.Fleet section (expandable, on the agents config page):
ScoutRowCardper scout with origin/dry-run badges (with explainer tooltips), cadence, per-run outcome boxes, enable switch, and a gear that expands an inline config form (mode + cadence).Chat CTAs (one-click agent tasks, mirroring the inbox discuss flow):
useInboxCloudTaskRunner(report-scoped runs keep theirsignal_reportanalytics; these track asmanual).Scout detail screen (
/code/agents/scouts/$skillName), signals-first:ScoutRowCardheader, 24h rollup line, a Signals section showing emissions from emitted runs, then the run list with All/Emitted/Quiet/Failed filter chips.Plumbing (per the repo's layering rules):
@posthog/api-client: scout config/run/emission methods.@posthog/core:scouts/scoutPresentation.ts(rollups, outcome derivation, filters) andscoutRunsWindow.ts(paginated 24h fleet window assembly), both unit-tested.@posthog/ui:features/scouts/components and hooks, nested TanStack routes.Known backend gaps (per-scout run filter, failure info on runs, aggregate stats) are patched client-side for now and tracked for
scout_harnessfixes in the posthog repo.How did you test this?
scoutPresentation.test.ts,scoutRunsWindow.test.ts); full ui/core test suites, Biome, and typecheck pass (also enforced per-commit by the pre-commit hook).signal_scout:writescope is unconfirmed (reads confirmed working); the chat CTAs end-to-end (task creation path is the same shared runner the inbox discuss/create-PR flows use).Automatic notifications