chore(core): hygiene cleanup from issue #431 by lmorchard · Pull Request #444 · mozilla/pilo

lmorchard · 2026-05-13T20:39:28Z

Summary

Bundle the items from #431 that still apply to current main. Each touches a small, focused area of packages/core.

A. Warn + emit SYSTEM_DEBUG_TOOL_DROP when providers return more than one tool call in a single turn, so dropped extras are observable rather than silently lost.
B. Replace the string-match isSetupError check with PlanningError / NoStartingUrlError subclasses (re-exported from the public API), so setup-error detection survives refactors of the underlying message text.
D. Bump the wait tool's upper bound from 30s to 120s and rewrite its execute to sleep directly with abort-signal polling instead of going through page.waitForTimeout (which is abort-blind). A 120s wait now responds to user aborts within ~500ms.
E. Drop the unused actionLoopSystemPrompt export; refactor the prompts tests to call buildActionLoopSystemPrompt(false, false) directly.

Item C (harden image-strip detection) was a no-op: the fallback the issue refers to (webAgent.ts:542-552) is not present in the current main lineage. git log -S "stripping images" shows it was introduced on PR #378 but didn't reach current main after the develop → main workflow switchover (PR #337). Nothing to harden.

Item F (persona prompt scroll guidance) is deferred to a follow-up PR that adds a real scroll tool — so the prompt text becomes accurate rather than stripped. Keeping that work separate so it gets its own review/changelog entry rather than being buried under a "hygiene" label.

Test plan

pnpm run check passes (typecheck + format + 1247 tests across core/cli/server/extension)
New tests assert:
- SYSTEM_DEBUG_TOOL_DROP event payload (droppedTools / keptTool)
- Planning failures reject with a PlanningError instance
- wait cap accepts 120s, rejects 121s
- wait aborts within one poll interval when abortSignal fires
Generated schema (schemas/webagent-event.json) regenerated for the new event
gitleaks protect --staged clean

Resolves #431

Copilot

Pull request overview

This PR performs focused core hygiene cleanup around tool-call observability, setup error typing, wait-tool limits, and prompt/export cleanup.

Changes:

Adds typed planning/setup errors and a multi-tool-drop debug event.
Increases the wait tool cap to 120 seconds and updates prompts/tests.
Removes the dead actionLoopSystemPrompt export and refreshes related tests/docs.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`packages/core/src/webAgent.ts`	Emits multi-tool-drop diagnostics and uses typed setup errors.
`packages/core/src/utils/retry.ts`	Adds tool-required diagnostics and retry prompt augmentation.
`packages/core/src/tools/webActionTools.ts`	Raises wait tool validation limit to 120 seconds.
`packages/core/src/prompts.ts`	Updates wait descriptions and removes scroll guidance/dead export.
`packages/core/src/events.ts`	Adds the `SYSTEM_DEBUG_TOOL_DROP` event and payload type.
`packages/core/src/errors.ts`	Adds `PlanningError` and `NoStartingUrlError`.
`packages/core/src/core.ts`	Re-exports the new error classes.
`packages/core/test/webAgent.test.ts`	Adds tests for planning error typing and multi-tool-drop events.
`packages/core/test/utils/retry.test.ts`	Adds tests for tool-required retry diagnostics/augmentation.
`packages/core/test/tools/webActionTools.test.ts`	Updates wait tool description/schema tests.
`packages/core/test/prompts.test.ts`	Refactors prompt tests after removing the exported constant.
`packages/core/test/events.test.ts`	Updates expected event type list.
`docs/dev-sessions/2026-05-13-1200-hygiene-cleanup-431/notes.md`	Adds implementation notes for issue #431 cleanup.

Comments suppressed due to low confidence (1)

packages/core/src/tools/webActionTools.ts:197

Raising the wait limit to 120s makes aborts much less responsive because the wait action ultimately calls page.waitForTimeout(seconds * 1000) and neither this tool nor the browser wait path observes context.abortSignal. A user abort during wait({ seconds: 120 }) can be delayed until the full wait finishes; make the wait abort-aware before increasing the cap this far.

      inputSchema: z.object({

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lmorchard · 2026-05-13T21:07:25Z

Thanks for the review — addressed all three points:

1. Wait + abort responsiveness (suppressed comment on webActionTools.ts:197): Real concern, fixed. The wait tool no longer routes through browser.performAction → page.waitForTimeout (which is abort-blind). It now does its own 500ms-polling sleep loop that checks context.abortSignal each iteration, so user aborts during a wait({seconds: 120}) are responsive within ~500ms. New test in webActionTools.test.ts exercises the abort path.

2. NoStartingUrlError doc (errors.ts:115): Fixed. The docstring now describes the actual situation (defensive guard against an unexpectedly unset this.url after planTask should have defaulted to about:blank) rather than the user-facing scenario it implied.

3. notes.md:26 stale comment: Fixed. Updated the line-288 entry to match what the diff actually does (rewrites the Best-Practices line so it points at wait() rather than the non-existent scroll action).

Bundle the items from #431 that still apply to current main: - A. Warn + emit SYSTEM_DEBUG_TOOL_DROP when providers return more than one tool call in a single turn, so dropped extras are observable rather than silently lost. - B. Replace the string-match `isSetupError` check with `PlanningError` and `NoStartingUrlError` subclasses (re-exported from the public API), so setup-error detection survives refactors of the underlying message text. - D. Bump the `wait` tool's upper bound from 30s to 120s and rewrite its `execute` to sleep directly with abort-signal polling instead of going through `page.waitForTimeout` (which is abort-blind), so a 120s wait can be interrupted by a user abort within ~500ms. - E. Drop the unused `actionLoopSystemPrompt` export; refactor the prompts tests to call `buildActionLoopSystemPrompt(false, false)` directly. Item C (harden image-strip detection) was a no-op: the fallback the issue refers to is not present in the current main lineage, so there is nothing to harden. Item F (persona prompt scroll guidance) is deferred to a follow-up PR that adds a real `scroll` tool, so the prompt text becomes accurate rather than being stripped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lmorchard requested a review from Copilot May 13, 2026 20:50

Copilot started reviewing on behalf of lmorchard May 13, 2026 20:51 View session

lmorchard force-pushed the worktree-fix-431-hygiene-cleanup branch 2 times, most recently from 3a1ee57 to 2959bee Compare May 13, 2026 20:57

Copilot AI reviewed May 13, 2026

View reviewed changes

Comment thread packages/core/src/errors.ts Outdated

Comment thread docs/dev-sessions/2026-05-13-1200-hygiene-cleanup-431/notes.md Outdated

lmorchard force-pushed the worktree-fix-431-hygiene-cleanup branch from 2959bee to 44986e6 Compare May 13, 2026 21:07

lmorchard force-pushed the worktree-fix-431-hygiene-cleanup branch 3 times, most recently from 0eaf00a to 1d16fad Compare May 13, 2026 21:43

lmorchard force-pushed the worktree-fix-431-hygiene-cleanup branch from 1d16fad to 3d5ba09 Compare May 13, 2026 21:43

lmorchard mentioned this pull request May 13, 2026

feat(core): add scroll tool #445

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(core): hygiene cleanup from issue #431#444

chore(core): hygiene cleanup from issue #431#444
lmorchard wants to merge 1 commit into
mainfrom
worktree-fix-431-hygiene-cleanup

lmorchard commented May 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

lmorchard commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lmorchard commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

lmorchard commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lmorchard commented May 13, 2026 •

edited

Loading