Skip to content

fix(computer): make every action invocable (schema field gaps) + element_at/error-mapping fixes#351

Open
1jehuang wants to merge 4 commits into
feat/macos-computer-toolfrom
fix/computer-tool-schema-and-element-at
Open

fix(computer): make every action invocable (schema field gaps) + element_at/error-mapping fixes#351
1jehuang wants to merge 4 commits into
feat/macos-computer-toolfrom
fix/computer-tool-schema-and-element-at

Conversation

@1jehuang

@1jehuang 1jehuang commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary

Exhaustively exercised every computer tool action against the live running server and fixed every failure so the tool is production-ready. The handlers were already implemented; the bugs were in the function-calling contract and a few AppleScript/error-mapping/perf details. All fixes are validated live after build + reload.

Fixes #350.

Bugs fixed

1. Schema omitted half the input fields → 8+ actions were impossible to call

ComputerInput deserializes dx, dy, to_x, to_y, w, h, menu_path, window_id, ax_action, contains, timeout_ms, region, level and dispatch() requires them, but parameters_schema() only declared a subset. A model can only emit parameters present in the tool schema, so those fields could never be sent and each dependent action dead-ended in its own requires X guard:
scroll, drag, resize_window, select_menu, window_screenshot, wait_for, set_brightness, perform_action, plus the ocr region variant.

Fix: declare all input fields (action specs stay in discover; fields must be declared). Trimmed descriptions so the always-on schema stays ~880 tokens. Added schema_declares_every_input_field regression test.

2. element_at used the AppleScript reserved word result

result holds the last command's value, so element_at always failed with The variable result is not defined. Renamed to bestHit.

3. element_at walked the entire AX tree → 20s timeout on big apps

Now prunes to only descend subtrees whose frame contains the point (a child's frame ⊆ its parent's), with a depth bound + explicit 12s timeout. Full-tree walk → path-length walk. Live: System Settings used to time out (20s); a comparable large tree (Messages) now returns in ~1.6s.

4. -1719 misclassified as "Accessibility permission required"

-1719 (errAEIllegalIndex) / -1728 (errAENoSuchObject) mean the target reference didn't resolve (app not running / no front window / AX path out of range), not a permission problem. Extracted a pure, unit-tested classify_osa_error: permission text first, then map index errors to an accurate "target not found" message. Use -25211 (errAXAPIDisabled) for the real permission code.

5. wait_for ignored its own timeout_ms

Each poll dumped the AX tree (depth 10) under the 20s default scripting timeout, so a single poll could take ~20s and a timeout_ms=2000 call wouldn't return until ~20s. Now each poll is bounded by the time remaining (capped 3s) and the dump depth is reduced. Live: a 2000ms wait_for now returns in ~2.2s.

Validation

  • cargo test -p jcode-app-core --lib tool::computer: 26 passed, 0 failed (10 new tests across the fixes).
  • Built selfdev, installed, reloaded the running server through 4 iterations, and re-ran the full action matrix live. Every previously-broken action now works:
    scrolldragresize_windowselect_menu ✓ (View > Actual Size in Preview) window_screenshot ✓ (captured an occluded window) wait_for ✓ (honors timeout) set_brightness ✓ (clean message) perform_action ✓ (AXPress) ocr regionelement_at ✓ (fast + accurate not-found errors).

Notes / known limitations

Test plan for reviewers

cargo test -p jcode-app-core --lib tool::computer
# live (mutates desktop; needs GUI + Accessibility/Screen Recording):
cargo test -p jcode-app-core tool::computer::coverage -- --ignored --nocapture --test-threads=1

jeremy added 4 commits June 8, 2026 21:27
…served word

The computer tool implements every action and ComputerInput field, but
parameters_schema() (progressive disclosure) only declared a subset of
properties. Models emit only parameters present in the tool's JSON schema,
so the undeclared fields could never be sent, making 8 actions impossible
to invoke despite being fully implemented:

- scroll            (needs dx/dy)
- drag              (needs to_x/to_y)
- resize_window     (needs w/h)
- select_menu       (needs menu_path)
- window_screenshot (needs window_id)
- wait_for          (needs contains, timeout_ms)
- set_brightness    (needs level)
- perform_action    (needs ax_action)
- ocr region variant(needs region)

Declare all input fields in the schema. Action *specs* stay in `discover`
(progressive disclosure), but the *fields* must be declared or the action
is dead on arrival. Trim field descriptions so the always-on schema stays
small (~880 tokens) and bump the schema_is_compact bound accordingly.

Also fix ax.rs::element_at: the hit-test AppleScript assigned to a local
named `result`, which is reserved in AppleScript (holds the last command's
value), so element_at always failed with "The variable result is not
defined." Rename to `bestHit`.

Add a schema_declares_every_input_field regression test so this class of
bug (handler requires a field the schema never exposes) can't recur.

Fixes #350.
…cation

Two production-quality issues found during exhaustive live testing:

1) element_at walked the ENTIRE AX tree, so on large apps (System Settings)
   it blew past the scripting timeout (~20s) and failed. A child's AX frame
   is contained in its parent's, so a parent that doesn't contain the point
   can't have a matching descendant: prune to only descend into subtrees
   whose frame contains the point, add a depth bound, and give it an explicit
   12s timeout. This turns a full-tree walk into a path-length walk.

2) osascript errors -1719 (errAEIllegalIndex / "Invalid index") were
   misclassified as "Accessibility permission required", sending users to
   grant a permission they already had when the real cause was "app not
   running / no front window / AX path out of range". Extract the mapping
   into a pure `classify_osa_error`, check permission text first, then map
   -1719/-1728/"invalid index" to an accurate "target not found" message.
   Use -25211 (errAXAPIDisabled) for the genuine permission code. Add unit
   tests covering each branch.

Refs #350.
Each wait_for poll dumped the AX tree (depth 10) under run_applescript's
20s default timeout. On a large app a single poll could take ~20s, so a
wait_for with timeout_ms=4000 didn't return until ~20s — it ignored its own
deadline. Bound each poll's scripting timeout to the time remaining (capped
at 3s) and shrink the dump depth to 6 so polls are cheap and the call
honors timeout_ms.

Refs #350.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant