fix(computer): make every action invocable (schema field gaps) + element_at/error-mapping fixes#351
Open
1jehuang wants to merge 4 commits into
Open
Conversation
added 4 commits
June 8, 2026 21:27
…served word The computer tool implements every action and ComputerInput field, but parameters_schema() (progressive disclosure) only declared a subset of properties. Models emit only parameters present in the tool's JSON schema, so the undeclared fields could never be sent, making 8 actions impossible to invoke despite being fully implemented: - scroll (needs dx/dy) - drag (needs to_x/to_y) - resize_window (needs w/h) - select_menu (needs menu_path) - window_screenshot (needs window_id) - wait_for (needs contains, timeout_ms) - set_brightness (needs level) - perform_action (needs ax_action) - ocr region variant(needs region) Declare all input fields in the schema. Action *specs* stay in `discover` (progressive disclosure), but the *fields* must be declared or the action is dead on arrival. Trim field descriptions so the always-on schema stays small (~880 tokens) and bump the schema_is_compact bound accordingly. Also fix ax.rs::element_at: the hit-test AppleScript assigned to a local named `result`, which is reserved in AppleScript (holds the last command's value), so element_at always failed with "The variable result is not defined." Rename to `bestHit`. Add a schema_declares_every_input_field regression test so this class of bug (handler requires a field the schema never exposes) can't recur. Fixes #350.
…cation Two production-quality issues found during exhaustive live testing: 1) element_at walked the ENTIRE AX tree, so on large apps (System Settings) it blew past the scripting timeout (~20s) and failed. A child's AX frame is contained in its parent's, so a parent that doesn't contain the point can't have a matching descendant: prune to only descend into subtrees whose frame contains the point, add a depth bound, and give it an explicit 12s timeout. This turns a full-tree walk into a path-length walk. 2) osascript errors -1719 (errAEIllegalIndex / "Invalid index") were misclassified as "Accessibility permission required", sending users to grant a permission they already had when the real cause was "app not running / no front window / AX path out of range". Extract the mapping into a pure `classify_osa_error`, check permission text first, then map -1719/-1728/"invalid index" to an accurate "target not found" message. Use -25211 (errAXAPIDisabled) for the genuine permission code. Add unit tests covering each branch. Refs #350.
Each wait_for poll dumped the AX tree (depth 10) under run_applescript's 20s default timeout. On a large app a single poll could take ~20s, so a wait_for with timeout_ms=4000 didn't return until ~20s — it ignored its own deadline. Bound each poll's scripting timeout to the time remaining (capped at 3s) and shrink the dump depth to 6 so polls are cheap and the call honors timeout_ms. Refs #350.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Exhaustively exercised every
computertool action against the live running server and fixed every failure so the tool is production-ready. The handlers were already implemented; the bugs were in the function-calling contract and a few AppleScript/error-mapping/perf details. All fixes are validated live after build + reload.Fixes #350.
Bugs fixed
1. Schema omitted half the input fields → 8+ actions were impossible to call
ComputerInputdeserializesdx, dy, to_x, to_y, w, h, menu_path, window_id, ax_action, contains, timeout_ms, region, levelanddispatch()requires them, butparameters_schema()only declared a subset. A model can only emit parameters present in the tool schema, so those fields could never be sent and each dependent action dead-ended in its ownrequires Xguard:scroll, drag, resize_window, select_menu, window_screenshot, wait_for, set_brightness, perform_action, plus theocrregion variant.Fix: declare all input fields (action specs stay in
discover; fields must be declared). Trimmed descriptions so the always-on schema stays ~880 tokens. Addedschema_declares_every_input_fieldregression test.2.
element_atused the AppleScript reserved wordresultresultholds the last command's value, soelement_atalways failed withThe variable result is not defined.Renamed tobestHit.3.
element_atwalked the entire AX tree → 20s timeout on big appsNow prunes to only descend subtrees whose frame contains the point (a child's frame ⊆ its parent's), with a depth bound + explicit 12s timeout. Full-tree walk → path-length walk. Live: System Settings used to time out (20s); a comparable large tree (Messages) now returns in ~1.6s.
4.
-1719misclassified as "Accessibility permission required"-1719(errAEIllegalIndex) /-1728(errAENoSuchObject) mean the target reference didn't resolve (app not running / no front window / AX path out of range), not a permission problem. Extracted a pure, unit-testedclassify_osa_error: permission text first, then map index errors to an accurate "target not found" message. Use-25211(errAXAPIDisabled) for the real permission code.5.
wait_forignored its owntimeout_msEach poll dumped the AX tree (depth 10) under the 20s default scripting timeout, so a single poll could take ~20s and a
timeout_ms=2000call wouldn't return until ~20s. Now each poll is bounded by the time remaining (capped 3s) and the dump depth is reduced. Live: a 2000mswait_fornow returns in ~2.2s.Validation
cargo test -p jcode-app-core --lib tool::computer: 26 passed, 0 failed (10 new tests across the fixes).selfdev, installed, reloaded the running server through 4 iterations, and re-ran the full action matrix live. Every previously-broken action now works:scroll✓drag✓resize_window✓select_menu✓ (View > Actual Size in Preview)window_screenshot✓ (captured an occluded window)wait_for✓ (honors timeout)set_brightness✓ (clean message)perform_action✓ (AXPress)ocr region✓element_at✓ (fast + accurate not-found errors).Notes / known limitations
wait_for/ui/element_atoperate onfront window of frontApp; apps on another Space or with non-AX content (Catalyst/Electron/rich-text) expose limited text. This is an inherent AX limitation, not a tool bug. Follow-up tracked in computer tool: truthful observability animation for background AX actions (highlight when visible, notice when occluded) #348.Test plan for reviewers