feat(llm): support parallel tool calls by robdefeo · Pull Request #40 · robdefeo/yaai

robdefeo · 2026-05-02T00:59:33Z

Summary

Both providers were permanently suppressing parallel tool calls
(disable_parallel_tool_use: true / parallel_tool_calls: false), which
kept the implementation correct but prevented the model from issuing
independent tool calls in a single turn. This removes those flags and wires
up the full stack to handle parallel calls properly.

LlmResponse.tool_calls is now a Vec (empty = final answer, N = calls);
ConversationTurn::AssistantToolCall carries calls: Vec<ToolCall>;
EntryContent::ToolCall stores the whole batch as one memory entry.
The Anthropic client merges consecutive ToolResult turns into a single
user message (required by the API). The agent loop dispatches all calls
concurrently via futures::join_all and stores individual result entries.

Closes #39

- LlmResponse.tool_call (Option) replaced by tool_calls (Vec) — empty means final answer, N entries means parallel tool use - ConversationTurn::AssistantToolCall now carries calls: Vec<ToolCall> instead of a single id/name/arguments triple - EntryContent::ToolCall stores calls: Vec<MemoryToolCall> so a whole parallel batch is saved as one assistant memory entry - Anthropic client: disable_parallel_tool_use removed; parse_blocks collects all ToolUse blocks; streaming accumulates per block index via BTreeMap; consecutive ToolResult turns merged into one user message (required by the API for parallel results) - OpenAI client: parallel_tool_calls field removed (API defaults to enabled); complete() and streaming accumulator both collect all calls - Agent loop dispatches all tool calls concurrently via join_all, storing one ToolCall batch entry and one ToolResult entry per result - futures crate added to workspace for join_all

greptile-apps · 2026-05-02T01:01:28Z

Confidence Score: 4/5

Safe to merge with the streaming error-handling fix; the P1 only surfaces on malformed server-sent JSON in the OpenAI streaming path.

One P1 (silent tool-call drop on parse error in OpenAI streaming) caps the score at 4. All other changes are correct and well-tested, including the Anthropic path, memory model, and agent loop.

crates/llm/src/openai.rs — accumulated_response function around line 262

Sequence Diagram

sequenceDiagram
    participant AL as AgentLoop
    participant LLM as LlmClient
    participant MEM as SessionMemory
    participant T1 as Tool[0]
    participant T2 as Tool[1]

    AL->>LLM: complete(turns, tools)
    LLM-->>AL: LlmResponse { tool_calls: [TC1, TC2] }
    AL->>MEM: add_entry(ToolCall { calls: [TC1, TC2], reasoning })
    par Concurrent dispatch
        AL->>T1: dispatch(TC1.name, TC1.args)
        AL->>T2: dispatch(TC2.name, TC2.args)
    end
    T1-->>AL: result1
    T2-->>AL: result2
    AL->>MEM: add_entry(ToolResult { tool_call_id: TC1.id, content: result1 })
    AL->>MEM: add_entry(ToolResult { tool_call_id: TC2.id, content: result2 })
    Note over AL,MEM: On next turn, Anthropic merges consecutive ToolResult turns into one user message

_{Reviews (1): Last reviewed commit: "feat(llm): support parallel tool calls a..." | Re-trigger Greptile}

greptile-apps · 2026-05-02T01:01:31Z

+                let arguments = if args_json.is_empty() {
+                    serde_json::json!({})
+                } else {
+                    serde_json::from_str(&args_json).ok()?


Silent error swallowing on malformed streamed tool-call JSON

.ok()? inside filter_map converts a parse failure to None, causing the tool call to be silently dropped from tool_calls with no error and no log warning. The Anthropic streaming path (lines 435–436 in anthropic.rs) correctly uses .with_context(...)? on a map(...).collect::<Result<_>>()? chain, which propagates the error to the caller. A bad JSON chunk here will produce a response that appears to have fewer tool calls than the model intended, leading to silent data loss.

The fix requires switching from filter_map to map and collecting as Result<Vec<_>> to mirror the Anthropic pattern:

let tool_calls = accumulator .tools .into_values() .filter_map(|(id, name, args_json)| match (id, name) { (Some(id), Some(name)) => { let arguments = if args_json.is_empty() { serde_json::json!({}) } else { serde_json::from_str(&args_json) .context("parsing streamed tool call arguments")? }; Some(Ok(ToolCall { id, name, arguments })) } _ => None, }) .collect::<Result<Vec<_>>>()?;

Copilot

Pull request overview

Enables end-to-end support for parallel tool calls across the LLM clients, agent loop, and session memory, removing the provider-side flags that previously forced tool calls to be serialized.

Changes:

Update core types to represent tool calls as batches (Vec<ToolCall> / Vec<MemoryToolCall>) instead of single calls.
Teach OpenAI + Anthropic clients to serialize/parse multiple tool calls, including streaming accumulation.
Dispatch tool calls concurrently in the agent loop and store one ToolCall batch entry plus per-call ToolResult entries (with Anthropic ToolResult message merging support).

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
crates/memory/src/lib.rs	Store tool calls as a batch in memory via `calls: Vec<MemoryToolCall>`.
crates/llm/src/lib.rs	Change `LlmResponse.tool_call` → `tool_calls: Vec<ToolCall>` and update final-answer logic.
crates/llm/src/openai.rs	Remove `parallel_tool_calls` suppression; add multi-tool-call (incl. streaming) support.
crates/llm/src/anthropic.rs	Remove `disable_parallel_tool_use`; support multiple tool_use blocks and merge consecutive ToolResult turns per API requirements.
crates/llm/tests/stub_tests.rs	Update tests for `tool_calls` semantics and serialization expectations.
crates/agent-loop/src/lib.rs	Dispatch tool calls concurrently (`join_all`) and store batch ToolCall + individual ToolResults.
crates/agent-loop/tests/agent_loop_tests.rs	Add coverage for parallel tool call dispatch + memory storage layout.
crates/agent-loop/Cargo.toml	Add `futures` dependency for concurrent dispatch.
Cargo.toml	Add workspace dependency on `futures`.
Cargo.lock	Lockfile updates for `futures` and related crates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

            for tool_call in tool_calls {
-                // Track and filter by the first tool call's index so parallel
-                // tool calls don't contaminate tool_arguments with a second
-                // call's JSON, which would produce invalid concatenated JSON.
-                if let Some(idx) = tool_call.index {
-                    match accumulator.tool_block_index {
-                        None => accumulator.tool_block_index = Some(idx),
-                        Some(first) if idx != first => {
-                            warn!(
-                                first_index = first,
-                                skipped_index = idx,
-                                "ignoring parallel tool call delta"
-                            );
-                            continue;
-                        }
-                        _ => {}
-                    }
-                } else if accumulator.tool_block_index.is_some() {
-                    // index absent but we already captured a call — treat as belonging to it
-                }
+                let idx = tool_call.index.unwrap_or(0);
+                let entry = accumulator
+                    .tools
+                    .entry(idx)
+                    .or_insert((None, None, String::new()));


+            (Some(id), Some(name)) => {
+                let arguments = if args_json.is_empty() {
+                    serde_json::json!({})
+                } else {
+                    serde_json::from_str(&args_json).ok()?
+                };
+                Some(ToolCall {
+                    id,
+                    name,
+                    arguments,
+                })
+            }
+            _ => None,
+        })
+        .collect();



greptile-apps Bot reviewed May 2, 2026

View reviewed changes

robdefeo requested a review from Copilot May 3, 2026 23:46

Copilot started reviewing on behalf of robdefeo May 3, 2026 23:47 View session

Copilot AI reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): support parallel tool calls#40

feat(llm): support parallel tool calls#40
robdefeo wants to merge 1 commit into
mainfrom
worktree-rosy-sprouting-cray

robdefeo commented May 2, 2026

Uh oh!

greptile-apps Bot commented May 2, 2026

Uh oh!

greptile-apps Bot May 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robdefeo commented May 2, 2026

Summary

Uh oh!

greptile-apps Bot commented May 2, 2026

Confidence Score: 4/5

Sequence Diagram

Uh oh!

greptile-apps Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants