Skip to content

feat(llm): support parallel tool calls#40

Open
robdefeo wants to merge 1 commit into
mainfrom
worktree-rosy-sprouting-cray
Open

feat(llm): support parallel tool calls#40
robdefeo wants to merge 1 commit into
mainfrom
worktree-rosy-sprouting-cray

Conversation

@robdefeo
Copy link
Copy Markdown
Owner

@robdefeo robdefeo commented May 2, 2026

Summary

Both providers were permanently suppressing parallel tool calls
(disable_parallel_tool_use: true / parallel_tool_calls: false), which
kept the implementation correct but prevented the model from issuing
independent tool calls in a single turn. This removes those flags and wires
up the full stack to handle parallel calls properly.

LlmResponse.tool_calls is now a Vec (empty = final answer, N = calls);
ConversationTurn::AssistantToolCall carries calls: Vec<ToolCall>;
EntryContent::ToolCall stores the whole batch as one memory entry.
The Anthropic client merges consecutive ToolResult turns into a single
user message (required by the API). The agent loop dispatches all calls
concurrently via futures::join_all and stores individual result entries.

Closes #39

- LlmResponse.tool_call (Option) replaced by tool_calls (Vec) — empty
  means final answer, N entries means parallel tool use
- ConversationTurn::AssistantToolCall now carries calls: Vec<ToolCall>
  instead of a single id/name/arguments triple
- EntryContent::ToolCall stores calls: Vec<MemoryToolCall> so a whole
  parallel batch is saved as one assistant memory entry
- Anthropic client: disable_parallel_tool_use removed; parse_blocks
  collects all ToolUse blocks; streaming accumulates per block index via
  BTreeMap; consecutive ToolResult turns merged into one user message
  (required by the API for parallel results)
- OpenAI client: parallel_tool_calls field removed (API defaults to
  enabled); complete() and streaming accumulator both collect all calls
- Agent loop dispatches all tool calls concurrently via join_all,
  storing one ToolCall batch entry and one ToolResult entry per result
- futures crate added to workspace for join_all
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 2, 2026

Confidence Score: 4/5

Safe to merge with the streaming error-handling fix; the P1 only surfaces on malformed server-sent JSON in the OpenAI streaming path.

One P1 (silent tool-call drop on parse error in OpenAI streaming) caps the score at 4. All other changes are correct and well-tested, including the Anthropic path, memory model, and agent loop.

crates/llm/src/openai.rs — accumulated_response function around line 262

Sequence Diagram

sequenceDiagram
    participant AL as AgentLoop
    participant LLM as LlmClient
    participant MEM as SessionMemory
    participant T1 as Tool[0]
    participant T2 as Tool[1]

    AL->>LLM: complete(turns, tools)
    LLM-->>AL: LlmResponse { tool_calls: [TC1, TC2] }
    AL->>MEM: add_entry(ToolCall { calls: [TC1, TC2], reasoning })
    par Concurrent dispatch
        AL->>T1: dispatch(TC1.name, TC1.args)
        AL->>T2: dispatch(TC2.name, TC2.args)
    end
    T1-->>AL: result1
    T2-->>AL: result2
    AL->>MEM: add_entry(ToolResult { tool_call_id: TC1.id, content: result1 })
    AL->>MEM: add_entry(ToolResult { tool_call_id: TC2.id, content: result2 })
    Note over AL,MEM: On next turn, Anthropic merges consecutive ToolResult turns into one user message
Loading

Reviews (1): Last reviewed commit: "feat(llm): support parallel tool calls a..." | Re-trigger Greptile

Comment thread crates/llm/src/openai.rs
let arguments = if args_json.is_empty() {
serde_json::json!({})
} else {
serde_json::from_str(&args_json).ok()?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent error swallowing on malformed streamed tool-call JSON

.ok()? inside filter_map converts a parse failure to None, causing the tool call to be silently dropped from tool_calls with no error and no log warning. The Anthropic streaming path (lines 435–436 in anthropic.rs) correctly uses .with_context(...)? on a map(...).collect::<Result<_>>()? chain, which propagates the error to the caller. A bad JSON chunk here will produce a response that appears to have fewer tool calls than the model intended, leading to silent data loss.

The fix requires switching from filter_map to map and collecting as Result<Vec<_>> to mirror the Anthropic pattern:

let tool_calls = accumulator
    .tools
    .into_values()
    .filter_map(|(id, name, args_json)| match (id, name) {
        (Some(id), Some(name)) => {
            let arguments = if args_json.is_empty() {
                serde_json::json!({})
            } else {
                serde_json::from_str(&args_json)
                    .context("parsing streamed tool call arguments")?
            };
            Some(Ok(ToolCall { id, name, arguments }))
        }
        _ => None,
    })
    .collect::<Result<Vec<_>>>()?;

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables end-to-end support for parallel tool calls across the LLM clients, agent loop, and session memory, removing the provider-side flags that previously forced tool calls to be serialized.

Changes:

  • Update core types to represent tool calls as batches (Vec<ToolCall> / Vec<MemoryToolCall>) instead of single calls.
  • Teach OpenAI + Anthropic clients to serialize/parse multiple tool calls, including streaming accumulation.
  • Dispatch tool calls concurrently in the agent loop and store one ToolCall batch entry plus per-call ToolResult entries (with Anthropic ToolResult message merging support).

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/memory/src/lib.rs Store tool calls as a batch in memory via calls: Vec<MemoryToolCall>.
crates/llm/src/lib.rs Change LlmResponse.tool_calltool_calls: Vec<ToolCall> and update final-answer logic.
crates/llm/src/openai.rs Remove parallel_tool_calls suppression; add multi-tool-call (incl. streaming) support.
crates/llm/src/anthropic.rs Remove disable_parallel_tool_use; support multiple tool_use blocks and merge consecutive ToolResult turns per API requirements.
crates/llm/tests/stub_tests.rs Update tests for tool_calls semantics and serialization expectations.
crates/agent-loop/src/lib.rs Dispatch tool calls concurrently (join_all) and store batch ToolCall + individual ToolResults.
crates/agent-loop/tests/agent_loop_tests.rs Add coverage for parallel tool call dispatch + memory storage layout.
crates/agent-loop/Cargo.toml Add futures dependency for concurrent dispatch.
Cargo.toml Add workspace dependency on futures.
Cargo.lock Lockfile updates for futures and related crates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/llm/src/openai.rs
Comment on lines 228 to +233
for tool_call in tool_calls {
// Track and filter by the first tool call's index so parallel
// tool calls don't contaminate tool_arguments with a second
// call's JSON, which would produce invalid concatenated JSON.
if let Some(idx) = tool_call.index {
match accumulator.tool_block_index {
None => accumulator.tool_block_index = Some(idx),
Some(first) if idx != first => {
warn!(
first_index = first,
skipped_index = idx,
"ignoring parallel tool call delta"
);
continue;
}
_ => {}
}
} else if accumulator.tool_block_index.is_some() {
// index absent but we already captured a call — treat as belonging to it
}
let idx = tool_call.index.unwrap_or(0);
let entry = accumulator
.tools
.entry(idx)
.or_insert((None, None, String::new()));
Comment thread crates/llm/src/openai.rs
Comment on lines +258 to 273
(Some(id), Some(name)) => {
let arguments = if args_json.is_empty() {
serde_json::json!({})
} else {
serde_json::from_str(&args_json).ok()?
};
Some(ToolCall {
id,
name,
arguments,
})
}
_ => None,
})
.collect();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(llm): support parallel tool calls instead of always disabling them

2 participants