Python: fix(core): coalesce streamed code_interpreter_tool_call chunks per call_id (fixes #5793) by hanhan761 · Pull Request #6196 · microsoft/agent-framework

hanhan761 · 2026-05-30T07:13:16Z

Summary

Coalesce streamed code_interpreter_tool_call chunks per call_id so that the finalized response contains one content item per logical code interpreter call instead of hundreds of incremental deltas.

Changes

`_types.py`

Added _coalesce_code_interpreter_tool_calls() — groups code_interpreter_tool_call items by call_id, keeps the chunk with the highest sequence_number (or longest text when sequence metadata is absent), and removes duplicates.
Added _code_interpreter_chunk_is_more_complete() — compares two CI call chunks using sequence_number from additional_properties, falling back to input text length.
Called _coalesce_code_interpreter_tool_calls() from _finalize_response() alongside the existing text and text_reasoning coalescing.

`test_types.py`

test_coalesce_code_interpreter_tool_calls_keeps_most_complete — streaming deltas + done collapse to the done event
test_coalesce_code_interpreter_tool_calls_groups_by_call_id — multiple distinct call_ids each keep their own winning chunk
test_coalesce_code_interpreter_tool_calls_preserves_non_ci_items — non-CI items are preserved
test_coalesce_code_interpreter_tool_calls_no_sequence_number — fallback to longest text
test_coalesce_code_interpreter_tool_calls_single_call_is_noop — single CI call unchanged

Issue

Fixes #5793

Verification

pytest packages/core/tests/core/test_types.py -k "test_coalesce_code_interpreter" -v

5/5 passed. Existing type tests also pass.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR adds post-processing to coalesce streamed code_interpreter_tool_call content chunks by call_id, keeping the most complete chunk, and introduces tests covering common coalescing scenarios.

Changes:

Add _coalesce_code_interpreter_tool_calls() and a completeness comparator for CI tool-call chunks.
Invoke CI coalescing during response finalization alongside existing text coalescing.
Add unit tests validating coalescing behavior across same/different call_ids, with and without sequence numbers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
python/packages/core/agent_framework/_types.py	Adds CI tool-call coalescing and applies it during response finalization.
python/packages/core/tests/core/test_types.py	Adds tests for CI tool-call coalescing behavior.

+def _coalesce_code_interpreter_tool_calls(contents: list[Content]) -> None:
+    """Coalesce code_interpreter_tool_call items with the same call_id, keeping the most complete chunk."""
+    best: dict[str, Content] = {}
+    first_pos: dict[str, int] = {}
+    drop_indices: set[int] = set()
+    for i, content in enumerate(contents):
+        if content.type != "code_interpreter_tool_call" or not content.call_id:
+            continue
+        cid = content.call_id
+        if cid in best:
+            if _code_interpreter_chunk_is_more_complete(content, best[cid]):
+                best[cid] = content
+            drop_indices.add(i)
+        else:
+            best[cid] = content
+            first_pos[cid] = i
+    if not drop_indices:
+        return
+    for cid, content in best.items():
+        contents[first_pos[cid]] = content
+    for idx in sorted(drop_indices, reverse=True):
+        contents.pop(idx)


+def _code_interpreter_chunk_is_more_complete(a: Content, b: Content) -> bool:
+    """Return True if 'a' is more complete than 'b'."""
+    seq_a = a.additional_properties.get("sequence_number")
+    seq_b = b.additional_properties.get("sequence_number")
+    if seq_a is not None and seq_b is not None:
+        return seq_a > seq_b
+    len_a = len(a.inputs[0].text) if a.inputs else 0
+    len_b = len(b.inputs[0].text) if b.inputs else 0
+    return len_a > len_b


+    len_a = len(a.inputs[0].text) if a.inputs else 0
+    len_b = len(b.inputs[0].text) if b.inputs else 0


    assert contents[1].text == "Thinking B1 B2"


+def test_coalesce_code_interpreter_tool_calls_keeps_most_complete():


+    assert contents[0].inputs[0].text == "import pandas"
+
+
+def test_coalesce_code_interpreter_tool_calls_groups_by_call_id():


+    assert contents[1].inputs[0].text == "b1"
+
+
+def test_coalesce_code_interpreter_tool_calls_preserves_non_ci_items():


+    assert contents[2].text == "after"
+
+
+def test_coalesce_code_interpreter_tool_calls_no_sequence_number():


+    assert contents[0].inputs[0].text == "longer_script"
+
+
+def test_coalesce_code_interpreter_tool_calls_single_call_is_noop():


…ll_id (fixes microsoft#5793) - _coalesce_code_interpreter_tool_calls() groups by call_id, keeps winner at its original position - _code_interpreter_chunk_is_more_complete() prefers valid sequence_number, coerces to int - _get_ci_chunk_content_length() sums text across all inputs - _try_parse_seq() handles string-typed sequence_number safely - 8 regression tests covering edge cases

Copilot AI review requested due to automatic review settings May 30, 2026 07:13

moonbox3 added the python label May 30, 2026

Copilot AI reviewed May 30, 2026

View reviewed changes

hanhan761 force-pushed the fix-5793-coalesce-ci-calls branch from abd5f01 to bc6c61e Compare May 30, 2026 07:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: fix(core): coalesce streamed code_interpreter_tool_call chunks per call_id (fixes #5793)#6196

Python: fix(core): coalesce streamed code_interpreter_tool_call chunks per call_id (fixes #5793)#6196
hanhan761 wants to merge 1 commit into
microsoft:mainfrom
hanhan761:fix-5793-coalesce-ci-calls

hanhan761 commented May 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		len_a = len(a.inputs[0].text) if a.inputs else 0
		len_b = len(b.inputs[0].text) if b.inputs else 0

		assert contents[1].text == "Thinking B1 B2"


		def test_coalesce_code_interpreter_tool_calls_keeps_most_complete():

		assert contents[0].inputs[0].text == "import pandas"


		def test_coalesce_code_interpreter_tool_calls_groups_by_call_id():

		assert contents[1].inputs[0].text == "b1"


		def test_coalesce_code_interpreter_tool_calls_preserves_non_ci_items():

		assert contents[2].text == "after"


		def test_coalesce_code_interpreter_tool_calls_no_sequence_number():

		assert contents[0].inputs[0].text == "longer_script"


		def test_coalesce_code_interpreter_tool_calls_single_call_is_noop():

Conversation

hanhan761 commented May 30, 2026

Summary

Changes

_types.py

test_types.py

Issue

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`_types.py`

`test_types.py`