Skip to content

Python: fix CosmosHistoryProvider saves code interpreter tool calls chunk by chunk (#5793)#6192

Open
hanhan761 wants to merge 2 commits into
microsoft:mainfrom
hanhan761:fix-5793-aggregate-ci-chunks
Open

Python: fix CosmosHistoryProvider saves code interpreter tool calls chunk by chunk (#5793)#6192
hanhan761 wants to merge 2 commits into
microsoft:mainfrom
hanhan761:fix-5793-aggregate-ci-chunks

Conversation

@hanhan761
Copy link
Copy Markdown

Summary

Fix #5793: During streaming, code interpreter tool calls are received as multiple content items with the same \call_id\ but different \sequence_number\ values. Before persisting to Cosmos DB, consecutive \code_interpreter_tool_call\ chunks with matching \call_id\ are now merged into a single \Content\ item with the complete, aggregated text.

Changes

\python/packages/azure-cosmos/agent_framework_azure_cosmos/_history_provider.py\

  • Added _merge_code_interpreter_chunks()\ helper function to merge a list of \code_interpreter_tool_call\ chunks into one \Content\ item, concatenating the text inputs and merging \�dditional_properties.
  • Added \CosmosHistoryProvider._aggregate_code_interpreter_calls()\ static method that scans \contents\ for consecutive \code_interpreter_tool_call\ items with the same \call_id\ and merges them.
  • Updated \save_messages()\ to run contents through the aggregation before serializing to Cosmos DB.

\python/packages/azure-cosmos/tests/test_cosmos_history_provider.py\

Added \TestCodeInterpreterAggregation\ class with tests:

  • \ est_merge_code_interpreter_chunks: verifies chunk text concatenation
  • \ est_aggregate_code_interpreter_calls_merges_consecutive_chunks: verifies full aggregation pipeline with mixed content types
  • \ est_aggregate_preserves_independent_tool_calls: verifies different \call_id\ values are not merged
  • \ est_save_messages_calls_aggregation: verifies \save_messages\ calls aggregation and saves aggregated content

Issue

Fixes #5793

…ider (microsoft#5793)

During streaming, code interpreter output arrives as multiple content items with the same call_id but different sequence_number values. Before persisting to Cosmos DB, consecutive code_interpreter_tool_call chunks with matching call_id are now merged into a single Content item with the complete, aggregated text.
Copilot AI review requested due to automatic review settings May 30, 2026 07:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds aggregation logic to CosmosHistoryProvider so that streamed code_interpreter_tool_call chunks (which arrive as multiple Content items sharing the same call_id) are merged into a single content item before being persisted to Cosmos DB.

Changes:

  • Introduces _merge_code_interpreter_chunks helper and CosmosHistoryProvider._aggregate_code_interpreter_calls static method.
  • Invokes aggregation inside save_messages prior to serializing each message.
  • Adds unit tests covering merging, non-merging of distinct call_ids, and end-to-end behavior through save_messages.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
python/packages/azure-cosmos/agent_framework_azure_cosmos/_history_provider.py Implements chunk merging helper and aggregation hook in save_messages.
python/packages/azure-cosmos/tests/test_cosmos_history_provider.py Adds TestCodeInterpreterAggregation test class verifying merge logic.

Comment on lines +239 to +240
if message.contents:
message.contents = self._aggregate_code_interpreter_calls(message.contents)
Comment on lines +31 to +34
for chunk in chunks:
for inp in (chunk.inputs or []):
if inp.type == "text" and inp.text:
all_text_parts.append(inp.text)
Comment on lines +35 to +36
if chunk.additional_properties:
merged_additional_properties.update(chunk.additional_properties)
Comment on lines +487 to +491
def test_save_messages_calls_aggregation(self) -> None:
"""save_messages aggregates code interpreter chunks before saving."""
from unittest.mock import AsyncMock, patch, MagicMock
from agent_framework import Content, Message
from agent_framework_azure_cosmos._history_provider import CosmosHistoryProvider
…st style

- save_messages: build message dict directly instead of mutating caller's Message\n- _merge_code_interpreter_chunks: preserve non-text inputs, drop sequence_number\n- Tests: module-level imports, async test methods, add coverage for new behaviors
@hanhan761
Copy link
Copy Markdown
Author

All 4 review comments have been addressed in commit 2ae1e66:

  1. save_messages no longer mutates caller's Message — now builds the message dict directly from \message.to_dict()\ with aggregated contents, leaving the original \Message\ untouched.
  2. Non-text inputs are preserved — _merge_code_interpreter_chunks\ now collects non-text inputs (e.g. images) and appends them to \merged_inputs\ alongside the concatenated text.
  3. \sequence_number\ is dropped from merged properties — the per-chunk \sequence_number\ key is explicitly skipped when merging \�dditional_properties, since the aggregated item no longer represents a single chunk.
  4. Tests converted to async style — imports moved to module level, @pytest.mark.asyncio\ used instead of \�syncio.run, added new test cases for immutability, non-text inputs, and sequence_number behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: CosmosHistoryProvider Code interpreter tool calls are saved chunk by chunk

3 participants