Skip to content

fix(openai): Only wrap types with _iterator for streamed responses#5917

Merged
alexander-alderman-webb merged 8 commits intomasterfrom
webb/openai/only-wrap-streams
Mar 31, 2026
Merged

fix(openai): Only wrap types with _iterator for streamed responses#5917
alexander-alderman-webb merged 8 commits intomasterfrom
webb/openai/only-wrap-streams

Conversation

@alexander-alderman-webb
Copy link
Copy Markdown
Contributor

@alexander-alderman-webb alexander-alderman-webb commented Mar 30, 2026

Description

Only wrap streaming responses when the _iterator attribute is defined on the returned object.
Create separate functions for wrapping synchronous and asynchronous Completions and Responses APIs.

  • ttft is now a local variable in each wrapper instead of a nonlocal variable.

Tracing LegacyAPIResponse is intentionally left out of scope.

Issues

Closes #5890

Reminders

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

Semver Impact of This PR

🟢 Patch (bug fixes)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

Langchain

  • Set gen_ai.operation.name and gen_ai.pipeline.name on LLM spans by ericapisani in #5849
  • Broaden AI provider detection beyond OpenAI and Anthropic by ericapisani in #5707
  • Update LLM span operation to gen_ai.generate_text by ericapisani in #5796

Other

Bug Fixes 🐛

Ci

  • Update validate-pr action to remove draft enforcement by stephanie-anderson in #5918
  • Use gh CLI to convert PR to draft by stephanie-anderson in #5874
  • Use GitHub App token for draft PR enforcement by stephanie-anderson in #5871

Openai

  • Only wrap types with _iterator for streamed responses by alexander-alderman-webb in #5917
  • Always set gen_ai.response.streaming for Responses by alexander-alderman-webb in #5697
  • Simplify Responses input handling by alexander-alderman-webb in #5695
  • Use max_output_tokens for Responses API by alexander-alderman-webb in #5693
  • Always set gen_ai.response.streaming for Completions by alexander-alderman-webb in #5692
  • Simplify Completions input handling by alexander-alderman-webb in #5690
  • Simplify embeddings input handling by alexander-alderman-webb in #5688

Other

  • (google-genai) Guard response extraction by alexander-alderman-webb in #5869
  • (workflow) Fix permission issue with github app and PR draft graphql endpoint by Jeffreyhung in #5887
  • Add cycle detection to exceptions_from_error by ericapisani in #5880

Documentation 📚

  • Update CONTRIBUTING.md with contribution requirements and TOC by stephanie-anderson in #5896

Internal Changes 🔧

Ai

  • Remove unused GEN_AI_PIPELINE operation constant by ericapisani in #5886
  • Rename generate_text to text_completion by ericapisani in #5885

Langchain

  • Add text completion test by alexander-alderman-webb in #5740
  • Add tool execution test by alexander-alderman-webb in #5739
  • Add basic agent test with Responses call by alexander-alderman-webb in #5726
  • Replace mocks with httpx types by alexander-alderman-webb in #5724
  • Consolidate span origin assertion by alexander-alderman-webb in #5723
  • Consolidate available tools assertion by alexander-alderman-webb in #5721

Openai

  • Replace mocks with httpx types for streaming Responses by alexander-alderman-webb in #5882
  • Replace mocks with httpx types for streaming Completions by alexander-alderman-webb in #5879
  • Move input handling code into API-specific functions by alexander-alderman-webb in #5687

Other

  • (asyncpg) Normalize query whitespace in integration by ericapisani in #5855
  • 🤖 Update test matrix with new releases (03/30) by github-actions in #5912
  • Merge PR validation workflows and add reason-specific labels by stephanie-anderson in #5898
  • Add workflow to close unvetted non-maintainer PRs by stephanie-anderson in #5895
  • Exclude compromised litellm versions by alexander-alderman-webb in #5876
  • Reactivate litellm tests by alexander-alderman-webb in #5853
  • Add note to coordinate with assignee before PR submission by sentrivana in #5868
  • Temporarily stop running litellm tests by alexander-alderman-webb in #5851

Other

  • ci+docs: Add draft PR enforcement by stephanie-anderson in #5867

🤖 This preview updates automatically when you update the PR.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

Codecov Results 📊

13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 10.74s

All tests are passing successfully.

❌ Patch coverage is 0.00%. Project has 14693 uncovered lines.

Files with missing lines (1)
File Patch % Lines
openai.py 4.63% ⚠️ 597 Missing

Generated by Codecov Action

@alexander-alderman-webb alexander-alderman-webb marked this pull request as ready for review March 30, 2026 09:30
@alexander-alderman-webb alexander-alderman-webb requested a review from a team as a code owner March 30, 2026 09:30
@alexander-alderman-webb alexander-alderman-webb marked this pull request as draft March 30, 2026 09:41
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Span never closed for unrecognized streaming response types
    • Added a fallback span.__exit__(None, None, None) in both streaming output handlers when the response is neither Stream nor AsyncStream so manually-entered spans are always closed.

Create PR

Or push these changes by commenting:

@cursor push 52cc4b9c5e
Preview (52cc4b9c5e)
diff --git a/sentry_sdk/integrations/openai.py b/sentry_sdk/integrations/openai.py
--- a/sentry_sdk/integrations/openai.py
+++ b/sentry_sdk/integrations/openai.py
@@ -880,6 +880,8 @@
             old_iterator=response._iterator,
             finish_span=finish_span,
         )
+    elif finish_span:
+        span.__exit__(None, None, None)
 
 
 def _set_responses_api_output_data(
@@ -937,6 +939,8 @@
             old_iterator=response._iterator,
             finish_span=finish_span,
         )
+    elif finish_span:
+        span.__exit__(None, None, None)
 
 
 def _set_embeddings_output_data(

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

@alexander-alderman-webb alexander-alderman-webb marked this pull request as ready for review March 30, 2026 11:40
if is_streaming_response:
_set_streaming_completions_api_output_data(
span, response, kwargs, integration, start_time, finish_span=True
if isinstance(response, Stream):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly that the problem was that the is_streaming_response check used to be too broad and it'd also wrap responses that did not have an _iterator?

I'm wondering whether it'd make sense to, in addition to checking whether the class is Stream or AsyncStream, actually check whether the response has an _iterator attribute and bail gracefully (i.e., do not wrap) if not, just in case OpenAI's internals change at some point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when you set the X-Stainless-Raw-Response header in the request there is an early return:

https://github.com/openai/openai-python/blob/58184ad545ee2abd98e171ee09766f259d7f38cd/src/openai/_base_client.py#L1162

My understanding is that the object in the early return path is less processed (and therefore has no _iterator). The issue arose because litellm always sets the header when requesting a streamed response from OpenAI.

I will add the hasattr(_iterator) check (in this case we don't have a good alternative to depending on the private detail).

Copy link
Copy Markdown
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very nice. See one comment.

@alexander-alderman-webb alexander-alderman-webb merged commit ae28669 into master Mar 31, 2026
158 checks passed
@alexander-alderman-webb alexander-alderman-webb deleted the webb/openai/only-wrap-streams branch March 31, 2026 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAIIntegration breaks litellm streaming — LegacyAPIResponse wraps Stream objects

2 participants