fix(openai): Only wrap types with _iterator for streamed responses#5917
fix(openai): Only wrap types with _iterator for streamed responses#5917alexander-alderman-webb merged 8 commits intomasterfrom
Conversation
Semver Impact of This PR🟢 Patch (bug fixes) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨Langchain
Other
Bug Fixes 🐛Ci
Openai
Other
Documentation 📚
Internal Changes 🔧Ai
Langchain
Openai
Other
Other
🤖 This preview updates automatically when you update the PR. |
Codecov Results 📊✅ 13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 10.74s All tests are passing successfully. ❌ Patch coverage is 0.00%. Project has 14693 uncovered lines. Files with missing lines (1)
Generated by Codecov Action |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Span never closed for unrecognized streaming response types
- Added a fallback
span.__exit__(None, None, None)in both streaming output handlers when the response is neitherStreamnorAsyncStreamso manually-entered spans are always closed.
- Added a fallback
Or push these changes by commenting:
@cursor push 52cc4b9c5e
Preview (52cc4b9c5e)
diff --git a/sentry_sdk/integrations/openai.py b/sentry_sdk/integrations/openai.py
--- a/sentry_sdk/integrations/openai.py
+++ b/sentry_sdk/integrations/openai.py
@@ -880,6 +880,8 @@
old_iterator=response._iterator,
finish_span=finish_span,
)
+ elif finish_span:
+ span.__exit__(None, None, None)
def _set_responses_api_output_data(
@@ -937,6 +939,8 @@
old_iterator=response._iterator,
finish_span=finish_span,
)
+ elif finish_span:
+ span.__exit__(None, None, None)
def _set_embeddings_output_data(This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
sentry_sdk/integrations/openai.py
Outdated
| if is_streaming_response: | ||
| _set_streaming_completions_api_output_data( | ||
| span, response, kwargs, integration, start_time, finish_span=True | ||
| if isinstance(response, Stream): |
There was a problem hiding this comment.
Do I understand it correctly that the problem was that the is_streaming_response check used to be too broad and it'd also wrap responses that did not have an _iterator?
I'm wondering whether it'd make sense to, in addition to checking whether the class is Stream or AsyncStream, actually check whether the response has an _iterator attribute and bail gracefully (i.e., do not wrap) if not, just in case OpenAI's internals change at some point.
There was a problem hiding this comment.
Yes, when you set the X-Stainless-Raw-Response header in the request there is an early return:
My understanding is that the object in the early return path is less processed (and therefore has no _iterator). The issue arose because litellm always sets the header when requesting a streamed response from OpenAI.
I will add the hasattr(_iterator) check (in this case we don't have a good alternative to depending on the private detail).
sentrivana
left a comment
There was a problem hiding this comment.
Looks very nice. See one comment.


Description
Only wrap streaming responses when the
_iteratorattribute is defined on the returned object.Create separate functions for wrapping synchronous and asynchronous Completions and Responses APIs.
ttftis now a local variable in each wrapper instead of a nonlocal variable.Tracing
LegacyAPIResponseis intentionally left out of scope.Issues
Closes #5890
Reminders
tox -e linters.feat:,fix:,ref:,meta:)