fix(openai, Anthropic) fixed finish_reasons Otel semconv compliance gaps. by max-deygin-traceloop · Pull Request #3916 · traceloop/openllmetry

max-deygin-traceloop · 2026-03-31T09:46:13Z

OpenAI, Anthropic, LangChain

Aligns all packages with the Bedrock, to follow OTel genAI convention: when finish_reason is unknown/None, use "" (empty string) instead of fabricating "stop" (OpenAI) or omitting the key entirely (Anthropic, LangChain).

OpenAI chat_wrappers: or "stop" → or ""
OpenAI completion_wrappers: or "stop" → or ""
Anthropic span_utils: conditional if mapped → always set with "" fallback (3 spots)
LangChain span_utils: conditional if fr → always include finish_reason with "" fallback
Updated tests in all three packages to verify the new behavior

Summary by CodeRabbit

Bug Fixes
- Output messages now always include a finish_reason field; when absent it is set to an empty string instead of being omitted or defaulted to "stop", ensuring a consistent message schema across instrumentations.
- Instrumentation now aggregates unique non-empty finish reasons into a top-level span attribute when present.
Tests
- Added and updated tests to assert finish_reason is present with an empty-string fallback and to verify finish-reason mapping and deduplication.

…, Anthropic, LangChain Aligns all packages with the Bedrock convention: when finish_reason is unknown/None, use "" (empty string) instead of fabricating "stop" (OpenAI) or omitting the key entirely (Anthropic, LangChain). - OpenAI chat_wrappers: `or "stop"` → `or ""` - OpenAI completion_wrappers: `or "stop"` → `or ""` - Anthropic span_utils: conditional `if mapped` → always set with `""` fallback (3 spots) - LangChain span_utils: conditional `if fr` → always include `finish_reason` with `""` fallback - Updated tests in all three packages to verify the new behavior

coderabbitai · 2026-03-31T09:46:31Z

Important

Review skipped

Too many files!

This PR contains 164 files, which is 14 over the limit of 150.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6a74447a-f239-42b0-b49d-ca096a6a5d88

📥 Commits

Reviewing files that changed from the base of the PR and between 596b30f and 1f1c5ca.

📒 Files selected for processing (164)

packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/_realtime_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/metrics/cassettes/test_openai_metrics/test_chat_completion_metrics.yaml
packages/opentelemetry-instrumentation-openai/tests/metrics/cassettes/test_openai_metrics/test_chat_parsed_completion_metrics.yaml
packages/opentelemetry-instrumentation-openai/tests/metrics/cassettes/test_openai_metrics/test_embeddings_metrics.yaml
packages/opentelemetry-instrumentation-openai/tests/metrics/cassettes/test_openai_metrics/test_image_gen_metrics.yaml
packages/opentelemetry-instrumentation-openai/tests/metrics/test_openai_metrics.py
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_existing_assistant.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_existing_assistant_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_existing_assistant_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant_with_polling.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant_with_polling_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_new_assistant_with_polling_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_existing_assistant.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_existing_assistant_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_existing_assistant_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_new_assistant.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_new_assistant_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_assistant/test_streaming_new_assistant_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_async_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_async_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_async_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_content_filtering.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_content_filtering_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_content_filtering_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_reasoning.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_chat_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_prompt_content_filtering.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_prompt_content_filtering_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_azure/test_prompt_content_filtering_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_async_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_async_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_async_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_history_message_dict.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_history_message_pydantic.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_pydantic_based_tool_calls.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_pydantic_based_tool_calls_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_pydantic_based_tool_calls_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_reasoning.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_exception_during_consumption.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_memory_leak_prevention.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_not_consumed.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_partial_consumption.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_tool_calls.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_tool_calls_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_tool_calls_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_chat_with_service_tier.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_with_asyncio_run.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_with_asyncio_run_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat/test_with_asyncio_run_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_refused_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_refused_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_async_parsed_refused_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_refused_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_refused_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_parse/test_parsed_refused_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_response_format/test_async_chat_response_format.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_chat_response_format/test_chat_response_format.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_async_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_langchain_style.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_langchain_style_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_langchain_style_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_completions/test_completion_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_azure_openai_embeddings.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_azure_openai_embeddings_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_azure_openai_embeddings_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings_with_raw_response.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings_with_raw_response_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_embeddings/test_embeddings_with_raw_response_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_exceptions/test_exception_in_instrumentation_suppressed.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestChatCompletionsFinishReasons.test_async_chat_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestChatCompletionsFinishReasons.test_async_chat_streaming_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestChatCompletionsFinishReasons.test_chat_completions_finish_reason_in_output_messages.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestChatCompletionsFinishReasons.test_chat_completions_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestChatCompletionsFinishReasons.test_chat_streaming_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestCompletionsFinishReasons.test_async_completions_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestCompletionsFinishReasons.test_async_completions_streaming_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestCompletionsFinishReasons.test_completions_finish_reason_in_output_messages.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestCompletionsFinishReasons.test_completions_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestCompletionsFinishReasons.test_completions_streaming_sets_top_level_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestFinishReasonDeduplication.test_streaming_deduplicates_finish_reasons.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestFinishReasonMapping.test_finish_reason_defaults_to_empty_string_when_missing.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestFinishReasonMapping.test_finish_reason_mapped_from_provider_not_derived.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestFinishReasonMapping.test_finish_reason_tool_calls_mapped_correctly.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestResponsesAPIFinishReasons.test_responses_api_stop_finish_reason.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/TestResponsesAPIFinishReasons.test_responses_api_tool_call_finish_reason.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_parallel.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_parallel_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_parallel_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming_parallel.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming_parallel_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming_parallel_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_streaming_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_tools_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_functions/test_open_ai_function_calls_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching_async.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching_async_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching_async_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_prompt_caching/test_openai_prompt_caching_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_reasoning.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_reasoning_dict_issue.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_async.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_async_with_context_manager.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_async_with_parent_span.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_with_context_manager.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_streaming_with_parent_span.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_tool_calls.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_input_history.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_request_params.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses/test_responses_with_service_tier.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision_base64.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision_base64_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision_base64_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision_with_events_with_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_vision/test_vision_with_events_with_no_content.yaml
packages/opentelemetry-instrumentation-openai/tests/traces/test_assistant.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_semconv_compliance.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_vision.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Standardizes finish_reason handling across Anthropic, LangChain, and OpenAI instrumentation: output message objects always include a finish_reason field, using an empty string ("") when no finish reason is available instead of omitting the key or supplying a default like "stop".

Changes

Cohort / File(s)	Summary
Anthropic Instrumentation `packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`, `packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`	`_map_finish_reason` returns `""` for falsy inputs; output messages and streaming attributes always include `finish_reason` (empty-string fallback). Tests added/updated to expect `finish_reason` present as `""` when missing.
LangChain Instrumentation `packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`, `packages/opentelemetry-instrumentation-langchain/tests/test_finish_reasons.py`	`set_chat_response` always sets `finish_reason` to `fr if fr else ""`. Test updated to expect `finish_reason` present as `""` when none provided.
OpenAI Instrumentation (wrappers & mapping) `packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py`, `packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py`	`_map_finish_reason` maps falsy inputs to `""`. Chat/completion wrappers set per-message `finish_reason` to the mapped value (no fabricated default); they aggregate unique non-empty mapped finish reasons and set the `GEN_AI_RESPONSE_FINISH_REASONS` span attribute when present.
OpenAI Tests `packages/opentelemetry-instrumentation-openai/tests/traces/test_semconv_compliance.py`, `packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py`	New and updated tests asserting empty-string fallback for missing finish reasons, mapping rules (e.g., `tool_calls`→`tool_call`), deduplication of top-level finish reasons, and presence of per-message `finish_reason` as `""` when missing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through spans and JSON trees today,
I nudged each finish_reason so none stray away,
When reasons vanish, I leave a small trace — "",
A tidy field in every message's place,
Hooray for steady telemetry play! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.06% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly identifies the primary change: fixing OpenTelemetry semantic convention compliance for finish_reason handling across OpenAI and Anthropic packages.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch max/fix-finish-reason-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`:
- Around line 398-404: The test currently uses "if raw:" which silently passes
when GenAIAttributes.GEN_AI_OUTPUT_MESSAGES is missing; change it to assert the
attribute exists (e.g., assert raw is not None or assert
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in span.attributes) before loading JSON
so the test fails on missing output messages, then keep the json.loads(raw) and
the existing finish_reason assertion; refer to the variable raw and the constant
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in test_semconv_span_attrs.py to locate
where to add the presence assertion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9ab06105-7ac8-4624-b334-bb0b979285e1

📥 Commits

Reviewing files that changed from the base of the PR and between 0a25803 and 23fa4d9.

📒 Files selected for processing (7)

packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
packages/opentelemetry-instrumentation-langchain/tests/test_finish_reasons.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_semconv_compliance.py

coderabbitai · 2026-03-31T09:50:25Z

packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py

+    raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES)
+    if raw:
+        output = json.loads(raw)
+        assert output[0]["finish_reason"] == "", (
+            f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'"
+        )
+


⚠️ Potential issue | 🟡 Minor

Make the streaming fallback test fail when output messages are missing.

At Line 399, if raw: makes the test pass even if GEN_AI_OUTPUT_MESSAGES is never written. Assert presence first so regressions are caught.

✅ Suggested test hardening

- raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) - if raw: - output = json.loads(raw) - assert output[0]["finish_reason"] == "", ( - f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'" - ) + raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) + assert raw is not None, "GEN_AI_OUTPUT_MESSAGES must be set for streaming text events" + output = json.loads(raw) + assert output[0]["finish_reason"] == "", ( + f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES)

if raw:

output = json.loads(raw)

assert output[0]["finish_reason"] == "", (

f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'"

)

raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES)

assert raw is not None, "GEN_AI_OUTPUT_MESSAGES must be set for streaming text events"

output = json.loads(raw)

assert output[0]["finish_reason"] == "", (

f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'"

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py` around lines 398 - 404, The test currently uses "if raw:" which silently passes when GenAIAttributes.GEN_AI_OUTPUT_MESSAGES is missing; change it to assert the attribute exists (e.g., assert raw is not None or assert GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in span.attributes) before loading JSON so the test fails on missing output messages, then keep the json.loads(raw) and the existing finish_reason assertion; refer to the variable raw and the constant GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in test_semconv_span_attrs.py to locate where to add the presence assertion.

coderabbitai

♻️ Duplicate comments (1)

packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py (1)

398-404: ⚠️ Potential issue | 🟡 Minor

Test silently passes if output messages are missing.

The if raw: guard means the test will pass even if GEN_AI_OUTPUT_MESSAGES is never written. Assert presence first to catch regressions.

✅ Suggested fix

     raw = span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES)
-    if raw:
-        output = json.loads(raw)
-        assert output[0]["finish_reason"] == "", (
-            f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'"
-        )
+    assert raw is not None, "GEN_AI_OUTPUT_MESSAGES must be set for streaming text events"
+    output = json.loads(raw)
+    assert output[0]["finish_reason"] == "", (
+        f"Expected '' for missing streaming finish_reason, got '{output[0]['finish_reason']}'"
+    )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`
around lines 398 - 404, The test currently guards with "if raw:" so it silently
passes when GenAIAttributes.GEN_AI_OUTPUT_MESSAGES is missing; change it to
explicitly assert presence before parsing: assert
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in span.attributes (or assert raw is not
None) then call json.loads(raw) and proceed to assert output[0]["finish_reason"]
== ""; update the check around
span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) and the subsequent
json.loads usage to ensure the attribute exists and is parsed only after the
assertion.

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py (1)

1574-1595: Good coverage of the _map_finish_reason helper.

The tests comprehensively cover:

All falsy inputs returning ""
All known Anthropic reason mappings
Unknown reason passthrough

One minor style note: the class-level import pattern is functional but unconventional. Consider importing at the top of the file alongside other imports from span_utils (lines 35-39) for consistency.

♻️ Optional: move import to top of file

At top of file (around line 39):

 from opentelemetry.instrumentation.anthropic.span_utils import (
     aset_input_attributes,
     set_response_attributes,
     set_streaming_response_attributes,
+    _map_finish_reason,
 )

Then simplify the class:

 class TestMapFinishReason:
-    from opentelemetry.instrumentation.anthropic.span_utils import _map_finish_reason
-    _map_finish_reason = staticmethod(_map_finish_reason)
-
     `@pytest.mark.parametrize`("falsy_input", [None, "", 0, False])
     def test_returns_empty_string_for_falsy(self, falsy_input):
-        assert self._map_finish_reason(falsy_input) == ""
+        assert _map_finish_reason(falsy_input) == ""
 
     def test_maps_end_turn_to_stop(self):
-        assert self._map_finish_reason("end_turn") == "stop"
+        assert _map_finish_reason("end_turn") == "stop"
     # ... similar changes for other test methods

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`
around lines 1574 - 1595, Move the class-level import of _map_finish_reason out
of TestMapFinishReason and place a standard import at the top of the test file
alongside other span_utils imports (import _map_finish_reason from
opentelemetry.instrumentation.anthropic.span_utils), then update
TestMapFinishReason to reference the top-level symbol (remove the class-level
import and staticmethod assignment) so the tests use the module-level
_map_finish_reason directly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`:
- Around line 398-404: The test currently guards with "if raw:" so it silently
passes when GenAIAttributes.GEN_AI_OUTPUT_MESSAGES is missing; change it to
explicitly assert presence before parsing: assert
GenAIAttributes.GEN_AI_OUTPUT_MESSAGES in span.attributes (or assert raw is not
None) then call json.loads(raw) and proceed to assert output[0]["finish_reason"]
== ""; update the check around
span.attributes.get(GenAIAttributes.GEN_AI_OUTPUT_MESSAGES) and the subsequent
json.loads usage to ensure the attribute exists and is parsed only after the
assertion.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py`:
- Around line 1574-1595: Move the class-level import of _map_finish_reason out
of TestMapFinishReason and place a standard import at the top of the test file
alongside other span_utils imports (import _map_finish_reason from
opentelemetry.instrumentation.anthropic.span_utils), then update
TestMapFinishReason to reference the top-level symbol (remove the class-level
import and staticmethod assignment) so the tests use the module-level
_map_finish_reason directly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a3b8a16f-3479-4146-99b8-e8e75b0c55a8

📥 Commits

Reviewing files that changed from the base of the PR and between 23fa4d9 and 1810943.

📒 Files selected for processing (5)

packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_semconv_compliance.py

✅ Files skipped from review due to trivial changes (1)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py

🚧 Files skipped from review as they are similar to previous changes (3)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_semconv_compliance.py
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py (1)

381-423: Replace placeholder pass tests with explicit skip/xfail markers.

Silent passes make CI look green without coverage for these behaviors.

Proposed refactor

 class TestFinishReasonOmission:
@@
+    `@pytest.mark.skip`(reason="Needs mocked scenario with all provider finish_reason values = None")
     def test_finish_reasons_omitted_when_empty(
@@
-        pass
+        ...
@@
 class TestResponsesAPIFinishReasons:
@@
+    `@pytest.mark.skip`(reason="Enable when Responses API fixtures/cassettes are available")
     `@pytest.mark.vcr`
     def test_responses_api_extracts_finish_reason_from_provider(
@@
-        pass
+        ...
@@
+    `@pytest.mark.skip`(reason="Enable when Responses API fixtures/cassettes are available")
     `@pytest.mark.vcr`
     def test_responses_api_sets_top_level_finish_reasons_from_response(
@@
-        pass
+        ...

If you want, I can draft concrete mocked fixtures for these three tests so they can be fully asserted instead of skipped.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py`
around lines 381 - 423, Replace the placeholder pass in the three tests so they
don't silently succeed: update
TestFinishReasonOmission.test_finish_reasons_omitted_when_empty,
TestResponsesAPIFinishReasons.test_responses_api_extracts_finish_reason_from_provider,
and
TestResponsesAPIFinishReasons.test_responses_api_sets_top_level_finish_reasons_from_response
to use explicit pytest markers (e.g., `@pytest.mark.skip`(reason="placeholder:
Responses API not available / requires mocked provider") or
`@pytest.mark.xfail`(reason="placeholder: requires mocked Responses API",
strict=False)) instead of pass; add an appropriate import for pytest if missing
and attach the marker directly above each test function so CI reports them as
skipped/expected-fail rather than passing silently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py`:
- Around line 359-368: The test test_streaming_deduplicates_finish_reasons is
currently invoking a non-streaming call; change the invocation of
openai_client.chat.completions.create in that test to use stream=True and
consume the returned stream (iterator/generator) so the streaming
accumulation/deduplication path is exercised (i.e., call
openai_client.chat.completions.create(..., stream=True) and iterate through the
stream to build/collect the final response), ensuring the assertions remain
against the streamed result.
- Around line 335-353: The test
test_finish_reason_defaults_to_empty_string_when_missing currently only checks
that finish_reason is a string; update it to mock a provider response where
finish_reason is None (when calling openai_client.chat.completions.create) and
then assert the traced output message's finish_reason equals the empty string
""; locate this test and adjust the setup to inject a controlled response with
finish_reason: None, call get_output_messages(span) and replace the loose
isinstance/string assertion with assert output_messages[0]["finish_reason"] ==
"" to confirm the empty-string fallback in the tracing logic.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py`:
- Around line 381-423: Replace the placeholder pass in the three tests so they
don't silently succeed: update
TestFinishReasonOmission.test_finish_reasons_omitted_when_empty,
TestResponsesAPIFinishReasons.test_responses_api_extracts_finish_reason_from_provider,
and
TestResponsesAPIFinishReasons.test_responses_api_sets_top_level_finish_reasons_from_response
to use explicit pytest markers (e.g., `@pytest.mark.skip`(reason="placeholder:
Responses API not available / requires mocked provider") or
`@pytest.mark.xfail`(reason="placeholder: requires mocked Responses API",
strict=False)) instead of pass; add an appropriate import for pytest if missing
and attach the marker directly above each test function so CI reports them as
skipped/expected-fail rather than passing silently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 150234c0-d877-4232-af77-e7a650c36543

📥 Commits

Reviewing files that changed from the base of the PR and between 1810943 and 596b30f.

📒 Files selected for processing (3)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py

coderabbitai · 2026-03-31T12:26:50Z

packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py

+    @pytest.mark.vcr
+    def test_completions_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Non-streaming completions must set gen_ai.response.finish_reasons."""
+        openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a joke about opentelemetry",
+        )
+
+        spans = span_exporter.get_finished_spans()
+        assert len(spans) == 1
+        span = spans[0]
+
+        # Verify top-level finish_reasons attribute is set
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None, "gen_ai.response.finish_reasons must be set"
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+        # Should contain mapped finish_reason from response
+        assert "length" in finish_reasons or "stop" in finish_reasons
+
+    @pytest.mark.vcr
+    def test_completions_finish_reason_in_output_messages(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Output messages must have finish_reason field."""
+        openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a joke about opentelemetry",
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        assert len(output_messages) > 0
+        for msg in output_messages:
+            assert "finish_reason" in msg
+            # finish_reason should be a string (empty string if missing from provider)
+            assert isinstance(msg["finish_reason"], str)
+
+    @pytest.mark.vcr
+    def test_completions_streaming_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, mock_openai_client
+    ):
+        """Streaming completions must accumulate and set finish_reasons."""
+        response = mock_openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a joke about opentelemetry",
+            stream=True,
+        )
+
+        for _ in response:
+            pass
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        # Verify top-level finish_reasons attribute is set
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None, "Streaming must set finish_reasons"
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+    @pytest.mark.vcr
+    @pytest.mark.asyncio
+    async def test_async_completions_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, async_openai_client
+    ):
+        """Async completions must set gen_ai.response.finish_reasons."""
+        await async_openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a joke about opentelemetry",
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+    @pytest.mark.vcr
+    @pytest.mark.asyncio
+    async def test_async_completions_streaming_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, async_openai_client
+    ):
+        """Async streaming completions must set finish_reasons."""
+        response = await async_openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a joke about opentelemetry",
+            stream=True,
+        )
+
+        async for _ in response:
+            pass
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+
+class TestChatCompletionsFinishReasons:
+    """Test finish_reason handling in chat completions API (P1-3)"""
+
+    @pytest.mark.vcr
+    def test_chat_completions_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Non-streaming chat must set gen_ai.response.finish_reasons."""
+        openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        # Verify top-level finish_reasons attribute is set
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None, "gen_ai.response.finish_reasons must be set"
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+        # Should contain "stop" for normal completion
+        assert "stop" in finish_reasons
+
+    @pytest.mark.vcr
+    def test_chat_completions_finish_reason_in_output_messages(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Output messages must have finish_reason field."""
+        openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        assert len(output_messages) > 0
+        for msg in output_messages:
+            assert "finish_reason" in msg
+            assert msg["finish_reason"] == "stop"
+
+    @pytest.mark.vcr
+    def test_chat_streaming_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Streaming chat must accumulate and set finish_reasons."""
+        response = openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
+            stream=True,
+        )
+
+        for _ in response:
+            pass
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        # Verify top-level finish_reasons attribute is set
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None, "Streaming must set finish_reasons"
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+    @pytest.mark.vcr
+    @pytest.mark.asyncio
+    async def test_async_chat_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, async_openai_client
+    ):
+        """Async chat must set gen_ai.response.finish_reasons."""
+        await async_openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+    @pytest.mark.vcr
+    @pytest.mark.asyncio
+    async def test_async_chat_streaming_sets_top_level_finish_reasons(
+        self, instrument_legacy, span_exporter, async_openai_client
+    ):
+        """Async streaming chat must set finish_reasons."""
+        response = await async_openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke about opentelemetry"}],
+            stream=True,
+        )
+
+        async for _ in response:
+            pass
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert finish_reasons is not None
+        assert isinstance(finish_reasons, (list, tuple))
+        assert len(finish_reasons) > 0
+
+
+class TestFinishReasonMapping:
+    """Test finish_reason mapping from provider values"""
+
+    @pytest.mark.vcr
+    def test_finish_reason_mapped_from_provider_not_derived(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """finish_reason must come from response, not inferred from parts.
+
+        This test validates that we use the actual finish_reason from the
+        provider response (e.g., "length", "content_filter") rather than
+        deriving it from the presence of tool_calls or other content.
+        """
+        # Use a completion that hits max_tokens to get finish_reason="length"
+        openai_client.completions.create(
+            model="davinci-002",
+            prompt="Tell me a very long story about opentelemetry",
+            max_tokens=10,  # Force length finish_reason
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        # Verify finish_reason is "length" from provider, not "stop"
+        assert len(output_messages) > 0
+        assert output_messages[0]["finish_reason"] == "length"
+
+        # Verify top-level attribute also has "length"
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+        assert "length" in finish_reasons
+
+    @pytest.mark.vcr
+    def test_finish_reason_tool_calls_mapped_correctly(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """finish_reason "tool_calls" must be mapped to "tool_call"."""
+        response = openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "What's the weather in Boston?"}],
+            tools=[
+                {
+                    "type": "function",
+                    "function": {
+                        "name": "get_weather",
+                        "description": "Get the current weather",
+                        "parameters": {
+                            "type": "object",
+                            "properties": {
+                                "location": {"type": "string"},
+                            },
+                        },
+                    },
+                }
+            ],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        # OpenAI returns "tool_calls" but we should map to "tool_call"
+        if output_messages and output_messages[0].get("parts"):
+            has_tool_call = any(
+                p.get("type") == "tool_call" for p in output_messages[0]["parts"]
+            )
+            if has_tool_call:
+                # Should be mapped to "tool_call" (singular)
+                assert output_messages[0]["finish_reason"] == "tool_call"
+
+                finish_reasons = span.attributes.get(
+                    GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+                )
+                assert "tool_call" in finish_reasons
+
+    @pytest.mark.vcr
+    def test_finish_reason_defaults_to_empty_string_when_missing(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """When provider doesn't return finish_reason, default to empty string."""
+        # This test may need a mocked response where finish_reason is None
+        openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Hello"}],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        # Even if provider returns None, output message should have ""
+        assert len(output_messages) > 0
+        assert "finish_reason" in output_messages[0]
+        assert isinstance(output_messages[0]["finish_reason"], str)
+
+
+class TestFinishReasonDeduplication:
+    """Test finish_reason deduplication in streaming scenarios"""
+
+    @pytest.mark.vcr
+    def test_streaming_deduplicates_finish_reasons(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Streaming with multiple choices should deduplicate finish_reasons."""
+        response = openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke"}],
+            n=2,  # Request 2 completions
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+
+        finish_reasons = span.attributes.get(
+            GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS
+        )
+
+        if finish_reasons:
+            # Should be deduplicated (no duplicate "stop" entries)
+            assert len(finish_reasons) == len(set(finish_reasons))
+
+
+class TestFinishReasonOmission:
+    """Test that top-level finish_reasons is omitted when no meaningful values"""
+
+    def test_finish_reasons_omitted_when_empty(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """When no finish_reasons available, attribute should be omitted entirely.
+
+        This test validates that we don't set an empty array when there are
+        no meaningful finish_reasons (e.g., all None values filtered out).
+        """
+        # This would require a mocked scenario where all finish_reasons are None
+        # For now, this is a placeholder test that documents the expected behavior
+        pass
+
+
+class TestResponsesAPIFinishReasons:
+    """Test finish_reason handling in Responses API (P1-1, P1-2)"""
+
+    @pytest.mark.vcr
+    def test_responses_api_extracts_finish_reason_from_provider(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Responses API must extract finish_reason from provider response.
+
+        This test validates that finish_reason comes from the actual response
+        object, not derived from the presence of tool_calls or message types.
+        """
+        # This test requires the Responses API which may not be in all OpenAI versions
+        # Placeholder for when Responses API tests are available
+        pass
+
+    @pytest.mark.vcr
+    def test_responses_api_sets_top_level_finish_reasons_from_response(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Responses API must extract top-level finish_reasons from response choices.
+
+        This test validates that finish_reasons are read from response.choices[]
+        rather than fabricated from output block types.
+        """
+        # Placeholder for Responses API test
+        pass


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Locate likely cassette files echo "== Candidate VCR cassette files ==" fd -i 'cassette|cassettes|vcr' . # 2) Scan YAML/JSON artifacts for likely secrets/PII markers echo echo "== Potential secret/PII markers in yaml/json artifacts ==" fd -e yml -e yaml -e json . | rg -n --iglob '*cassette*' \ -e 'Authorization' \ -e 'Bearer ' \ -e 'api[_-]?key' \ -e 'x-api-key' \ -e 'sk-[A-Za-z0-9]+' || true # 3) Show references to filters that scrub headers/body in VCR setup echo echo "== VCR scrub/filter configuration references ==" rg -n -C2 'filter_headers|before_record|before_record_request|before_record_response'

Repository: traceloop/openllmetry

Length of output: 13550

🏁 Script executed:

# Check if cassette files exist for test_finish_reasons.py ls -la ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ | head -20 echo "---" echo "Count of cassette files:" find ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ -type f | wc -l echo "---" echo "Check for test_finish_reasons cassettes specifically:" find ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ -name "*finish_reason*" | head -10

Repository: traceloop/openllmetry

Length of output: 1122

🏁 Script executed:

# Verify the conftest.py has proper filter configuration cat -n ./packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py | head -30

Repository: traceloop/openllmetry

Length of output: 535

🏁 Script executed:

# Double-check: is there a test_finish_reasons directory? ls -la ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ | grep -i finish # Also check if cassettes directory exists in the test directory itself find ./packages/opentelemetry-instrumentation-openai/tests/traces -type d -name "cassettes" -o -type d -name "*finish*" # Count total test methods in test_finish_reasons.py grep -c "def test_" ./packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py || echo "File not found" # Check if the test file exists ls -la ./packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py 2>/dev/null || echo "test_finish_reasons.py not found"

Repository: traceloop/openllmetry

Length of output: 272

🏁 Script executed:

# Check the VCR record mode configuration in conftest.py (if set) grep -n "record_mode\|vcr_config" ./packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py # Check if there's pytest.ini or pyproject.toml VCR settings cat ./packages/opentelemetry-instrumentation-openai/tests/traces/pytest.ini 2>/dev/null || echo "No pytest.ini"

Repository: traceloop/openllmetry

Length of output: 98

🏁 Script executed:

# List all cassette subdirectories to see structure ls -1 ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ echo "---" # Check if cassettes were recorded for any test in test_finish_reasons find ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ -type f -name "*.yaml" -o -name "*.json" | xargs grep -l "finish.reason\|finish_reason" | head -5 || echo "No cassettes found with finish_reason" echo "---" # Check when cassettes were last modified to see if they're recent find ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/ -type f -printf '%T@ %Tc %p\n' | sort -n | tail -5

Repository: traceloop/openllmetry

Length of output: 1811

🏁 Script executed:

# Check test_finish_reasons.py to see what the test decorator says and cassette strategy head -50 ./packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py echo "---" # Check if there's a test_finish_reasons cassette directory at all find ./packages/opentelemetry-instrumentation-openai/tests/traces -name "*finish_reasons*" -type d

Repository: traceloop/openllmetry

Length of output: 1943

🏁 Script executed:

# Check the cassettes directory - does test_finish_reasons have any cassettes? test -d ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons && echo "Directory exists" || echo "test_finish_reasons cassette directory DOES NOT EXIST" # Count VCR-marked tests in test_finish_reasons.py grep -c "@pytest.mark.vcr" ./packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py # Check git status to see if cassettes were added git -C ./packages/opentelemetry-instrumentation-openai status --short tests/traces/cassettes/ 2>/dev/null | head -20 || echo "Git status unavailable"

Repository: traceloop/openllmetry

Length of output: 121

🏁 Script executed:

# Check if there's a record_mode setting anywhere that might affect VCR behavior grep -r "record_mode\|vcr.*once\|vcr.*new_episodes" ./packages/opentelemetry-instrumentation-openai/tests/ 2>/dev/null | head -10 || echo "No record_mode found" # Verify the conftest fixtures are being used properly grep -A5 "def openai_client" ./packages/opentelemetry-instrumentation-openai/tests/conftest.py | head -15

Repository: traceloop/openllmetry

Length of output: 149

🏁 Script executed:

# Verify the conftest filter configuration is complete cat ./packages/opentelemetry-instrumentation-openai/tests/traces/conftest.py echo "---" # Check if cassettes need to be recorded by looking at git diff or status git -C ./packages/opentelemetry-instrumentation-openai diff --name-only 2>/dev/null | grep -i cassette || echo "No cassette changes in diff" echo "---" # Check if test_finish_reasons.py was just added git -C ./packages/opentelemetry-instrumentation-openai status --short 2>/dev/null | grep finish_reason || echo "File status unclear"

Repository: traceloop/openllmetry

Length of output: 479

🏁 Script executed:

# Confirm what happens when pytest runs - check if there's a record_mode default python3 -c " import os os.chdir('./packages/opentelemetry-instrumentation-openai/tests/traces') # Check if pytest-vcr is configured anywhere with open('conftest.py', 'r') as f: content = f.read() if 'record_mode' in content: print('record_mode found in conftest') else: print('No explicit record_mode - will use default (usually once)') " 2>/dev/null || echo "Cannot check record_mode"

Repository: traceloop/openllmetry

Length of output: 122

Cassettes for test_finish_reasons.py must be recorded before merging.

The 16 new VCR-marked tests in test_finish_reasons.py have no corresponding cassette files in ./packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_finish_reasons/. With the default record_mode=once, these tests cannot playback in CI and will fail.

The VCR filter configuration in conftest.py is properly configured with filter_headers for authorization, api-key, openai-organization, openai-project, set-cookie, and x-request-id. However, the cassettes themselves must be recorded. Record them locally using a valid OpenAI API key in an environment variable, and commit the cassettes to verify they contain no secrets and are properly scrubbed.

coderabbitai · 2026-03-31T12:26:50Z

packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py

+    def test_finish_reason_defaults_to_empty_string_when_missing(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """When provider doesn't return finish_reason, default to empty string."""
+        # This test may need a mocked response where finish_reason is None
+        openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Hello"}],
+        )
+
+        spans = span_exporter.get_finished_spans()
+        span = spans[0]
+        output_messages = get_output_messages(span)
+
+        # Even if provider returns None, output message should have ""
+        assert len(output_messages) > 0
+        assert "finish_reason" in output_messages[0]
+        assert isinstance(output_messages[0]["finish_reason"], str)
+


⚠️ Potential issue | 🟡 Minor

This test does not assert the empty-string fallback it describes.

Right now it passes for any string value; it should assert "" against a controlled response where provider finish_reason is None.

Proposed assertion tightening

assert len(output_messages) > 0 assert "finish_reason" in output_messages[0] - assert isinstance(output_messages[0]["finish_reason"], str) + assert output_messages[0]["finish_reason"] == ""

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py` around lines 335 - 353, The test test_finish_reason_defaults_to_empty_string_when_missing currently only checks that finish_reason is a string; update it to mock a provider response where finish_reason is None (when calling openai_client.chat.completions.create) and then assert the traced output message's finish_reason equals the empty string ""; locate this test and adjust the setup to inject a controlled response with finish_reason: None, call get_output_messages(span) and replace the loose isinstance/string assertion with assert output_messages[0]["finish_reason"] == "" to confirm the empty-string fallback in the tracing logic.

coderabbitai · 2026-03-31T12:26:50Z

packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py

+    def test_streaming_deduplicates_finish_reasons(
+        self, instrument_legacy, span_exporter, openai_client
+    ):
+        """Streaming with multiple choices should deduplicate finish_reasons."""
+        response = openai_client.chat.completions.create(
+            model="gpt-3.5-turbo",
+            messages=[{"role": "user", "content": "Tell me a joke"}],
+            n=2,  # Request 2 completions
+        )
+


⚠️ Potential issue | 🟠 Major

test_streaming_deduplicates_finish_reasons is not testing streaming code.

The test name/docstring says streaming, but Line 363-367 call is non-streaming (stream=True is missing), so this misses the streaming accumulation path.

Proposed fix

response = openai_client.chat.completions.create( model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Tell me a joke"}], n=2, # Request 2 completions + stream=True, ) + for _ in response: + pass + spans = span_exporter.get_finished_spans()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-openai/tests/traces/test_finish_reasons.py` around lines 359 - 368, The test test_streaming_deduplicates_finish_reasons is currently invoking a non-streaming call; change the invocation of openai_client.chat.completions.create in that test to use stream=True and consume the returned stream (iterator/generator) so the streaming accumulation/deduplication path is exercised (i.e., call openai_client.chat.completions.create(..., stream=True) and iterate through the stream to build/collect the final response), ensuring the assertions remain against the streamed result.

max-deygin-traceloop requested a review from OzBenSimhonTraceloop March 31, 2026 09:46

max-deygin-traceloop self-assigned this Mar 31, 2026

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

refactor

1810943

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

OzBenSimhonTraceloop approved these changes Mar 31, 2026

View reviewed changes

max-deygin-traceloop changed the title ~~fix: standardize finish_reason fallback to empty string~~ fix(openai, Anthropic) fixed finish_reasons Otel semconv compliance gaps. Mar 31, 2026

Fixed finish reasons

596b30f

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

OzBenSimhonTraceloop self-requested a review March 31, 2026 13:10

max-deygin-traceloop added 3 commits March 31, 2026 17:15

updated tests recordings to real models

013135a

ReRecorded VCR cassetes

9b05f3a

Update test assertions

4f5a25e

max-deygin-traceloop force-pushed the max/fix-finish-reason-fallback branch from b4198cc to 4f5a25e Compare March 31, 2026 14:28

max-deygin-traceloop added 5 commits March 31, 2026 17:35

removed finish reason duplicate writes

a081cd7

fixed tests

bc08032

Fix ommision reason test

d287ab2

Recorded mising cassetes

9852722

fixed semconv request type

1f1c5ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openai, Anthropic) fixed finish_reasons Otel semconv compliance gaps.#3916

fix(openai, Anthropic) fixed finish_reasons Otel semconv compliance gaps.#3916
max-deygin-traceloop wants to merge 11 commits intomainfrom
max/fix-finish-reason-fallback

max-deygin-traceloop commented Mar 31, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 31, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

coderabbitai bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

max-deygin-traceloop commented Mar 31, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAI, Anthropic, LangChain

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

max-deygin-traceloop commented Mar 31, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 31, 2026 •

edited

Loading