Add framework-level OpenTelemetry tracing by cpsievert · Pull Request #310 · posit-dev/chatlas

cpsievert · 2026-05-12T20:45:23Z

Summary

Chatlas now emits OpenTelemetry spans that capture the full structure of multi-turn conversations and tool execution — without requiring any provider-specific instrumentor libraries. When a TracerProvider is configured, every chat()/stream() call automatically produces a 3-level span hierarchy:

invoke_agent                      # wraps the full chat loop
├── chat gpt-4o                   # each model API call
├── execute_tool get_weather      # each tool invocation
├── chat gpt-4o                   # follow-up model call
└── ...

Users opt in with pip install "chatlas[otel]" and a standard TracerProvider setup (console exporter, Logfire, or any OTLP-compatible backend). The approach is consistent with Shiny for Python's OTel story — same [otel] extra pattern, same recommended tools, same config-module pattern.

Spans follow the GenAI semantic conventions and record token usage, response model/ID, and optionally full message content (gated by OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT). These framework spans complement (not replace) provider-specific SDK instrumentors like opentelemetry-instrumentation-openai-v2.

New chatlas/_otel.py module with span lifecycle functions
Hooks into all 6 core Chat methods (sync + async for agent/chat/tool spans)
7 tests with VCR cassettes covering span hierarchy, token usage, content capture, tool errors, streaming, and no-op behavior
Updated docs/get-started/monitor.qmd with framework-level tracing docs

Test plan

pytest tests/test_otel.py — 7/7 passing
pyright chatlas/_otel.py — 0 errors
ruff check and ruff format — clean
Manual verification with a real exporter (Logfire or console) and live API key

Adds a new chatlas/_otel.py module that emits OpenTelemetry spans for the chat lifecycle: invoke_agent (top-level), chat (per model call), and execute_tool (per tool invocation). Spans follow the GenAI semantic conventions with attributes like gen_ai.usage.input_tokens, gen_ai.response.model, and optional message content capture controlled by the OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT env var. Also adds an `otel` optional dependency extra (`pip install chatlas[otel]`).

Wires the _otel span functions into the six core Chat methods: _chat_impl/_chat_impl_async (agent spans), _submit_turns/ _submit_turns_async (chat spans), and _invoke_tool/ _invoke_tool_async (tool spans). Parent context is passed explicitly via _otel_parent to avoid async context hazards.

Adds 7 tests covering span hierarchy, token usage, content capture (on/off), tool error recording, streaming lifecycle, and no-op behavior. Includes VCR cassettes for replay without live API keys. Updates docs/get-started/monitor.qmd with a new framework-level tracing section (console quickstart, Logfire production path, config-module pattern) before the existing provider-specific content.

The OTel API is ~212KB with no heavy transitive deps, and its default ProxyTracer already no-ops when no SDK is configured. Making it a hard dep lets us drop the lazy initialization (cache_tracer/initialized/is_tracing guards) and always create spans, relying on the no-op machinery for zero overhead when nobody is collecting. Removes the `chatlas[otel]` extra — the API is now always available. Users still opt in to collection by installing opentelemetry-sdk and configuring a TracerProvider.

cpsievert added 4 commits May 12, 2026 15:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add framework-level OpenTelemetry tracing#310

Add framework-level OpenTelemetry tracing#310
cpsievert wants to merge 4 commits into
mainfrom
worktree-feat+otel-framework-spans

cpsievert commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cpsievert commented May 12, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant