Skip to content

[BOT ISSUE] Agent Framework streaming paths missing token usage metrics #45

@braintrust-bot

Description

@braintrust-bot

Summary

The Agent Framework streaming instrumentation captures no token usage metrics in either the LLM-level (BraintrustChatClientMiddleware) or agent-level (BraintrustAgentMiddleware) streaming paths. The non-streaming paths in both middleware classes correctly capture prompt_tokens, completion_tokens, and tokens via SpanTagHelper.SetTokenMetrics().

What is missing

LLM-level: BraintrustChatClientMiddleware

The non-streaming GetResponseAsync (line 58) captures token usage:

SpanTagHelper.SetTokenMetrics(activity, response.Usage);

The streaming GetStreamingResponseAsync (lines 80–157) never calls SetTokenMetrics. It yields individual ChatResponseUpdate objects but does not accumulate or extract usage data from them.

Agent-level: BraintrustAgentMiddleware

The non-streaming RunCoreAsync (line 59) captures token usage:

SpanTagHelper.SetTokenMetrics(activity, response.Usage);

The streaming RunCoreStreamingAsync (lines 76–130) also never calls SetTokenMetrics. Additionally, this path captures no output at all (no SetOutputMessages call), making it even more sparse than its non-streaming counterpart.

Impact

Users who use Agent Framework streaming — a common pattern for interactive applications — will see spans without token usage data, while non-streaming calls produce complete metrics. This makes cost tracking and token budgeting unreliable for streaming workloads. The gap is particularly pronounced for the agent-level middleware, which captures neither output nor token usage for streaming calls.

Braintrust docs status

The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls state that token usage is captured automatically. For Agent Framework streaming in the C# SDK, this is not the case. Status: supported (documented as expected behavior, not implemented for streaming paths).

Upstream sources

  • Microsoft.Extensions.AI: NuGet package Microsoft.Extensions.AI v10.4.1 — ChatResponse.Usage provides UsageDetails with InputTokenCount, OutputTokenCount, TotalTokenCount; streaming implementations may provide aggregated usage
  • M.E.AI IChatClient docs: https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.ichatclient
  • Microsoft Agents Framework: NuGet packages Microsoft.Agents.AI v1.0.0 and Microsoft.Agents.AI.Workflows v1.0.0

Local files inspected

  • src/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs — non-streaming GetResponseAsync calls SetTokenMetrics (line 58); streaming GetStreamingResponseAsync (lines 80–157) does not
  • src/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs — non-streaming RunCoreAsync calls SetTokenMetrics (line 59); streaming RunCoreStreamingAsync (lines 76–130) does not, and also captures no output
  • src/Braintrust.Sdk.AgentFramework/SpanTagHelper.csSetTokenMetrics (lines 88–98) is available but unused by streaming paths
  • tests/Braintrust.Sdk.AgentFramework.Tests/ — no streaming tests verify token metric capture

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions