Summary
The Agent Framework streaming instrumentation captures no token usage metrics in either the LLM-level (BraintrustChatClientMiddleware) or agent-level (BraintrustAgentMiddleware) streaming paths. The non-streaming paths in both middleware classes correctly capture prompt_tokens, completion_tokens, and tokens via SpanTagHelper.SetTokenMetrics().
What is missing
LLM-level: BraintrustChatClientMiddleware
The non-streaming GetResponseAsync (line 58) captures token usage:
SpanTagHelper.SetTokenMetrics(activity, response.Usage);
The streaming GetStreamingResponseAsync (lines 80–157) never calls SetTokenMetrics. It yields individual ChatResponseUpdate objects but does not accumulate or extract usage data from them.
Agent-level: BraintrustAgentMiddleware
The non-streaming RunCoreAsync (line 59) captures token usage:
SpanTagHelper.SetTokenMetrics(activity, response.Usage);
The streaming RunCoreStreamingAsync (lines 76–130) also never calls SetTokenMetrics. Additionally, this path captures no output at all (no SetOutputMessages call), making it even more sparse than its non-streaming counterpart.
Impact
Users who use Agent Framework streaming — a common pattern for interactive applications — will see spans without token usage data, while non-streaming calls produce complete metrics. This makes cost tracking and token budgeting unreliable for streaming workloads. The gap is particularly pronounced for the agent-level middleware, which captures neither output nor token usage for streaming calls.
Braintrust docs status
The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls state that token usage is captured automatically. For Agent Framework streaming in the C# SDK, this is not the case. Status: supported (documented as expected behavior, not implemented for streaming paths).
Upstream sources
- Microsoft.Extensions.AI: NuGet package
Microsoft.Extensions.AI v10.4.1 — ChatResponse.Usage provides UsageDetails with InputTokenCount, OutputTokenCount, TotalTokenCount; streaming implementations may provide aggregated usage
- M.E.AI
IChatClient docs: https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.ichatclient
- Microsoft Agents Framework: NuGet packages
Microsoft.Agents.AI v1.0.0 and Microsoft.Agents.AI.Workflows v1.0.0
Local files inspected
src/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs — non-streaming GetResponseAsync calls SetTokenMetrics (line 58); streaming GetStreamingResponseAsync (lines 80–157) does not
src/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs — non-streaming RunCoreAsync calls SetTokenMetrics (line 59); streaming RunCoreStreamingAsync (lines 76–130) does not, and also captures no output
src/Braintrust.Sdk.AgentFramework/SpanTagHelper.cs — SetTokenMetrics (lines 88–98) is available but unused by streaming paths
tests/Braintrust.Sdk.AgentFramework.Tests/ — no streaming tests verify token metric capture
Summary
The Agent Framework streaming instrumentation captures no token usage metrics in either the LLM-level (
BraintrustChatClientMiddleware) or agent-level (BraintrustAgentMiddleware) streaming paths. The non-streaming paths in both middleware classes correctly captureprompt_tokens,completion_tokens, andtokensviaSpanTagHelper.SetTokenMetrics().What is missing
LLM-level:
BraintrustChatClientMiddlewareThe non-streaming
GetResponseAsync(line 58) captures token usage:The streaming
GetStreamingResponseAsync(lines 80–157) never callsSetTokenMetrics. It yields individualChatResponseUpdateobjects but does not accumulate or extract usage data from them.Agent-level:
BraintrustAgentMiddlewareThe non-streaming
RunCoreAsync(line 59) captures token usage:The streaming
RunCoreStreamingAsync(lines 76–130) also never callsSetTokenMetrics. Additionally, this path captures no output at all (noSetOutputMessagescall), making it even more sparse than its non-streaming counterpart.Impact
Users who use Agent Framework streaming — a common pattern for interactive applications — will see spans without token usage data, while non-streaming calls produce complete metrics. This makes cost tracking and token budgeting unreliable for streaming workloads. The gap is particularly pronounced for the agent-level middleware, which captures neither output nor token usage for streaming calls.
Braintrust docs status
The Braintrust tracing docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls state that token usage is captured automatically. For Agent Framework streaming in the C# SDK, this is not the case. Status: supported (documented as expected behavior, not implemented for streaming paths).
Upstream sources
Microsoft.Extensions.AIv10.4.1 —ChatResponse.UsageprovidesUsageDetailswithInputTokenCount,OutputTokenCount,TotalTokenCount; streaming implementations may provide aggregated usageIChatClientdocs: https://learn.microsoft.com/en-us/dotnet/api/microsoft.extensions.ai.ichatclientMicrosoft.Agents.AIv1.0.0 andMicrosoft.Agents.AI.Workflowsv1.0.0Local files inspected
src/Braintrust.Sdk.AgentFramework/BraintrustChatClientMiddleware.cs— non-streamingGetResponseAsynccallsSetTokenMetrics(line 58); streamingGetStreamingResponseAsync(lines 80–157) does notsrc/Braintrust.Sdk.AgentFramework/BraintrustAgentMiddleware.cs— non-streamingRunCoreAsynccallsSetTokenMetrics(line 59); streamingRunCoreStreamingAsync(lines 76–130) does not, and also captures no outputsrc/Braintrust.Sdk.AgentFramework/SpanTagHelper.cs—SetTokenMetrics(lines 88–98) is available but unused by streaming pathstests/Braintrust.Sdk.AgentFramework.Tests/— no streaming tests verify token metric capture