Skip to content

feat(loops+otel): nested GenAI-semconv span tree for loop topology (Tangle Intelligence viewer)#78

Merged
drewstone merged 1 commit into
mainfrom
feat/loop-otel-genai-tracing
May 31, 2026
Merged

feat(loops+otel): nested GenAI-semconv span tree for loop topology (Tangle Intelligence viewer)#78
drewstone merged 1 commit into
mainfrom
feat/loop-otel-genai-tracing

Conversation

@tangletools
Copy link
Copy Markdown
Contributor

Makes the driven/dynamic loops render as a real topology tree in any
OpenTelemetry/GenAI viewer (Tangle Intelligence in ADC, Phoenix, Tempo, …) over
the existing OTLP export path — no new channel.

Problem

Loop traces were flat, zero-duration point spans with bespoke loop.*
attributes: ingestible, but renders as a flat list, and the agent-authored
topology (the per-round move + why) was invisible — the driver/planner never
reach the trace layer.

Future-proofing — standard, not bespoke

Spans now follow the current OpenTelemetry GenAI semantic conventions
(registry),
explicitly avoiding the deprecated keys:

  • gen_ai.operation.name (invoke_workflow / invoke_agent), gen_ai.agent.name,
    gen_ai.conversation.id, gen_ai.usage.input_tokens / output_tokens
  • 🚫 not gen_ai.system, gen_ai.usage.prompt_tokens, gen_ai.usage.completion_tokens
  • tangle.loop.* / tangle.cost.usd namespaced extension for what OTel hasn't
    standardized (topology move, verdict, placement, cost) — the OTel-blessed
    vendor-extension escape hatch, so a future standard is a one-namespace remap.

Shape

loop (invoke_workflow)
  └─ loop.round[k] (invoke_workflow)   tangle.loop.move.{kind,width,rationale}, .decision
       ├─ loop.iteration[i] (invoke_agent)  gen_ai.agent.name, gen_ai.usage.*,
       │     tangle.loop.verdict.{valid,score}, tangle.loop.placement.kind, tangle.cost.usd
       └─ …

Changes

  • Kernel: new loop.plan event per round ({roundIndex, plannedCount, moveKind, rationale})
    via an optional Driver.describePlan() (dynamic driver returns its move's kind+rationale;
    refine/fanout infer from count). loop.iteration.ended now carries tokenUsage.
  • otel-export: buildLoopOtelSpans(events, traceId, parentSpanId?) — pure builder of the
    nested, real-duration tree with the attribute schema above. loopEventToOtelSpan (flat)
    kept for back-compat. Scope version 0.23.0 → 0.33.0.
  • trace-propagation: buffers events per runId, flushes the tree on loop.ended so the
    live MCP→OTLP path ships the hierarchy.

Tests

buildLoopOtelSpans tree shape + real durations + gen_ai/tangle attrs + a
no-deprecated-keys assertion (+4); kernel emits loop.plan with move kind +
rationale and iteration tokenUsage (+1). Full suite 403 green, tsc + biome clean.

Increment 2 (separate): the planner's own gen_ai span (needs
PlannerContext.traceEmitter). Serves #754 (intelligence ingestor).

…logy

Loop traces were flat, zero-duration point spans with bespoke loop.* attrs —
ingestible but rendering as a flat list, with the agent-authored topology (the
round-by-round move + rationale) invisible. This makes the dynamic loops render
as a real topology tree in any OTel/GenAI viewer (Tangle Intelligence, Phoenix,
Tempo, …) over the existing OTLP export path.

Emission (kernel):
- New loop.plan trace event per plan() round: { roundIndex, plannedCount,
  moveKind, rationale }. moveKind comes from a new OPTIONAL Driver.describePlan()
  (createDynamicDriver returns its chosen move's kind + rationale); refine/
  fanout-vote omit it and the kernel infers kind from the planned-task count.
- loop.iteration.ended now carries tokenUsage → maps to gen_ai.usage.* on the
  branch span.

OTel mapping (otel-export):
- buildLoopOtelSpans(events, traceId, parentSpanId?) reconstructs a nested,
  REAL-DURATION tree: loop (invoke_workflow) → loop.round (move kind/width/
  rationale/decision) → loop.iteration (invoke_agent: gen_ai.agent.name,
  gen_ai.usage.input_tokens/output_tokens, verdict, placement, cost).
- Attributes follow the CURRENT GenAI semconv (gen_ai.operation.name,
  gen_ai.agent.name, gen_ai.conversation.id, gen_ai.usage.input/output_tokens) —
  explicitly NOT the deprecated gen_ai.system / prompt_tokens / completion_tokens
  — plus a namespaced tangle.loop.* / tangle.cost.usd extension for what OTel
  hasn't standardized (topology move, verdict, placement, cost).
- trace-propagation buffers events per runId and flushes the tree on loop.ended,
  so the live MCP→OTLP path ships the hierarchy, not flat point spans.
- Bumped the stale OTLP scope version 0.23.0 → 0.33.0.

loopEventToOtelSpan (flat) is retained for back-compat. Increment 2 (the
planner's own gen_ai span) needs PlannerContext.traceEmitter — deferred.

Tests: buildLoopOtelSpans tree shape + real durations + gen_ai/tangle attrs +
no-deprecated-keys assertion (otel-export.test.ts, +4); kernel emits loop.plan
with move kind + rationale and iteration tokenUsage (dynamic.test.ts, +1). Full
suite 403 green, tsc + biome clean.
@drewstone drewstone merged commit 86fc739 into main May 31, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants