Skip to content

spanner: Query span ends immediately (<0.1ms) while child RowIterator span runs for full duration (~100ms+) #14198

@Kurichi

Description

@Kurichi

Problem

In cloud.google.com/go/spanner, the Query span created by OpenTelemetry ends almost immediately because query() uses defer endSpan(), which fires on function return. However, the actual Spanner communication hasn't started yet at that point — all I/O happens later inside RowIterator.Next() via ExecuteStreamingSql.

This creates a broken parent-child relationship in traces:

Spanner.Query          duration=~0.01ms   ← parent, ends immediately (zero I/O)
  └── RowIterator      duration=~100ms    ← child, runs long after parent ended
       └── ExecuteStreamingSql  duration=~100ms  ← actual gRPC call

The parent span ends before its child span even starts doing work, which violates the causal relationship that OpenTelemetry spans are meant to represent.

Root Cause

In transaction.go (L706-708):

func (t *txReadOnly) query(ctx context.Context, statement Statement, options QueryOptions) (ri *RowIterator) {
    ctx, _ = startSpan(ctx, "Query", t.otConfig.commonTraceStartOptions...)
    defer func() { endSpan(ctx, ri.err) }()  // ← ends span on function return
    // ...
    return streamWithTransactionCallbacks(...)
}

query() performs no I/O — it only sets up parameters and returns a RowIterator. The defer endSpan() fires immediately.

In read.go (L95):

func streamWithTransactionCallbacks(...) *RowIterator {
    ctx, cancel := context.WithCancel(ctx)
    ctx, _ = startSpan(ctx, "RowIterator")  // child span of Query's context
    return &RowIterator{...}
}

The RowIterator span is created as a child of Query, but Query has already ended by the time RowIterator.Next() is called.

In read.go (L284-290), the RowIterator span only ends when Stop() is called — which is correct, but its parent has long since ended.

The same pattern exists in Read() with startSpan("Read") + defer endSpan.

Observed Trace Data

From observed traces, all Query/RowIterator pairs showed the same pattern consistently:

Span Typical Duration
Spanner.Query < 0.1ms
RowIterator ~100ms
ExecuteStreamingSql ~100ms

The Query span duration is negligible because it performs no I/O, while the RowIterator child span captures all the actual work.

Comparison with Other Languages

Java (google-cloud-spanner-java / ResumableStreamIterator):

  • A single span is created at construction and remains open throughout the entire iteration lifecycle
  • The span only ends in the close() method
  • Uses try (IScope scope = tracer.withSpan(span)) to ensure child gRPC operations are properly nested

Node.js (googleapis/nodejs-spanner):

  • startTrace('Snapshot.run', ...) creates a single span wrapping the entire operation
  • Span ends on stream 'end' or 'error' event

Go (this issue):

  • 2-span split where Query ends immediately and RowIterator does all the work

Java and Node.js correctly use a single span that covers from query initiation through iteration completion. Only Go has this 2-span split with a broken parent-child relationship.

Expected Behavior

A single span should cover the full query lifecycle (from query initiation to iteration completion):

Before: Query (<0.1ms) → RowIterator (~100ms) → ExecuteStreamingSql (~100ms)
After:  Query (~100ms) → ExecuteStreamingSql (~100ms)

Possible Fix

  1. Remove the separate RowIterator span from streamWithTransactionCallbacks() in read.go
  2. Remove defer endSpan() from query() — instead, have RowIterator.Stop() end the Query span (since r.streamd.ctx would carry the Query span context)
  3. On error before RowIterator creation, end the span immediately
  4. Apply the same fix to Read()

This would align with the Java/Node.js approach of a single span covering the full operation.

Environment

  • cloud.google.com/go/spanner v1.88.0
  • Go 1.24

Related Issues

Metadata

Metadata

Assignees

Labels

api: spannerIssues related to the Spanner API.priority: p2Moderately-important priority. Fix may not be included in next release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions