Skip to content

feat(runtimed): Execution handle abstraction — decouple "submit work" from "get results" #1048

@rgbkrk

Description

@rgbkrk

Problem

The current Python API has three execution methods on CellHandle:

  • cell.run() — execute and wait for collected results
  • cell.queue() — fire and forget
  • cell.stream() — execute and yield events as they come

All three conflate "submit work to the runtime" with "how you consume the results." The frontend model is cleaner: queue a cell for execution, then watch the CRDT for output changes as they stream in. The execution queue is a daemon concept, but there's no first-class representation of it in the Python API.

Proposed API

# Submit work — returns immediately
execution = await cell.execute()

# Check status (sync read from local state)
execution.status      # "pending" | "running" | "done" | "error"
execution.cell_id     # the cell being executed

# Wait for completion
result = await execution.result()          # blocks until done
result = await execution.result(timeout=30) # with timeout

# Stream events (alternative to waiting)
async for event in execution:
    print(event.event_type, event.output)

# Cancel
await execution.cancel()

The cell's outputs update independently through the CRDT regardless of whether anyone holds the Execution handle. This matches how the frontend works — it queues execution and observes document changes.

cell.run() becomes sugar

# Equivalent to:
execution = await cell.execute()
result = await execution.result(timeout=timeout_secs)

cell.queue() becomes sugar

# Equivalent to:
execution = await cell.execute()
# (discard the handle)

Architecture: Execution IDs + CRDT State (no intents)

The spike in #1052 proved that threading a UUID execution_id through the stack works end-to-end. For 2.1, we keep the existing ExecuteCell request/response path (no CRDT-based intents) and focus on making execution a first-class concept in the CRDT documents.

Why no intents

An earlier design proposed writing execution intents into the notebook doc. After review, this was dropped because:

  • Intents are mutable CRDT data but semantically immutable once acknowledged — Automerge doesn't enforce this
  • A client modifying an already-processed intent creates inconsistency the daemon must silently ignore
  • The existing ExecuteCell request/response is clean and ephemeral — no lingering mutable state
  • Attribution and CRDT-based cancel can be revisited later if needed

Document ownership

Concern Document Writer(s)
Cell source, metadata Notebook doc Any client
Cell execution_id pointer Notebook doc Daemon (on acknowledgment)
Kernel status, queue state Runtime state doc Daemon only
Execution lifecycle (status, timing) Runtime state doc Daemon only

How it works

  1. Client sends ExecuteCell request (same as today)
  2. Daemon generates execution_id (UUID), queues the cell, responds with CellQueued { cell_id, execution_id }
  3. Daemon immediately writes execution_id onto the cell in the notebook doc — before the kernel starts
  4. Daemon creates execution entry in runtime state doc: executions/{execution_id}/ with status, timing fields
  5. As the kernel runs, daemon updates status ("queued""running""done" / "error") and appends outputs to outputs/{execution_id}/
  6. Frontend reads the cell's execution_id, looks up outputs from the runtime state doc, displays them
  7. Python Execution handle reads status and outputs from the runtime state doc

Runtime state doc schema

executions/                 Map
  {execution_id}/           Map
    cell_id: Str
    status: Str             "queued" | "running" | "done" | "error"
    execution_count: Int
    started_at: Str|null
    finished_at: Str|null
    success: Bool|null

queue/                      Map
  executing: Str|null       execution_id currently running
  order: List[Str]          execution_ids in queue order

outputs/                    Map
  {execution_id}/           List[Map]
    output_type: Str
    output_json: Str

Cell execution_id pointer

Each cell gains execution_id: Str|null in the notebook doc. Set by the daemon at acknowledgment time. Frontend reads this to know which outputs to display. Clear outputs = set to null (pure client op).

Save semantics

Snapshot both docs, join on execution_id: for each cell, look up its execution_id in the runtime state doc's outputs map, serialize to ipynb.

Cancel

Stays as kernel interrupt (same as today). No CRDT-based cancel for now. The Execution.cancel() method sends an interrupt request.

Sub-issues (implementation order)

See linked sub-issues for agent-sized work items.

Related Issues

Context

This came out of the 2.0 API redesign (#983, #1030). The current run/queue/stream methods work and ship with 2.0. This issue tracks the evolution to a cleaner abstraction that mirrors the frontend's queue-and-observe model.


Status (updated 2026-03-31)

Most sub-issues are landed:

The execution_id is threaded end-to-end. The Python Execution handle ships with 2.0. What remains is moving cell outputs from the notebook doc into RuntimeStateDoc.outputs/{execution_id}/ (#1106), which also sets the pattern for widget outputs (#761).

Note: CRDT-based execution intents were evaluated and dropped. The existing ExecuteCell request/response path is kept — it's clean, ephemeral, and doesn't leave mutable state in the CRDT that must be silently ignored after processing.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions