Skip to content

[QDP Runtime] Add v1 State-Partitioned Distributed Runtime Draft#1226

Open
400Ping wants to merge 2 commits intoapache:dev-distributed-qdpfrom
400Ping:dev-distributed-qdp
Open

[QDP Runtime] Add v1 State-Partitioned Distributed Runtime Draft#1226
400Ping wants to merge 2 commits intoapache:dev-distributed-qdpfrom
400Ping:dev-distributed-qdp

Conversation

@400Ping
Copy link
Copy Markdown
Member

@400Ping 400Ping commented Mar 29, 2026

Summary

This PR introduces a v1 draft of qdp-runtime, a state-partitioned distributed runtime skeleton for Mahout QDP.

The goal of this PR is to establish the core abstractions and control-plane scaffolding needed for multi-GPU and future multi-node execution. This is not intended to be a production-ready distributed runtime yet. Instead, it defines the v1 execution model, task lifecycle, placement policies, gather/reduce semantics, and runtime object tracking needed to iterate safely.

What This PR Adds

New crate

  • qdp-runtime

Core runtime model

  • state-partitioned distributed state metadata
  • local/global qubit layout via PartitionLayout
  • DistributedStateHandle and StatePartitionRef

Placement and topology

  • RoundRobin, Weighted, and TopologyAware placement policies
  • heterogeneous GPU-aware device capability model
  • NVLink-aware topology metadata
  • ClusterInventory and DeviceTopology

Coordinator / worker scaffolding

  • worker registration
  • in-process worker model
  • coordinator job planning
  • partition task generation
  • partition-to-worker mapping

Task lifecycle

  • Pending / Assigned / Running / Completed / Failed
  • retry policy
  • lease timeout skeleton
  • task result reporting

Output handling

  • runtime object/output registry
  • gather planning via GatherPlan
  • metric reduction planning via ReducePlan
  • host-side metric aggregation for Sum / Mean / Min / Max / Concat

Examples and docs

  • qdp-runtime/examples/local_runtime_smoke.rs
  • qdp-runtime/examples/local_runtime_benchmark.rs
  • qdp/docs/runtime/RUNTIME_V1.md

Scope of This PR

This PR focuses on the v1 runtime draft and control-plane design.

It intentionally does not attempt to provide:

  • full multi-node transport
  • persistent GPU object store
  • partition migration
  • dynamic repartitioning
  • production-ready fault tolerance
  • complete benchmark suite integration

Design Notes

The current v1 draft is centered around a state-partitioned execution model.

That means:

  • a logical state may be partitioned across devices
  • partition layout is currently contiguous amplitude blocks
  • placement can be weighted and topology-aware
  • gather and metric reduction are explicit runtime operations
  • runtime outputs are tracked via a coordinator-side object registry

This PR also keeps NVTX instrumentation hooks in place so that future runtime bottlenecks can be profiled more easily.

Current Limitations

  • transport is still in-process / metadata-oriented
  • local executor support is minimal
  • examples are smoke/benchmark scaffolding, not full workflow integration
  • topology support is advisory metadata in v1
  • object tracking exists, but a persistent GPU-resident object store is future work

Follow-Up Work

  • wire the runtime into a real single-node local execution path end-to-end
  • add persistent runtime object storage semantics
  • improve retry/reassignment behavior
  • extend topology-aware gather/reduce logic
  • add benchmark integration with existing QDP benchmark workflows
  • add multi-node transport and worker communication

Related Issues

Related to #1210

Checklist

  • Added or updated unit tests for all changes
  • Added or updated documentation for all changes

Signed-off-by: 400Ping <jiekaichang@apache.org>
@400Ping
Copy link
Copy Markdown
Member Author

400Ping commented Mar 29, 2026

cc @viiccwen

Signed-off-by: 400Ping <jiekaichang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant