Skip to content

Latest commit

 

History

History
34 lines (26 loc) · 1.86 KB

File metadata and controls

34 lines (26 loc) · 1.86 KB

Phase 5: Distributed Optimization

Overview

Phase 5 introduced high-level optimizations to reduce network latency and enable complex multi-shard query patterns.

Key Components

1. Shard Pruning (distributed/distributed_executor.cpp)

Optimized query routing based on partitioning keys.

  • Predicate Analysis: Detects filters on sharding keys (e.g., WHERE id = 100).
  • Targeted Dispatch: Routes fragments only to the specific node owning the shard, avoiding cluster-wide broadcasts.

2. Aggregation Merging

Implemented coordination for distributed analytics.

  • Partial Aggregation: Data nodes compute local counts and sums.
  • Global Merge: The coordinator identifies aggregate functions in the SELECT list and merges partial results from all shards into a final result set.

3. Broadcast Join Orchestration

Developed a prototype for cross-shard JOINs.

  • Table Fetching: Coordinator retrieves full data from a smaller table across all shards.
  • Broadcasting: Pushes the gathered data to the ShuffleBuffer of every node in the cluster.
  • Local Execution: Rewrites the query so each node joins its local shard with the broadcasted buffer data.

4. Shuffle Infrastructure

Enabled inter-node data movement.

  • BufferScanOperator: A physical operator that reads from in-memory shuffle buffers instead of heap files.
  • ClusterManager Buffering: Thread-safe staging area for data received via PushData RPCs.

Lessons Learned

  • Broadcast joins are highly effective for small-to-large table joins but require careful consideration of coordinator memory limits.
  • Merging aggregates at the coordinator is a bottleneck for very large clusters; future work could explore tree-based merging.

Status: 100% Test Pass

All scenarios, including distributed transactions (2PC) and join orchestration, have been verified with automated integration tests.