You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/performance/SQLITE_COMPARISON.md
+22-19Lines changed: 22 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,34 +45,37 @@ We addressed the gaps via the following optimizations:
45
45
Distributed shuffle joins send **all tuples** across the network to partitioned nodes, even when many will never match. This causes unnecessary network traffic and buffer memory usage.
46
46
47
47
### Solution: Bloom Filter Integration
48
-
Implemented bloom filters to filter tuples at the source before network transmission:
49
-
-**One-sided bloom filter**: Built from the left/build table, applied to filter the right/probe table
50
-
-**Distributed construction**: Each data node constructs its local bloom during the left/build scan phase
51
-
-**Coordinator coordination**: `BloomFilterPush` RPC broadcasts filter metadata to all nodes before the right/probe shuffle
48
+
Implemented bloom filters to filter tuples at the source before network transmission using a 3-phase approach:
49
+
-**Phase 1 (Local Build)**: Each data node scans its local left/build table partition, extracts join key values, and builds a local bloom filter
50
+
-**Phase 2 (Bit Aggregation)**: Coordinator sends `BloomFilterBits` RPC to each data node; each responds with local bloom bits; coordinator OR-aggregates all bits into a single filter
51
+
-**Phase 3 (Sender-Side Filter)**: Coordinator broadcasts aggregated filter via `BloomFilterPush` RPC; before sending right/probe tuples, `ShuffleFragment` handler checks `might_contain()` and skips tuples that will definitely not match
52
52
53
53
### Architecture
54
54
```
55
-
[Phase 1: Shuffle Left] [Phase 2: Shuffle Right]
56
-
| |
57
-
v v
58
-
Build local bloom Apply bloom filter
59
-
from join keys before buffering
60
-
| |
61
-
+---- BloomFilterPush ----->---+
62
-
(filter metadata) |
63
-
v
64
-
Filtered tuples buffered
55
+
Phase 1: Scan Left Phase 2: Aggregate Bits Phase 3: Filter Right
56
+
| | |
57
+
v v v
58
+
Build local bloom <---> BloomFilterBits RPC <-------- Aggregate & Broadcast
59
+
on each data node (OR-aggregate bits) via BloomFilterPush
Added probabilistic filtering to reduce network traffic in shuffle joins.
30
30
-**MurmurHash3-based BloomFilter**: Configurable false positive rate (default 1%) with optimal bit count and hash function calculation.
31
-
-**Filter Construction**: Built during Phase 1 scan, stored in `ClusterManager` per context.
32
-
-**Filter Application**: `PushData` handler checks `might_contain()` before buffering, skipping tuples that will definitely not match.
31
+
-**Distributed Construction**: Each data node builds a local bloom filter from its left/build table partition during Phase 1 scan.
32
+
-**Bit Aggregation**: Coordinator collects local bloom bits from all data nodes via `BloomFilterBits` RPC and OR-aggregates them into a single filter.
33
+
-**Sender-Side Filtering**: Aggregated filter is broadcast via `BloomFilterPush` before Phase 2; `ShuffleFragment` handler applies `might_contain()` before sending `PushData`, skipping tuples that will definitely not match.
33
34
34
35
## Lessons Learned
35
36
- Shuffle joins significantly reduce network traffic compared to broadcast joins for large-to-large table joins.
0 commit comments