Skip to content

Commit 1d3e497

Browse files
committed
fix(rocksdb): lower L0 trigger and tighten slowdown/stop thresholds to prevent OOM
With pin_l0_filter_and_index_blocks_in_cache enabled, each L0 SST file pins its bloom filter and index blocks in the block cache. With the old settings (trigger=64, 256 MB write buffers), L0 could accumulate to the stop threshold — at ~9.75 MB pinned per file, this consumed ~3.7 GB across 3 DBs, overflowing the 2 GB block cache and spilling to heap, triggering OOM during mainnet initial sync around block 750k. Lower the trigger to 32 and set slowdown=×3 (96 files) and stop=×4 (128 files). With 128 MB write buffers (~4.9 MB filter blocks per file), pinned metadata at the stop threshold is ~1.88 GB across 3 DBs — within the 2 GB cache at all times. Set the write buffer CLI flag accordingly: --db-write-buffer-size-mb=128
1 parent 2e928f9 commit 1d3e497

1 file changed

Lines changed: 12 additions & 8 deletions

File tree

src/new_index/db.rs

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -104,20 +104,24 @@ impl DB {
104104
// default trigger of 4, while keeping the file count — and therefore
105105
// bloom-filter memory and lookup cost — bounded.
106106
//
107-
// With bloom filters at 10 bits/key and a 512 MB write buffer, each L0
108-
// file has ~7.8 M keys, so its filter block is ~9.75 MB. At 64 files
109-
// that is ~625 MB of pinned filter blocks — well within an 8 GB cache.
110-
// Each lookup checks 64 bloom filters (fast, in-memory) and reads from
111-
// only ~0.64 files on average (1 % false-positive rate × 64 files).
107+
// With bloom filters at 10 bits/key and a 128 MB write buffer, each L0
108+
// file has ~3.9 M keys, so its filter block is ~4.9 MB. At the slowdown
109+
// threshold (96 files) that is ~470 MB of pinned filter blocks per DB,
110+
// ~1.41 GB across 3 DBs — within a 2 GB cache. At the stop threshold
111+
// (128 files) it is ~628 MB per DB / ~1.88 GB total, still within bounds.
112+
// Previously trigger=64 with 256 MB buffers caused pinned metadata to
113+
// overflow the 2 GB cache at L0=128 (~3.7 GB), spilling to uncontrolled
114+
// heap and triggering OOM. Trigger=32 + slowdown=96 keeps the peak safe
115+
// while allowing enough L0 accumulation for good bulk-load throughput.
112116
//
113117
// Set slowdown/stop triggers well above the compaction trigger so writes
114118
// are never stalled while background compaction catches up.
115119
// Disable the pending-compaction-bytes stall so the large backlog that
116120
// builds up during the bulk load does not block writes.
117-
const L0_BULK_TRIGGER: i32 = 64;
121+
const L0_BULK_TRIGGER: i32 = 32;
118122
db_opts.set_level_zero_file_num_compaction_trigger(L0_BULK_TRIGGER);
119-
db_opts.set_level_zero_slowdown_writes_trigger(L0_BULK_TRIGGER * 4);
120-
db_opts.set_level_zero_stop_writes_trigger(L0_BULK_TRIGGER * 8);
123+
db_opts.set_level_zero_slowdown_writes_trigger(L0_BULK_TRIGGER * 3);
124+
db_opts.set_level_zero_stop_writes_trigger(L0_BULK_TRIGGER * 4);
121125
db_opts.set_hard_pending_compaction_bytes_limit(0);
122126
db_opts.set_soft_pending_compaction_bytes_limit(0);
123127

0 commit comments

Comments
 (0)