Zero-KV is a disk-backed, read-optimized key-value storage engine for low-latency local environments.
The project explores systems-level performance techniques in Rust: immutable binary storage, memory-mapped reads, fixed-width binary frames, Unix Domain Sockets, batched request handling, and vectored writes.
The engine is intentionally scoped around a simple workload: fast local lookups from a pre-baked immutable dataset.
- Memory-mapped storage: Uses
memmap2to expose the baked database file through the OS page cache. - Fixed binary protocol: Uses 16-byte request frames and 8-byte response headers for predictable parsing.
- Unix Domain Socket transport: Uses UDS for local inter-process communication without TCP loopback overhead.
- Sorted immutable index: Stores fixed-width index entries and performs binary search over the mapped index region.
- Input-driven baking: Compiles a simple
key,valueinput file into an immutable binarystorage.db. - Batched request handling: Reads 4KB request batches and processes up to 256 pipelined requests per batch.
- Vectored responses: Uses
write_vectoredto write response headers and value slices without assembling a separate contiguous response buffer. - Partial-write safety: Handles partial vectored writes by advancing through written slices until the full response batch is flushed.
- Cache-conscious layout: Aligns stored value regions to 8-byte boundaries to simplify offset calculation and reduce unaligned access concerns.
- Reusable hot-path buffers: Reuses stack-allocated response metadata inside the server loop to reduce allocator pressure during steady-state handling.
- Rust stable
- Linux or WSL2
- Unix Domain Socket support
Clone the repository and build the release binaries:
git clone https://github.com/Alpy16/Zero-KV.git
cd Zero-KV
cargo build --release
Zero-KV expects a simple CSV-like input file:
key,value
Example:
1,hello
2,world
100,First value content
Generate a 100,000-entry benchmark dataset:
seq 1 100000 | awk '{print $1 ",value_content_at_key_" $1}' > bench_input.csv
Compile the input dataset into an immutable binary database:
cargo run --release --bin baker -- bench_input.csv
This creates:
storage.db
The server loads this file at startup.
RUST_LOG=error cargo run --release --bin kv_store
By default, the server listens on:
/tmp/zero-kv.sock
In another terminal:
cargo run --release --bin benchmarker
The benchmarker launches 100 concurrent Unix socket clients, each sending 256-request pipelines against keys in the baked dataset.
The repository includes cycle.sh, which runs the full build, bake, serve, and benchmark cycle.
chmod +x cycle.sh
./cycle.sh
The script performs the following steps:
- Builds release binaries
- Generates a 100,000-entry dataset
- Bakes
storage.db - Starts the server in the background
- Waits for the Unix socket to become available
- Runs the saturation benchmark
- Checks that the server survived the benchmark
- Cleans up temporary input files
The benchmark is designed to measure saturated local read throughput under pipelined Unix Domain Socket traffic.
Latest benchmark configuration:
| Parameter | Value |
|---|---|
| Dataset size | 100,000 entries |
| Concurrent clients | 100 |
| Pipeline depth | 256 requests |
| Run duration | 10 seconds |
| Transport | Unix Domain Sockets |
| Storage mode | Immutable mmap-backed file |
| Workload | Existing-key reads |
| Hit rate | 100% |
Latest observed result:
| Metric | Result |
|---|---|
| Throughput | 10,405,284 requests/sec |
| Batch P50 latency | 2.387ms |
| Batch P99 latency | 5.413ms |
| Total requests | 104,117,504 |
Latency values represent completion time for a full 256-request pipeline, not individual request latency.
Benchmark results are environment-dependent. The result above was collected in a local Linux/WSL2 environment using Unix Domain Sockets.
Runtime behavior was inspected with samply and Firefox Profiler.
View interactive profile trace
In the captured steady-state profile, allocator activity was not visible in the request hot path. Most runtime was concentrated around socket polling, batched request handling, and mmap-backed lookup logic.
Zero-KV
├── src/
│ ├── lib.rs # Protocol types, binary frame layout, shared constants
│ ├── storage.rs # Mmap-backed storage engine and index lookup
│ ├── main.rs # UDS server and request handling loop
│ └── bin/
│ ├── baker.rs # Dataset compiler: input file -> immutable storage.db
│ └── benchmarker.rs # Saturation benchmark client
├── Cargo.toml
├── cycle.sh # Build, bake, serve, benchmark lifecycle script
└── README.md
The baked database is immutable and laid out as:
+------------------+
| Header |
+------------------+
| IndexEntry[0] |
| IndexEntry[1] |
| ... |
| IndexEntry[n] |
+------------------+
| Value bytes |
| 8-byte padding |
| Value bytes |
| 8-byte padding |
| ... |
+------------------+
Each index entry stores:
key: u64
val_offset: u64
val_len: u32
padding: u32
Because index entries are fixed-width and sorted by key, lookups use binary search over the mapped index region.
Each request is a fixed 16-byte frame:
op: u32
padding: u32
key: u64
Supported operations:
| Opcode | Operation |
|---|---|
| 0 | Get |
| 1 | Exists |
Each response begins with an 8-byte header:
status: u32
length: u32
Response statuses:
| Status | Meaning |
|---|---|
| 0 | Ok |
| 1 | Not found |
| 2 | Error |
If length > 0, the response header is followed by the value bytes.
Context: The engine is designed for local IPC workloads where TCP loopback introduces unnecessary protocol overhead.
Decision: Use Unix Domain Sockets as the default transport.
Consequence: The transport remains local-only, but avoids TCP/IP framing and checksum overhead for same-machine clients.
Context: Supporting writes, deletes, and compaction would require additional synchronization and recovery logic.
Decision: Compile input data into a sorted immutable binary file before server startup.
Consequence: Runtime lookup logic stays simple and fast, while mutation support is intentionally out of scope.
Context: Standard read calls require explicit user-space buffers and repeated syscall interaction.
Decision: Use memmap2 to map the database file and let the OS page cache manage file-backed memory.
Consequence: Lookups can return borrowed slices from the mapped file without copying values into a separate storage buffer.
Context: Text protocols and variable-width request formats add parsing overhead and allocation pressure.
Decision: Use fixed-width request and response headers with explicit binary layouts.
Consequence: Request decoding becomes predictable and cheap, but the protocol is less flexible than a self-describing format.
Context: Handling one request per syscall limits throughput under high request rates.
Decision: Read 4KB request batches and flush response headers/value slices through vectored I/O.
Consequence: Syscall cost is amortized across many pipelined requests.
Context: A single vectored write is not guaranteed to flush every byte in the provided slice list.
Decision: Track the number of bytes written, advance through completed slices, trim partially written slices, and continue writing until the full response batch is sent.
Consequence: The server preserves response framing correctness while keeping the zero-copy response path.
The storage engine uses a raw pointer internally to reference the mapped index region.
This avoids self-referential lifetime issues while allowing the read-only storage object to be shared across worker tasks. The pointer is derived from the owned mmap, the mmap length is validated before pointer construction, and the mapped file is never mutated by the engine after startup.
The server response path uses borrowed slices into preallocated response headers and mmap-backed values. Vectored writes are advanced carefully so partial writes do not corrupt the client-visible response stream.
- Read-only storage engine
- No writes, deletes, compaction, or replication
- No crash-recovery layer beyond immutable file loading
- Local IPC only through Unix Domain Sockets
- Benchmark is optimized for local saturated read throughput
- Batch latency is reported per 256-request pipeline, not per individual request
- Current input format is simple CSV-like
key,valueand does not support escaping commas inside values - Benchmark workload currently measures 100% existing-key reads
MIT