Skip to content

performance

Thomas Mangin edited this page May 29, 2026 · 5 revisions

Pre-Alpha. This page describes behavior that may change.

ze-perf is a BGP propagation latency benchmark tool. It measures how long it takes for BGP UPDATE messages to propagate through a device under test (DUT) by establishing sender and receiver sessions, injecting routes from the sender, and timing their arrival at the receiver. The tool produces per-route latency distributions, convergence times, and throughput numbers.

Subcommands

run

Run a benchmark against a BGP DUT. The sender establishes a BGP session with the DUT and injects a set of routes. The receiver establishes a separate session and measures how long each route takes to arrive. Multiple iterations are run and the results are aggregated.

ze-perf run --dut-addr 172.31.0.2 --dut-asn 65000
ze-perf run --dut-addr 172.31.0.2 --dut-asn 65000 --routes 10000 --json
ze-perf run --dut-addr 172.31.0.2 --dut-asn 65000 --family ipv6/unicast
ze-perf run --dut-addr 172.31.0.2 --dut-asn 65000 --force-mp --repeat 10

report

Generate a comparison report from one or more result JSON files. Supports Markdown tables (default), self-contained HTML, and full documentation with methodology disclaimers.

ze-perf report result-ze.json result-gobgp.json
ze-perf report --html result-ze.json result-gobgp.json > report.html
ze-perf report --doc result-*.json > docs/performance.md

track

Track performance history over time from an NDJSON file and detect regressions. In --check mode, exits non-zero if any metric has regressed beyond the configured threshold.

ze-perf track history.ndjson
ze-perf track --check history.ndjson
ze-perf track --html history.ndjson > trend.html
ze-perf track --check --threshold-convergence 15 history.ndjson

Key flags for run

Flag Default Purpose
--dut-addr (required) DUT BGP address.
--dut-asn (required) DUT autonomous system number.
--dut-port 179 DUT BGP port.
--dut-name unknown DUT implementation name (for reports).
--dut-version DUT version string (for reports).
--sender-addr 127.0.0.1 Sender local address.
--sender-asn 65001 Sender autonomous system number.
--receiver-addr 127.0.0.2 Receiver local address.
--receiver-asn 65002 Receiver autonomous system number.
--routes 1000 Number of routes to inject.
--family ipv4/unicast Address family (ipv4/unicast or ipv6/unicast).
--force-mp false Force MP_REACH_NLRI encoding for IPv4 unicast.
--seed 0 (random) Deterministic PRNG seed for route generation.
--repeat 5 Number of benchmark iterations.
--warmup-runs 1 Warmup iterations (discarded from results).
--warmup 2s Delay after session establishment before injecting.
--duration 60s Maximum time to wait for convergence per iteration.
--batch-size 0 (auto) NLRIs per UPDATE message (0 = auto-max within 4096 bytes).
--passive-listen false Listen on port 179 for inbound DUT connections (requires root).
--json false JSON output.
--output Write JSON results to file (implies --json).

Make targets

make ze-perf          # Build the ze-perf binary
make ze-perf-bench    # Run a benchmark against the local Ze instance
make ze-perf-report   # Generate a comparison report from result files
make ze-perf-track    # Check history for regressions

Docker test infrastructure

The test/perf/ directory contains the Docker-based benchmark infrastructure. It includes DUT configurations, run scripts (run.py), and a results/ directory for storing benchmark output. The Docker setup launches Ze (and optionally other implementations) in containers with isolated networking so that benchmarks are reproducible across machines.

The benchmark suite currently tests 8 implementations: Ze, BIRD, FRR, GoBGP, OpenBGPd, RustyBGP, ExaBGP, and bio-rd.

RIB attribute bundling

The Loc-RIB uses attribute bundling to reduce memory consumption. Non-AS_PATH attributes (ORIGIN, MED, LOCAL_PREF, COMMUNITY, EXTENDED_COMMUNITY, LARGE_COMMUNITY, etc.) are stored in a shared BundlePool that deduplicates via content hashing. Each RouteEntry holds a bundle handle (4 bytes) plus a separate AS_PATH handle (8 bytes), instead of 13 inline attribute handles (56 bytes).

In real-world RIBs, 97% of routes share identical non-AS_PATH attributes, so most entries point to the same few bundles. Measured on a 100K-entry RIB:

Metric Before After Improvement
Scan latency 1,930 us 1,586 us -18%
Scan memory per op 10,853 B 2,281 B -79%
Insert no-op latency 56,517 us 48,482 us -14%

The bundling is transparent to callers: the RIB API returns the same attribute values as before. The only observable difference is reduced memory footprint and improved cache locality.

ribOut compact storage

The per-peer Adj-RIB-Out (ribOut) was the dominant per-peer cost. Phase 2 replaced the full per-peer *Route (288 B struct, 385 to 741 B measured per route) with a 16-byte ribOutEntry: MsgID (8 B) + attribute pool handle (4 B) + stale level (1 B) + padding (3 B). Wire attribute bytes are deduplicated in a shared pool (pool.RibOut), so the same UPDATE sent to N peers stores one pool copy and N four-byte handles. The full *Route is reconstructed on demand on cold paths only (replay, show, refresh).

Measured at 100K IPv4 /32 routes on an Apple M4 Max with Go 1.26 (TotalAlloc / routes, including struct, backing data, and map overhead):

Layer Struct Measured Allocs Per-peer
Plugin RIB (adj-rib-in) 32 B 69 B 1.0 No
Plugin ribOut (before) 288 B 385 to 741 B 6 to 10 Yes
Plugin ribOut (after) 16 B ~16 B + shared pool 0 Entry yes, pool no

Scaling impact across the three storage layers:

Scenario Plugin RIB Plugin ribOut Shared pool Total
100K routes, 1 peer 7 MB 2 MB ~7 MB 16 MB
100K routes, 10 peers 7 MB 15 MB ~7 MB 29 MB
1M routes, 10 peers 66 MB 153 MB ~70 MB 289 MB
1M routes, 50 peers 66 MB 763 MB ~70 MB 899 MB

Hot-path optimization campaign

Ze has undergone a systematic optimization campaign targeting the UPDATE processing pipeline. Key results:

Optimization Impact
Zero-alloc AS-PATH prepend Same-encoding eBGP forwarding avoids heap allocation.
Lazy monitor delivery JSON formatting skipped when no monitor subscriber matches.
GC pressure reduction Reduced allocation rate on hot paths (event delivery, RIB insert, forward dispatch).
Batch destination resolution Per-peer forwarding facts precomputed at session lifecycle boundaries, not per-UPDATE.
Parse-once RIB insert Attributes parsed once per UPDATE instead of per NLRI.
Zero-filter JSON fast path Bypasses filter machinery when no attribute filter is set.
Typed event delivery Atomic monitor count and typed delivery for hot-path events.
textbuf pooling sync.Pool for text buffers, freeze-after-extract semantics.
appendJSONSafeString Skips per-byte escape on safe strings.

The benchmark suite (ze-perf) tests 8 implementations: Ze, BIRD, FRR, GoBGP, OpenBGPd, RustyBGP, ExaBGP, and bio-rd.

See also

  • Testing for the full test suite overview.
  • Building for the Make targets that produce the binaries.

Home

About

First Steps

Configuration

Operation

Interfaces

Plugins

Plugin Development

Chaos Testing

Blueprints

Development

Reference

Clone this wiki locally