Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
384bcd4
draft: streaming-mode cpu reduction design spec
packethog May 15, 2026
00c6b9e
spec: revise streaming cpu design with --enable-websocket flag and fi…
packethog May 15, 2026
d4fd8ef
plan: streaming cpu reduction implementation plan
packethog May 15, 2026
ae3bb49
plan: address codex findings on plan v1 (recovery, backstop, dirty pl…
packethog May 15, 2026
4b57d07
plan: address codex v2 findings (ws-gate, snapped gate, dedicated 5s …
packethog May 15, 2026
cff3895
plan: align stuck-stream backstop interval with tob freshness thresho…
packethog May 15, 2026
f794ca1
test: add Px::num_digits boundary tests around 10^n thresholds
packethog May 15, 2026
9e20066
perf: replace Px::num_digits f64 log10 with u64::ilog10
packethog May 15, 2026
6100fa2
perf: pre-size L2 level output Vecs to skip realloc growth
packethog May 15, 2026
f27a70e
feat: add --enable-websocket cli flag (default off)
packethog May 15, 2026
c9927b6
feat: thread enable_websocket through run_websocket_server with start…
packethog May 15, 2026
96c4e35
feat: skip ws tcp bind when --enable-websocket is off
packethog May 15, 2026
387b224
feat: thread enable_websocket into OrderBookListener and OrderBookState
packethog May 16, 2026
1704c9b
test: tidy websocket_disabled_test doc backticks and unwrap allow
packethog May 16, 2026
5d9fb0e
perf: skip 6 bucketed l2 variants and cap unbucketed to bbo when webs…
packethog May 16, 2026
f9eae5d
test: assert tob bbo is identical across enable_websocket configs
packethog May 16, 2026
acce669
fix: reject startup when no market-data output is configured
packethog May 18, 2026
3f9f006
feat: add book_dirty flag to OrderBookState set only on real mutations
packethog May 18, 2026
2ab72b4
perf: emit streaming l2 snapshot at block finalization, not per chunk
packethog May 18, 2026
552f3bd
feat: 5s stuck-stream snapshot backstop on dedicated 250ms ticker
packethog May 18, 2026
161410d
fix: emit authoritative tob snapshot after streaming recovery
packethog May 18, 2026
8b49e3e
test: add streaming-mode listener test constructor
packethog May 18, 2026
4eca555
test: l2-5 streaming finalization behavior suite
packethog May 18, 2026
e4e4696
refactor: expose dual-validator fixture snapshot capture for reuse
packethog May 18, 2026
9075ef3
test: assert stream quote sequence is an ordered block subsequence wi…
packethog May 18, 2026
190383d
docs: document --enable-websocket multicast-only default in README
packethog May 18, 2026
eeed11c
docs: changelog for streaming cpu reduction and ws default-off
packethog May 18, 2026
274804d
fix: authoritative tob snapshots bypass staleness suppression so stal…
packethog May 18, 2026
26c00f3
fix: narrow staleness bypass to corrections only and decouple recover…
packethog May 18, 2026
54fe8c8
fix: gate L2-5 finalization and recovery emission on !enable_websocke…
packethog May 18, 2026
3f6d91b
order book: atomic live-height recheck before surgical recovery
packethog May 18, 2026
75bf8e4
order book: only mark provisional published when publisher will deliv…
packethog May 18, 2026
886adc5
order book: finalization never bypasses staleness; only recovery corr…
packethog May 18, 2026
ddc9f47
publisher: reject --dob-group without --multicast-group (registry not…
packethog May 18, 2026
068d101
websocket server: reject no-output config at the runner boundary, not…
packethog May 18, 2026
d641d65
websocket server: reject dob-only config at the runner boundary to ma…
packethog May 18, 2026
fcd31f1
publisher: own the provisional-supersede decision (resolves catch-up …
packethog May 18, 2026
99ec7b9
backstop: trigger on source age not local clock; tie supersede flag t…
packethog May 18, 2026
64e9075
stream cpu: skip l4 fanout when websocket disabled; one-shot backstop…
packethog May 18, 2026
8916d43
gitignore: exclude .claude/*.lock harness artifacts
packethog May 18, 2026
5790c04
publisher: recovery correction must not clear the provisional-superse…
packethog May 18, 2026
6523ca5
backstop: re-arm one-shot guard on post-provisional mutation so long …
packethog May 18, 2026
ee981aa
publisher: only enter caught-up state on a fresh publish, not a force…
packethog May 18, 2026
7c9e044
publisher: clear caught-up on a forced-stale publish so periodic rese…
packethog May 18, 2026
269fb6f
l2: don't reserve full source depth for bucketed variants (bounded sn…
packethog May 18, 2026
651d830
recovery: reject off-lock report on same-height intra-block race via …
packethog May 18, 2026
7dae2f4
recovery: only validate finalized stream heights; bump mutation seq o…
packethog May 18, 2026
0f7262c
backstop: freshness-window emit (retry-while-fresh, silent-when-stale…
packethog May 18, 2026
271b74f
publisher: a provisional publish must not mark caught-up (no stale au…
packethog May 18, 2026
793837e
backstop: dedup provisional by mutation seq instead of a phase-skippa…
packethog May 18, 2026
77a137d
docs: reconcile readme/changelog with final design (dob requires mult…
packethog May 18, 2026
9fa87c6
publisher: discharge supersede only on full local send; resend cached…
packethog May 18, 2026
edee645
fix: revert pending-driven resend (rolls back subscribers); gate back…
packethog May 18, 2026
6de440d
publisher: retry only the cached forced-stale supersede (S2), never t…
packethog May 18, 2026
19d545e
publisher: cache only published snapshots so a suppressed stale one c…
packethog May 18, 2026
a1623d7
publisher: a fully-sent interval retry discharges the provisional obl…
packethog May 18, 2026
fc77a81
publisher: drop racy cached-supersede resend; rely on in-loop force-p…
packethog May 18, 2026
618c460
publisher: drop caught-up on broadcast lag so stale cache is not rebr…
packethog May 18, 2026
a23e3c9
publisher: corrections stay resync-eligible; broadcast lag also clear…
packethog May 18, 2026
3951bce
publisher: keep pending-provisional across broadcast lag (next author…
packethog May 18, 2026
31ac40f
gitignore: exclude docker cross-build output dir
packethog May 18, 2026
d57d8c0
docs: update ARCHITECTURE.md for multicast-only default, L2-5 finaliz…
packethog May 18, 2026
7693bc6
docs: add repo claude.md onboarding pointing at architecture.md
packethog May 18, 2026
13dc3e8
binaries: default --dob-channel-bound to 65536 (burst-safe); reconcil…
packethog May 18, 2026
e352d03
order book: gate block-mode l4 fanout on enable_websocket
packethog May 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
target/
server/tmp/
.worktrees/
.claude/*.lock
target-docker/
135 changes: 127 additions & 8 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ the file tree alone.
The main binary is `dz_hl_publisher` in
`binaries/src/bin/dz_hl_publisher.rs`. It can serve WebSocket clients, publish
top-of-book (TOB) multicast, publish depth-of-book (DoB) multicast, and expose
Prometheus metrics. Production deployments in this repo mostly use it as a
multicast publisher.
Prometheus metrics. Production deployments in this repo run it **multicast-only:
the WebSocket listener is disabled by default** and is opt-in via
`--enable-websocket`. At least one output must be configured (see Process
Startup); a no-output run is rejected.

## High-Level Model

Expand Down Expand Up @@ -59,11 +61,19 @@ Important source files:
`dz_hl_publisher` parses CLI flags and constructs:

- `IngestMode`: `block` or `stream`.
- WebSocket listener: off unless `--enable-websocket` is passed.
- Optional TOB `MulticastConfig`.
- Optional DoB `DobConfig`.
- Optional Prometheus metrics listener.
- Optional streaming-only separate fill ingest.

**Output-mode validation.** A run with no output (`--enable-websocket` absent
and neither `--multicast-group` nor `--dob-group`) is rejected. `--dob-group` is
not a standalone mode: the instrument registry is only bootstrapped from the HL
API in multicast mode, so `--dob-group` requires `--multicast-group`. Both rules
are enforced at the CLI **and** again at the `run_websocket_server` library
boundary, so an alternate caller cannot start a publisher that emits nothing.

It then calls `run_websocket_server`. The function name is historical: this is
also the top-level coordinator for multicast-only deployments.

Expand All @@ -72,7 +82,12 @@ also the top-level coordinator for multicast-only deployments.
- `market_message_tx`: carries `InternalMessage::Snapshot` and
`InternalMessage::Fills` for TOB and WebSocket L2/trade consumers.
- `l4_message_tx`: carries `InternalMessage::L4BookUpdates` for WebSocket L4
consumers. TOB does not subscribe to this high-volume channel.
consumers. TOB does not subscribe to this high-volume channel. When the
WebSocket listener is disabled, the L4 fan-out is skipped entirely on both
ingest paths — streaming (`publish_l4_update` returns early) and block mode
(`receive_block_batch` is `enable_websocket`-gated): in multicast-only mode
there are no L4 consumers, so the clone + task-spawn + broadcast per applied
block/diff is pure overhead and is elided.

If TOB or DoB is enabled, a shared instrument registry is also created. TOB uses
it to map coins to instrument definitions. DoB uses it to resolve internal coins
Expand Down Expand Up @@ -121,9 +136,23 @@ After initialization:
- Every validation interval, the listener compares its computed L4 snapshot
with a fresh validator snapshot.

Validation is done off-lock to avoid blocking the hot path: the listener clones
book state under the lock (capturing the snapshot height and a monotonic
`mutation_seq`), drops the lock, computes and compares the snapshot, then
re-acquires the lock to apply repairs. Before applying, it re-checks that the
live height **and** `mutation_seq` still match what was validated, and (in
streaming mode) that the height is finalized with no block still buffered for
it. If anything moved, the report is stale and discarded — a later validation
cycle re-derives against current state. This prevents a raced report from
overwriting newer book state. Recovery's own per-coin mutations also bump
`mutation_seq`, so an overlapping validation task cannot replay a stale repair.

When validation finds a per-coin divergence, the listener repairs only affected
coins. If DoB is enabled, this recovery also emits an `InstrumentReset` and
queues a priority DoB snapshot for the affected instrument.
queues a priority DoB snapshot for the affected instrument. On the TOB side,
recovery emits a `Correction` snapshot (see TOB Publishing Hot Path) so the
corrected book reaches subscribers without waiting for an unrelated future
diff.

## Block Mode Hot Path

Expand Down Expand Up @@ -166,6 +195,9 @@ Key properties:
mutations.
- Fills publish TOB trades but do not mutate book state.
- TOB snapshots are emitted once per file-read chunk, not per individual diff.
This per-chunk cadence applies to block mode and to WebSocket-enabled
streaming. Multicast-only streaming instead emits per finalized block (see
Streaming Finalization).
- In streaming mode, optional separate fill ingest sends fills directly to the
market broadcast and does not compute book snapshots for fill-only rows.

Expand Down Expand Up @@ -196,7 +228,7 @@ sequenceDiagram
L->>B: drain earliest block
B->>S: apply_stream_diff when ready
S->>D: immediate DoB mutation event
L->>M: Snapshot after read chunk
L->>M: Snapshot at block finalization (per chunk if --enable-websocket)
L->>M: Fills as fill rows arrive
M->>T: pair fill sides by tid
M->>T: freshness decision, then Quote/Trade or suppression
Expand Down Expand Up @@ -266,6 +298,20 @@ meaningful diffs after finalization are fatal because they would mutate a closed
DoB block. Late statuses after finalization are metadata-only unless a pending
raw diff can consume them, so they are counted and ignored.

In multicast-only streaming, finalization also drives the **authoritative TOB
L2 snapshot**: instead of emitting once per file-read chunk, the listener emits
one snapshot when a block finalizes. A real BBO-affecting mutation opens a
`book_dirty` epoch (stamped with the dirtying row's source time and a
`mutation_seq`); finalization emits the authoritative snapshot and closes the
epoch. If a block stays unfinalized, a stuck-stream backstop emits a
`Provisional` snapshot — gated so it only fires once the dirty epoch's source
age is at least `STREAM_DIRTY_BACKSTOP_INTERVAL`, deduplicated by `mutation_seq`
so a static stall does not recompute every tick while ongoing diffs still
re-publish updated state. Recovery (per-coin divergence repair) emits a
`Correction`. WebSocket-enabled streaming bypasses all of this and keeps the
pre-existing per-chunk cadence byte-for-byte; that flag is the durable rollback
path for this emission model.

## TOB Publishing Hot Path

TOB has two UDP channels:
Expand All @@ -281,6 +327,17 @@ The TOB publisher subscribes only to `market_message_tx`. It receives:
It intentionally ignores `InternalMessage::L4BookUpdates`, so high-volume L4
traffic cannot evict TOB snapshots from the TOB receiver.

The L2 snapshot carried in `InternalMessage::Snapshot` is computed by
`compute_l2_snapshots`. With the WebSocket listener enabled it produces the full
set of WebSocket `l2Book` aggregation variants (the `n_sig_figs`/`mantissa`
bucket combinations, 7 per coin). With the WebSocket listener disabled there are
no `l2Book` subscribers, so it computes a single unbucketed snapshot — the TOB
publisher only ever uses level 1 (best bid/ask) of it anyway. This affects the
TOB/WebSocket-L2 path only. The DoB feed is a separate path (per-order L4 deltas
via `DobApplyTap` and full per-instrument resting-order snapshots via
`clone_coin_orders`) and is unaffected — DoB depth/level coverage does not
change with this setting.

```mermaid
flowchart LR
Listener["OrderBookListener"] -->|"Snapshot"| MarketTx["market broadcast"]
Expand Down Expand Up @@ -308,6 +365,40 @@ marketdata is suppressed. Heartbeats and refdata can still be emitted. Activity
tracking is based on actual marketdata datagrams, so suppressed messages do not
starve heartbeats.

### Snapshot emission kinds and the supersede model

`InternalMessage::Snapshot` carries an emission kind that the publisher owns the
interpretation of (the publisher is the only component that both knows whether a
provisional was actually sent and applies the freshness gate, with one clock):

- **Authoritative** — normal block finalization / block-mode per-chunk emit.
Lag-gated: published only when fresh, so catch-up backlog is suppressed.
- **Provisional** — stuck-stream backstop. Lag-gated like Authoritative.
- **Correction** — recovery divergence repair. Always published, bypassing the
freshness gate, because it corrects *incorrect* (diverged) data rather than
merely stale data.

A `Provisional` that is actually sent creates a supersede obligation: the next
finalized-block `Authoritative` is force-published even if it is now stale, so a
subscriber that saw the partial provisional converges to the block's final book.
The obligation is discharged only by a *fully locally sent* Authoritative (a
partial UDP send keeps it, so a later snapshot re-forces it); a recovery
`Correction` is orthogonal to the stream dirty epoch and never discharges it.

`caught_up` gates the periodic `snapshot_interval` resync (a bounded
rebroadcast of the last published full snapshot — the recovery path for
subscribers that missed a datagram over fire-and-forget UDP). It is true only
when the most recent publish was a genuinely fresh `Authoritative`, or any
`Correction` (a correction must stay resync-eligible — leaving a subscriber on
diverged data is worse than stale). It is cleared on suppression, on a
forced-stale publish, and on broadcast-receiver `Lagged` (a lag drop may have
skipped the only finalization snapshot, so the cache can no longer be asserted
current). Broadcast `Lagged` does not clear the supersede obligation: keeping it
lets the next finalized Authoritative resync a stranded subscriber. There is
one documented irreducible best-effort gap here: a lost forced-stale supersede
during a continued stall is recovered by the next snapshot / the caught-up
resync once the stream recovers, not by unbounded stale rebroadcast.

Important latency components:

- Validator/source lag: `local_time - block_time`.
Expand Down Expand Up @@ -384,9 +475,19 @@ There are three notable fanout boundaries:
| `l4_message_tx` | Tokio broadcast | WebSocket `L4BookUpdates` | High-volume L4 traffic should not affect TOB. |
| DoB MPSC | bounded Tokio MPSC | `DobEvent` | Full queue drops DoB mutation events. |

The DoB MPSC depth (and the DoB snapshot-request channel) is sized by
`--dob-channel-bound`, which **defaults to `65536`**. Hyperliquid emits large
`OrderAdd` bursts at block boundaries, so this must be sized for burst
absorption, not average rate; the default is chosen accordingly. Overriding it
to a shallow value (e.g. the old `4096`) drops correctness-significant DoB
mutation events during normal block-boundary bursts — do not lower it without
evidence the workload's bursts fit.

Broadcast lag and bounded-channel drops are intentionally different failure
modes. Broadcast lag is receiver-side loss on an internal pub-sub channel. DoB
MPSC full is producer-side backpressure at the apply tap.
modes. Broadcast lag is receiver-side loss on an internal pub-sub channel; the
TOB publisher mitigates it by invalidating `caught_up` on `Lagged` so it does
not rebroadcast a possibly-stale cached snapshot as current. DoB MPSC full is
producer-side backpressure at the apply tap and is correctness-significant.

## Metrics and Observability

Expand Down Expand Up @@ -492,10 +593,28 @@ active instrument. This usually points at stale or incomplete instrument
metadata. The mutation still applies to the internal book, but no DoB event is
emitted for that coin.

### `dob_tap: channel full, dropping ...`

The bounded DoB MPSC backpressured and the tap dropped a mutation event. This is
correctness-significant: downstream reconstructed DoB books are inconsistent for
that instrument until the next reset/snapshot. With the default
`--dob-channel-bound` (`65536`) this should not occur under normal
block-boundary bursts; if it does, either the bound was overridden too low or
the DoB emitter cannot keep up with steady-state load — the latter needs a
throughput investigation, not a larger buffer.

## Design Constraints

- Block mode remains supported and is the compatibility baseline.
- Block mode remains supported and is the compatibility baseline; it must not
regress.
- Streaming mode is opt-in and optimized for lower latency.
- The WebSocket listener is off by default. `--enable-websocket` is the durable
rollback contract: it restores the pre-finalization-driven streaming emission
byte-for-byte (per-chunk cadence, full L2 fan-out, recovery carried by the
next per-chunk snapshot).
- A run with no configured output is rejected; `--dob-group` requires
`--multicast-group`. These are enforced at the CLI and the
`run_websocket_server` boundary.
- Fills do not mutate book state.
- Raw-book `New` diffs insert resting orders directly; they do not invoke local
matching. The raw diff price/size are authoritative for the resting book;
Expand Down
60 changes: 60 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

## Unreleased

### Breaking changes

- The WebSocket listener is now **disabled by default**. The publisher is
multicast-only unless `--enable-websocket` is passed. Existing deployments
that relied on the default `--address`/`--port` WebSocket server MUST add
`--enable-websocket` to retain that behavior.
- Output-mode configuration is now validated. A run with no output configured
is rejected, and `--dob-group` is **not** a standalone mode — it requires
`--multicast-group` (the shared instrument registry is only bootstrapped
from the HL API in multicast mode, so a DoB-only publisher would resolve no
instruments). Valid output configurations are `--enable-websocket` and/or
`--multicast-group` (the latter optionally with `--dob-group`). Both checks
are enforced by the CLI **and** by the `run_websocket_server` library entry
point, so alternate callers cannot start a publisher that emits nothing.

**Rollback:** the durable rollback contract is the `--enable-websocket`
flag. Passing it restores the pre-L2-5 streaming cadence **byte-for-byte**
(per-chunk TOB emission, full 7-variant L2 fan-out, recovery carried by the
next per-chunk snapshot) — this equivalence is enforced by discriminating
WS-enabled tests and the streaming goldens, and held across the entire
change set, so it is a safe operational revert without touching code. The
WS-disabled L2-1 (integer `u64::ilog10` digit count), L2-3 (BBO-only L2 in
multicast-only mode), and L2-4 (pre-sized L2 level vectors) perf levers are
independent of the streaming finalization/backstop path (L2-5) and of each
other; any one can be reverted alone if isolated to it.

### Performance

- Streaming-mode CPU reduced by:
- replacing `Px::num_digits()` f64 `log10().floor()` with integer
`u64::ilog10()` (also fixes a latent off-by-one for values just below
large powers of ten);
- in the default (WS-disabled) config, computing only the unbucketed
best-bid/ask L2 snapshot instead of 7 bucketed variants per coin;
- in streaming mode (WS-disabled), emitting the TOB L2 snapshot once per
finalized block instead of once per file-read chunk, with a 250ms
stuck-stream backstop and recovery-path emission so corrected BBOs are
never withheld;
- pre-sizing L2 level output vectors to avoid reallocation growth.

### Fixed

- `Px::num_digits()` no longer reports an extra digit for `u64` values just
below large powers of ten (f64 cast imprecision).
- Streaming recovery now emits an authoritative TOB snapshot immediately, so
a per-coin divergence repair is reflected to multicast subscribers without
waiting for an unrelated later diff.
- `--dob-channel-bound` default raised `4096` → `65536`. The old default
dropped correctness-significant DoB mutation events during normal Hyperliquid
block-boundary `OrderAdd` bursts; the bound must be sized for burst
absorption, so the safe value is now the default (deployments no longer have
to remember to pass it explicitly).
Loading
Loading