Hyper surge by bulkcade · Pull Request #78 · QuantAMMProtocol/quantammsim

bulkcade · 2026-03-11T10:26:31Z

No description provided.

…ates - test_lp_supply_through_pool_class: use DynamicInputArrays bundle instead of old positional-args signature - test_lp_supply_e2e_do_run_on_historic_data: use TEST_DATA_DIR and date range within test data coverage (2023-01-01 to 2023-01-15) - test_noise_trade_does_not_affect_virtual_balances: carry/input_list already fixed in previous commit

Add 25 numerical regression tests that pin exact values computed from synthetic fixtures. These protect against silent computation errors during refactoring — existing tests only check shapes and signs. Covers: grid interpolation (knot exactness, midpoint values, monotonicity, differentiability), loss function (pinned value + gradient at known params), noise volume, per-pool fit convergence (loss, cadence), joint fit (both noise modes, predict_new_pool, warm start), pack/unpack roundtrips.

Add CHAIN_GAS_USD lookup and pool_loss_fixed_gas to fix gas to known chain-level costs, removing the cadence-gas degeneracy. Per-pool fit and joint fit (Option A) both support fix_gas_to_chain flag, optimizing only cadence and noise coefficients when gas is held constant.

Compute daily realized volatility from Binance minute prices instead of Balancer API hourly prices, removing the 90-day data restriction. Each pool now uses its full historical date range (up to 1761 days). The calibration runner calls replace_panel_volatility_with_binance() and supports train_days=0 for unrestricted history.

Clip panel dates to Binance price data range so pools with stale token data (BAL, MKR, BADGER, LIT) use their available overlap instead of failing. Snap sim start to next midnight for tokens starting mid-day. Set max_memory_days=0 and preslice_burnin=False to prevent negative start_idx. Workers load their own price data to avoid pickling large DataFrames across processes.

…ility 87 new tests covering: - CHAIN_GAS_USD constants (pinned values, completeness) - pack/unpack fixed-gas params (roundtrip, shape, position) - pool_loss_fixed_gas (zero-when-perfect, matches free-gas, gradients) - per-pool fit fixed-gas (gas_usd pinned, gas_fixed flag, loss decreases) - fit_all_pools with fix_gas_to_chain (chain-level gas matching) - TOKEN_MAP resolution (wrapped native, LSTs, stablecoins, vault tokens) - compute_binance_pair_volatility (synthetic data, edge cases) - replace_panel_volatility_with_binance (immutability, no NaN) - joint fit fixed-gas (prepare, pack/unpack, loss, bounds, fit, predict)

…issing coverage Replace near-vacuous tests with substantive ones: - test_loss_with_heterogeneous_y (trivial !=) → test_day_indices_affect_loss - test_predict_with_nonzero_attrs (conditional guard) → test_predict_matches_linear_model - test_stable_vs_volatile_uses_single_asset (20x range) → hand-computed ground truth Add missing coverage: - PCHIP boundary clamping (cadence below min, gas above max) - replace_panel_volatility correctness (replaced values match compute_binance_pair_volatility) - Ground truth recovery, pinned loss values, OLS coefficient pinning Fix misleading name: test_grad_invariant_to_fixed_gas_perturbation → test_grad_changes_with_gas

Add CalibrationModel coordinator and 5 Head implementations (PerPoolHead, FixedHead, LinearHead, PerPoolNoiseHead, SharedLinearNoiseHead) so that new model variants (MLP, delta heads, Huber loss) require only a new Head + tests, not edits across the codebase. All 207 existing tests pass unchanged; 69 new tests added (276 total).

Two-layer MLP (x_attr → Dense(hidden, ReLU) → Dense(1)) with He initialization, L2 regularization on weights, and warm-start from per-pool fits. 16 unit tests + 6 integration tests with CalibrationModel.

Two-layer MLP (x_attr, Dense(hidden, ReLU), Dense(K_OBS)) that replaces the linear SharedLinearNoiseHead for the noise coefficient mapping. Initialized with W2=0 so output starts at pooled OLS noise coefficients. 16 unit tests + 6 CalibrationModel integration tests.

…S Hessian Add MLP calibration and sweep scripts.

- MLPHead/MLPNoiseHead init uses lstsq warm-start for W2 instead of zeros (fixes zero-iteration L-BFGS bug) - Best hyperparameters from sweep: alpha_cad=0.001, alpha_noise=0.1, maxiter=5000 - MLP noise R²=0.575 (vs Option C 0.612) but cadence is degenerate — noise head absorbs arb volume, decomposition not identified - Add sweep script and analysis doc

- Add output_lo/output_hi to LinearHead and MLPHead for cadence bounds - Add run_two_stage_joint() and _extract_two_stage_per_pool() to MLP calibration script

Remove sigma- and fee-dependent features from observation covariates so the arb channel is the only path for volatility-driven volume variation (see docs/noise_covariate_design.md). build_x_obs gains reduced=True, per_pool_fit derives k_obs from data shape, and prepare_joint_data forwards reduced_x_obs.

PerPoolNoiseHead, SharedLinearNoiseHead, and MLPNoiseHead accept k_obs=4 to match the reduced x_obs. Defaults to K_OBS=8 so existing usage is unchanged.

Add reclamm_calibrated_noise_volume (c_0..c_7 log-linear model with TVL, volatility, fee interactions, and DOW harmonics). Wire through all 4 reserve calculation paths with dow_sin/dow_cos scan inputs. Consolidate volatility/DOW array prep into _prepare_noise_arrays.

…tring in static dict Read n_evaluation_points from optuna_settings instead of hardcoding 20. Keep startDateString in the static fingerprint dict — the calibrated noise model needs it to compute day-of-week arrays.

Add encode_tokens() to build token index, per-pool token/chain assignments, and token covariate matrix (D_TOKEN=5) from the matched pool set. Token classification via symbol lookup for stablecoins, ETH derivatives, and L1 natives. Market cap from hardcoded values or JSON fallback. Foundation for the token-factored noise head where pool noise coefficients decompose as u[token_a] + u[token_b] + alpha[chain] + beta_fee * log(fee) + delta_i.

noise_coeffs_i = u[token_a] + u[token_b] + alpha[chain] + beta_fee * log(fee) + delta_i Token effects regularized toward x_token @ Gamma (population prediction from market cap and asset class). Per-pool deltas L2-regularized for partial pooling. Warm-start init decomposes Option C noise_coeffs into token/chain/fee effects via lstsq. predict_new_pool() handles seen tokens (learned u_t), unseen tokens (Gamma fallback), and unseen chains (zero alpha). Comprehensive tests: additivity, regularization, warm-start round-trip, gradient finiteness, new-pool prediction for seen/unseen tokens/chains.

Add prepare_token_factored_data() combining joint data preparation with token encoding. End-to-end tests verify TokenFactoredNoiseHead fits through CalibrationModel with PerPoolHead(cadence) + FixedHead(gas), both cold-start and warm-started from Option C.

Add run_option_c_reduced() for 4-covariate per-pool fits and run_reduced_joint() for joint MLPNoiseHead with k_obs=4. Wire reduced model into the comparison pipeline with correct x_obs dispatch in compute_per_pool_predictions(). Save reduced Option C results to JSON immediately for downstream use.

Full pipeline: Phase 0 pooled Ridge diagnostic (baseline vs pool attrs vs token dummies vs full), token-factored fit with lambda_delta sweep, token/chain/delta analysis tables, leave-one-pool-out cross-validation comparing LOO R² to Option C in-sample R², and diagnostic plots. Phase 0 results: token dummies +0.091 vs pool attrs +0.072 above baseline (R²=0.058), confirming compositional structure exists. LOO results: median R²=0.33 vs Option C 0.59, 7/36 wins — the static coefficient prediction bottleneck limits transfer to unseen pools. This motivates lagged cross-pool features.

Load per-pool noise_coeffs from calibration JSON, derive arb_frequency from calibrated log_cadence, and pick up token pair, fee, and gas from pool metadata. Supports both 4-covariate (reduced) and 8-covariate (full) noise coefficient formats. Add --n-eval-points flag for evaluation sub-window control.

Token canonicalization (_CANON_MAP) maps wrapped/derivative tokens to their base symbols (WETH→ETH, waBasWETH→ETH, WBTC→BTC, etc.), reducing the token graph from ~32 to ~22 unique tokens and thickening peer groups for cross-pool information sharing. Cross-pool lag features (build_cross_pool_x_obs, K_OBS_CROSS=7) enrich observation-level covariates with lagged peer volume averages for token A, token B, and chain — so daily noise predictions can adapt to market conditions without autoregressive cold-start issues.

Re-evaluates pool loss functions at the optimum to decompose total loss into data_loss (mean per-pool MSE) and reg_loss (head regularization). Enables tracking whether lambda annealing is reducing data fit or just shrinking regularization.

Lambda sweep now runs descending (high→low regularization) with each fit warm-starting from the previous result. Runner runs two ablations side-by-side: baseline (K_OBS_REDUCED=4) vs cross-pool (K_OBS_CROSS=7), reporting separated data/reg loss and LOO R² for each configuration. prepare_token_factored_data() gains cross_pool parameter to swap in cross-pool lag features automatically.

TokenFactoredNoiseHead.init() now zero-pads when warm_start noise_coeffs are shorter than k_obs (e.g. warm-starting k_obs=7 from k_obs=4 Option C results). prepare_token_factored_data(cross_pool=True) now trims y_obs and day_indices to match the first-day-dropped x_obs from build_cross_pool_x_obs. Runner gains --cross-pool-only flag with pickle caching of stage 1 (Option C + filtering) and baseline results so ablation 2 can run independently. Baseline-missing paths handled gracefully. Adds TestPrepareTokenFactoredCrossPool with shape consistency tests that would have caught the broadcast error.

Diagnostic experiments establishing the cross-pool prediction landscape: - run_cross_pool_diagnostics: lambda_token sweep, leave-one-in, AR1 baseline, pool connectivity analysis - run_cross_pool_linear: ridge regression (peers only, peers+own lag, LOO with overlap transfer, 30d burn-in, peer mean) - run_cross_pool_noise_linear: same battery on noise residuals (log_vol - log_V_arb) - run_residual_comparison: apples-to-apples R² on noise residual target across all methods including Option C - run_deepsets_volume: DeepSets on total volume (v1, raw) - run_deepsets_noise: DeepSets with V_arb decomposition and Optuna - run_deepsets_v2: full feature menu with Optuna feature selection, trains on total volume, evaluates on noise residual Key findings: ridge in-sample peers+own = 0.599 (matching Option C), but cross-pool signal is almost entirely shared arb response — noise residual ridge ceiling is 0.098. Option C noise residual R² = 0.060.

…rdization fix - noise_model_arrays.py: new module to precompute noise_base and noise_tvl_coeff arrays from trained artifact. Decomposes per-pool coefficients into TVL-dependent and TVL-independent components. Returns tvl_mean/tvl_std for runtime standardization. - noise_trades.py: add tvl_mean/tvl_std params to reclamm_market_linear_noise_volume() — standardizes log(TVL) at runtime to match training scale. Fixes NaN blowup from raw TVL. - reclamm_reserves.py: pass noise_params (tvl_mean, tvl_std) through to market_linear dispatch. - reclamm.py: cache loaded noise arrays on pool instance to avoid repeated disk reads. Support noise_arrays_path in fingerprint. - tune_reclamm_calibrated_noise.py: add --noise-model flag (calibrated vs market_linear), --artifact-dir, --initial-pool-value. Save precomputed arrays to disk, pass path + tvl stats via fingerprint. Default dates adjusted to panel coverage period. 100 trials × 3 objectives all complete successfully. - plot_reclamm_optuna_result.py: forward noise_arrays_path in run_full_period() for market_linear re-runs.

- run_deconfounder_noise.py: Four-strategy causal analysis of b_tvl: 1. Variance decomposition (62% between-pool, 38% within-pool) 2. Within-pool Δ regressions (median b_tvl=+0.12, daily too fast) 2b. Lagged-average TVL across windows (stable ~0.95 at all horizons) 3. TVL decomposition: price-driven vs flow-driven (IV-style) 4. Deconfounder sensitivity (Wang & Blei 2019, n_factors sweep) D'Amour critique acknowledged in docstring. Ridge warm-start, standardized Z_hat. Convergent finding: per-pool b_tvl ~1.0 is the right working estimate for counterfactuals. - scan_lp_events.py: Scan all pools for large LP deposit/withdrawal events (semi-exogenous TVL shocks). Filters pool creation events via min-age and min-tvl. Computes per-event elasticity from ±window day volume comparison. 836 events across 118 pools, median elasticity +0.84 (clean: +0.98, OLS: +0.89). Saves CSV + generates plots: elasticity histograms, deposits vs withdrawals, elasticity vs pool size, log-log scatter with OLS, boxplot by chain. No asymmetry between deposits/withdrawals, flat across pool sizes and chains.

Validates the full model (PCHIP arb + per-pool linear noise) against the AAVE/WETH natural experiment (70x TVL increase from LP deposit). Model predicts 44.3x total volume increase vs 39.2x observed (113% accuracy). V_arb carries 111x through PCHIP grid, V_noise adds 7.4x through the noise model (raw elasticity 0.42). Combined response matches observed despite individual channels having different elasticities from the event study total. Also evaluates counterfactual noise volumes at arbitrary TVL levels, using median pre-deposit market features with only TVL varying.

…volume_zscore features - run_mlp_noise.py: MLP noise model with learnable cadence, no panel dependency. Uses only Binance market data + pool TVL. Supports variable depth/width, per-pool bias, optax cosine LR decay, and Optuna sweep over architecture + hyperparameters. Best eval R² = 0.39 (matches linear baseline) with [16,8,4]. In-sample R² = 0.70 with [128,64,32] — overfits on temporal split. - market_features.py: add volume_zscore feature — within-token rolling z-score of daily Binance USD volume (today vs 30d trailing mean/std). Captures "unusually active day for this token" without cross-token scale issues. Added for BTC, token A, and token B.

…eline - noise_model_arrays.py: rewrite build_simulator_arrays to use Binance parquets directly (no panel/API dependency). Takes token_a, token_b + date range, builds all features from market data. Works for any date range covered by Binance data. Tested: 639 days for AAVE/ETH. - tune_reclamm_calibrated_noise.py: update to new build_simulator_arrays interface (token_a/token_b instead of pool_id + matched_clean). Extended date range (2024-06 to 2026-03) now works. - run_mlp_noise.py: add Optuna sweep (--tune), optax cosine LR decay (--cosine), pool attributes (--pool-attrs).

…, 36 pools)

…/quantammsim into noise-modelling

Two bugs in noise fee income application for reClAMM pools: 1. Cadence scaling: noise model returns per-minute volume but was applied once per arb step (every arb_frequency minutes) without scaling. Now multiplies by minutes_per_step. At cadence=5, this was underestimating noise fee income by 5x. 2. Price preservation: uniform real-reserve scaling (Ra*s, Rb*s) preserves price for weighted pools but NOT for 2-CLPs where price depends on effective reserves (Ra+Va)/(Rb+Vb). Fixed by scaling effective reserves uniformly then subtracting virtuals: Ra_new = (Ra+Va)*scale - Va. Preserves quoted marginal price. Total value added still equals noise_fee_income (verified algebraically: effective_value * (scale-1) = fee_income). Both fixes applied to all CLP noise model variants (tsoukalas, loglinear, calibrated, market_linear). Also: tune script adds L-BFGS support, 25% protocol fee split, extended date range, $7M default TVL. New plotting scripts for model vs real comparison.

…/quantammsim into noise-modelling

Feature scaling reform in build_data(): TVL and BTC log_price kept in raw log scale (absolute level matters), returns/trends/volume_zscore unscaled, volatilities lightly centered. Eliminates global z-score that squeezed TVL into [-2,+2] and prevented models from learning TVL response. Add ±3σ clamp on standardized log(TVL) in reclamm_market_linear_noise_volume to prevent extreme concentration from wireheading the noise model. Change default protocol_fee_split from 0.0 to 0.25 to match reClAMM production configuration.

Add CMA-ES as third optimisation method in tune_reclamm_calibrated_noise.py alongside Optuna and BFGS, with population_size, sigma0, n_generations controls. Add min_train_returns_over_hodl rejection in Optuna objective: trials with catastrophic in-sample returns_over_hodl are rejected early (return -inf) to avoid wasting evaluation budget.

Michaelis-Menten noise model (run_mm_noise.py): structural TVL saturation via V_noise = alpha_i * TVL/(K_i + TVL) * exp(x_market @ gamma_i), with learned EWMA smoothing on TVL (discovery lag). Per-pool alpha, K, gamma; shared lambda. Achieves R²=0.64 matching per-pool linear while adding saturation (K_med ~$19M). MLP noise model (run_mlp_noise.py): Optuna sweep with per-trial model saving, TVL response check at sweep end. Investigation showed shared MLP cannot learn TVL relationship due to cross-pool confounding. Model comparison (run_model_comparison.py): linear vs MLP noise model across TVL levels with time series and summary plots. MM fit plotting (plot_mm_noise_fit.py): 6-panel per-pool time series, cross-pool TVL response/elasticity curves, K distribution analysis.

MM model now supports per-pool log_K (default) and shared Binance-volume K (--shared-K). Per-pool K with per-pool gamma achieves R²=0.66 at 20K epochs with structural TVL saturation (median K≈$2M). Optuna sweep searches lr, l2, huber_delta, init_log_K, per_pool_gamma. Best shared-gamma eval R²=0.42 (huber=0.5, lr=1e-4 consistently). Removed learned EWMA (lambda stayed near 1, no benefit). Removed TVL interaction features (hurt eval R², subsumed by K). Plot script handles both K modes, shows all pools by default. New: verify_vol_volume_slope.py — cross-pool volume/TVL analysis confirming sublinear scaling (TVL^0.80, R²=0.79) and TVL elasticity ~0.9 across fee tiers.

New: fetch_competitor_tvl.py fetches historical TVL from DeFi Llama for all competing pools per token pair (same-chain), computes effective K via network conductance model (direct + multi-hop through hub tokens WETH, WSTETH, USDC, USDT, DAI, WBTC with harmonic mean for series combination). Self-exclusion via own TVL subtraction. MM model (run_mm_noise.py) now supports --observed-K flag: K is fixed from DeFi Llama data (0 learned K params), with per-pool alpha + gamma learning the noise level and temporal variation. R²=0.625 with economically meaningful K values (AAVE/WETH: $89M, USDC/WETH: $432M). New noise function: reclamm_mm_observed_noise_volume() in noise_trades.py evaluates V_noise = exp(base) * TVL/(K+TVL) per minute. New array builder: build_mm_simulator_arrays() in noise_model_arrays.py precomputes noise_base + competitor_tvl minute arrays for the simulator.

Wire reclamm_mm_observed_noise_volume through the full simulator: - reclamm.py: load noise_base + competitor_tvl arrays from npz - reclamm_reserves.py: dispatch mm_observed in scan step, append competitor_tvl to scan_inputs alongside noise_base - noise_model_arrays.py: build_mm_simulator_arrays() precomputes both arrays from MM model artifact + DeFi Llama competitor TVL - jax_runner_utils.py: add noise array keys to _TRAINING_ONLY_FIELDS Fingerprint usage: noise_model="mm_observed", noise_arrays_path="path/to/arrays.npz"

tune_reclamm_calibrated_noise.py now supports --noise-model mm_observed which uses the Michaelis-Menten model with observed competitor TVL from DeFi Llama as K. Precomputes noise_base + competitor_tvl arrays via build_mm_simulator_arrays, saves to npz, passes path in fingerprint. Usage: python experiments/tune_reclamm_calibrated_noise.py \ --noise-model mm_observed \ --artifact-dir results/mm_noise \ --competitor-tvl-path results/competitor_tvl/competitor_tvl.npz

MatthewWilletts and others added 30 commits March 9, 2026 18:22

feat: add MLPHead for nonlinear pool-attribute-to-cadence mapping

c7ee40d

Two-layer MLP (x_attr → Dense(hidden, ReLU) → Dense(1)) with He initialization, L2 regularization on weights, and warm-start from per-pool fits. 16 unit tests + 6 integration tests with CalibrationModel.

fix: use small random W2 init for MLP heads to avoid degenerate L-BFG…

ccc3c6c

…S Hessian Add MLP calibration and sweep scripts.

merge dev

16f5bf3

WIP: output clipping on heads, two-stage joint calibration

49671f8

- Add output_lo/output_hi to LinearHead and MLPHead for cadence bounds - Add run_two_stage_joint() and _extract_two_stage_per_pool() to MLP calibration script

feat: parameterize k_obs in noise heads

ba30663

PerPoolNoiseHead, SharedLinearNoiseHead, and MLPNoiseHead accept k_obs=4 to match the reduced x_obs. Defaults to K_OBS=8 so existing usage is unchanged.

wip on deepsets

1056ee0

MatthewWilletts and others added 22 commits March 23, 2026 13:08

merge origin

b41d793

data: per-pool linear noise model artifact (Binance-only, 22 features…

414903d

…, 36 pools)

Merge branch 'noise-modelling' of https://github.com/QuantAMMProtocol…

c936eee

…/quantammsim into noise-modelling

compare improvements

5d40a65

compare improvements

5ec81c5

data: add sim arrays for 0x9d1fcf346ea1b0

29af7df

diagnostics

8331b7f

Merge branch 'noise-modelling' of https://github.com/QuantAMMProtocol…

bf033bb

…/quantammsim into noise-modelling

perforance improvements

00144e2

different scripts

e202d67

initial take

fc616a8

bulkcade force-pushed the hyper-surge branch from 7f6e98f to fc616a8 Compare April 7, 2026 14:30

MatthewWilletts and others added 7 commits April 7, 2026 15:45

merge noise-modelling

fa32c9d

merge conflict fix

1a9f644

refactor balancer hypersurge

d123a9f

dynamic input array slicing defensive fix

4003371

reclamm hypersurge first implementation

c3b0ede

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyper surge#78

Hyper surge#78
bulkcade wants to merge 63 commits into
devfrom
hyper-surge

bulkcade commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bulkcade commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants