diff --git a/.gitignore b/.gitignore index 70ace4c..158f2c9 100644 --- a/.gitignore +++ b/.gitignore @@ -161,3 +161,13 @@ target/ profile_default/ ipython_config.py docs/source/api/core/generated/ + +# local virtualenv +.venv/ + +# local agent/editor config +.claude/ + +# simulation / tuning artifacts +/results/ +optuna_studies/ diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..59bfc1f --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,79 @@ +# quantammsim — running simulations + +JAX-accelerated simulator for backtesting and tuning AMM pools (Balancer, CowAMM, Gyroscope, QuantAMM, reCLAMM) against historic minute-resolution token data. This file is for agents driving simulations; read it before running anything. + +## Environment + +- Use the existing venv: `.venv/` (Python 3.12). Prefix commands with `.venv/bin/python`, or `source .venv/bin/activate` first. Do not assume conda. +- Package is installed editable (`pip install -e .`). If imports fail, re-run that from the repo root. +- reCLAMM lives on the `training-pipeline` branch. +- Quick sanity check: `.venv/bin/python -c "from quantammsim.runners.jax_runners import do_run_on_historic_data; print('ok')"` + +## Data + +- Historic data is parquet at `quantammsim/data/_USD.parquet` (minute resolution). Available now: `AAVE`, `ETH`, `USDC`. Add more with `.venv/bin/python scripts/download_data.py `. +- `tokens` in any run config must match these filenames exactly (`ETH`, not `WETH`; `BTC`, not `WBTC`). +- The downloader already forces `mpire` `start_method="spawn"` (JAX is multithreaded; `fork` deadlocks on macOS) and normalizes Binance's 2025+ microsecond timestamps. Don't strip those patches in `scripts/download_data.py` / `historic_data_utils.py`. +- First download per ticker is heavy (~1 min, ETH parquet ≈ 180 MB). + +## Running one backtest + +Entrypoint: `do_run_on_historic_data(run_fingerprint, params)` in `quantammsim.runners.jax_runners`. + +```python +import numpy as np, jax.numpy as jnp +from quantammsim.runners.jax_runners import do_run_on_historic_data + +def to_daily_price_shift_base(exp): # daily price-shift % -> Solidity base + return 1.0 - exp / 124649.0 + +run_fingerprint = { + "tokens": ["AAVE", "ETH"], "rule": "reclamm", + "startDateString": "2025-05-20 00:00:00", "endDateString": "2026-05-20 00:00:00", + "initial_pool_value": 10_000.0, "do_arb": True, + "fees": 0.003, "gas_cost": 0.0, "arb_fees": 0.0, + "chunk_period": 60, "weight_interpolation_period": 60, +} +params = { # reCLAMM knobs + "price_ratio": jnp.array(4.0), + "centeredness_margin": jnp.array(0.1), # 10% + "daily_price_shift_base": jnp.array(to_daily_price_shift_base(0.02)), # 2% +} +r = do_run_on_historic_data(run_fingerprint=run_fingerprint, params=params) +``` + +Parameter units: percentages are fractions — `fees=0.003` is 0.3%, `centeredness_margin=0.1` is 10%. `price_ratio` is a raw multiplier. Daily price-shift % must be converted via `to_daily_price_shift_base`. + +`result` keys: `value`, `prices`, `reserves`, `weights`, `fee_revenue` (all over time), plus `final_value`, `final_reserves`. `result["value"]` may need `np.asarray(...).reshape(-1)`. `result["prices"]` is `(n_minutes, n_tokens)` in USD, column-aligned with `tokens`. + +## Comparing against HODL (do this — pools are judged relative to holding) + +A pool's USD value alone is meaningless without a hold baseline. Compute from the same run: + +```python +val = np.asarray(r["value"]).reshape(-1) +prices = np.asarray(r["prices"]); res0 = np.asarray(r["reserves"])[0] +deposit_hodl = (res0 * prices[-1]).sum() # hold the deposited basket +init_usd = (res0 * prices[0]).sum() +uniform_hodl = ((init_usd / 2.0) / prices[0] * prices[-1]).sum() # equal-$ 50/50 hold +# pool minus HODL = fees earned minus impermanent loss +``` + +## Tuning (finding good params) + +Entrypoint: `train_on_historic_data(fp, return_training_metadata=True)` → `(best_params, metadata)`. Set `fp["optimisation_settings"]["method"]="optuna"`. The objective is `fp["return_val"]`; all objectives are normalized so **higher = better** (the optimizer maximizes). HODL-relative objectives: `returns_over_hodl`, `annualised_returns_over_hodl`, `returns_over_uniform_hodl`, `daily_log_sharpe_excess`. Full list is in `quantammsim/core_simulator/forward_pass.py` (`_calculate_return_value`). The CLI wrapper is `experiments/tune_reclamm_params.py` (defaults to AAVE/ETH, search over `price_ratio`/`centeredness_margin`/`shift_exponent`). + +Always split train vs out-of-sample: `startDateString` (train start), `endDateString` (train end / test start), `endTestDateString` (test end). `metadata["best_continuous_test_metrics"][0]` holds OOS metrics (incl. both HODL variants) for any objective — use it as a common yardstick across objectives. + +## Critical caveats (learned the hard way) + +- **Continuous metrics ≠ fresh-deposit reality.** Every in-framework metric (train, validation, and the OOS `test_objective`) comes from a *continuous* simulation where the pool enters each window carrying reserves evolved from the prior period. A real LP deposits *fresh* (rebalanced to the pool's initial split). These diverge a lot. **For any LP-facing claim, re-run the chosen config fresh with `do_run_on_historic_data` on the target window** and compare to HODL there. Do not quote the tuner's OOS number as the LP outcome. +- **Optuna selects on the (penalized) train objective**, which overfits: the train-best config routinely loses out-of-sample, and configs that look best OOS in the continuous run can be the worst on a fresh deposit. Rank candidates by fresh-deposit OOS before recommending. +- reCLAMM is a price-following AMM: it tends to *track* the held basket and earns its edge from fees in **ranging** markets, not one-directional trends (where IL dominates). "No config beats HODL for this pair/period" is a valid, common result — report it honestly rather than tuning until something looks good. +- Optuna runs in-memory (no sqlite file); per-trial records persist to `results/run_.json` (double-JSON-encoded: `json.loads` twice; element 0 is the fingerprint, elements 1..N are trials). Live progress is logged to `optuna_studies/optimization.log`. + +## Conventions + +- Generated artifacts land in `results/` and `optuna_studies/` (untracked); clean or ignore as needed. +- Long runs: launch in the background and poll the log, don't block. +- `train_on_historic_data` sets JAX x64 mode internally; for standalone numeric work prefer float64 to match. diff --git a/quantammsim/utils/data_processing/amalgamated_data_utils.py b/quantammsim/utils/data_processing/amalgamated_data_utils.py index 61cd65f..89df098 100644 --- a/quantammsim/utils/data_processing/amalgamated_data_utils.py +++ b/quantammsim/utils/data_processing/amalgamated_data_utils.py @@ -43,7 +43,8 @@ def forward_fill_ohlcv_data(df, token): end=pd.to_datetime(df.index.max(), unit="ms"), freq="1min", ) - full_index = full_index.astype(np.int64) // 10**6 + # Force ms precision before int64 cast; pandas 3.x date_range defaults to [ms], 2.x to [ns]. + full_index = full_index.astype("datetime64[ms]").astype(np.int64) # Reindex with the complete minute-level index df = df.reindex(full_index) diff --git a/quantammsim/utils/data_processing/historic_data_utils.py b/quantammsim/utils/data_processing/historic_data_utils.py index b83a97b..4e93b20 100644 --- a/quantammsim/utils/data_processing/historic_data_utils.py +++ b/quantammsim/utils/data_processing/historic_data_utils.py @@ -701,18 +701,14 @@ def get_binance_vision_data(token, numeraire, root): # Combine and format data combined_df = pd.concat(monthly_files + daily_files) - # Convert unix timestamps to milliseconds if they're in nanoseconds or seconds - # Typical millisecond timestamps are ~13 digits - # Nanosecond timestamps are ~19 digits - # Second timestamps are ~10 digits - # First convert nanoseconds to milliseconds - combined_df["unix"] = combined_df["unix"].apply( - lambda x: x // 1_000_000 if len(str(int(x))) > 13 else x - ) - # Then convert seconds to milliseconds - combined_df["unix"] = combined_df["unix"].apply( - lambda x: x * 1000 if len(str(int(x))) <= 10 else x - ) + # Normalize unix timestamps to milliseconds. Binance Vision shipped klines in ms + # historically (13 digits) and switched to microseconds in 2025 (16 digits); nanosecond + # (19 digits) and second (10 digits) variants also exist in other sources. + u = combined_df["unix"].to_numpy(dtype=np.int64) + u = np.where((u >= 10**15) & (u < 10**18), u // 1_000, u) # us -> ms + u = np.where(u >= 10**18, u // 1_000_000, u) # ns -> ms + u = np.where(u < 10**12, u * 1_000, u) # s -> ms + combined_df["unix"] = u combined_df["date"] = pd.to_datetime(combined_df["unix"], unit="ms").dt.strftime("%Y-%m-%d %H:%M:%S") combined_df["symbol"] = f"{token}/{numeraire}" combined_df[f"Volume {token}"] = combined_df["volume"] @@ -984,7 +980,7 @@ def update_historic_data(token, root): agg_dict = {k: v for k, v in agg_dict.items() if k in concated_df_hourly.columns} # Perform resampling - hourly_data = concated_df_hourly.resample("1H").agg(agg_dict).reset_index() + hourly_data = concated_df_hourly.resample("1h").agg(agg_dict).reset_index() # Save hourly data hourly_data.to_csv(hourlyPath, index=False) diff --git a/scripts/README.md b/scripts/README.md index ac35c30..8db7452 100644 --- a/scripts/README.md +++ b/scripts/README.md @@ -67,7 +67,7 @@ All four grids additionally cross over the same **cross-cutting axes** (`ste`, ` | flag | default | what it does | |---|---|---| | `--optimiser` | `adam` | picks the sweep grid | -| `--rule` | `momentum` | pre-canned strategy name (see the quantamm pools: `momentum`, `anti_momentum`, `power_channel`, `mean_reversion_channel`) | +| `--rule` | `momentum` | pre-canned strategy name. QuantAMM pools: `momentum`, `anti_momentum`, `power_channel`, `mean_reversion_channel`, `difference_momentum`, `min_variance`, `hodling_index_market_cap`, `trad_hodling_index_market_cap`. Other pools: `balancer`, `cow`, `gyroscope`, `hodl`, `reclamm` (see the [reclamm](#reclamm) section below). | | `--tokens ETH USDC` | ETH USDC | pool assets, must match downloaded data | | `--start` / `--end` / `--test-end` | 2023-06-01 / 2025-06-01 / 2026-01-01 | train window + held-out test window | | `--n-parameter-sets` | 4 | parallel multi-start param sets per run | @@ -107,6 +107,44 @@ python scripts/train_strategy.py --optimiser adam --dry-run There's an upstream bug in `create_trial_params` where `expand_around=True` produces invalid `low > high` bounds for `logit_lamb` (because the optuna path overrides `parameter_config["logit_lamb"]` with absolute bounds but the expand_around branch treats them as deltas). The script pins `expand_around=False` in `base_fingerprint` as a workaround. If optuna silently returns 0 completed trials, check `./optuna_studies/optimization.log`. +### reclamm + +reCLAMM is a concentrated-liquidity pool (`ReClammPool` in `quantammsim/pools/reCLAMM/reclamm.py`, dispatched from `creator.py:232`). It plugs into the same sweep pipeline as the QuantAMM rules — `--rule reclamm` is all you need on the CLI. Optuna is the natural optimiser because most trainable knobs are scalar log-scaled rates, not gradient-friendly weight vectors; Adam works but expects a different parameter shape. + +```bash +# Train a reclamm sweep on AAVE/ETH with Optuna. +python scripts/train_strategy.py \ + --rule reclamm \ + --tokens AAVE ETH \ + --optimiser optuna \ + --start "2024-06-01 00:00:00" \ + --end "2025-06-01 00:00:00" + +# Smoke-test reclamm end-to-end in seconds. +python scripts/train_strategy.py --rule reclamm --tokens AAVE ETH --optimiser optuna --smoke --max-runs 4 +``` + +reclamm-specific knobs live in `quantammsim/runners/default_run_fingerprint.py` and apply only when `rule == "reclamm"`: + +| key | default | what it does | +|---|---|---| +| `reclamm_interpolation_method` | `"geometric"` | `"geometric"` or `"constant_arc_length"` | +| `reclamm_arc_length_speed` | `None` | auto-calibrate from geometric onset; or fix a number | +| `reclamm_centeredness_scaling` | `False` | scale speed by margin/centeredness | +| `reclamm_learn_arc_length_speed` | `False` | include `arc_length_speed` in trainable params | +| `reclamm_use_shift_exponent` | `False` | parametrise shift rate as `shift_exponent` (log-friendly) | +| `reclamm_learn_fees` | `False` | include `fees` in the Optuna search | + +These aren't exposed as CLI flags — override them by editing `base_fingerprint()` in `train_strategy.py` (or by reaching into `fp[...]` after `base_fingerprint(args)` returns). + +Reference scripts: + +- `scripts/demo_run_reclamm.py` — single-run simulation (no training); use to sanity-check tokens/dates before kicking off a sweep. +- `scripts/calibrate_reclamm_noise.py` — calibrate `price_noise_sigma` from real reCLAMM behaviour; run before training if you want realistic noise injection. +- `scripts/plot_reclamm_optuna_result.py` and `scripts/reclamm/` — analysis/plotting helpers for reclamm trial results. + +After training, the rest of the pipeline is unchanged — `evaluate_trials.py` then `readout.py` work the same way as for QuantAMM rules. + --- ## `evaluate_trials.py` — pick a winner diff --git a/scripts/download_data.py b/scripts/download_data.py index c004bcb..149b70c 100644 --- a/scripts/download_data.py +++ b/scripts/download_data.py @@ -3,6 +3,22 @@ import zipfile import argparse from pathlib import Path + +# binance_historical_data uses mpire with the default fork start method. quantammsim +# transitively imports JAX (multithreaded), and fork-after-threads deadlocks on macOS. +# Force spawn workers before any mpire pool is constructed. +import mpire + +_orig_workerpool_init = mpire.WorkerPool.__init__ + + +def _workerpool_init_spawn(self, *args, **kwargs): + kwargs.setdefault("start_method", "spawn") + return _orig_workerpool_init(self, *args, **kwargs) + + +mpire.WorkerPool.__init__ = _workerpool_init_spawn + from tqdm import tqdm from quantammsim.utils.data_processing.historic_data_utils import ( update_historic_data,