Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,13 @@ target/
profile_default/
ipython_config.py
docs/source/api/core/generated/

# local virtualenv
.venv/

# local agent/editor config
.claude/

# simulation / tuning artifacts
/results/
optuna_studies/
79 changes: 79 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# quantammsim — running simulations

JAX-accelerated simulator for backtesting and tuning AMM pools (Balancer, CowAMM, Gyroscope, QuantAMM, reCLAMM) against historic minute-resolution token data. This file is for agents driving simulations; read it before running anything.

## Environment

- Use the existing venv: `.venv/` (Python 3.12). Prefix commands with `.venv/bin/python`, or `source .venv/bin/activate` first. Do not assume conda.
- Package is installed editable (`pip install -e .`). If imports fail, re-run that from the repo root.
- reCLAMM lives on the `training-pipeline` branch.
- Quick sanity check: `.venv/bin/python -c "from quantammsim.runners.jax_runners import do_run_on_historic_data; print('ok')"`

## Data

- Historic data is parquet at `quantammsim/data/<TICKER>_USD.parquet` (minute resolution). Available now: `AAVE`, `ETH`, `USDC`. Add more with `.venv/bin/python scripts/download_data.py <TICKERS...>`.
- `tokens` in any run config must match these filenames exactly (`ETH`, not `WETH`; `BTC`, not `WBTC`).
- The downloader already forces `mpire` `start_method="spawn"` (JAX is multithreaded; `fork` deadlocks on macOS) and normalizes Binance's 2025+ microsecond timestamps. Don't strip those patches in `scripts/download_data.py` / `historic_data_utils.py`.
- First download per ticker is heavy (~1 min, ETH parquet ≈ 180 MB).

## Running one backtest

Entrypoint: `do_run_on_historic_data(run_fingerprint, params)` in `quantammsim.runners.jax_runners`.

```python
import numpy as np, jax.numpy as jnp
from quantammsim.runners.jax_runners import do_run_on_historic_data

def to_daily_price_shift_base(exp): # daily price-shift % -> Solidity base
return 1.0 - exp / 124649.0

run_fingerprint = {
"tokens": ["AAVE", "ETH"], "rule": "reclamm",
"startDateString": "2025-05-20 00:00:00", "endDateString": "2026-05-20 00:00:00",
"initial_pool_value": 10_000.0, "do_arb": True,
"fees": 0.003, "gas_cost": 0.0, "arb_fees": 0.0,
"chunk_period": 60, "weight_interpolation_period": 60,
}
params = { # reCLAMM knobs
"price_ratio": jnp.array(4.0),
"centeredness_margin": jnp.array(0.1), # 10%
"daily_price_shift_base": jnp.array(to_daily_price_shift_base(0.02)), # 2%
}
r = do_run_on_historic_data(run_fingerprint=run_fingerprint, params=params)
```

Parameter units: percentages are fractions — `fees=0.003` is 0.3%, `centeredness_margin=0.1` is 10%. `price_ratio` is a raw multiplier. Daily price-shift % must be converted via `to_daily_price_shift_base`.

`result` keys: `value`, `prices`, `reserves`, `weights`, `fee_revenue` (all over time), plus `final_value`, `final_reserves`. `result["value"]` may need `np.asarray(...).reshape(-1)`. `result["prices"]` is `(n_minutes, n_tokens)` in USD, column-aligned with `tokens`.

## Comparing against HODL (do this — pools are judged relative to holding)

A pool's USD value alone is meaningless without a hold baseline. Compute from the same run:

```python
val = np.asarray(r["value"]).reshape(-1)
prices = np.asarray(r["prices"]); res0 = np.asarray(r["reserves"])[0]
deposit_hodl = (res0 * prices[-1]).sum() # hold the deposited basket
init_usd = (res0 * prices[0]).sum()
uniform_hodl = ((init_usd / 2.0) / prices[0] * prices[-1]).sum() # equal-$ 50/50 hold
# pool minus HODL = fees earned minus impermanent loss
```

## Tuning (finding good params)

Entrypoint: `train_on_historic_data(fp, return_training_metadata=True)` → `(best_params, metadata)`. Set `fp["optimisation_settings"]["method"]="optuna"`. The objective is `fp["return_val"]`; all objectives are normalized so **higher = better** (the optimizer maximizes). HODL-relative objectives: `returns_over_hodl`, `annualised_returns_over_hodl`, `returns_over_uniform_hodl`, `daily_log_sharpe_excess`. Full list is in `quantammsim/core_simulator/forward_pass.py` (`_calculate_return_value`). The CLI wrapper is `experiments/tune_reclamm_params.py` (defaults to AAVE/ETH, search over `price_ratio`/`centeredness_margin`/`shift_exponent`).

Always split train vs out-of-sample: `startDateString` (train start), `endDateString` (train end / test start), `endTestDateString` (test end). `metadata["best_continuous_test_metrics"][0]` holds OOS metrics (incl. both HODL variants) for any objective — use it as a common yardstick across objectives.

## Critical caveats (learned the hard way)

- **Continuous metrics ≠ fresh-deposit reality.** Every in-framework metric (train, validation, and the OOS `test_objective`) comes from a *continuous* simulation where the pool enters each window carrying reserves evolved from the prior period. A real LP deposits *fresh* (rebalanced to the pool's initial split). These diverge a lot. **For any LP-facing claim, re-run the chosen config fresh with `do_run_on_historic_data` on the target window** and compare to HODL there. Do not quote the tuner's OOS number as the LP outcome.
- **Optuna selects on the (penalized) train objective**, which overfits: the train-best config routinely loses out-of-sample, and configs that look best OOS in the continuous run can be the worst on a fresh deposit. Rank candidates by fresh-deposit OOS before recommending.
- reCLAMM is a price-following AMM: it tends to *track* the held basket and earns its edge from fees in **ranging** markets, not one-directional trends (where IL dominates). "No config beats HODL for this pair/period" is a valid, common result — report it honestly rather than tuning until something looks good.
- Optuna runs in-memory (no sqlite file); per-trial records persist to `results/run_<hash>.json` (double-JSON-encoded: `json.loads` twice; element 0 is the fingerprint, elements 1..N are trials). Live progress is logged to `optuna_studies/optimization.log`.

## Conventions

- Generated artifacts land in `results/` and `optuna_studies/` (untracked); clean or ignore as needed.
- Long runs: launch in the background and poll the log, don't block.
- `train_on_historic_data` sets JAX x64 mode internally; for standalone numeric work prefer float64 to match.
3 changes: 2 additions & 1 deletion quantammsim/utils/data_processing/amalgamated_data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ def forward_fill_ohlcv_data(df, token):
end=pd.to_datetime(df.index.max(), unit="ms"),
freq="1min",
)
full_index = full_index.astype(np.int64) // 10**6
# Force ms precision before int64 cast; pandas 3.x date_range defaults to [ms], 2.x to [ns].
full_index = full_index.astype("datetime64[ms]").astype(np.int64)
# Reindex with the complete minute-level index
df = df.reindex(full_index)

Expand Down
22 changes: 9 additions & 13 deletions quantammsim/utils/data_processing/historic_data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -701,18 +701,14 @@ def get_binance_vision_data(token, numeraire, root):

# Combine and format data
combined_df = pd.concat(monthly_files + daily_files)
# Convert unix timestamps to milliseconds if they're in nanoseconds or seconds
# Typical millisecond timestamps are ~13 digits
# Nanosecond timestamps are ~19 digits
# Second timestamps are ~10 digits
# First convert nanoseconds to milliseconds
combined_df["unix"] = combined_df["unix"].apply(
lambda x: x // 1_000_000 if len(str(int(x))) > 13 else x
)
# Then convert seconds to milliseconds
combined_df["unix"] = combined_df["unix"].apply(
lambda x: x * 1000 if len(str(int(x))) <= 10 else x
)
# Normalize unix timestamps to milliseconds. Binance Vision shipped klines in ms
# historically (13 digits) and switched to microseconds in 2025 (16 digits); nanosecond
# (19 digits) and second (10 digits) variants also exist in other sources.
u = combined_df["unix"].to_numpy(dtype=np.int64)
u = np.where((u >= 10**15) & (u < 10**18), u // 1_000, u) # us -> ms
u = np.where(u >= 10**18, u // 1_000_000, u) # ns -> ms
u = np.where(u < 10**12, u * 1_000, u) # s -> ms
combined_df["unix"] = u
combined_df["date"] = pd.to_datetime(combined_df["unix"], unit="ms").dt.strftime("%Y-%m-%d %H:%M:%S")
combined_df["symbol"] = f"{token}/{numeraire}"
combined_df[f"Volume {token}"] = combined_df["volume"]
Expand Down Expand Up @@ -984,7 +980,7 @@ def update_historic_data(token, root):
agg_dict = {k: v for k, v in agg_dict.items() if k in concated_df_hourly.columns}

# Perform resampling
hourly_data = concated_df_hourly.resample("1H").agg(agg_dict).reset_index()
hourly_data = concated_df_hourly.resample("1h").agg(agg_dict).reset_index()

# Save hourly data
hourly_data.to_csv(hourlyPath, index=False)
Expand Down
40 changes: 39 additions & 1 deletion scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ All four grids additionally cross over the same **cross-cutting axes** (`ste`, `
| flag | default | what it does |
|---|---|---|
| `--optimiser` | `adam` | picks the sweep grid |
| `--rule` | `momentum` | pre-canned strategy name (see the quantamm pools: `momentum`, `anti_momentum`, `power_channel`, `mean_reversion_channel`) |
| `--rule` | `momentum` | pre-canned strategy name. QuantAMM pools: `momentum`, `anti_momentum`, `power_channel`, `mean_reversion_channel`, `difference_momentum`, `min_variance`, `hodling_index_market_cap`, `trad_hodling_index_market_cap`. Other pools: `balancer`, `cow`, `gyroscope`, `hodl`, `reclamm` (see the [reclamm](#reclamm) section below). |
| `--tokens ETH USDC` | ETH USDC | pool assets, must match downloaded data |
| `--start` / `--end` / `--test-end` | 2023-06-01 / 2025-06-01 / 2026-01-01 | train window + held-out test window |
| `--n-parameter-sets` | 4 | parallel multi-start param sets per run |
Expand Down Expand Up @@ -107,6 +107,44 @@ python scripts/train_strategy.py --optimiser adam --dry-run

There's an upstream bug in `create_trial_params` where `expand_around=True` produces invalid `low > high` bounds for `logit_lamb` (because the optuna path overrides `parameter_config["logit_lamb"]` with absolute bounds but the expand_around branch treats them as deltas). The script pins `expand_around=False` in `base_fingerprint` as a workaround. If optuna silently returns 0 completed trials, check `./optuna_studies/optimization.log`.

### reclamm

reCLAMM is a concentrated-liquidity pool (`ReClammPool` in `quantammsim/pools/reCLAMM/reclamm.py`, dispatched from `creator.py:232`). It plugs into the same sweep pipeline as the QuantAMM rules — `--rule reclamm` is all you need on the CLI. Optuna is the natural optimiser because most trainable knobs are scalar log-scaled rates, not gradient-friendly weight vectors; Adam works but expects a different parameter shape.

```bash
# Train a reclamm sweep on AAVE/ETH with Optuna.
python scripts/train_strategy.py \
--rule reclamm \
--tokens AAVE ETH \
--optimiser optuna \
--start "2024-06-01 00:00:00" \
--end "2025-06-01 00:00:00"

# Smoke-test reclamm end-to-end in seconds.
python scripts/train_strategy.py --rule reclamm --tokens AAVE ETH --optimiser optuna --smoke --max-runs 4
```

reclamm-specific knobs live in `quantammsim/runners/default_run_fingerprint.py` and apply only when `rule == "reclamm"`:

| key | default | what it does |
|---|---|---|
| `reclamm_interpolation_method` | `"geometric"` | `"geometric"` or `"constant_arc_length"` |
| `reclamm_arc_length_speed` | `None` | auto-calibrate from geometric onset; or fix a number |
| `reclamm_centeredness_scaling` | `False` | scale speed by margin/centeredness |
| `reclamm_learn_arc_length_speed` | `False` | include `arc_length_speed` in trainable params |
| `reclamm_use_shift_exponent` | `False` | parametrise shift rate as `shift_exponent` (log-friendly) |
| `reclamm_learn_fees` | `False` | include `fees` in the Optuna search |

These aren't exposed as CLI flags — override them by editing `base_fingerprint()` in `train_strategy.py` (or by reaching into `fp[...]` after `base_fingerprint(args)` returns).

Reference scripts:

- `scripts/demo_run_reclamm.py` — single-run simulation (no training); use to sanity-check tokens/dates before kicking off a sweep.
- `scripts/calibrate_reclamm_noise.py` — calibrate `price_noise_sigma` from real reCLAMM behaviour; run before training if you want realistic noise injection.
- `scripts/plot_reclamm_optuna_result.py` and `scripts/reclamm/` — analysis/plotting helpers for reclamm trial results.

After training, the rest of the pipeline is unchanged — `evaluate_trials.py` then `readout.py` work the same way as for QuantAMM rules.

---

## `evaluate_trials.py` — pick a winner
Expand Down
16 changes: 16 additions & 0 deletions scripts/download_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,22 @@
import zipfile
import argparse
from pathlib import Path

# binance_historical_data uses mpire with the default fork start method. quantammsim
# transitively imports JAX (multithreaded), and fork-after-threads deadlocks on macOS.
# Force spawn workers before any mpire pool is constructed.
import mpire

_orig_workerpool_init = mpire.WorkerPool.__init__


def _workerpool_init_spawn(self, *args, **kwargs):
kwargs.setdefault("start_method", "spawn")
return _orig_workerpool_init(self, *args, **kwargs)


mpire.WorkerPool.__init__ = _workerpool_init_spawn

from tqdm import tqdm
from quantammsim.utils.data_processing.historic_data_utils import (
update_historic_data,
Expand Down