OpenRouter-driven runner for the BitGN ECOM benchmark, decoupled from
bitgn-ecom. This project is standalone: it owns its own venv, vendors its
own copy of the bitgn-* wrappers and Python linter modules, and does not
require bitgn-ecom to be present on the machine.
cp .env.template .env
# fill in DEEPSEEK_API_KEY (direct backend) and BITGN_API_KEY.
# OPENROUTER_API_KEY is only needed for OpenRouter-backed aliases.
uv sync
uv run python runner_api.py \
--model deepseek-v4-flash \
--task t01 \
--workers 1
--model deepseek-v4-flashresolves (viamodels.toml) to the DeepSeek direct API (DEEPSEEK_API_KEY), bypassing OpenRouter and its per-key credit limit. To A/B the same model through OpenRouter use--model deepseek-v4-flash-or; the bare slash iddeepseek/deepseek-v4-flashis also an OpenRouter passthrough.
.
├── runner_api.py # CLI runner (trial dispatch, harness lifecycle)
├── api_executor.py # OpenRouter tool-use loop, gates, refs enforcement
├── events.py tools.py linters.py
├── tests/ scripts/ runs/ results/
└── vendor/
├── PROVENANCE.md # source SHA + snapshot date + wrapper patches
├── bin/ # bitgn-* wrappers (PATH for executor subprocess)
└── python/ # ecom_linters, ecom_taxonomy, ecom_tracking, …
See vendor/PROVENANCE.md for the upstream commit SHA the vendored copy was
snapshotted from. This project is in fork mode — it owns its copy and may
diverge freely.
uv run python runner_api.py \
--model <verified-id> \
--task t01 --task t02 --task t03 \
--workers 1 --verbose \
--output results/smoke3.json