Skip to content

rbpp3042/bitgn-ecom-api

Repository files navigation

bitgn-ecom-api

OpenRouter-driven runner for the BitGN ECOM benchmark, decoupled from bitgn-ecom. This project is standalone: it owns its own venv, vendors its own copy of the bitgn-* wrappers and Python linter modules, and does not require bitgn-ecom to be present on the machine.

Quickstart

cp .env.template .env
# fill in DEEPSEEK_API_KEY (direct backend) and BITGN_API_KEY.
# OPENROUTER_API_KEY is only needed for OpenRouter-backed aliases.
uv sync
uv run python runner_api.py \
    --model deepseek-v4-flash \
    --task t01 \
    --workers 1

--model deepseek-v4-flash resolves (via models.toml) to the DeepSeek direct API (DEEPSEEK_API_KEY), bypassing OpenRouter and its per-key credit limit. To A/B the same model through OpenRouter use --model deepseek-v4-flash-or; the bare slash id deepseek/deepseek-v4-flash is also an OpenRouter passthrough.

Layout

.
├── runner_api.py          # CLI runner (trial dispatch, harness lifecycle)
├── api_executor.py        # OpenRouter tool-use loop, gates, refs enforcement
├── events.py tools.py linters.py
├── tests/  scripts/  runs/  results/
└── vendor/
    ├── PROVENANCE.md      # source SHA + snapshot date + wrapper patches
    ├── bin/               # bitgn-* wrappers (PATH for executor subprocess)
    └── python/            # ecom_linters, ecom_taxonomy, ecom_tracking, …

See vendor/PROVENANCE.md for the upstream commit SHA the vendored copy was snapshotted from. This project is in fork mode — it owns its copy and may diverge freely.

Smoke (multiple tasks)

uv run python runner_api.py \
    --model <verified-id> \
    --task t01 --task t02 --task t03 \
    --workers 1 --verbose \
    --output results/smoke3.json

About

BitGN ECOM agent executor

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages