Skip to content

Commit bce1e89

Browse files
author
zhusy54
committed
Support: simplify benchmark workflow and fix case filtering
- Restructure benchmark_rounds.sh to prefer test_*.py over run_example.py - Fix: only pass --manual include to test_*.py; run_example.py does not support this flag and would crash when case_name is specified
1 parent 92155aa commit bce1e89

2 files changed

Lines changed: 13 additions & 23 deletions

File tree

.claude/skills/benchmark/SKILL.md

Lines changed: 12 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -87,17 +87,7 @@ npu-smi info
8787

8888
Pick devices with **HBM-Usage = 0**. Find the longest consecutive sub-range (at most 4). If no idle device is found, prompt user to specify a device ID.
8989

90-
## Step 3: Pin PTO-ISA
91-
92-
Extract pinned commit from `.github/workflows/ci.yml`:
93-
94-
```bash
95-
PTO_ISA_COMMIT=$(grep -oP '(?<=-c )\w+' .github/workflows/ci.yml | head -1)
96-
```
97-
98-
Append `-c $PTO_ISA_COMMIT` to benchmark args so `run_example.py` picks it up.
99-
100-
## Step 4: Prepare — Compute Absolute Paths
90+
## Step 3: Compute Absolute Paths
10191

10292
The Bash tool resets its working directory to the project root on every call. Relative paths like `cd worktree && ...` are fragile and easy to forget. **Compute absolute paths once, then use them everywhere.**
10393

@@ -118,7 +108,7 @@ WORKTREE_ABS="/home/user/simpler/tmp/worktree_baseline_20260331_102302"
118108

119109
**Do NOT use `cd` + relative `./tools/...`** — this is the #1 source of silent errors (running the wrong workspace).
120110

121-
## Step 5: Run Benchmarks
111+
## Step 4: Run Benchmarks
122112

123113
### Single Mode
124114

@@ -141,7 +131,7 @@ Pure Python files (`bindings.py`, `code_runner.py`) are resolved via `sys.path`
141131

142132
**Solution: always create a venv in the worktree** (~26s overhead). This builds both the nanobind extension AND runtime binaries, fully isolating the baseline.
143133

144-
#### 5a. Create worktree, venv, and build
134+
#### 4a. Create worktree, venv, and build
145135

146136
Inline the **absolute** worktree path (copy-paste the value, do not rely on shell variables persisting):
147137

@@ -158,27 +148,27 @@ python3 -m venv "${WORKTREE_ABS}/.venv" --system-site-packages
158148

159149
This gives the worktree its own `_task_interface.*.so` in `.venv/lib/python3.*/site-packages/`, completely independent from the main workspace.
160150

161-
#### 5b. Run baseline
151+
#### 4b. Run baseline
162152

163153
Activate the venv so `benchmark_rounds.sh` (which calls `python3`) picks up the worktree's nanobind extension and Python bindings:
164154

165155
```bash
166156
# WORKTREE_ABS must be the literal absolute path (e.g. /home/user/simpler/tmp/worktree_baseline_20260331)
167-
cd "$WORKTREE_ABS" && source .venv/bin/activate && pwd && ./tools/benchmark_rounds.sh -d $BASELINE_DEVICE -c $PTO_ISA_COMMIT -r "$RUNTIME" \
157+
cd "$WORKTREE_ABS" && source .venv/bin/activate && pwd && ./tools/benchmark_rounds.sh -d $BASELINE_DEVICE -r "$RUNTIME" \
168158
2>&1 | tee "${PROJECT_ROOT}/tmp/benchmark_baseline_${TIMESTAMP}_${RUNTIME}.txt"
169159
```
170160

171161
**Always include `pwd &&` after `cd` to verify you are in the correct directory.** If `pwd` does not print the worktree path, something went wrong — do not proceed.
172162

173-
#### 5c. Run current
163+
#### 4c. Run current
174164

175165
```bash
176166
# Runs from the main workspace (Bash tool default cwd)
177-
./tools/benchmark_rounds.sh -d $CURRENT_DEVICE -c $PTO_ISA_COMMIT -r "$RUNTIME" \
167+
./tools/benchmark_rounds.sh -d $CURRENT_DEVICE -r "$RUNTIME" \
178168
2>&1 | tee "tmp/benchmark_current_${TIMESTAMP}_${RUNTIME}.txt"
179169
```
180170

181-
#### 5d. Cleanup
171+
#### 4d. Cleanup
182172

183173
```bash
184174
git worktree remove "$WORKTREE_ABS" --force
@@ -209,22 +199,22 @@ done
209199
#### Sequential execution (one device)
210200

211201
```bash
212-
# 1. Worktree + venv already created in step 5a
202+
# 1. Worktree + venv already created in step 4a
213203

214204
# 2. For each runtime (serially — one device, one process at a time):
215205
# Baseline first (from worktree with venv activated)
216-
cd "$WORKTREE_ABS" && source .venv/bin/activate && pwd && ./tools/benchmark_rounds.sh -d $DEVICE -c $PTO_ISA_COMMIT -r "$RUNTIME" \
206+
cd "$WORKTREE_ABS" && source .venv/bin/activate && pwd && ./tools/benchmark_rounds.sh -d $DEVICE -r "$RUNTIME" \
217207
2>&1 | tee "${PROJECT_ROOT}/tmp/benchmark_baseline_${TIMESTAMP}_${RUNTIME}.txt"
218208

219209
# Then current (from main workspace — default cwd, no venv)
220-
./tools/benchmark_rounds.sh -d $DEVICE -c $PTO_ISA_COMMIT -r "$RUNTIME" \
210+
./tools/benchmark_rounds.sh -d $DEVICE -r "$RUNTIME" \
221211
2>&1 | tee "tmp/benchmark_current_${TIMESTAMP}_${RUNTIME}.txt"
222212

223213
# 3. Cleanup
224214
git -C "$PROJECT_ROOT" worktree remove "$WORKTREE_ABS" --force
225215
```
226216

227-
## Step 6: Report Results
217+
## Step 5: Report Results
228218

229219
Parse `Trimmed Avg:` for elapsed and `Orch Trimmed Avg:` for orchestration time from benchmark output.
230220

@@ -293,7 +283,6 @@ If any example shows > 5% regression, highlight it explicitly.
293283

294284
- [ ] Mode detected (single vs compare)
295285
- [ ] Idle device found or user-specified
296-
- [ ] PTO-ISA pinned to CI commit
297286
- [ ] `PROJECT_ROOT` and `WORKTREE_ABS` absolute paths computed
298287
- [ ] (Compare mode) Worktree created, venv built with `pip install -e .`
299288
- [ ] (Compare mode) Baseline completed — venv activated, `pwd` confirmed worktree path before running

tools/benchmark_rounds.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -411,6 +411,7 @@ run_bench() {
411411
fi
412412
if [[ -n "$case_name" ]]; then
413413
run_cmd+=(--case "$case_name")
414+
[[ -n "$test_file" ]] && run_cmd+=(--manual include)
414415
fi
415416
run_cmd+=("${EXTRA_ARGS[@]}")
416417

0 commit comments

Comments
 (0)