Skip to content

Add per-run profiling config for fine-grained Run() profiling#2152

Draft
xiaofeihan1 wants to merge 1 commit into
mainfrom
xiaofeihan/per-run-profiling-config
Draft

Add per-run profiling config for fine-grained Run() profiling#2152
xiaofeihan1 wants to merge 1 commit into
mainfrom
xiaofeihan/per-run-profiling-config

Conversation

@xiaofeihan1
Copy link
Copy Markdown
Contributor

Adds model.decoder.run_profiling to genai_config.json so users can profile selected session.Run() calls (e.g. only prefill, or only the N-th decode) without modifying Python code. The existing set_runtime_option('enable_profiling', ...) path already supports per- run profiling but requires API calls per experiment; this surfaces the same capability via config.

Schema:
"run_profiling": {
"enabled": true,
"output_prefix": "onnxruntime_run_profile",
"runs": "0"
}

runs DSL: comma-separated tokens, each one of N, A-B, A-, or .
"0" -> prefill only (default)
"1-" -> all decode steps
"0,5" -> prefill + 5th decode
"
" -> every run

Filename: <output_prefix>_<ort_timestamp>.json. Run index is a per-Generator counter (0 = prefill, N = N-th decode). Coexists with session-level enable_profiling; both produce independent files.

Adds model.decoder.run_profiling to genai_config.json so users can
profile selected session.Run() calls (e.g. only prefill, or only the
N-th decode) without modifying Python code. The existing
set_runtime_option('enable_profiling', ...) path already supports per-
run profiling but requires API calls per experiment; this surfaces the
same capability via config.

Schema:
  "run_profiling": {
    "enabled": true,
    "output_prefix": "onnxruntime_run_profile",
    "runs": "0"
  }

runs DSL: comma-separated tokens, each one of N, A-B, A-, or *.
  "0"      -> prefill only (default)
  "1-"     -> all decode steps
  "0,5"    -> prefill + 5th decode
  "*"      -> every run

Filename: <output_prefix><idx>_<ort_timestamp>.json. Run index is a
per-Generator counter (0 = prefill, N = N-th decode). Coexists with
session-level enable_profiling; both produce independent files.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant