Support v1 raw multimodal image offload#2836
Conversation
a574785 to
4e78f06
Compare
4e78f06 to
2f5eacc
Compare
2f5eacc to
034e83e
Compare
bfc45cc to
b7903f6
Compare
/inference/v1/generate materializes every raw ref (no cache-only None branch); an unresolved ref is a hard error, not a silent None. Removes the cache-miss 409 path + helpers (_cache_only_mm_hashes/_is_missing_mm_cache_error/_missing_mm_cache_message). Bumps renderers/verifiers submodules to the matching cleanup commits. Deployment-agnostic; mm_hash encoder cache still skips re-encode. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
renderers.mm_store dropped the split_mmraw_ref backcompat alias; use split_raw_mm_ref. Bump renderers submodule pin to the cleanup commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts: # deps/verifiers # src/prime_rl/orchestrator/orchestrator.py # src/prime_rl/trainer/batch.py # tests/unit/orchestrator/test_batch.py
# Conflicts: # packages/prime-rl-configs/src/prime_rl/configs/inference.py # packages/prime-rl-configs/src/prime_rl/configs/rl.py # src/prime_rl/entrypoints/rl.py
| def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None: | ||
| if os.environ.get(IMAGE_OFFLOAD_DIR_ENV): | ||
| return | ||
| os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal)) |
There was a problem hiding this comment.
Pre-set offload env skips config
Medium Severity
When VF_RENDERER_IMAGE_OFFLOAD_DIR is already present in the process environment, apply_run_asset_env exits without applying [multimodal] resolution, and the multi-node SLURM template only sets a default when that variable is empty. That leaves a pre-existing shell or module value in charge instead of the config-owned offload path the launcher documents as non-overridable.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit dce2759. Configure here.
| max_concurrent_runs: int = Field(1, ge=1) | ||
| """Maximum number of concurrent runs to allow. If 1, only one run may run at a time.""" | ||
|
|
||
| missing_mm_image_policy: MissingMMImagePolicy = "placeholder_zero_loss" |
There was a problem hiding this comment.
can we put this in the multi modal config ?
| def build_run_asset_env( | ||
| output_dir: Path, | ||
| multimodal: MultimodalConfig | None = None, | ||
| base: Mapping[str, str] | None = None, | ||
| ) -> dict[str, str]: | ||
| """Resolve the environment used by subprocesses that share run image assets. | ||
|
|
||
| Prime-RL config owns the multimodal image offload path. Env vars are only the | ||
| transport used by verifiers/renderers running in subprocesses. | ||
| """ | ||
|
|
||
| env = dict(os.environ if base is None else base) | ||
| config = multimodal or MultimodalConfig() | ||
|
|
||
| env[IMAGE_OFFLOAD_DIR_ENV] = str(resolve_image_offload_dir(output_dir, config, env)) | ||
|
|
||
| return env | ||
|
|
||
|
|
||
| def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None: | ||
| if os.environ.get(IMAGE_OFFLOAD_DIR_ENV): | ||
| return | ||
| os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal)) |
There was a problem hiding this comment.
not a fan of seting up env var, why can't we jsut read from the env var instead and dinamycally change it in the code if the env var is not present ?
| inherited_env = dict(os.environ) | ||
| writer_run_asset_env = build_run_asset_env(config.orchestrator.output_dir, multimodal=config.multimodal) |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.
| ) | ||
| image_offload_dir = ( | ||
| os.path.expanduser(str(config.multimodal.offload_dir)) if config.multimodal.offload_dir is not None else "" | ||
| ) |
There was a problem hiding this comment.
SLURM offload path resolution mismatch
Medium Severity
Multi-node SLURM rendering sets image_offload_dir with expanduser on the raw config string, while local orchestrator subprocesses use resolve_image_offload_dir with Template substitution and Path.resolve(). Runs that set [multimodal].offload_dir with ${RUN_ID} or other env placeholders can export a different VF_RENDERER_IMAGE_OFFLOAD_DIR on SLURM than on single-node uv run rl.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.
| message=exc.message, | ||
| err_type=exc.err_type, | ||
| status_code=exc.status_code, | ||
| ) |
There was a problem hiding this comment.
Adapter failures become HTTP 500
Low Severity
Raw multimodal decode only catches _MMImageRefError, but adapter materialize_for_vllm raises ValueError for layout fingerprint, grid, and placeholder mismatches. Those client-side contract violations surface as unhandled server errors instead of structured bad-request responses like other invalid raw refs.
Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.
# Conflicts: # deps/verifiers


Summary
Wires Prime-RL v1 training onto the raw multimodal image offload contract used by the companion Renderers and Verifiers PRs.
Companion PRs:
Current design:
VF_RENDERER_IMAGE_OFFLOAD_DIR; they rewrite image parts to sharedfile://assets and emit JSON-safe raw descriptors.raw_image_urias the sole image locator plus adapter-owned metadata (family,layout_fingerprint,payload).raw_image_idwas removed to avoid two sources of truth.mmraw:<base64-json-payload>refs carryingraw_image_uri, hash, family, fingerprint, modality, and payload. Raw-mm ref and descriptor version markers were removed because this branch has no compatibility contract yet.VF_RENDERER_IMAGE_OFFLOAD_DIR.file://URIs throughMMRefs;build_mm_refsnow builds those URIs from the raw descriptors instead of rescanning messages.Noneslots, inline base64 storage, and root-plus-id lookup paths are not supported.Submodule pins:
a7953b9(Trim uv lock churn)9b3e7ee(Merge remote-tracking branch 'origin/main' into codex/v1-raw-image-offload)Validation
uv run ruff check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.pyuv run ruff format --check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.pyuv run pytest tests/test_renderer_config.py tests/test_client.py::test_generate_serializes_raw_mm_refs tests/test_multimodal_output_modes.py -q->38 passedruff check,ruff format,ty (ci parity)passeduv run ruff check src/prime_rl/entrypoints/rl.py src/prime_rl/inference/server.py src/prime_rl/inference/vllm/serving_tokens.py src/prime_rl/multimodal/schema.py src/prime_rl/orchestrator/trajectories.py src/prime_rl/utils/mm.py tests/unit/inference/test_serving_tokens.py tests/unit/orchestrator/test_batch.py tests/unit/utils/test_mm.pyuv run pytest tests/unit/utils/test_mm.py tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload tests/unit/orchestrator/test_batch.py::test_prepare_sample_rejects_overlong_raw_mm_refs->4 passeduv run pytest tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload->1 passedgit diff --check,uv lock --check, anduv sync --all-extras --lockedpassed on the branch after mergingorigin/main.Latest sync:
/home/ubuntu/prime-rl-mainand merged latest prime-rlorigin/main(f19ba721a) into this branch.9b3e7ee, which contains both this PR's previous Verifiers pin (22c7cf4c) and current main's Verifiers pin (22d6333a). This resolves thedeps/verifiersmerge conflict and satisfieslean-v1'sverifiers>=0.1.15.dev402requirement.a7953b9, which contains main's renderers pin (a5efbb), the processed multimodal renderer output mode, and the cleaned lockfile diff.uv.lockfor the optional Renderersvisionextra metadata so CI'suv sync --all-extras --lockedpath is in sync.Note
High Risk
Touches orchestrator→trainer transport, packing/truncation semantics, inference multimodal request handling, and launcher env wiring across distributed runs—errors can desync tokens from images or silently drop training signal via placeholders.
Overview
V1 multimodal training/inference now uses offloaded
file://image assets and lightweight raw descriptors instead of shipping processedpixel_valuestensors through the rollout transport.Shared
[multimodal]config (with optionaloffload_dir) propagates to trainer, orchestrator, and inference. The RL launcher and SLURM templates setVF_RENDERER_IMAGE_OFFLOAD_DIR(launcher-protected, not overridable viaenv_vars); orchestrator subprocesses get writer-side asset env viarun_assets.Transport:
TrainingSample/MicroBatchcarryMMRefs(JSON descriptors + URIs). Orchestrator trajectories build refs withbuild_mm_refs;mm_kwargson the wire is rejected. Packing fails on overlong multimodal samples rather than truncating tokens/images out of sync.Trainer:
RawImageMaterializerand newprime_rl.multimodaladapters (Qwen VL, Kimi K25) materialize tensors at train time with layout/hash checks.missing_mm_image_policycan synthesize zero-loss placeholders when files disappear. Forward uses per-familyForwardPolicyinstead of hard-coded MRoPE heuristics.Inference:
PrimeRlServingTokensmaterializes rawmmrawrefs from request payloads (hash-verified reads) before vLLM sees multimodal kwargs.Docs note no multimodal truncation; the old Qwen3-VL e2e test that assumed encoded kwargs in features was removed.
Reviewed by Cursor Bugbot for commit 89fcbc1. Bugbot is set up for automated code reviews on this repo. Configure here.