Skip to content

Support v1 raw multimodal image offload#2836

Open
eligotts wants to merge 22 commits into
mainfrom
feat/v1-raw-mm-offload
Open

Support v1 raw multimodal image offload#2836
eligotts wants to merge 22 commits into
mainfrom
feat/v1-raw-mm-offload

Conversation

@eligotts

@eligotts eligotts commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Wires Prime-RL v1 training onto the raw multimodal image offload contract used by the companion Renderers and Verifiers PRs.

Companion PRs:

Current design:

  • Prime-RL config owns only the image offload directory. There is no inline/offload mode knob; v1 raw multimodal uses offloaded image files.
  • Renderers/verifiers are the only writer side that needs VF_RENDERER_IMAGE_OFFLOAD_DIR; they rewrite image parts to shared file:// assets and emit JSON-safe raw descriptors.
  • Raw descriptors carry raw_image_uri as the sole image locator plus adapter-owned metadata (family, layout_fingerprint, payload). raw_image_id was removed to avoid two sources of truth.
  • The vLLM payload is information-complete: renderers serialize mmraw:<base64-json-payload> refs carrying raw_image_uri, hash, family, fingerprint, modality, and payload. Raw-mm ref and descriptor version markers were removed because this branch has no compatibility contract yet.
  • Prime inference materializes images from the URI in the request payload, so it no longer needs or derives VF_RENDERER_IMAGE_OFFLOAD_DIR.
  • The trainer already consumes exact file:// URIs through MMRefs; build_mm_refs now builds those URIs from the raw descriptors instead of rescanning messages.
  • Legacy processed multimodal tensors, cache-only None slots, inline base64 storage, and root-plus-id lookup paths are not supported.

Submodule pins:

  • Renderers: a7953b9 (Trim uv lock churn)
  • Verifiers: 9b3e7ee (Merge remote-tracking branch 'origin/main' into codex/v1-raw-image-offload)

Validation

  • Renderers: uv run ruff check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.py
  • Renderers: uv run ruff format --check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.py
  • Renderers: uv run pytest tests/test_renderer_config.py tests/test_client.py::test_generate_serializes_raw_mm_refs tests/test_multimodal_output_modes.py -q -> 38 passed
  • Verifiers commit hooks: ruff check, ruff format, ty (ci parity) passed
  • Prime-RL: uv run ruff check src/prime_rl/entrypoints/rl.py src/prime_rl/inference/server.py src/prime_rl/inference/vllm/serving_tokens.py src/prime_rl/multimodal/schema.py src/prime_rl/orchestrator/trajectories.py src/prime_rl/utils/mm.py tests/unit/inference/test_serving_tokens.py tests/unit/orchestrator/test_batch.py tests/unit/utils/test_mm.py
  • Prime-RL focused tests: uv run pytest tests/unit/utils/test_mm.py tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload tests/unit/orchestrator/test_batch.py::test_prepare_sample_rejects_overlong_raw_mm_refs -> 4 passed
  • Post-ref-version cleanup: uv run pytest tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload -> 1 passed
  • Latest sync: git diff --check, uv lock --check, and uv sync --all-extras --locked passed on the branch after merging origin/main.

Latest sync:

  • Fast-forwarded local /home/ubuntu/prime-rl-main and merged latest prime-rl origin/main (f19ba721a) into this branch.
  • Bumped Verifiers to 9b3e7ee, which contains both this PR's previous Verifiers pin (22c7cf4c) and current main's Verifiers pin (22d6333a). This resolves the deps/verifiers merge conflict and satisfies lean-v1's verifiers>=0.1.15.dev402 requirement.
  • Kept Renderers at a7953b9, which contains main's renderers pin (a5efbb), the processed multimodal renderer output mode, and the cleaned lockfile diff.
  • Refreshed uv.lock for the optional Renderers vision extra metadata so CI's uv sync --all-extras --locked path is in sync.

Note

High Risk
Touches orchestrator→trainer transport, packing/truncation semantics, inference multimodal request handling, and launcher env wiring across distributed runs—errors can desync tokens from images or silently drop training signal via placeholders.

Overview
V1 multimodal training/inference now uses offloaded file:// image assets and lightweight raw descriptors instead of shipping processed pixel_values tensors through the rollout transport.

Shared [multimodal] config (with optional offload_dir) propagates to trainer, orchestrator, and inference. The RL launcher and SLURM templates set VF_RENDERER_IMAGE_OFFLOAD_DIR (launcher-protected, not overridable via env_vars); orchestrator subprocesses get writer-side asset env via run_assets.

Transport: TrainingSample / MicroBatch carry MMRefs (JSON descriptors + URIs). Orchestrator trajectories build refs with build_mm_refs; mm_kwargs on the wire is rejected. Packing fails on overlong multimodal samples rather than truncating tokens/images out of sync.

Trainer: RawImageMaterializer and new prime_rl.multimodal adapters (Qwen VL, Kimi K25) materialize tensors at train time with layout/hash checks. missing_mm_image_policy can synthesize zero-loss placeholders when files disappear. Forward uses per-family ForwardPolicy instead of hard-coded MRoPE heuristics.

Inference: PrimeRlServingTokens materializes raw mmraw refs from request payloads (hash-verified reads) before vLLM sees multimodal kwargs.

Docs note no multimodal truncation; the old Qwen3-VL e2e test that assumed encoded kwargs in features was removed.

Reviewed by Cursor Bugbot for commit 89fcbc1. Bugbot is set up for automated code reviews on this repo. Configure here.

@eligotts eligotts force-pushed the feat/v1-raw-mm-offload branch from 2f5eacc to 034e83e Compare June 25, 2026 06:43
@eligotts eligotts force-pushed the feat/v1-raw-mm-offload branch from bfc45cc to b7903f6 Compare June 25, 2026 06:47
S1ro1 and others added 10 commits June 27, 2026 00:18
/inference/v1/generate materializes every raw ref (no cache-only None branch); an unresolved ref is a hard error, not a silent None. Removes the cache-miss 409 path + helpers (_cache_only_mm_hashes/_is_missing_mm_cache_error/_missing_mm_cache_message). Bumps renderers/verifiers submodules to the matching cleanup commits. Deployment-agnostic; mm_hash encoder cache still skips re-encode.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
renderers.mm_store dropped the split_mmraw_ref backcompat alias; use split_raw_mm_ref. Bump renderers submodule pin to the cleanup commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts:
#	deps/verifiers
#	src/prime_rl/orchestrator/orchestrator.py
#	src/prime_rl/trainer/batch.py
#	tests/unit/orchestrator/test_batch.py
# Conflicts:
#	packages/prime-rl-configs/src/prime_rl/configs/inference.py
#	packages/prime-rl-configs/src/prime_rl/configs/rl.py
#	src/prime_rl/entrypoints/rl.py
@eligotts eligotts marked this pull request as ready for review June 29, 2026 16:36
Comment thread src/prime_rl/utils/run_assets.py
Comment thread src/prime_rl/trainer/rl/data.py
Comment thread src/prime_rl/inference/server.py Outdated
def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None:
if os.environ.get(IMAGE_OFFLOAD_DIR_ENV):
return
os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-set offload env skips config

Medium Severity

When VF_RENDERER_IMAGE_OFFLOAD_DIR is already present in the process environment, apply_run_asset_env exits without applying [multimodal] resolution, and the multi-node SLURM template only sets a default when that variable is empty. That leaves a pre-existing shell or module value in charge instead of the config-owned offload path the launcher documents as non-overridable.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit dce2759. Configure here.

max_concurrent_runs: int = Field(1, ge=1)
"""Maximum number of concurrent runs to allow. If 1, only one run may run at a time."""

missing_mm_image_policy: MissingMMImagePolicy = "placeholder_zero_loss"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put this in the multi modal config ?

Comment on lines +46 to +68
def build_run_asset_env(
output_dir: Path,
multimodal: MultimodalConfig | None = None,
base: Mapping[str, str] | None = None,
) -> dict[str, str]:
"""Resolve the environment used by subprocesses that share run image assets.

Prime-RL config owns the multimodal image offload path. Env vars are only the
transport used by verifiers/renderers running in subprocesses.
"""

env = dict(os.environ if base is None else base)
config = multimodal or MultimodalConfig()

env[IMAGE_OFFLOAD_DIR_ENV] = str(resolve_image_offload_dir(output_dir, config, env))

return env


def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None:
if os.environ.get(IMAGE_OFFLOAD_DIR_ENV):
return
os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of seting up env var, why can't we jsut read from the env var instead and dinamycally change it in the code if the env var is not present ?

Comment on lines +129 to +130
inherited_env = dict(os.environ)
writer_run_asset_env = build_run_asset_env(config.orchestrator.output_dir, multimodal=config.multimodal)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont fully get this part

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.

)
image_offload_dir = (
os.path.expanduser(str(config.multimodal.offload_dir)) if config.multimodal.offload_dir is not None else ""
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SLURM offload path resolution mismatch

Medium Severity

Multi-node SLURM rendering sets image_offload_dir with expanduser on the raw config string, while local orchestrator subprocesses use resolve_image_offload_dir with Template substitution and Path.resolve(). Runs that set [multimodal].offload_dir with ${RUN_ID} or other env placeholders can export a different VF_RENDERER_IMAGE_OFFLOAD_DIR on SLURM than on single-node uv run rl.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.

message=exc.message,
err_type=exc.err_type,
status_code=exc.status_code,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adapter failures become HTTP 500

Low Severity

Raw multimodal decode only catches _MMImageRefError, but adapter materialize_for_vllm raises ValueError for layout fingerprint, grid, and placeholder mismatches. Those client-side contract violations surface as unhandled server errors instead of structured bad-request responses like other invalid raw refs.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants