Support v1 raw multimodal image offload by eligotts · Pull Request #2836 · PrimeIntellect-ai/prime-rl

eligotts · 2026-06-18T21:38:04Z

Summary

Wires Prime-RL v1 training onto the raw multimodal image offload contract used by the companion Renderers and Verifiers PRs.

Companion PRs:

Renderers: [codex] Support raw image refs for multimodal rendering renderers#89
Verifiers: [codex] Support raw image offload in v1 train client verifiers#1746

Current design:

Prime-RL config owns only the image offload directory. There is no inline/offload mode knob; v1 raw multimodal uses offloaded image files.
Renderers/verifiers are the only writer side that needs VF_RENDERER_IMAGE_OFFLOAD_DIR; they rewrite image parts to shared file:// assets and emit JSON-safe raw descriptors.
Raw descriptors carry raw_image_uri as the sole image locator plus adapter-owned metadata (family, layout_fingerprint, payload). raw_image_id was removed to avoid two sources of truth.
The vLLM payload is information-complete: renderers serialize mmraw:<base64-json-payload> refs carrying raw_image_uri, hash, family, fingerprint, modality, and payload. Raw-mm ref and descriptor version markers were removed because this branch has no compatibility contract yet.
Prime inference materializes images from the URI in the request payload, so it no longer needs or derives VF_RENDERER_IMAGE_OFFLOAD_DIR.
The trainer already consumes exact file:// URIs through MMRefs; build_mm_refs now builds those URIs from the raw descriptors instead of rescanning messages.
Legacy processed multimodal tensors, cache-only None slots, inline base64 storage, and root-plus-id lookup paths are not supported.

Submodule pins:

Renderers: a7953b9 (Trim uv lock churn)
Verifiers: 9b3e7ee (Merge remote-tracking branch 'origin/main' into codex/v1-raw-image-offload)

Validation

Renderers: uv run ruff check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.py
Renderers: uv run ruff format --check renderers/configs.py renderers/base.py renderers/client.py renderers/qwen3_vl.py renderers/qwen35.py renderers/kimi_k25.py renderers/mm_store.py renderers/__init__.py tests/test_renderer_config.py tests/test_multimodal_output_modes.py tests/test_client.py
Renderers: uv run pytest tests/test_renderer_config.py tests/test_client.py::test_generate_serializes_raw_mm_refs tests/test_multimodal_output_modes.py -q -> 38 passed
Verifiers commit hooks: ruff check, ruff format, ty (ci parity) passed
Prime-RL: uv run ruff check src/prime_rl/entrypoints/rl.py src/prime_rl/inference/server.py src/prime_rl/inference/vllm/serving_tokens.py src/prime_rl/multimodal/schema.py src/prime_rl/orchestrator/trajectories.py src/prime_rl/utils/mm.py tests/unit/inference/test_serving_tokens.py tests/unit/orchestrator/test_batch.py tests/unit/utils/test_mm.py
Prime-RL focused tests: uv run pytest tests/unit/utils/test_mm.py tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload tests/unit/orchestrator/test_batch.py::test_prepare_sample_rejects_overlong_raw_mm_refs -> 4 passed
Post-ref-version cleanup: uv run pytest tests/unit/inference/test_serving_tokens.py::test_materialize_raw_image_ref_uses_generic_family_payload -> 1 passed
Latest sync: git diff --check, uv lock --check, and uv sync --all-extras --locked passed on the branch after merging origin/main.

Latest sync:

Fast-forwarded local /home/ubuntu/prime-rl-main and merged latest prime-rl origin/main (f19ba721a) into this branch.
Bumped Verifiers to 9b3e7ee, which contains both this PR's previous Verifiers pin (22c7cf4c) and current main's Verifiers pin (22d6333a). This resolves the deps/verifiers merge conflict and satisfies lean-v1's verifiers>=0.1.15.dev402 requirement.
Kept Renderers at a7953b9, which contains main's renderers pin (a5efbb), the processed multimodal renderer output mode, and the cleaned lockfile diff.
Refreshed uv.lock for the optional Renderers vision extra metadata so CI's uv sync --all-extras --locked path is in sync.

Note

High Risk
Touches orchestrator→trainer transport, packing/truncation semantics, inference multimodal request handling, and launcher env wiring across distributed runs—errors can desync tokens from images or silently drop training signal via placeholders.

Overview
V1 multimodal training/inference now uses offloaded file:// image assets and lightweight raw descriptors instead of shipping processed pixel_values tensors through the rollout transport.

Shared [multimodal] config (with optional offload_dir) propagates to trainer, orchestrator, and inference. The RL launcher and SLURM templates set VF_RENDERER_IMAGE_OFFLOAD_DIR (launcher-protected, not overridable via env_vars); orchestrator subprocesses get writer-side asset env via run_assets.

Transport: TrainingSample / MicroBatch carry MMRefs (JSON descriptors + URIs). Orchestrator trajectories build refs with build_mm_refs; mm_kwargs on the wire is rejected. Packing fails on overlong multimodal samples rather than truncating tokens/images out of sync.

Trainer: RawImageMaterializer and new prime_rl.multimodal adapters (Qwen VL, Kimi K25) materialize tensors at train time with layout/hash checks. missing_mm_image_policy can synthesize zero-loss placeholders when files disappear. Forward uses per-family ForwardPolicy instead of hard-coded MRoPE heuristics.

Inference: PrimeRlServingTokens materializes raw mmraw refs from request payloads (hash-verified reads) before vLLM sees multimodal kwargs.

Docs note no multimodal truncation; the old Qwen3-VL e2e test that assumed encoded kwargs in features was removed.

^{Reviewed by Cursor Bugbot for commit 89fcbc1. Bugbot is set up for automated code reviews on this repo. Configure here.}

/inference/v1/generate materializes every raw ref (no cache-only None branch); an unresolved ref is a hard error, not a silent None. Removes the cache-miss 409 path + helpers (_cache_only_mm_hashes/_is_missing_mm_cache_error/_missing_mm_cache_message). Bumps renderers/verifiers submodules to the matching cleanup commits. Deployment-agnostic; mm_hash encoder cache still skips re-encode. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

renderers.mm_store dropped the split_mmraw_ref backcompat alias; use split_raw_mm_ref. Bump renderers submodule pin to the cleanup commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

# Conflicts: # deps/verifiers # src/prime_rl/orchestrator/orchestrator.py # src/prime_rl/trainer/batch.py # tests/unit/orchestrator/test_batch.py

# Conflicts: # packages/prime-rl-configs/src/prime_rl/configs/inference.py # packages/prime-rl-configs/src/prime_rl/configs/rl.py # src/prime_rl/entrypoints/rl.py

cursor · 2026-06-29T20:52:22Z

+def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None:
+    if os.environ.get(IMAGE_OFFLOAD_DIR_ENV):
+        return
+    os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal))


Pre-set offload env skips config

Medium Severity

When VF_RENDERER_IMAGE_OFFLOAD_DIR is already present in the process environment, apply_run_asset_env exits without applying [multimodal] resolution, and the multi-node SLURM template only sets a default when that variable is empty. That leaves a pre-existing shell or module value in charge instead of the config-owned offload path the launcher documents as non-overridable.

Additional Locations (1)

src/prime_rl/templates/multi_node_rl.sbatch.j2#L64-L72

^{Reviewed by Cursor Bugbot for commit dce2759. Configure here.}

samsja · 2026-06-29T21:49:58Z

    max_concurrent_runs: int = Field(1, ge=1)
    """Maximum number of concurrent runs to allow. If 1, only one run may run at a time."""

+    missing_mm_image_policy: MissingMMImagePolicy = "placeholder_zero_loss"


can we put this in the multi modal config ?

samsja · 2026-06-29T21:51:04Z

+def build_run_asset_env(
+    output_dir: Path,
+    multimodal: MultimodalConfig | None = None,
+    base: Mapping[str, str] | None = None,
+) -> dict[str, str]:
+    """Resolve the environment used by subprocesses that share run image assets.
+
+    Prime-RL config owns the multimodal image offload path. Env vars are only the
+    transport used by verifiers/renderers running in subprocesses.
+    """
+
+    env = dict(os.environ if base is None else base)
+    config = multimodal or MultimodalConfig()
+
+    env[IMAGE_OFFLOAD_DIR_ENV] = str(resolve_image_offload_dir(output_dir, config, env))
+
+    return env
+
+
+def apply_run_asset_env(output_dir: Path, multimodal: MultimodalConfig) -> None:
+    if os.environ.get(IMAGE_OFFLOAD_DIR_ENV):
+        return
+    os.environ.update(build_run_asset_env(output_dir, multimodal=multimodal))


not a fan of seting up env var, why can't we jsut read from the env var instead and dinamycally change it in the code if the env var is not present ?

samsja · 2026-06-29T21:51:42Z

+    inherited_env = dict(os.environ)
+    writer_run_asset_env = build_run_asset_env(config.orchestrator.output_dir, multimodal=config.multimodal)


I dont fully get this part

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.}

cursor · 2026-06-29T22:26:06Z

    )
+    image_offload_dir = (
+        os.path.expanduser(str(config.multimodal.offload_dir)) if config.multimodal.offload_dir is not None else ""
+    )


SLURM offload path resolution mismatch

Medium Severity

Multi-node SLURM rendering sets image_offload_dir with expanduser on the raw config string, while local orchestrator subprocesses use resolve_image_offload_dir with Template substitution and Path.resolve(). Runs that set [multimodal].offload_dir with ${RUN_ID} or other env placeholders can export a different VF_RENDERER_IMAGE_OFFLOAD_DIR on SLURM than on single-node uv run rl.

Additional Locations (1)

src/prime_rl/utils/run_assets.py#L30-L43

^{Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.}

cursor · 2026-06-29T22:26:06Z

+                    message=exc.message,
+                    err_type=exc.err_type,
+                    status_code=exc.status_code,
+                )


Adapter failures become HTTP 500

Low Severity

Raw multimodal decode only catches _MMImageRefError, but adapter materialize_for_vllm raises ValueError for layout fingerprint, grid, and placeholder mismatches. Those client-side contract violations surface as unhandled server errors instead of structured bad-request responses like other invalid raw refs.

^{Reviewed by Cursor Bugbot for commit 0b0f1fb. Configure here.}

# Conflicts: # deps/verifiers

This was referenced Jun 20, 2026

[codex] Support raw image offload in v1 train client PrimeIntellect-ai/verifiers#1746

Open

[codex] Support raw image refs for multimodal rendering PrimeIntellect-ai/renderers#89

Open

eligotts force-pushed the feat/v1-raw-mm-offload branch from a574785 to 4e78f06 Compare June 24, 2026 01:12

eligotts changed the base branch from feat/nano-as-v1 to main June 24, 2026 01:12

eligotts force-pushed the feat/v1-raw-mm-offload branch from 4e78f06 to 2f5eacc Compare June 24, 2026 02:29

Support v1 raw multimodal offload

034e83e

eligotts force-pushed the feat/v1-raw-mm-offload branch from 2f5eacc to 034e83e Compare June 25, 2026 06:43

Update raw image renderer pin

b7903f6

eligotts force-pushed the feat/v1-raw-mm-offload branch from bfc45cc to b7903f6 Compare June 25, 2026 06:47

S1ro1 and others added 10 commits June 27, 2026 00:18

serving: use canonical split_raw_mm_ref + bump renderers pin

5ec8148

renderers.mm_store dropped the split_mmraw_ref backcompat alias; use split_raw_mm_ref. Bump renderers submodule pin to the cleanup commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bump renderers pin (drop orphaned image_cache_max)

bcc93c2

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into feat/v1-raw-mm-offload

2c5af36

# Conflicts: # deps/verifiers # src/prime_rl/orchestrator/orchestrator.py # src/prime_rl/trainer/batch.py # tests/unit/orchestrator/test_batch.py

Bump raw image dependency pins

75b9e7c

feat: support inline multimodal image storage

99b590a

fix: preserve launcher image asset env

3518763

Simplify raw multimodal offload path

f37ceac

Merge remote-tracking branch 'origin/main' into feat/v1-raw-mm-offload

a66f1b1

# Conflicts: # packages/prime-rl-configs/src/prime_rl/configs/inference.py # packages/prime-rl-configs/src/prime_rl/configs/rl.py # src/prime_rl/entrypoints/rl.py

Simplify raw multimodal validation

80dc59f

eligotts marked this pull request as ready for review June 29, 2026 16:36

cursor Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/prime_rl/utils/run_assets.py

Comment thread src/prime_rl/trainer/rl/data.py

Clarify raw image asset env wiring

bf2bc8e

cursor Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/prime_rl/inference/server.py Outdated

eligotts added 3 commits June 29, 2026 19:44

Use URI-complete raw multimodal refs

805486f

Merge branch 'main' into feat/v1-raw-mm-offload

2f000f3

Drop raw multimodal descriptor version

dce2759

cursor Bot reviewed Jun 29, 2026

View reviewed changes

samsja reviewed Jun 29, 2026

View reviewed changes

Bump verifiers raw multimodal client prep

0b0f1fb

cursor Bot reviewed Jun 29, 2026

View reviewed changes

eligotts added 5 commits June 29, 2026 22:28

Merge remote-tracking branch 'origin/main' into feat/v1-raw-mm-offload

d8218c3

# Conflicts: # deps/verifiers

Bump renderers for processed multimodal output

d18ed10

Bump renderers lock cleanup

0da60a8

Bump verifiers for main sync

c7deb3b

Merge remote-tracking branch 'origin/main' into feat/v1-raw-mm-offload

89fcbc1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support v1 raw multimodal image offload#2836

Support v1 raw multimodal image offload#2836
eligotts wants to merge 22 commits into
mainfrom
feat/v1-raw-mm-offload

eligotts commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot Jun 29, 2026

Uh oh!

samsja Jun 29, 2026

Uh oh!

samsja Jun 29, 2026

Uh oh!

samsja Jun 29, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 29, 2026

Uh oh!

cursor Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		inherited_env = dict(os.environ)
		writer_run_asset_env = build_run_asset_env(config.orchestrator.output_dir, multimodal=config.multimodal)

Uh oh!

Conversation

eligotts commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot Jun 29, 2026

Choose a reason for hiding this comment

Pre-set offload env skips config

Uh oh!

samsja Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

samsja Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

samsja Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 29, 2026

Choose a reason for hiding this comment

SLURM offload path resolution mismatch

Uh oh!

cursor Bot Jun 29, 2026

Choose a reason for hiding this comment

Adapter failures become HTTP 500

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eligotts commented Jun 18, 2026 •

edited

Loading