[codex] Support raw image offload in v1 train client by eligotts · Pull Request #1746 · PrimeIntellect-ai/verifiers

eligotts · 2026-06-18T07:04:56Z

Design update — inline/offload image storage

This PR now follows the prime-rl multimodal image storage policy:

offload: current behavior, rewrite base64 data images to file:// run assets and require file-backed image URLs.
inline: keep data:image/...;base64,... URLs in the message payload and validate them without rewriting.

TrainClient now calls the policy-aware image preparation helper, so prime-rl can be the single source of truth via environment/config propagation.

Validation after latest push: uv run pytest tests/v1/test_train_client_multimodal.py -q passed (5 passed). Commit/push hooks also passed (ruff check, ruff format, generated AGENTS/CLAUDE check, ty).

Design update — dropped the `None`/cache-only image path

This PR and its companions (prime-rl #2836 / verifiers #1746 / renderers #89) no longer use the "send None for already-cached images" mechanism. Every image carries its raw descriptor ref at every slot (current and prior turns); /inference/v1/generate rematerializes each ref from disk every request.

Why: the None path coupled correctness to deployment (LRU cache present, single replica / DP-affinity, no eviction) and surfaced a miss as a hard vLLM EngineDeadError (qwen3-vl mrope dereferences a None image_grid_thw) that the retry net couldn't catch across the engine→API IPC. Dropping it is deployment-agnostic (a miss is impossible) and non-hacky. vLLM's mm_hash encoder cache still skips the expensive GPU re-encode for free — we only forgo the cheap IPC/CPU-reprocess dedup.

Validated: color-codeword (Qwen3-VL-4B) under DP=2, no affinity / no cache reliance: 0 crashes, 0 data=None, multi-turn accumulation correct, reward ~0.84. Also confirmed under TP.

This repo: with every image carrying its ref, no cache miss can occur — removed the retry subsystem (_generate_with_image_ref_retry, _has_descriptor_only_images, _retryable_mm_error_type, _json_error_type, _RETRYABLE_MM_ERROR_TYPES). Rollouts call renderers.client.generate directly. Obsolete retry tests removed.

Original description

Summary

tighten v1 multimodal graph serialization around strict raw descriptor sidecars
reject processed multimodal payload keys recursively, including nested pixel_values, image_embeds, and image_features
update v1 multimodal tests to use strict prime_raw_mm_item envelopes instead of descriptor-only Qwen payloads
keep raw image offload and retry behavior aligned with the companion Renderers and Prime-RL PRs

Companion PRs

Prime-RL: Support v1 raw multimodal image offload prime-rl#2836
Renderers: [codex] Support raw image refs for multimodal rendering renderers#89

Notes

Draft/WIP: this depends on the renderer generic raw multimodal ref contract in the companion PR.
v1 multimodal sidecars intentionally carry raw descriptors only, not processed image tensors or image-processor payloads.
Prime/vLLM materialization happens from raw image refs rather than Verifiers-held processor outputs.

Validation

Commit hooks: ruff check, ruff format, generated AGENTS/CLAUDE check passed.
Push hook: ty (ci parity) passed.
End-to-end hosted-style smoke through Prime-RL with /home/ubuntu/verifiers, /home/ubuntu/renderers, and /home/ubuntu/prime-rl-v1-raw-mm-offload completed inference, env rollouts, train batch creation, trainer step 0, and decoded strict trainer-bound raw image refs.

[!NOTE]

Support raw image offload in the v1 train client for multimodal conversations

Adds prepare_images_inplace in multimodal.py that recursively walks message structures, offloads image URLs to run-asset file:// paths, and raises RuntimeError for invalid or non-offloaded URLs.

Adds prepare_request_body and prepare_messages hooks to the base Client class; TrainClient overrides both to run image preparation in a background thread for chat-dialect requests.

Extends TrainClient.get_response to allow bridge-to-next-turn for multimodal conversations, passing previous_multi_modal_data to multimodal renderers to maintain placeholder alignment.

Updates MessageNode serialization in graph.py to reject processed tensor payloads and only accept raw image descriptor sidecars; _attribute_mm now also tracks mm_placeholders.

Risk: image URLs that are not offloadable or do not resolve to file:// will raise at request time, breaking any callers that previously passed through non-file image URLs.

^{Macroscope summarized 22c7cf4.}

Note

Medium Risk
Touches the multimodal training hot path (image offload failures halt rollouts at request time) and changes wire/graph multimodal payload shape, which must stay aligned with companion renderers and prime-rl PRs.

Overview
Adds prepare_images_inplace so renderer-backed paths rewrite image_url parts to file:// run assets (via renderers.mm_store) before tokenization, and wires that through v0 RendererClient.to_native_prompt, v1 TrainClient (prepare_request_body / prepare_messages), and the interception server so harness bodies and user-simulator messages match what the trace and trainer see.

Multimodal training sidecars move from processed tensors to raw image descriptors (raw_image_uri required; pixel_values / embed keys rejected). Graph serialization no longer numpy-encodes mm_items; mm_placeholders are attributed per node and merged on branches. PendingTurn.previous_multi_modal_data() feeds bridge_to_next_turn so multi-turn multimodal prompts can bridge (the old multimodal bridge block is removed).

Legacy v0 rollouts preserve live trajectory multi_modal_data when mapping to v1 traces. Docs and type comments are updated; eval dashboard reward rows use an explicit loop (no behavior change).

^{Reviewed by Cursor Bugbot for commit 22c7cf4. Bugbot is set up for automated code reviews on this repo. Configure here.}

Every image carries its ref, so no cache miss can occur. Removes _generate_with_image_ref_retry / _has_descriptor_only_images / _retryable_mm_error_type / _json_error_type / _RETRYABLE_MM_ERROR_TYPES; rollouts call generate() directly. Obsolete retry tests removed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…fload # Conflicts: # verifiers/v1/clients/train.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Want reviews to match your repository better? Bugbot Learning can learn team-specific rules from PR activity. A team admin can enable Learning in the Cursor dashboard.

^{Reviewed by Cursor Bugbot for commit 0b1d73f. Configure here.}

cursor · 2026-06-29T16:38:27Z

+    result = _offload_image_url(_image_source_url(source), image_dir)
+    if result is not None:
+        _set_image_source_url(source, result)
+    _require_file_image_url(source)


Inline image URLs always rejected

High Severity

The v1 train path always requires every image_url to become a file:// run asset after preparation, even when offload leaves a data:image/...;base64,... URL unchanged. That conflicts with the intended inline multimodal storage mode where base64 image URLs stay in the message payload, so inline training rollouts fail at request preparation instead of validating in place.

Additional Locations (1)

verifiers/v1/clients/train.py#L208-L216

^{Reviewed by Cursor Bugbot for commit 0b1d73f. Configure here.}

removed inline mode so this is irrelevant

macroscopeapp · 2026-06-29T16:38:58Z

Approvability

Verdict: Needs human review

2 blocking correctness issues found. This PR adds new multimodal image handling capabilities with significant new code and runtime behavior changes. Multiple unresolved review comments identify potential bugs including rejected inline images (high severity) and backwards compatibility concerns for existing saved rollouts.

^{You can customize Macroscope's approvability policy. Learn more.}

macroscopeapp · 2026-06-29T19:47:01Z

+    return False
+
+
+def _validate_raw_mm_item(item: Any) -> dict[str, Any]:


🟡 Medium v1/graph.py:76

_validate_raw_mm_item now unconditionally rejects processed multimodal payloads containing keys like pixel_values, and deserialize_multi_modal_data runs it on every multi_modal_data field during deserialization. Loading a previously persisted multimodal v1 trace whose sidecars contain pixel_values now raises TypeError instead of round-tripping, breaking backwards compatibility for existing saved rollouts. Consider allowing processed payloads through on the deserialization path (e.g. by skipping the processed-key check in the validator's before path) so old traces can still be loaded.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file @verifiers/v1/graph.py around line 76: `_validate_raw_mm_item` now unconditionally rejects processed multimodal payloads containing keys like `pixel_values`, and `deserialize_multi_modal_data` runs it on every `multi_modal_data` field during deserialization. Loading a previously persisted multimodal v1 trace whose sidecars contain `pixel_values` now raises `TypeError` instead of round-tripping, breaking backwards compatibility for existing saved rollouts. Consider allowing processed payloads through on the deserialization path (e.g. by skipping the processed-key check in the validator's `before` path) so old traces can still be loaded.

…fload

macroscopeapp · 2026-06-29T22:40:24Z

+        if value.get("type") == "image_url":
+            source = value.get("image_url")
+            if source is not None:
+                _prepare_image_source(source, image_dir=image_dir)


🟡 Medium utils/multimodal.py:64

prepare_images_inplace skips validation when an image_url part has a missing or None image_url field: lines 65-67 only call _prepare_image_source when source is not None, so the malformed part passes through unchecked. Downstream, ChatDialect.parse_request normalizes it to ImageUrlSource(url=""), forwarding a request with an empty image URL instead of rejecting it. Consider calling _require_file_image_url(value) (or otherwise validating) when source is None so malformed parts are rejected.

Suggested change

if value.get("type") == "image_url":

source = value.get("image_url")

if source is not None:

_prepare_image_source(source, image_dir=image_dir)

if value.get("type") == "image_url":

source = value.get("image_url")

if source is not None:

_prepare_image_source(source, image_dir=image_dir)

else:

_require_file_image_url(value)

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file @verifiers/utils/multimodal.py around lines 64-67: `prepare_images_inplace` skips validation when an `image_url` part has a missing or `None` `image_url` field: lines 65-67 only call `_prepare_image_source` when `source is not None`, so the malformed part passes through unchecked. Downstream, `ChatDialect.parse_request` normalizes it to `ImageUrlSource(url="")`, forwarding a request with an empty image URL instead of rejecting it. Consider calling `_require_file_image_url(value)` (or otherwise validating) when `source` is `None` so malformed parts are rejected.

This was referenced Jun 18, 2026

[codex] Support raw image refs for multimodal rendering PrimeIntellect-ai/renderers#89

Open

Support v1 raw multimodal image offload PrimeIntellect-ai/prime-rl#2836

Open

eligotts force-pushed the codex/v1-raw-image-offload branch from 7556743 to 3f5bb1a Compare June 23, 2026 19:23

eligotts changed the base branch from feat/nano-as-v1 to main June 23, 2026 19:24

eligotts added 2 commits June 25, 2026 06:39

Support raw image offload in v1 train client

173a518

Enforce strict raw multimodal descriptors

de37650

eligotts force-pushed the codex/v1-raw-image-offload branch from 3f5bb1a to de37650 Compare June 25, 2026 06:40

macroscopeapp Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread verifiers/v1/clients/train.py

Comment thread verifiers/v1/graph.py Outdated

Comment thread verifiers/v1/clients/train.py Outdated

S1ro1 and others added 3 commits June 27, 2026 00:18

Merge remote-tracking branch 'origin/main' into codex/v1-raw-image-of…

9430999

…fload # Conflicts: # verifiers/v1/clients/train.py

feat: support inline multimodal images

4a7b37a

macroscopeapp Bot reviewed Jun 28, 2026

View reviewed changes

Comment thread verifiers/v1/utils/multimodal.py Outdated

Simplify v1 raw image offload path

0b1d73f

eligotts marked this pull request as ready for review June 29, 2026 16:36

cursor Bot reviewed Jun 29, 2026

View reviewed changes

eligotts added 3 commits June 29, 2026 17:41

Preserve v1 node usage in trace dumps

2d4969b

Surface request preparation failures on traces

7ade0b2

Require raw image URIs in v1 sidecars

0dc57a1

macroscopeapp Bot reviewed Jun 29, 2026

View reviewed changes

eligotts added 2 commits June 29, 2026 22:23

Share multimodal image preparation across clients

18b0fbe

Merge remote-tracking branch 'origin/main' into codex/v1-raw-image-of…

22c7cf4

…fload

macroscopeapp Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] Support raw image offload in v1 train client#1746

[codex] Support raw image offload in v1 train client#1746
eligotts wants to merge 11 commits into
mainfrom
codex/v1-raw-image-offload

eligotts commented Jun 18, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 29, 2026

Uh oh!

eligotts Jun 29, 2026

Uh oh!

macroscopeapp Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

macroscopeapp Bot Jun 29, 2026

Uh oh!

macroscopeapp Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return False


		def _validate_raw_mm_item(item: Any) -> dict[str, Any]:

Uh oh!

Conversation

eligotts commented Jun 18, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Design update — inline/offload image storage

Design update — dropped the None/cache-only image path

Summary

Companion PRs

Notes

Validation

Support raw image offload in the v1 train client for multimodal conversations

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 29, 2026

Choose a reason for hiding this comment

Inline image URLs always rejected

Uh oh!

eligotts Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

macroscopeapp Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eligotts commented Jun 18, 2026 •

edited by macroscopeapp Bot

Loading

Design update — dropped the `None`/cache-only image path

macroscopeapp Bot commented Jun 29, 2026 •

edited

Loading