feat: add process_results_use_image and video metadata dict support in task API by Luodian · Pull Request #1275 · EvolvingLMMs-Lab/lmms-eval

Luodian · 2026-03-26T16:10:27Z

Summary

Two task API enhancements for richer evaluation capabilities:

1. `process_results_use_image` config flag

When set to true in a task YAML, image/video columns are preserved in the dataset passed to process_results. This enables metrics that need visual context (e.g., bounding box overlay verification, visual grounding checks).

Default behavior is unchanged — images are still stripped unless explicitly opted in.

2. Video metadata dict handling in ConfigurableMessagesTask

Dict-type visuals carrying video_start/video_end metadata are now properly passed through to models, enabling per-sample temporal range support for video tasks.

# Example usage in task YAML:
process_results_use_image: true

Test plan

Verify existing tasks without the flag still strip images (no regression)
Test a task with process_results_use_image: true receives images in process_results
Test video task with temporal metadata dict flows through correctly

Two task API enhancements: 1. New `process_results_use_image` config flag: when true, preserves image/video data in dataset_no_image for tasks whose process_results needs visual context (e.g., bounding box verification). 2. Video metadata dict handling in ConfigurableMessagesTask: dict visuals with video_start/video_end metadata are passed through to models, enabling per-sample temporal range support.

Luodian merged commit cfc260b into main Apr 11, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add process_results_use_image and video metadata dict support in task API#1275

feat: add process_results_use_image and video metadata dict support in task API#1275
Luodian merged 1 commit intomainfrom
feat/task-api-enhancements

Luodian commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Luodian commented Mar 26, 2026

Summary

1. process_results_use_image config flag

2. Video metadata dict handling in ConfigurableMessagesTask

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `process_results_use_image` config flag