Skip to content

feat: add process_results_use_image and video metadata dict support in task API#1275

Merged
Luodian merged 1 commit intomainfrom
feat/task-api-enhancements
Apr 11, 2026
Merged

feat: add process_results_use_image and video metadata dict support in task API#1275
Luodian merged 1 commit intomainfrom
feat/task-api-enhancements

Conversation

@Luodian
Copy link
Copy Markdown
Contributor

@Luodian Luodian commented Mar 26, 2026

Summary

Two task API enhancements for richer evaluation capabilities:

1. process_results_use_image config flag

When set to true in a task YAML, image/video columns are preserved in the dataset passed to process_results. This enables metrics that need visual context (e.g., bounding box overlay verification, visual grounding checks).

Default behavior is unchanged — images are still stripped unless explicitly opted in.

2. Video metadata dict handling in ConfigurableMessagesTask

Dict-type visuals carrying video_start/video_end metadata are now properly passed through to models, enabling per-sample temporal range support for video tasks.

# Example usage in task YAML:
process_results_use_image: true

Test plan

  • Verify existing tasks without the flag still strip images (no regression)
  • Test a task with process_results_use_image: true receives images in process_results
  • Test video task with temporal metadata dict flows through correctly

Two task API enhancements:
1. New `process_results_use_image` config flag: when true, preserves
   image/video data in dataset_no_image for tasks whose process_results
   needs visual context (e.g., bounding box verification).
2. Video metadata dict handling in ConfigurableMessagesTask: dict visuals
   with video_start/video_end metadata are passed through to models,
   enabling per-sample temporal range support.
@Luodian Luodian merged commit cfc260b into main Apr 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant