fix: preserve HME100k prediction case in OCRBench scoring by akawincent · Pull Request #1278 · EvolvingLMMs-Lab/lmms-eval

akawincent · 2026-03-27T13:21:37Z

Summary

fix OCRBench scoring for the HME100k subset by preserving prediction case
keep the existing lowercase normalization for the other OCRBench subsets unchanged

Why

Issue #1220 points out that ocrbench_process_results lowercases pred before branching on dataset_name, while the HME100k branch intentionally compares answers without lowercasing them. That makes correct HME100k predictions score as 0 when the only difference is letter case.

Testing

not run (per request)

Closes #1220

kcz358

Hi, so instead of just remove the lower(), maybe should actually lower the gt_ans as well?

akawincent · 2026-04-09T09:14:18Z

Hi, so instead of just remove the lower(), maybe should actually lower the gt_ans as well?

@kcz358 I don't think we should lowercase gt_ans for HME100k.

HME100k should be case-sensitive, since it is handwritten mathematical expression recognition.

So, we should not lowercase pred, and we should not lowercase gt_ans either. The original problem was that pred was lowercased too early, which broke the intended HME100k matching behavior.

Since these answers are math-expression / LaTeX-like strings, lowercasing gt_ans could also create false positives for charactors like V, F, I, A, etc.

kcz358 · 2026-04-09T09:53:32Z

Got it, looks make sense if HME requires the answer to be case sensitive. Will this change cause false negative on other branches? If not then I will merge this PR. Thanks

akawincent · 2026-04-09T13:32:01Z

@kcz358

No, this should not affect the other OCRBench branches.
I have confirmed that other branches do pred.lower() and gt_ans.lower() in else.... cuz they are case-insensitive.

fix: preserve HME100k prediction case in OCRBench scoring

d124eb2

kcz358 reviewed Apr 9, 2026

View reviewed changes

kcz358 approved these changes Apr 10, 2026

View reviewed changes

kcz358 merged commit f54dd28 into EvolvingLMMs-Lab:main Apr 10, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve HME100k prediction case in OCRBench scoring#1278

fix: preserve HME100k prediction case in OCRBench scoring#1278
kcz358 merged 1 commit intoEvolvingLMMs-Lab:mainfrom
akawincent:fix/1220_HME100k_score

akawincent commented Mar 27, 2026

Uh oh!

kcz358 left a comment

Uh oh!

akawincent commented Apr 9, 2026 •

edited

Loading

Uh oh!

kcz358 commented Apr 9, 2026

Uh oh!

akawincent commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akawincent commented Mar 27, 2026

Summary

Why

Testing

Uh oh!

kcz358 left a comment

Choose a reason for hiding this comment

Uh oh!

akawincent commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kcz358 commented Apr 9, 2026

Uh oh!

akawincent commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akawincent commented Apr 9, 2026 •

edited

Loading