Skip to content

Commit bec69a4

Browse files
Merge pull request #408 from aitomatic/examples/FinanceBench
update examples/FinanceBench labels
2 parents 7d76cdc + 438ea6a commit bec69a4

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

examples/FinanceBench/ground-truths.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4288,6 +4288,8 @@ financebench_id_00603:
42884288
correctness: >-
42894289
the answer mentions new stores
42904290
4291+
evaluator-unreliable: true
4292+
42914293

42924294
financebench_id_00605:
42934295
sector: Consumer Discretionary
@@ -4316,6 +4318,8 @@ financebench_id_00605:
43164318
the answer contains a calculated percentage value that is in the range from 30% to 40%
43174319
(if the answer is a single number, assume that it is that calculated percentage value)
43184320
4321+
evaluator-unreliable: true
4322+
43194323

43204324
financebench_id_00606: # tricky: highly implicit wordings
43214325
sector: Consumer Discretionary

0 commit comments

Comments
 (0)