Add IFEval score to metrics
#2
by
lewtun
HF staff
- opened
This adds the prompt-level-loose accuracy metric from Google's IFEval benchmark: https://arxiv.org/abs/2311.07911
Thanks for adding would be nice to see a reference point (if that score is good/bad) compared to models of similar size etc
abacaj
changed pull request status to
merged