Add IFEval score to metrics

#2
by lewtun HF staff - opened

This adds the prompt-level-loose accuracy metric from Google's IFEval benchmark: https://arxiv.org/abs/2311.07911

Thanks for adding would be nice to see a reference point (if that score is good/bad) compared to models of similar size etc

abacaj changed pull request status to merged

Sign up or log in to comment