Add IFEval score to metrics

by lewtun HF staff - opened Mar 2

←

lewtun

Mar 2

This adds the prompt-level-loose accuracy metric from Google's IFEval benchmark: https://arxiv.org/abs/2311.07911

Mar 2

•

Thanks for adding would be nice to see a reference point (if that score is good/bad) compared to models of similar size etc

abacaj changed pull request status to merged Mar 2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment