Recreating MMLU scores

by theblackcat102 - opened Jun 5

Jun 5

Do you guys use lm-evaluation-harnesss for MMLU evaluation? I'm not getting the stark improvement found in fineweb-edu image using this checkpoint.

HuggingFaceFW org Jun 5

We do not. I've added a note to the top of this file detailing how you can reproduce our setup: https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/lighteval_tasks.py

Jun 5

@guipenedo Thanks for the quick reply, I will check it out

theblackcat102 changed discussion status to closed Jun 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment