Recreating MMLU scores

#2
by theblackcat102 - opened

Do you guys use lm-evaluation-harnesss for MMLU evaluation? I'm not getting the stark improvement found in fineweb-edu image using this checkpoint.

HuggingFaceFW org

We do not. I've added a note to the top of this file detailing how you can reproduce our setup: https://huggingface.co/datasets/HuggingFaceFW/fineweb/blob/main/lighteval_tasks.py

@guipenedo Thanks for the quick reply, I will check it out

theblackcat102 changed discussion status to closed

Sign up or log in to comment