MMLU by task leaderboard

#173
by CoreyMorris - opened

I created a leaderboard showing the accuracy score for each task in MMLU https://huggingface.co/spaces/CoreyMorris/MMLU-by-task-Leaderboard . I'll keep it updated at least until hugging face decides to create one one with the breakdown by tasks. I'm open to suggestions for improving it.

Open LLM Leaderboard org

This is a great idea! (We probably won't add one here at the moment)

Overall, I would suggest:

  • removing non MMLU scores
  • adding some of the original MMLU groupings (humanities, social sciences, STEM, other) (you can find more info on the original repository)
  • using a bigger widget for the table (it's hard to search in it) and possibly adding a search function.

I really like the plots, you could add some explanation of what you are plotting and why, it would really enrich your page.

Lastly, don't forget your own citation link! :)

Thanks for the suggestions !

@clefourrier

  • I made the table bigger and added some ways to filter(Model size, model name, and task name)
  • Also added some explanation for the plotting and my own citation.

I'll probably add the original MMLU groupings as well. Not sure about removing the non MMLU scores. I want people to be able to compare those as well, but I should probably at least have some explanation and maybe have them hidden or less prominent by default.

Thanks again for the feedback !

clefourrier changed discussion status to closed

Sign up or log in to comment