Finding the best leaderboard for your use case
✨ Featured leaderboards
Since the end of 2023, we have worked with partners with strong evaluation knowledge, to highlight their work as a blog series, called Leaderboards on the Hub
.
Among these, here is a shortlist on some LLM-specific leaderboards you could take a look at!
- Code evaluation:
- Mathematics abiliites:
- Safety:
- Performance:
This series is particularly interesting to understand the subtelties of evaluation across different modalities and topics, and we hope it will act as a knowledge base in the future.
🔍 Explore Spaces by yourself
On the Hub, leaderboards
and arenas
are hosted as Spaces, like machine learning demos.
You can either look for the keywords leaderboard
or arena
in the space title using the search bar here (or this link), in the full space using the “Full-text search”, or look for spaces with correct metadata by looking for the leaderboard
tags here.
We also try to maintain an up-to-date collection of leaderboards. If we missed your space, tag one of the members of the evaluation team in the space discussion!
< > Update on GitHub