Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
davidberenstein1957
's Collections
Dataset Viber annotators
LLM evals and benchmark datasets
Useful Spaces
Cool and fun Spaces
Model Leaderboards
Useful models
Useful datasets
LLM evals and benchmark datasets
updated
Aug 17
Upvote
2
allenai/reward-bench
Viewer
•
Updated
9 days ago
•
8.11k
•
57.5k
•
67
openai/openai_humaneval
Viewer
•
Updated
Jan 4
•
164
•
8.05k
•
226
google/IFEval
Viewer
•
Updated
Aug 14
•
541
•
3.04k
•
28
allenai/ai2_arc
Viewer
•
Updated
Dec 21, 2023
•
7.79k
•
770k
•
128
allenai/winogrande
Updated
Jan 18
•
12.4k
•
53
TIGER-Lab/MMLU-Pro
Viewer
•
Updated
11 days ago
•
12.1k
•
149k
•
258
cais/mmlu
Viewer
•
Updated
Mar 8
•
231k
•
350k
•
299
truthfulqa/truthful_qa
Viewer
•
Updated
Jan 4
•
1.63k
•
6.33k
•
196
openai/gsm8k
Viewer
•
Updated
Jan 4
•
17.6k
•
46.3k
•
364
Rowan/hellaswag
Viewer
•
Updated
Sep 28, 2023
•
60k
•
15.3k
•
83
tatsu-lab/alpaca_eval
Updated
Aug 16
•
63.2k
•
48
HuggingFaceH4/mt_bench_prompts
Viewer
•
Updated
Jul 3, 2023
•
80
•
673
•
15
nvidia/ChatRAG-Bench
Viewer
•
Updated
May 24
•
34.6k
•
1.77k
•
93
rungalileo/ragbench
Viewer
•
Updated
Jun 11
•
95.4k
•
717
•
8
Upvote
2
Share collection
View history
Collection guide
Browse collections