MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering Paper • 2410.07095 • Published Oct 9 • 6
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 91
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. • 12 items • Updated Sep 13, 2023 • 86
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 109
Handbook v0.1 models and datasets Collection Models and datasets for v0.1 of the alignment handbook • 6 items • Updated Nov 10, 2023 • 24
DPO vs KTO vs IPO Collection A collection of datasets and models used for the Aligning LLMs with Direct Preference Optimization Methods blogpost • 2 items • Updated Jan 16 • 11
Constitutional AI Collection A collection of datasets and models that accompany the Constitutional AI recipe. See hf.co/blog/constitutional-ai for more details. • 9 items • Updated Feb 1 • 5
Tulu V2 Suite Collection The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated Sep 25 • 42
Paloma Collection Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains • 8 items • Updated Sep 26 • 13