Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 371
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference Paper • 2406.06424 • Published Jun 10 • 12
MaPO Collection This collection includes the models and datasets as a part of the MaPO release. • 9 items • Updated Jun 12 • 5
Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets Paper • 2405.18952 • Published May 29 • 10
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 683
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated Apr 12 • 17
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 62
ORPO Collection This is the official collection of "ORPO: Monolithic Preference Optimization without Reference Model". • 5 items • Updated Apr 12 • 11