Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
rbgo
's Collections
Finetuning
LLM-Alignment Papers
PPO Trainers
All About LLMs
PPO Trainers
updated
Sep 12
Upvote
-
Direct Language Model Alignment from Online AI Feedback
Paper
•
2402.04792
•
Published
Feb 7
•
29
Upvote
-
Share collection
View history
Collection guide
Browse collections