alignment_24_best - a u-brixton Collection

u-brixton 's Collections

math

foundation_models

alignment_24_best

monte_carlo_24_best

alignment_24_best

updated 26 days ago

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2 • 15
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23 • 10
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12 • 9
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Paper • 2402.14740 • Published Feb 22 • 10
Binary Classifier Optimization for Large Language Model Alignment

Paper • 2404.04656 • Published Apr 6 • 2
Noise Contrastive Alignment of Language Models with Explicit Rewards

Paper • 2402.05369 • Published Feb 8 • 1
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16 • 33
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7 • 29
Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 14
ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 62
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Paper • 2405.21046 • Published May 31 • 3
From r to Q^*: Your Language Model is Secretly a Q-Function

Paper • 2404.12358 • Published Apr 18 • 2
Offline Regularised Reinforcement Learning for Large Language Models Alignment

Paper • 2405.19107 • Published May 29 • 13
Towards Scalable Automated Alignment of LLMs: A Survey

Paper • 2406.01252 • Published Jun 3 • 2
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4 • 72
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Paper • 2404.14367 • Published Apr 22 • 1
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

Paper • 2404.14313 • Published Apr 22
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13 • 1
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Paper • 2404.10719 • Published Apr 16 • 4
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 44
Building Math Agents with Multi-Turn Iterative Preference Learning

Paper • 2409.02392 • Published Sep 4 • 14
Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning

Paper • 2406.17312 • Published Jun 25
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 24
trl-lib/kto-mix-14k

Viewer • Updated Mar 25 • 15k • 539 • 6
Towards Efficient and Exact Optimization of Language Model Alignment

Paper • 2402.00856 • Published Feb 1
HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2 • 19
General Preference Modeling with Preference Representations for Aligning Language Models

Paper • 2410.02197 • Published Oct 3 • 7
Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult

Paper • 2409.17545 • Published Sep 26 • 18
RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Paper • 2410.04612 • Published Oct 6
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Paper • 2410.11677 • Published Oct 15