Miguel Alonso Jr's picture

1 6 6

Miguel Alonso Jr

miguelalonsojr

·

AI & ML interests

ML, RL, Robotics

Organizations

miguelalonsojr's activity

upvoted 3 papers 9 months ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 14

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 29

upvoted a paper 10 months ago

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48

upvoted 2 collections 10 months ago

Zephyr 7B

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 145

Papers about model merging

referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13 • 14