Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Paper • 2203.05482 • Published Mar 10, 2022 • 6
Diverse Weight Averaging for Out-of-Distribution Generalization Paper • 2205.09739 • Published May 19, 2022 • 1
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs Paper • 2309.07311 • Published Sep 13, 2023 • 2
ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models Paper • 2402.00794 • Published Feb 1 • 1
Tracking Universal Features Through Fine-Tuning and Model Merging Paper • 2410.12391 • Published 21 days ago • 5