Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis Paper • 2402.14797 • Published Feb 22 • 19
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 126
GPTVQ: The Blessing of Dimensionality for LLM Quantization Paper • 2402.15319 • Published Feb 23 • 19
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 602
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27 • 23
EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs Paper • 2403.02775 • Published Mar 5 • 11
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 182
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models Paper • 2403.03100 • Published Mar 5 • 34
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Paper • 2408.03910 • Published Aug 7 • 15
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Paper • 2408.06072 • Published Aug 12 • 35
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20 • 56
TVG: A Training-free Transition Video Generation Method with Diffusion Models Paper • 2408.13413 • Published Aug 24 • 13
MobileQuant: Mobile-friendly Quantization for On-device Language Models Paper • 2408.13933 • Published Aug 25 • 13
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Paper • 2409.05152 • Published Sep 8 • 29
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers Paper • 2409.04109 • Published Sep 6 • 43
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning Paper • 2410.01044 • Published Oct 1 • 34
Aligning Large Language Models via Self-Steering Optimization Paper • 2410.17131 • Published 19 days ago • 19
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published 17 days ago • 40
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper • 2411.02337 • Published 5 days ago • 32