RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published 2 days ago • 36
Continuous Speculative Decoding for Autoregressive Image Generation Paper • 2411.11925 • Published 3 days ago • 13
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use Paper • 2411.10323 • Published 6 days ago • 26
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation Paper • 2411.08033 • Published 9 days ago • 21
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 6 days ago • 87
MagicQuill: An Intelligent Interactive Image Editing System Paper • 2411.09703 • Published 7 days ago • 50
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published 9 days ago • 15
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation Paper • 2411.08380 • Published 8 days ago • 24
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published 9 days ago • 58
SpeechAlign: Aligning Speech Generation to Human Preferences Paper • 2404.05600 • Published Apr 8 • 1
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Paper • 2411.04997 • Published 14 days ago • 34
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization Paper • 2411.06208 • Published 12 days ago • 18
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models Paper • 2411.07232 • Published 10 days ago • 60
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 14 days ago • 48