Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published Jun 10 • 64
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published Jun 10 • 23
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models Paper • 2406.04271 • Published Jun 6 • 27
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4 • 36
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration Paper • 2406.01014 • Published Jun 3 • 30
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM Paper • 2406.02884 • Published Jun 5 • 14
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Paper • 2406.02900 • Published Jun 5 • 10
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots Paper • 2406.02523 • Published Jun 4 • 9
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Paper • 2405.21060 • Published May 31 • 63
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7 • 55
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper • 2405.19327 • Published May 29 • 45
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer Paper • 2405.17405 • Published May 27 • 14
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning Paper • 2405.17258 • Published May 27 • 14
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25 • 10
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Paper • 2405.15574 • Published May 24 • 53
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published May 24 • 43
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published May 24 • 25
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published May 23 • 23
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach Paper • 2405.15613 • Published May 24 • 13
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting Paper • 2405.15125 • Published May 24 • 5
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23 • 34
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21 • 28
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 22
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published May 20 • 27
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20 • 45
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published May 20 • 33
Imp: Highly Capable Large Multimodal Models for Mobile Devices Paper • 2405.12107 • Published May 20 • 25
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published May 16 • 26
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper • 2408.03615 • Published Aug 7 • 30