-
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 65 -
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper • 2406.06469 • Published • 24 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 28 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 37
Collections
Discover the best community collections!
Collections including paper arxiv:2406.06525
-
Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Paper • 2405.21048 • Published • 12 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 37 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 65
-
xLSTM: Extended Long Short-Term Memory
Paper • 2405.04517 • Published • 11 -
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Paper • 2405.05254 • Published • 9 -
Understanding the performance gap between online and offline alignment algorithms
Paper • 2405.08448 • Published • 14 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 126
-
FLAME: Factuality-Aware Alignment for Large Language Models
Paper • 2405.01525 • Published • 24 -
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 35 -
Transformers Can Do Arithmetic with the Right Embeddings
Paper • 2405.17399 • Published • 51 -
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture
Paper • 2405.18991 • Published • 12
-
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Paper • 2404.13686 • Published • 27 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 65 -
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
Paper • 2308.06721 • Published • 29
-
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 12 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 65 -
Data curation via joint example selection further accelerates multimodal learning
Paper • 2406.17711 • Published • 3 -
Unveiling Encoder-Free Vision-Language Models
Paper • 2406.11832 • Published • 49
-
All you need is a good init
Paper • 1511.06422 • Published • 1 -
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
Paper • 2404.14507 • Published • 21 -
Efficient Transformer Encoders for Mask2Former-style models
Paper • 2404.15244 • Published • 1 -
Deep Residual Learning for Image Recognition
Paper • 1512.03385 • Published • 6
-
AniClipart: Clipart Animation with Text-to-Video Priors
Paper • 2404.12347 • Published • 12 -
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
Paper • 2404.11565 • Published • 14 -
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 44 -
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Paper • 2407.02687 • Published • 22
-
Dynamic Typography: Bringing Words to Life
Paper • 2404.11614 • Published • 44 -
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer
Paper • 2404.14351 • Published • 5 -
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Paper • 2404.17672 • Published • 18 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 65