llm - a Car9n Collection

Car9n 's Collections

llm

llm

updated Aug 9

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10 • 64
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published Jun 10 • 23
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Paper • 2406.04271 • Published Jun 6 • 27
Block Transformer: Global-to-Local Language Modeling for Fast Inference

Paper • 2406.02657 • Published Jun 4 • 36
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

Paper • 2406.01014 • Published Jun 3 • 30
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM

Paper • 2406.02884 • Published Jun 5 • 14
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5 • 10
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Paper • 2406.02523 • Published Jun 4 • 9
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Paper • 2405.21060 • Published May 31 • 63
Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published May 30 • 32
Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Paper • 2405.20335 • Published May 30 • 17
Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7 • 55
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Paper • 2405.19327 • Published May 29 • 46
2BP: 2-Stage Backpropagation

Paper • 2405.18047 • Published May 28 • 23
Phased Consistency Model

Paper • 2405.18407 • Published May 28 • 46
Yuan 2.0-M32: Mixture of Experts with Attention Router

Paper • 2405.17976 • Published May 28 • 18
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 85
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27 • 51
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published May 27 • 14
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning

Paper • 2405.17258 • Published May 27 • 14
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

Paper • 2405.16287 • Published May 25 • 10
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 53
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Paper • 2405.15738 • Published May 24 • 43
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24 • 25
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published May 23 • 23
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Paper • 2405.15613 • Published May 24 • 13
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting

Paper • 2405.15125 • Published May 24 • 5
Not All Language Model Features Are Linear

Paper • 2405.14860 • Published May 23 • 39
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23 • 34
Dense Connector for MLLMs

Paper • 2405.13800 • Published May 22 • 21
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19 • 150
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21 • 28
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Paper • 2405.12970 • Published May 21 • 22
Diffusion for World Modeling: Visual Details Matter in Atari

Paper • 2405.12399 • Published May 20 • 27
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 45
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20 • 33
Imp: Highly Capable Large Multimodal Models for Mobile Devices

Paper • 2405.12107 • Published May 20 • 25
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 87
Many-Shot In-Context Learning in Multimodal Foundation Models

Paper • 2405.09798 • Published May 16 • 26
What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 98
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks

Paper • 2408.03615 • Published Aug 7 • 30