-
Adapting Language Models to Compress Contexts
Paper • 2305.14788 • Published • 1 -
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 27 -
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization
Paper • 2401.07793 • Published • 3 -
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression
Paper • 2402.16058 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2305.14788
-
Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model
Paper • 2212.09146 • Published • 3 -
RaLLe: A Framework for Developing and Evaluating Retrieval-Augmented Large Language Models
Paper • 2308.10633 • Published • 1 -
MemeCap: A Dataset for Captioning and Interpreting Memes
Paper • 2305.13703 • Published -
Contrastive Learning for Inference in Dialogue
Paper • 2310.12467 • Published
-
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 25 -
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models
Paper • 2308.16137 • Published • 39 -
Scaling Transformer to 1M tokens and beyond with RMT
Paper • 2304.11062 • Published • 2 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 17
-
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Paper • 2310.08659 • Published • 22 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44 -
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
Paper • 2309.16119 • Published • 1 -
LoRA ensembles for large language model fine-tuning
Paper • 2310.00035 • Published • 2
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 13 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 27 -
Self-slimmed Vision Transformer
Paper • 2111.12624 • Published • 1 -
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Paper • 2308.14903 • Published • 1
-
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning
Paper • 2310.04474 • Published • 2 -
Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques
Paper • 2310.08101 • Published • 2 -
Instance Needs More Care: Rewriting Prompts for Instances Yields Better Zero-Shot Performance
Paper • 2310.02107 • Published • 3
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Paper • 2310.15494 • Published • 1 -
A Long Way to Go: Investigating Length Correlations in RLHF
Paper • 2310.03716 • Published • 9 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 65 -
Giraffe: Adventures in Expanding Context Lengths in LLMs
Paper • 2308.10882 • Published • 1
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
When can transformers reason with abstract symbols?
Paper • 2310.09753 • Published • 2 -
Improving Length-Generalization in Transformers via Task Hinting
Paper • 2310.00726 • Published • 1 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 27
-
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 5 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 41 -
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Paper • 2202.07922 • Published • 1 -
Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques
Paper • 2310.08101 • Published • 2