-
Adapting Language Models to Compress Contexts
Paper • 2305.14788 • Published • 1 -
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 26 -
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization
Paper • 2401.07793 • Published • 3 -
Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression
Paper • 2402.16058 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2311.11829
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 21 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 16 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 27 -
The Impact of Reasoning Step Length on Large Language Models
Paper • 2401.04925 • Published • 15
-
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Paper • 2312.04474 • Published • 29 -
Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning
Paper • 2312.08901 • Published -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28 -
Making Large Language Models Better Reasoners with Step-Aware Verifier
Paper • 2206.02336 • Published • 1
-
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper • 2401.01854 • Published • 10 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 26 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 28 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 4 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 30
-
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Paper • 2311.06720 • Published • 7 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 37 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 27
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 70 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 7 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 24 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 33
-
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 121 -
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 118 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 48
-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 7 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 24