-
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 24 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2310.16450
-
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 18 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 24 -
Tuna: Instruction Tuning using Feedback from Large Language Models
Paper • 2310.13385 • Published • 10
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Paper • 2310.15494 • Published • 1 -
A Long Way to Go: Investigating Length Correlations in RLHF
Paper • 2310.03716 • Published • 9 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 65 -
Giraffe: Adventures in Expanding Context Lengths in LLMs
Paper • 2308.10882 • Published • 1
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 35 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9
-
Agents: An Open-source Framework for Autonomous Language Agents
Paper • 2309.07870 • Published • 41 -
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts
Paper • 2309.07430 • Published • 27 -
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper • 2309.08532 • Published • 52 -
Investigating Answerability of LLMs for Long-Form Question Answering
Paper • 2309.08210 • Published • 12
-
Uncovering mesa-optimization algorithms in Transformers
Paper • 2309.05858 • Published • 12 -
ProPainter: Improving Propagation and Transformer for Video Inpainting
Paper • 2309.03897 • Published • 26 -
Approximating Two-Layer Feedforward Networks for Efficient Transformers
Paper • 2310.10837 • Published • 10 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 9
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8