-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 18 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 26 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 24 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2402.10893
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 99 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109
-
Suppressing Pink Elephants with Direct Principle Feedback
Paper • 2402.07896 • Published • 9 -
Policy Improvement using Language Feedback Models
Paper • 2402.07876 • Published • 5 -
Direct Language Model Alignment from Online AI Feedback
Paper • 2402.04792 • Published • 29 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 64
-
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
Paper • 2402.08714 • Published • 10 -
Data Engineering for Scaling Language Models to 128K Context
Paper • 2402.10171 • Published • 21 -
RLVF: Learning from Verbal Feedback without Overgeneralization
Paper • 2402.10893 • Published • 10 -
Coercing LLMs to do and reveal (almost) anything
Paper • 2402.14020 • Published • 12
-
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Paper • 2311.03285 • Published • 28 -
Tailoring Self-Rationalizers with Multi-Reward Distillation
Paper • 2311.02805 • Published • 3 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 2 -
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper • 2309.11235 • Published • 16
-
NExT-GPT: Any-to-Any Multimodal LLM
Paper • 2309.05519 • Published • 78 -
Large Language Model for Science: A Study on P vs. NP
Paper • 2309.05689 • Published • 20 -
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Paper • 2309.06126 • Published • 16 -
Large Language Models for Compiler Optimization
Paper • 2309.07062 • Published • 22
-
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper • 2309.10202 • Published • 9 -
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
Paper • 2309.10150 • Published • 24 -
Robotic Offline RL from Internet Videos via Value-Function Pre-Training
Paper • 2309.13041 • Published • 8 -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 9
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 8
-
Large-Scale Automatic Audiobook Creation
Paper • 2309.03926 • Published • 53 -
Agents: An Open-source Framework for Autonomous Language Agents
Paper • 2309.07870 • Published • 41 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29