-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 142 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 28
Collections
Discover the best community collections!
Collections including paper arxiv:2409.12917
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 31 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 43 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 34 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 60
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 61 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 38 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29
-
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 182 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 60 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 28
-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 42 -
InternLM2 Technical Report
Paper • 2403.17297 • Published • 28 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 12 -
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 108
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 16 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 10 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 64
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 21 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 79 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 142 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 35 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 21 -
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method
Paper • 2402.17193 • Published • 23 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 131