Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.12917

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 28 days ago • 72

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12 • 61
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 33
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Paper • 2405.06682 • Published May 5 • 2

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131

Factuality - Faithfulness - Hallucination

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131
FactAlign: Long-form Factuality Alignment of Large Language Models

Paper • 2410.01691 • Published 14 days ago • 8
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published 13 days ago • 44
ECon: On the Detection and Resolution of Evidence Conflicts

Paper • 2410.04068 • Published 12 days ago

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published 29 days ago • 27
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published 24 days ago • 21

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6 • 42
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published 27 days ago • 131
Reward-Robust RLHF in LLMs

Paper • 2409.15360 • Published 29 days ago • 5
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published 22 days ago • 23

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28 • 83
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3 • 80
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12 • 61
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance

Paper • 2405.06682 • Published May 5 • 2

Previous
1
2
3
4
...
6
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs