VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Paper • 2410.01679 • Published 14 days ago • 22
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models Paper • 2410.05229 • Published 9 days ago • 14
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published 27 days ago • 21