To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18 • 36
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems Paper • 2402.12875 • Published Feb 20 • 13
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published Oct 1 • 29
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs Paper • 2410.12405 • Published Oct 16 • 13
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 22 days ago • 59
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments Paper • 2410.23918 • Published 21 days ago • 18
ATM: Improving Model Merging by Alternating Tuning and Merging Paper • 2411.03055 • Published 16 days ago • 1
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 14 days ago • 108
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model Paper • 2411.04496 • Published 15 days ago • 22
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding Paper • 2411.04282 • Published 15 days ago • 30