TinyGSM: achieving >80% on GSM8k with small language models Paper • 2312.09241 • Published Dec 14, 2023 • 37
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants Paper • 2308.16884 • Published Aug 31, 2023 • 8
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 21