--- license: apache-2.0 datasets: - MathGenie/MathCode-Pile language: - en metrics: - accuracy base_model: - mistralai/Mistral-7B-v0.1 pipeline_tag: text-generation tags: - math --- # MathCoder2 ### Introduction The MathCoder2 models are created by conducting continued pretraining on [MathCode-Pile](https://huggingface.co/datasets/MathGenie/MathCode-Pile). They are introduced in the paper [MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code](https://arxiv.org/abs/2410.08196). The mathematical pretraining dataset includes mathematical code accompanied with natural language reasoning steps, making it a superior resource for models aimed at performing advanced mathematical reasoning tasks. ### Evaluation ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65dd9e7b4a4fce1ec96dc6b7/BEZoDZLjp-fPFlt7oFXBa.png) ### Citation If you find this repository helpful, please consider citing our papers: ``` @misc{lu2024mathcoder2bettermathreasoning, title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code}, author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li}, year={2024}, eprint={2410.08196}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.08196}, } ``` ``` @inproceedings{ wang2024mathcoder, title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning}, author={Zimu Lu and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li}, booktitle={The Twelfth International Conference on Learning Representations}, year={2024}, url={https://openreview.net/forum?id=z8TW0ttBPp} } ```