thomas-yanxin
/

XinYuan-Qwen2.5-7B-0917

Model card Files Files and versions Community

XinYuan-Qwen2.5-7B-0917 / README.md

thomas-yanxin's picture

Update README.md

8989b1b verified about 2 months ago

|

history blame contribute delete

1.23 kB

	---
	license: other
	language:
	- zh
	- en
	datasets:
	- thomas-yanxin/MT-SFT-ShareGPT
	---


	The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.

	Here are the results from our OpenCompass evaluation：

	\| Classification \| Benchmarks \| Models \|
	\| :------------: \| :--------: \| :--------: \|
	\| \| 名称 \| XinYuan-Qwen2-7B \|
	\| English \| MMLU \| 73.72 \|
	\| \| MMLU-Pro \| / \|
	\| \| Theorem QA \| / \|
	\| \| GPQA \| 33.04 \|
	\| \| BBH \| 67.55 \|
	\| \| IFEval (Prompt Strict-Acc.) \| 40.48 \|
	\| \| ARC-C \| 91.19 \|
	\| Math \| GSM8K \| 82.94 \|
	\| \| MATH \| 41.06 \|
	\| Chinese \| C-EVAL \| 81.02 \|
	\| \| CMMLU \| 80.06 \|
	\| Code \| MBPP \| 50.6 \|
	\| \| HumanEval \| 83.99 \|