thomas-yanxin's picture
Update README.md
8989b1b verified
metadata
license: other
language:
  - zh
  - en
datasets:
  - thomas-yanxin/MT-SFT-ShareGPT

The main purpose of this model is to validate the usability of thomas-yanxin/MT-SFT-ShareGPT, i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.

Here are the results from our OpenCompass evaluation:

Classification Benchmarks Models
名称 XinYuan-Qwen2-7B
English MMLU 73.72
MMLU-Pro /
Theorem QA /
GPQA 33.04
BBH 67.55
IFEval (Prompt Strict-Acc.) 40.48
ARC-C 91.19
Math GSM8K 82.94
MATH 41.06
Chinese C-EVAL 81.02
CMMLU 80.06
Code MBPP 50.6
HumanEval 83.99