JosephusCheung
/

Qwen-LLaMAfied-7B-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions

JosephusCheung commited on Aug 11, 2023

Commit

db7ad2f

•

1 Parent(s): f557bed

Update README.md

Files changed (1) hide show

README.md +10 -4

README.md CHANGED Viewed

@@ -21,11 +21,15 @@ PROMPT FORMAT: [chatml](https://github.com/openai/openai-python/blob/main/chatml
 CURRENT MMLU: 53.48
 ```
-stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
 ```
-Issue: Compared to the original Qwen-7B-Chat scoring 53.90, the MMLU score dropped slightly (-0.42) due to insufficient realignment.
 [预览版本]
@@ -42,7 +46,9 @@ PROMPT 格式: [chatml](https://github.com/openai/openai-python/blob/main/chatml
 当前的 MMLU: 53.48
 ```
-stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
 ```
-问题：相比原本的 Qwen-7B-Chat 的 53.90，由于不够充分的重新对齐，MMLU分数略有下降（-0.42）。

 CURRENT MMLU: 53.48
+CURRENT CEval (val): 54.13
 ```
+MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
+CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
 ```
+Issue: Compared to the original Qwen-7B-Chat scoring 53.90 in MMLU and 54.18 in CEval (val), the our scores dropped slightly [-0.42 in MMLU, -0.05 in CEval (val)] due to insufficient realignment.
 [预览版本]
 当前的 MMLU: 53.48
 ```
+MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
+CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
 ```
+问题：相比原本的 Qwen-7B-Chat 的 MMLU 分数 53.90 和 CEval (val) 分数 54.13，由于不够充分的重新对齐，分数都略有下降（MMLU -0.42, CEval -0.05）。