JosephusCheung
commited on
Commit
•
db7ad2f
1
Parent(s):
f557bed
Update README.md
Browse files
README.md
CHANGED
@@ -21,11 +21,15 @@ PROMPT FORMAT: [chatml](https://github.com/openai/openai-python/blob/main/chatml
|
|
21 |
|
22 |
CURRENT MMLU: 53.48
|
23 |
|
|
|
|
|
24 |
```
|
25 |
-
stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
|
|
|
|
|
26 |
```
|
27 |
|
28 |
-
Issue: Compared to the original Qwen-7B-Chat scoring 53.90, the
|
29 |
|
30 |
[预览版本]
|
31 |
|
@@ -42,7 +46,9 @@ PROMPT 格式: [chatml](https://github.com/openai/openai-python/blob/main/chatml
|
|
42 |
当前的 MMLU: 53.48
|
43 |
|
44 |
```
|
45 |
-
stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
|
|
|
|
|
46 |
```
|
47 |
|
48 |
-
问题:相比原本的 Qwen-7B-Chat 的 53.90
|
|
|
21 |
|
22 |
CURRENT MMLU: 53.48
|
23 |
|
24 |
+
CURRENT CEval (val): 54.13
|
25 |
+
|
26 |
```
|
27 |
+
MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
|
28 |
+
|
29 |
+
CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
|
30 |
```
|
31 |
|
32 |
+
Issue: Compared to the original Qwen-7B-Chat scoring 53.90 in MMLU and 54.18 in CEval (val), the our scores dropped slightly [-0.42 in MMLU, -0.05 in CEval (val)] due to insufficient realignment.
|
33 |
|
34 |
[预览版本]
|
35 |
|
|
|
46 |
当前的 MMLU: 53.48
|
47 |
|
48 |
```
|
49 |
+
MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
|
50 |
+
|
51 |
+
CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
|
52 |
```
|
53 |
|
54 |
+
问题:相比原本的 Qwen-7B-Chat 的 MMLU 分数 53.90 和 CEval (val) 分数 54.13,由于不够充分的重新对齐,分数都略有下降(MMLU -0.42, CEval -0.05)。
|