JosephusCheung commited on
Commit
db7ad2f
1 Parent(s): f557bed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -21,11 +21,15 @@ PROMPT FORMAT: [chatml](https://github.com/openai/openai-python/blob/main/chatml
21
 
22
  CURRENT MMLU: 53.48
23
 
 
 
24
  ```
25
- stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
 
 
26
  ```
27
 
28
- Issue: Compared to the original Qwen-7B-Chat scoring 53.90, the MMLU score dropped slightly (-0.42) due to insufficient realignment.
29
 
30
  [预览版本]
31
 
@@ -42,7 +46,9 @@ PROMPT 格式: [chatml](https://github.com/openai/openai-python/blob/main/chatml
42
  当前的 MMLU: 53.48
43
 
44
  ```
45
- stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
 
 
46
  ```
47
 
48
- 问题:相比原本的 Qwen-7B-Chat 的 53.90,由于不够充分的重新对齐,MMLU分数略有下降(-0.42)。
 
21
 
22
  CURRENT MMLU: 53.48
23
 
24
+ CURRENT CEval (val): 54.13
25
+
26
  ```
27
+ MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
28
+
29
+ CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
30
  ```
31
 
32
+ Issue: Compared to the original Qwen-7B-Chat scoring 53.90 in MMLU and 54.18 in CEval (val), the our scores dropped slightly [-0.42 in MMLU, -0.05 in CEval (val)] due to insufficient realignment.
33
 
34
  [预览版本]
35
 
 
46
  当前的 MMLU: 53.48
47
 
48
  ```
49
+ MMLU - stem ACC: 46.40 Humanities ACC: 47.61 other ACC: 61.31 social ACC: 61.78 AVERAGE ACC:53.48
50
+
51
+ CEval (val) - STEM acc: 45.28 Social Science acc: 66.19 Humanities acc: 58.76 Other acc: 54.62 Hard acc:28.64 AVERAGE acc:54.13
52
  ```
53
 
54
+ 问题:相比原本的 Qwen-7B-Chat 的 MMLU 分数 53.90 和 CEval (val) 分数 54.13,由于不够充分的重新对齐,分数都略有下降(MMLU -0.42, CEval -0.05)。