LiyuanLucasLiu commited on
Commit
234a12a
1 Parent(s): e87098c

typo fixed

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -77,7 +77,7 @@ Note a different version of mid-training and post-training, emphasizing long con
77
  | HellaSwag | 83.7 | 83.8 | 70.4 | 79.0 | 71.1 | 82.6 | 78.8 | 91.7 |
78
  | ANLI | 60.6 | 59.8 | 55.2 | 65.2 | 57.3 | 68.3 | 58.1 | 75.7 |
79
  | GSM-8K | 90.4 | 88.7 | 64.7 | 83.8 | 77.4 | 93.5 | 78.1 | 93.8 |
80
- | MedQA | 70.4 | 62.2 | 67.9 | 60.5 | 78.5 | 63.4 | 88.9 |
81
  | AGIEval | 48.2 | 50.3 | 45.2 | 54.0 | 42.0 | 56.9 | 48.4 | 37.6 |
82
  | TriviaQA | 73.9 | 71.6 | 78.5 | 82.2 | 67.7 | 84.5 | 85.8 | 66.0 |
83
  | Arc-C | 92.0 | 91.0 | 87.3 | 91.3 | 82.8 | 93.0 | 87.4 | 97.0 |
@@ -87,7 +87,7 @@ Note a different version of mid-training and post-training, emphasizing long con
87
  | BigBench-Hard | 81.4 | 79.1 | 69.7 | 81.8 | 51.5 | 80.2 | 68.3 | 81.2 |
88
  | WinoGrande | 81.4 | 81.3 | 62.0 | 75.3 | 65.0 | 83.3 | 68.8 | 89.3 |
89
  | OpenBookQA | 89.8 | 89.6 | 85.8 | 88.6 | 82.6 | 91.8 | 86.0 | 95.2 |
90
- | BoolQ | 83.4 | 84.5 | 77.6 | 82.7 | 80.9 | 89.1 | 79.1 | 90.6 |
91
  | CommonSenseQA | 81.8 | 83.5 | 78.1 | 82.0 | 79.0 | 84.4 | 79.6 | 88.5 |
92
  | TruthfulQA | 74.5 | 77.5 | 60.1 | 67.4 | 63.2 | 81.9 | 85.8 | 85.6 |
93
  | HumanEval | 74.4 | 70.7 | 37.8 | 39.6 | 60.4 | 78.7 | 62.2 | 92.1 |
 
77
  | HellaSwag | 83.7 | 83.8 | 70.4 | 79.0 | 71.1 | 82.6 | 78.8 | 91.7 |
78
  | ANLI | 60.6 | 59.8 | 55.2 | 65.2 | 57.3 | 68.3 | 58.1 | 75.7 |
79
  | GSM-8K | 90.4 | 88.7 | 64.7 | 83.8 | 77.4 | 93.5 | 78.1 | 93.8 |
80
+ | MedQA | 70.4 | 70.5 | 62.2 | 67.9 | 60.5 | 78.5 | 63.4 | 88.9 |
81
  | AGIEval | 48.2 | 50.3 | 45.2 | 54.0 | 42.0 | 56.9 | 48.4 | 37.6 |
82
  | TriviaQA | 73.9 | 71.6 | 78.5 | 82.2 | 67.7 | 84.5 | 85.8 | 66.0 |
83
  | Arc-C | 92.0 | 91.0 | 87.3 | 91.3 | 82.8 | 93.0 | 87.4 | 97.0 |
 
87
  | BigBench-Hard | 81.4 | 79.1 | 69.7 | 81.8 | 51.5 | 80.2 | 68.3 | 81.2 |
88
  | WinoGrande | 81.4 | 81.3 | 62.0 | 75.3 | 65.0 | 83.3 | 68.8 | 89.3 |
89
  | OpenBookQA | 89.8 | 89.6 | 85.8 | 88.6 | 82.6 | 91.8 | 86.0 | 95.2 |
90
+ | BoolQ | 83.4 | 84.6 | 77.6 | 82.7 | 80.9 | 89.1 | 79.1 | 90.6 |
91
  | CommonSenseQA | 81.8 | 83.5 | 78.1 | 82.0 | 79.0 | 84.4 | 79.6 | 88.5 |
92
  | TruthfulQA | 74.5 | 77.5 | 60.1 | 67.4 | 63.2 | 81.9 | 85.8 | 85.6 |
93
  | HumanEval | 74.4 | 70.7 | 37.8 | 39.6 | 60.4 | 78.7 | 62.2 | 92.1 |