thomas-yanxin
commited on
Commit
•
8989b1b
1
Parent(s):
6cee1b1
Update README.md
Browse files
README.md
CHANGED
@@ -18,9 +18,9 @@ Here are the results from our OpenCompass evaluation:
|
|
18 |
| English | MMLU | 73.72 |
|
19 |
| | MMLU-Pro | / |
|
20 |
| | Theorem QA | / |
|
21 |
-
| | GPQA |
|
22 |
| | BBH | 67.55 |
|
23 |
-
| | IFEval (Prompt Strict-Acc.) |
|
24 |
| | ARC-C | 91.19 |
|
25 |
| Math | GSM8K | 82.94 |
|
26 |
| | MATH | 41.06 |
|
|
|
18 |
| English | MMLU | 73.72 |
|
19 |
| | MMLU-Pro | / |
|
20 |
| | Theorem QA | / |
|
21 |
+
| | GPQA | 33.04 |
|
22 |
| | BBH | 67.55 |
|
23 |
+
| | IFEval (Prompt Strict-Acc.) | 40.48 |
|
24 |
| | ARC-C | 91.19 |
|
25 |
| Math | GSM8K | 82.94 |
|
26 |
| | MATH | 41.06 |
|