Wojx commited on
Commit
c67e207
1 Parent(s): 05d6ae3

Update README.md

Browse files

Add MMLU benchmark results

Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -52,11 +52,11 @@ To get the expected features and performance for the chat versions, a specific L
52
  # Evaluation Results
53
  |Model | Size| hellaswag | arc_challenge | MMLU|
54
  |---|---|---|---|---|
55
- | Llama-2-chat | 7B | 78.55% | 52.9% | |
56
- | Llama-2-chat | 13B | 81.94% | 59.04% | |
57
- | Trurl 2.0 (with MMLU) | 13B | 80.09% | 59.30% |
58
- | Trurl 2.0 (no MMLU) | 13B | TO-DO | TO-DO | |
59
- | Trurl 2.0 | 7b | TO-DO | TO-DO |
60
 
61
  <img src="https://voicelab.ai/wp-content/uploads/trurl-hero.webp" alt="trurl graphic" style="width:100px;"/>
62
 
 
52
  # Evaluation Results
53
  |Model | Size| hellaswag | arc_challenge | MMLU|
54
  |---|---|---|---|---|
55
+ | Llama-2-chat | 7B | 78.55% | 52.9% | 48.32% |
56
+ | Llama-2-chat | 13B | 81.94% | 59.04% | 54.64% |
57
+ | Trurl 2.0 (with MMLU) | 13B | 80.09% | 59.30% | 78.35% |
58
+ | Trurl 2.0 (no MMLU) | 13B | TO-DO | TO-DO | TO-DO|
59
+ | Trurl 2.0 | 7b | TO-DO | TO-DO | TO-DO|
60
 
61
  <img src="https://voicelab.ai/wp-content/uploads/trurl-hero.webp" alt="trurl graphic" style="width:100px;"/>
62