leaderboard-pr-bot commited on
Commit
8c7e334
1 Parent(s): b8bd85e

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +109 -1
README.md CHANGED
@@ -1,5 +1,100 @@
1
  ---
2
  license: llama3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
  This is my first MOE model based on Llama-3 70B.
5
 
@@ -9,4 +104,17 @@ example output
9
 
10
  ```
11
  <|begin_of_text|>write me a story about peppa pig. Peppa Pig was a very busy pig. She had a lot of things to do today. She needed to get ready for a very important event, the annual mud festival. This year, she was going to be the main judge of the muddy puddle contest. She was very excited and couldn't wait to see all the muddy puddles that everyone would bring. But first, she needed to get ready. She took a bath, put on her favorite dress, and did her hair. When she was all ready, she headed downstairs to the kitchen, where her mom was making breakfast. She had made Peppa's favorite, pancakes with strawberries and whipped cream. Peppa was so happy and sat down to eat her breakfast. After she finished eating, she headed out the door to go to the festival. When she arrived, she saw all of her friends waiting for her. There was Suzy Sheep, Danny Dog, and Emily Elephant. They all said hello and Peppa was so happy to see them. The festival was about to start and Peppa took her place at the judges table. The first contestant was Suzy Sheep and she had brought a puddle that was very muddy and squelchy. Peppa loved it and gave it a score of 9 out of 10. Next was Danny Dog and he had brought a puddle that was even bigger and muddier than Suzy's. Peppa was very impressed and gave it a score of 9.5 out of 10. Then it was Emily Elephant's turn and she had brought a puddle that was so muddy, it looked like a swamp. Peppa was amazed and gave it a score of 10 out of 10. After all the contestants had presented their puddles, Peppa had to make a very difficult decision. She had to choose the winner of the muddy puddle contest. She thought for a long time and finally, she made her decision. The winner was Emily Elephant. Everyone cheered and clapped for Emily and Peppa gave her a big hug and a trophy. The festival was a huge success and Peppa was so happy to have been a part of it. She couldn't wait to do it all again next year. The end.<|eot_id|> [end of text]
12
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama3
3
+ model-index:
4
+ - name: Llama-3-70Bx2-MOE
5
+ results:
6
+ - task:
7
+ type: text-generation
8
+ name: Text Generation
9
+ dataset:
10
+ name: IFEval (0-Shot)
11
+ type: HuggingFaceH4/ifeval
12
+ args:
13
+ num_few_shot: 0
14
+ metrics:
15
+ - type: inst_level_strict_acc and prompt_level_strict_acc
16
+ value: 54.82
17
+ name: strict accuracy
18
+ source:
19
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
20
+ name: Open LLM Leaderboard
21
+ - task:
22
+ type: text-generation
23
+ name: Text Generation
24
+ dataset:
25
+ name: BBH (3-Shot)
26
+ type: BBH
27
+ args:
28
+ num_few_shot: 3
29
+ metrics:
30
+ - type: acc_norm
31
+ value: 51.42
32
+ name: normalized accuracy
33
+ source:
34
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
35
+ name: Open LLM Leaderboard
36
+ - task:
37
+ type: text-generation
38
+ name: Text Generation
39
+ dataset:
40
+ name: MATH Lvl 5 (4-Shot)
41
+ type: hendrycks/competition_math
42
+ args:
43
+ num_few_shot: 4
44
+ metrics:
45
+ - type: exact_match
46
+ value: 19.86
47
+ name: exact match
48
+ source:
49
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
50
+ name: Open LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: GPQA (0-shot)
56
+ type: Idavidrein/gpqa
57
+ args:
58
+ num_few_shot: 0
59
+ metrics:
60
+ - type: acc_norm
61
+ value: 19.13
62
+ name: acc_norm
63
+ source:
64
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
65
+ name: Open LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: MuSR (0-shot)
71
+ type: TAUR-Lab/MuSR
72
+ args:
73
+ num_few_shot: 0
74
+ metrics:
75
+ - type: acc_norm
76
+ value: 20.85
77
+ name: acc_norm
78
+ source:
79
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: MMLU-PRO (5-shot)
86
+ type: TIGER-Lab/MMLU-Pro
87
+ config: main
88
+ split: test
89
+ args:
90
+ num_few_shot: 5
91
+ metrics:
92
+ - type: acc
93
+ value: 46.02
94
+ name: accuracy
95
+ source:
96
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
97
+ name: Open LLM Leaderboard
98
  ---
99
  This is my first MOE model based on Llama-3 70B.
100
 
 
104
 
105
  ```
106
  <|begin_of_text|>write me a story about peppa pig. Peppa Pig was a very busy pig. She had a lot of things to do today. She needed to get ready for a very important event, the annual mud festival. This year, she was going to be the main judge of the muddy puddle contest. She was very excited and couldn't wait to see all the muddy puddles that everyone would bring. But first, she needed to get ready. She took a bath, put on her favorite dress, and did her hair. When she was all ready, she headed downstairs to the kitchen, where her mom was making breakfast. She had made Peppa's favorite, pancakes with strawberries and whipped cream. Peppa was so happy and sat down to eat her breakfast. After she finished eating, she headed out the door to go to the festival. When she arrived, she saw all of her friends waiting for her. There was Suzy Sheep, Danny Dog, and Emily Elephant. They all said hello and Peppa was so happy to see them. The festival was about to start and Peppa took her place at the judges table. The first contestant was Suzy Sheep and she had brought a puddle that was very muddy and squelchy. Peppa loved it and gave it a score of 9 out of 10. Next was Danny Dog and he had brought a puddle that was even bigger and muddier than Suzy's. Peppa was very impressed and gave it a score of 9.5 out of 10. Then it was Emily Elephant's turn and she had brought a puddle that was so muddy, it looked like a swamp. Peppa was amazed and gave it a score of 10 out of 10. After all the contestants had presented their puddles, Peppa had to make a very difficult decision. She had to choose the winner of the muddy puddle contest. She thought for a long time and finally, she made her decision. The winner was Emily Elephant. Everyone cheered and clapped for Emily and Peppa gave her a big hug and a trophy. The festival was a huge success and Peppa was so happy to have been a part of it. She couldn't wait to do it all again next year. The end.<|eot_id|> [end of text]
107
+ ```
108
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
109
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_cloudyu__Llama-3-70Bx2-MOE)
110
+
111
+ | Metric |Value|
112
+ |-------------------|----:|
113
+ |Avg. |35.35|
114
+ |IFEval (0-Shot) |54.82|
115
+ |BBH (3-Shot) |51.42|
116
+ |MATH Lvl 5 (4-Shot)|19.86|
117
+ |GPQA (0-shot) |19.13|
118
+ |MuSR (0-shot) |20.85|
119
+ |MMLU-PRO (5-shot) |46.02|
120
+