metadata
license: llama3
model-index:
- name: Llama-3-70Bx2-MOE
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 54.82
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 51.42
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 19.86
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 19.13
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 20.85
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 46.02
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
name: Open LLM Leaderboard
This is my first MOE model based on Llama-3 70B.
example output
<|begin_of_text|>write me a story about peppa pig. Peppa Pig was a very busy pig. She had a lot of things to do today. She needed to get ready for a very important event, the annual mud festival. This year, she was going to be the main judge of the muddy puddle contest. She was very excited and couldn't wait to see all the muddy puddles that everyone would bring. But first, she needed to get ready. She took a bath, put on her favorite dress, and did her hair. When she was all ready, she headed downstairs to the kitchen, where her mom was making breakfast. She had made Peppa's favorite, pancakes with strawberries and whipped cream. Peppa was so happy and sat down to eat her breakfast. After she finished eating, she headed out the door to go to the festival. When she arrived, she saw all of her friends waiting for her. There was Suzy Sheep, Danny Dog, and Emily Elephant. They all said hello and Peppa was so happy to see them. The festival was about to start and Peppa took her place at the judges table. The first contestant was Suzy Sheep and she had brought a puddle that was very muddy and squelchy. Peppa loved it and gave it a score of 9 out of 10. Next was Danny Dog and he had brought a puddle that was even bigger and muddier than Suzy's. Peppa was very impressed and gave it a score of 9.5 out of 10. Then it was Emily Elephant's turn and she had brought a puddle that was so muddy, it looked like a swamp. Peppa was amazed and gave it a score of 10 out of 10. After all the contestants had presented their puddles, Peppa had to make a very difficult decision. She had to choose the winner of the muddy puddle contest. She thought for a long time and finally, she made her decision. The winner was Emily Elephant. Everyone cheered and clapped for Emily and Peppa gave her a big hug and a trophy. The festival was a huge success and Peppa was so happy to have been a part of it. She couldn't wait to do it all again next year. The end.<|eot_id|> [end of text]
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 35.35 |
IFEval (0-Shot) | 54.82 |
BBH (3-Shot) | 51.42 |
MATH Lvl 5 (4-Shot) | 19.86 |
GPQA (0-shot) | 19.13 |
MuSR (0-shot) | 20.85 |
MMLU-PRO (5-shot) | 46.02 |