P0x0 commited on
Commit
7b6dea1
1 Parent(s): ddf011f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -39
README.md CHANGED
@@ -56,42 +56,75 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
56
  I encourage you to provide feedback on the model's performance. If you'd like to create your own quantizations, feel free to do so and let me know how it works for you!
57
 
58
  model-index:
59
- - name: P0x0/Astra-v1-12B
60
- results:
61
- - task:
62
- type: text-generation
63
- dataset:
64
- type: benchmark
65
- name: AI2 Reasoning Challenge (25-Shot)
66
- metrics:
67
- - name: Average
68
- type: Average
69
- value: 19.46
70
- verified: false
71
- - name: IFEval
72
- type: IFEval
73
- value: 28.06
74
- verified: false
75
- - name: BBH
76
- type: BBH
77
- value: 31.81
78
- verified: false
79
- - name: MATH Lvl 5
80
- type: MATH Lvl 5
81
- value: 9.67
82
- verified: false
83
- - name: GPQA
84
- type: GPQA
85
- value: 8.5
86
- verified: false
87
- - name: MUSR
88
- type: MUSR
89
- value: 11.38
90
- verified: false
91
- - name: MMLU-PRO
92
- type: MMLU-PRO
93
- value: 27.34
94
- verified: false
95
- source:
96
- name: Open LLM Leaderboard
97
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  I encourage you to provide feedback on the model's performance. If you'd like to create your own quantizations, feel free to do so and let me know how it works for you!
57
 
58
  model-index:
59
+ - name: P0x0/Astra-v1-12B
60
+ results:
61
+ - task:
62
+ type: text-generation
63
+ dataset:
64
+ type: Aggregate
65
+ name: Average
66
+ metrics:
67
+ - name: Average
68
+ type: Average
69
+ value: 19.46
70
+ verified: false
71
+ - task:
72
+ type: text-generation
73
+ dataset:
74
+ type: IFEval
75
+ name: IFEval
76
+ metrics:
77
+ - name: Score
78
+ type: IFEval
79
+ value: 28.06
80
+ verified: false
81
+ - task:
82
+ type: text-generation
83
+ dataset:
84
+ type: BBH
85
+ name: BBH
86
+ metrics:
87
+ - name: Score
88
+ type: BBH
89
+ value: 31.81
90
+ verified: false
91
+ - task:
92
+ type: text-generation
93
+ dataset:
94
+ type: MATH Lvl 5
95
+ name: MATH Lvl 5
96
+ metrics:
97
+ - name: Score
98
+ type: MATH Lvl 5
99
+ value: 9.67
100
+ verified: false
101
+ - task:
102
+ type: text-generation
103
+ dataset:
104
+ type: GPQA
105
+ name: GPQA
106
+ metrics:
107
+ - name: Score
108
+ type: GPQA
109
+ value: 8.5
110
+ verified: false
111
+ - task:
112
+ type: text-generation
113
+ dataset:
114
+ type: MUSR
115
+ name: MUSR
116
+ metrics:
117
+ - name: Score
118
+ type: MUSR
119
+ value: 11.38
120
+ verified: false
121
+ - task:
122
+ type: text-generation
123
+ dataset:
124
+ type: MMLU-PRO
125
+ name: MMLU-PRO
126
+ metrics:
127
+ - name: Score
128
+ type: MMLU-PRO
129
+ value: 27.34
130
+ verified: false