RichardErkhov commited on
Commit
4f1d35e
1 Parent(s): e43b0fd

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +248 -0
README.md ADDED
@@ -0,0 +1,248 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Moza-7B-v1.0 - bnb 8bits
11
+ - Model creator: https://huggingface.co/kidyu/
12
+ - Original model: https://huggingface.co/kidyu/Moza-7B-v1.0/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ license: apache-2.0
20
+ library_name: transformers
21
+ tags:
22
+ - mergekit
23
+ - merge
24
+ base_model:
25
+ - mistralai/Mistral-7B-v0.1
26
+ - cognitivecomputations/dolphin-2.2.1-mistral-7b
27
+ - Open-Orca/Mistral-7B-OpenOrca
28
+ - openchat/openchat-3.5-0106
29
+ - mlabonne/NeuralHermes-2.5-Mistral-7B
30
+ - GreenNode/GreenNode-mini-7B-multilingual-v1olet
31
+ - berkeley-nest/Starling-LM-7B-alpha
32
+ - viethq188/LeoScorpius-7B-Chat-DPO
33
+ - meta-math/MetaMath-Mistral-7B
34
+ - Intel/neural-chat-7b-v3-3
35
+ inference: false
36
+ model-index:
37
+ - name: Moza-7B-v1.0
38
+ results:
39
+ - task:
40
+ type: text-generation
41
+ name: Text Generation
42
+ dataset:
43
+ name: AI2 Reasoning Challenge (25-Shot)
44
+ type: ai2_arc
45
+ config: ARC-Challenge
46
+ split: test
47
+ args:
48
+ num_few_shot: 25
49
+ metrics:
50
+ - type: acc_norm
51
+ value: 66.55
52
+ name: normalized accuracy
53
+ source:
54
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
55
+ name: Open LLM Leaderboard
56
+ - task:
57
+ type: text-generation
58
+ name: Text Generation
59
+ dataset:
60
+ name: HellaSwag (10-Shot)
61
+ type: hellaswag
62
+ split: validation
63
+ args:
64
+ num_few_shot: 10
65
+ metrics:
66
+ - type: acc_norm
67
+ value: 83.45
68
+ name: normalized accuracy
69
+ source:
70
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
71
+ name: Open LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: MMLU (5-Shot)
77
+ type: cais/mmlu
78
+ config: all
79
+ split: test
80
+ args:
81
+ num_few_shot: 5
82
+ metrics:
83
+ - type: acc
84
+ value: 62.77
85
+ name: accuracy
86
+ source:
87
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: TruthfulQA (0-shot)
94
+ type: truthful_qa
95
+ config: multiple_choice
96
+ split: validation
97
+ args:
98
+ num_few_shot: 0
99
+ metrics:
100
+ - type: mc2
101
+ value: 65.16
102
+ source:
103
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
104
+ name: Open LLM Leaderboard
105
+ - task:
106
+ type: text-generation
107
+ name: Text Generation
108
+ dataset:
109
+ name: Winogrande (5-shot)
110
+ type: winogrande
111
+ config: winogrande_xl
112
+ split: validation
113
+ args:
114
+ num_few_shot: 5
115
+ metrics:
116
+ - type: acc
117
+ value: 77.51
118
+ name: accuracy
119
+ source:
120
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
121
+ name: Open LLM Leaderboard
122
+ - task:
123
+ type: text-generation
124
+ name: Text Generation
125
+ dataset:
126
+ name: GSM8k (5-shot)
127
+ type: gsm8k
128
+ config: main
129
+ split: test
130
+ args:
131
+ num_few_shot: 5
132
+ metrics:
133
+ - type: acc
134
+ value: 62.55
135
+ name: accuracy
136
+ source:
137
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0
138
+ name: Open LLM Leaderboard
139
+ ---
140
+ # Moza-7B-v1.0
141
+
142
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63474d73511cd17d2c790ed7/e7hw2xIzfpUseCFEOINg7.png)
143
+
144
+ This is a [meme-merge](https://en.wikipedia.org/wiki/Joke) of pre-trained language models,
145
+ created using [mergekit](https://github.com/cg123/mergekit).
146
+ Use at your own risk.
147
+
148
+ ## Details
149
+ ### Quantized Model
150
+ - [GGUF](https://huggingface.co/kidyu/Moza-7B-v1.0-GGUF)
151
+
152
+ ### Merge Method
153
+
154
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method,
155
+ using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
156
+
157
+ The value for `density` are from [this blogpost](https://huggingface.co/blog/mlabonne/merge-models),
158
+ and the weight was randomly generated and then assigned to the models,
159
+ with priority (of using the bigger weight) to `NeuralHermes`, `OpenOrca`, and `neural-chat`.
160
+ The models themselves are chosen by "vibes".
161
+
162
+ ### Models Merged
163
+
164
+ The following models were included in the merge:
165
+ * [cognitivecomputations/dolphin-2.2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.2.1-mistral-7b)
166
+ * [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
167
+ * [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
168
+ * [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)
169
+ * [GreenNode/GreenNode-mini-7B-multilingual-v1olet](https://huggingface.co/GreenNode/GreenNode-mini-7B-multilingual-v1olet)
170
+ * [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
171
+ * [viethq188/LeoScorpius-7B-Chat-DPO](https://huggingface.co/viethq188/LeoScorpius-7B-Chat-DPO)
172
+ * [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)
173
+ * [Intel/neural-chat-7b-v3-3](https://huggingface.co/Intel/neural-chat-7b-v3-3)
174
+
175
+ ### Prompt Format
176
+
177
+ You can use `Alpaca` formatting for inference
178
+
179
+ ```
180
+ ### Instruction:
181
+
182
+ ### Response:
183
+ ```
184
+
185
+ ### Configuration
186
+
187
+ The following YAML configuration was used to produce this model:
188
+
189
+ ```yaml
190
+ base_model: mistralai/Mistral-7B-v0.1
191
+ models:
192
+ - model: mlabonne/NeuralHermes-2.5-Mistral-7B
193
+ parameters:
194
+ density: 0.63
195
+ weight: 0.83
196
+ - model: Intel/neural-chat-7b-v3-3
197
+ parameters:
198
+ density: 0.63
199
+ weight: 0.74
200
+ - model: meta-math/MetaMath-Mistral-7B
201
+ parameters:
202
+ density: 0.63
203
+ weight: 0.22
204
+ - model: openchat/openchat-3.5-0106
205
+ parameters:
206
+ density: 0.63
207
+ weight: 0.37
208
+ - model: Open-Orca/Mistral-7B-OpenOrca
209
+ parameters:
210
+ density: 0.63
211
+ weight: 0.76
212
+ - model: cognitivecomputations/dolphin-2.2.1-mistral-7b
213
+ parameters:
214
+ density: 0.63
215
+ weight: 0.69
216
+ - model: viethq188/LeoScorpius-7B-Chat-DPO
217
+ parameters:
218
+ density: 0.63
219
+ weight: 0.38
220
+ - model: GreenNode/GreenNode-mini-7B-multilingual-v1olet
221
+ parameters:
222
+ density: 0.63
223
+ weight: 0.13
224
+ - model: berkeley-nest/Starling-LM-7B-alpha
225
+ parameters:
226
+ density: 0.63
227
+ weight: 0.33
228
+ merge_method: dare_ties
229
+ parameters:
230
+ normalize: true
231
+ int8_mask: true
232
+ dtype: bfloat16
233
+ ```
234
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
235
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_kidyu__Moza-7B-v1.0)
236
+
237
+ | Metric |Value|
238
+ |---------------------------------|----:|
239
+ |Avg. |69.66|
240
+ |AI2 Reasoning Challenge (25-Shot)|66.55|
241
+ |HellaSwag (10-Shot) |83.45|
242
+ |MMLU (5-Shot) |62.77|
243
+ |TruthfulQA (0-shot) |65.16|
244
+ |Winogrande (5-shot) |77.51|
245
+ |GSM8k (5-shot) |62.55|
246
+
247
+
248
+