Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,8 @@
|
|
1 |
---
|
|
|
|
|
2 |
language:
|
3 |
- en
|
4 |
-
- ko
|
5 |
pipeline_tag: text-generation
|
6 |
tags:
|
7 |
- mergekit
|
@@ -188,10 +189,14 @@ extra_gated_fields:
|
|
188 |
extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
|
189 |
extra_gated_button_content: Submit
|
190 |
---
|
191 |
-
#
|
192 |
|
193 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
194 |
|
|
|
|
|
|
|
|
|
195 |
## Merge Details
|
196 |
### Merge Method
|
197 |
|
@@ -200,8 +205,7 @@ This model was merged using the passthrough merge method.
|
|
200 |
### Models Merged
|
201 |
|
202 |
The following models were included in the merge:
|
203 |
-
* [
|
204 |
-
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
|
205 |
|
206 |
### Configuration
|
207 |
|
@@ -210,14 +214,11 @@ The following YAML configuration was used to produce this model:
|
|
210 |
```yaml
|
211 |
slices:
|
212 |
- sources:
|
213 |
-
|
214 |
-
|
215 |
- sources:
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
```
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- Undi95/Meta-Llama-3-8B-Instruct-hf
|
4 |
language:
|
5 |
- en
|
|
|
6 |
pipeline_tag: text-generation
|
7 |
tags:
|
8 |
- mergekit
|
|
|
189 |
extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
|
190 |
extra_gated_button_content: Submit
|
191 |
---
|
192 |
+
# Meta-Llama-3-11.5B-Instruct
|
193 |
|
194 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
195 |
|
196 |
+
I had this idea at night that it would make sense to make a frankenmerge of Llama 3.. since we didn't get a 13B or 34B versions this time..
|
197 |
+
|
198 |
+
Here's the same thing but for the base model: [mpasila/Meta-Llama-3-11.5B](https://huggingface.co/mpasila/Meta-Llama-3-11.5B/)
|
199 |
+
|
200 |
## Merge Details
|
201 |
### Merge Method
|
202 |
|
|
|
205 |
### Models Merged
|
206 |
|
207 |
The following models were included in the merge:
|
208 |
+
* [Undi95/Meta-Llama-3-8B-Instruct-hf](https://huggingface.co/Undi95/Meta-Llama-3-8B-Instruct-hf)
|
|
|
209 |
|
210 |
### Configuration
|
211 |
|
|
|
214 |
```yaml
|
215 |
slices:
|
216 |
- sources:
|
217 |
+
- model: Undi95/Meta-Llama-3-8B-Instruct-hf
|
218 |
+
layer_range: [0, 24]
|
219 |
- sources:
|
220 |
+
- model: Undi95/Meta-Llama-3-8B-Instruct-hf
|
221 |
+
layer_range: [8, 32]
|
222 |
+
merge_method: passthrough
|
223 |
+
dtype: bfloat16
|
224 |
+
```
|
|
|
|
|
|