asiansoul commited on
Commit
fab2134
1 Parent(s): f79147a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -14
README.md CHANGED
@@ -1,7 +1,8 @@
1
  ---
 
 
2
  language:
3
  - en
4
- - ko
5
  pipeline_tag: text-generation
6
  tags:
7
  - mergekit
@@ -188,10 +189,14 @@ extra_gated_fields:
188
  extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
189
  extra_gated_button_content: Submit
190
  ---
191
- # KoDolphin
192
 
193
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
194
 
 
 
 
 
195
  ## Merge Details
196
  ### Merge Method
197
 
@@ -200,8 +205,7 @@ This model was merged using the passthrough merge method.
200
  ### Models Merged
201
 
202
  The following models were included in the merge:
203
- * [beomi/Llama-3-Open-Ko-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-Open-Ko-8B-Instruct-preview)
204
- * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
205
 
206
  ### Configuration
207
 
@@ -210,14 +214,11 @@ The following YAML configuration was used to produce this model:
210
  ```yaml
211
  slices:
212
  - sources:
213
- - model: beomi/Llama-3-Open-Ko-8B-Instruct-preview
214
- layer_range: [0, 20] # Use foundational and intermediate language processing layers in Korean
215
  - sources:
216
- - model: cognitivecomputations/dolphin-2.9-llama3-8b
217
- layer_range: [15, 24] # Utilize advanced coding and domain-specific layers
218
-
219
- merge_method: passthrough # Direct combination of layers without transformation
220
- dtype: float16 # Efficient resource usage
221
-
222
-
223
- ```
 
1
  ---
2
+ base_model:
3
+ - Undi95/Meta-Llama-3-8B-Instruct-hf
4
  language:
5
  - en
 
6
  pipeline_tag: text-generation
7
  tags:
8
  - mergekit
 
189
  extra_gated_description: The information you provide will be collected, stored, processed and shared in accordance with the [Meta Privacy Policy](https://www.facebook.com/privacy/policy/).
190
  extra_gated_button_content: Submit
191
  ---
192
+ # Meta-Llama-3-11.5B-Instruct
193
 
194
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
195
 
196
+ I had this idea at night that it would make sense to make a frankenmerge of Llama 3.. since we didn't get a 13B or 34B versions this time..
197
+
198
+ Here's the same thing but for the base model: [mpasila/Meta-Llama-3-11.5B](https://huggingface.co/mpasila/Meta-Llama-3-11.5B/)
199
+
200
  ## Merge Details
201
  ### Merge Method
202
 
 
205
  ### Models Merged
206
 
207
  The following models were included in the merge:
208
+ * [Undi95/Meta-Llama-3-8B-Instruct-hf](https://huggingface.co/Undi95/Meta-Llama-3-8B-Instruct-hf)
 
209
 
210
  ### Configuration
211
 
 
214
  ```yaml
215
  slices:
216
  - sources:
217
+ - model: Undi95/Meta-Llama-3-8B-Instruct-hf
218
+ layer_range: [0, 24]
219
  - sources:
220
+ - model: Undi95/Meta-Llama-3-8B-Instruct-hf
221
+ layer_range: [8, 32]
222
+ merge_method: passthrough
223
+ dtype: bfloat16
224
+ ```