thesunday commited on
Commit
a454c14
1 Parent(s): 6083837

Update model card

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md CHANGED
@@ -1,3 +1,51 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
  ---
6
+
7
+ # Model Description
8
+ This is an experiment to test merging 14 models using DARE TIES 🦙
9
+
10
+ The merged model is then merged again with [janai-hq/trinity-v1](https://huggingface.co/janai-hq/trinity-v1) using Gradient SLERP.
11
+ The result is a base model that performs quite well but requires some further instruction fine-tuning.
12
+
13
+ The 14 models are as follows:
14
+ 1. [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
15
+ 2. [ehartford/dolphin-2.2.1-mistral-7b](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b)
16
+ 3. [SciPhi/SciPhi-Mistral-7B-32k](https://huggingface.co/SciPhi/SciPhi-Mistral-7B-32k)
17
+ 4. [ehartford/samantha-1.2-mistral-7b](https://huggingface.co/ehartford/samantha-1.2-mistral-7b)
18
+ 5. [Arc53/docsgpt-7b-mistral](https://huggingface.co/Arc53/docsgpt-7b-mistral)
19
+ 6. [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
20
+ 7. [Q-bert/MetaMath-Cybertron-Starling](https://huggingface.co/Q-bert/MetaMath-Cybertron-Starling)
21
+ 8. [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
22
+ 9. [v1olet/v1olet_marcoroni-go-bruins-merge-7B](https://huggingface.co/v1olet/v1olet_marcoroni-go-bruins-merge-7B)
23
+ 10. [beowolx/MistralHermes-CodePro-7B-v1](https://huggingface.co/beowolx/MistralHermes-CodePro-7B-v1)
24
+ 11. [TIGER-Lab/MAmmoTH-7B-Mistral](https://huggingface.co/TIGER-Lab/MAmmoTH-7B-Mistral)
25
+ 12. [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
26
+ 13. [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp)
27
+ 14. [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B)
28
+
29
+ - base model: [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
30
+
31
+ The yaml config file for this model is here:
32
+
33
+ ```yaml
34
+ slices:
35
+ - sources:
36
+ - model: janai-hq/trinity-v1
37
+ layer_range: [0, 32]
38
+ - model: EmbeddedLLM/Mistral-7B-Merge-14-v0
39
+ layer_range: [0, 32]
40
+ merge_method: slerp
41
+ base_model: janai-hq/trinity-v1
42
+ parameters:
43
+ t:
44
+ - filter: self_attn
45
+ value: [0, 0.5, 0.3, 0.7, 1]
46
+ - filter: mlp
47
+ value: [1, 0.5, 0.7, 0.3, 0]
48
+ - value: 0.5
49
+ dtype: bfloat16
50
+
51
+ ```