File size: 4,670 Bytes
7f58f27 27ccf75 7f58f27 27ccf75 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
---
base_model:
- mistralai/Mixtral-8x7B-v0.1
- Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora
- mistralai/Mixtral-8x7B-v0.1
- LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA
- rombodawg/Open_Gpt4_8x7B_v0.2
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/Mixtral-8x7B-v0.1
- Sao10K/Typhon-Mixtral-v1
tags:
- mergekit
- merge
license: cc-by-4.0
---
# mergeout
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) as a base.
### Models Merged
The following models were included in the merge:
* [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) + [Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora)
* [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) + [LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA](https://huggingface.co/LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA)
* [rombodawg/Open_Gpt4_8x7B_v0.2](https://huggingface.co/rombodawg/Open_Gpt4_8x7B_v0.2)
* [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
* [Sao10K/Typhon-Mixtral-v1](https://huggingface.co/Sao10K/Typhon-Mixtral-v1)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: mistralai/Mixtral-8x7B-Instruct-v0.1
parameters:
density: 0.6
weight: 1.0
- model: rombodawg/Open_Gpt4_8x7B_v0.2
parameters:
density: 0.5
weight: 0.8
- model: mistralai/Mixtral-8x7B-v0.1+LoneStriker/Air-Striker-Mixtral-8x7B-ZLoss-LoRA
parameters:
density: 0.5
weight: 0.6
- model: Sao10K/Typhon-Mixtral-v1
parameters:
density: 0.5
weight: 0.7
- model: mistralai/Mixtral-8x7B-v0.1+Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora
parameters:
density: 0.5
weight: 0.4
merge_method: dare_ties
base_model: mistralai/Mixtral-8x7B-v0.1
parameters:
normalize: true
int8_mask: true
dtype: bfloat16
name: Mega-Destroyer-8x7B
```
Hello everyone, this is Dampf. You might know me as the creator of Mythical-Destroyer-13B.
This time, I collaborated with Mr.DragonFox aka FoxEngineAi, harnessing his powerful rig to deliver a Merge of multiple high quality Mixtral 8x7B models. My goal was to beat Bagel-Mistery-Tour V2 by Ycros and create the best Mixtral model to date. Did I succeed? Please try it out and decide for yourself!
Aside from the obvious Mixtral Instruct, to keep its intelligence, I've merged Rombo's excellent Open_Gpt4_v0.2 model that consists of Jon Durbin's Bagel-DPO-8x7B and another highly regarded model, namely smelborp/MixtralOrochi8x7B. This model also combines different datasets together, meaning it should be agood fit for every task you throw at it. This model acts like the reasoning part in the merge.
In contrast, we have Air-Striker and LimaRP at the creative side which will allow for great roleplays in different styles, they are also a good fit to enhance the model's writing capabilities greatly.
And finally, I've merged Sao10K/Typhon-Mixtral-v1 to boost the story writing capabilities even further. It includes KoboldAI's latest Holodeck model, as well as a couple of his latest models and combines it into one package. My hope is that this will capture the magic Sao10K/Fimbulvetr-11B-v2 emits, just at the intelligence level of a Mixtral model. This one also includes Nous Hermes 2 DPO, a high quality instruct model that will boost its intelligence and sorta act like a balancer to all the creative stuff in the merge.
What we have here is a model that should be fantastic at instruct and roleplay/creative tasks a like. So basically a general purpose model. Perhaps the pinnacle of Rocksmashing? Idk xD I just know it includes nearly all datasets on the sun. As a reason, it will likely work with every prompt format as well. So feel free to use Alpaca, Vicuna, ChatML, Llama 2 Chat or whatever your heart desires.
A huge thank you to the creators of these fantastic datasets and fine tunes in the respective merges, namely Jon Durbin, Teknium, Sao10K, MistralAI, LoneStriker, NeverSleep, Suikamelon, Doctor-Shotgun, KoboldAI and more. All credit goes to them. A thank you to the creators of the different merges I've merged (Mergeception!) as well! And of course a thank you to MrDragonFox for lending his compute! Please enjoy :D |