File size: 1,760 Bytes
9ab0e9d 74663f5 44274cb 74663f5 3903c7f 7bc505d 3903c7f 864404b 3903c7f 1bf3476 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
license: llama3.1
---
Replete Storm is an experimental della merge of the highly useful [Replete-LLM-V2-Llama-3.1-8b](https://huggingface.co/Replete-AI/Replete-LLM-V2-Llama-3.1-8b) and the outstanding akjindal53244's [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B), which is based in part on Arcee AI's highly scoring [Llama-Spark](https://huggingface.co/arcee-ai/Llama-Spark). Credits go to the authors and creators of those models.
This merge is crafted to favor Replete in the early and final layers, but favor Llama Storm midway, as you can see in the density and weight coefficients in the mergekit YAML below. The goal is to leave little fine-tuning for the specific qualities of each model to assert themselves: the quality conversation and function-calling, together with Replete's GPQA performance.
---
license: llama3.1
datasets:
- arcee-ai/The-Tome
- Replete-AI/The_Living_AI_Dataset
- Replete-AI/code_bagel_hermes-2.5
language:
- en
base_model: NousResearch/Meta-Llama-3.1-8B
---
```yaml
models:
- model: Replete-AI/Replete-LLM-V2-Llama-3.1-8b
parameters:
density: [ 0.80, 0.60, 0.50, 0.40, 0.50, 0.60, 0.80 ]
epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ]
weight: [ 0.80, 0.65, 0.60, 0.55, 0.60, 0.65, 0.70 ]
lambda: 0.85
- model: akjindal53244/Llama-3.1-Storm-8B
parameters:
density: [ 0.40, 0.60, 0.70, 0.80, 0.70, 0.60, 0.40 ]
epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ]
weight: [ 0.40, 0.70, 0.80, 0.90, 0.80, 0.70, 0.40 ]
lambda: 0.7
merge_method: della
base_model: NousResearch/Meta-Llama-3.1-8B
parameters:
int8_mask: true
normalize: true
rescale: true
dtype: float16
tokenizer_source: union
name: replete-storm
``` |