File size: 1,760 Bytes
9ab0e9d
 
 
74663f5
 
44274cb
74663f5
3903c7f
 
 
 
 
 
 
 
 
 
 
7bc505d
3903c7f
 
 
 
 
 
 
 
 
 
864404b
3903c7f
 
 
 
 
 
 
 
 
 
 
1bf3476
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: llama3.1
---
Replete Storm is an experimental della merge of the highly useful [Replete-LLM-V2-Llama-3.1-8b](https://huggingface.co/Replete-AI/Replete-LLM-V2-Llama-3.1-8b) and the outstanding akjindal53244's [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B), which is based in part on Arcee AI's highly scoring [Llama-Spark](https://huggingface.co/arcee-ai/Llama-Spark).  Credits go to the authors and creators of those models.

This merge is crafted to favor Replete in the early and final layers, but favor Llama Storm midway, as you can see in the density and weight coefficients in the mergekit YAML below.  The goal is to leave little fine-tuning for the specific qualities of each model to assert themselves:  the quality conversation and function-calling, together with Replete's GPQA performance.

---
license: llama3.1
datasets:
- arcee-ai/The-Tome
- Replete-AI/The_Living_AI_Dataset
- Replete-AI/code_bagel_hermes-2.5
language:
- en
base_model: NousResearch/Meta-Llama-3.1-8B
---

```yaml
models:
  - model: Replete-AI/Replete-LLM-V2-Llama-3.1-8b
    parameters:
      density: [ 0.80, 0.60, 0.50, 0.40, 0.50, 0.60, 0.80 ]
      epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ]
      weight:  [ 0.80, 0.65, 0.60, 0.55, 0.60, 0.65, 0.70 ]
      lambda: 0.85
  - model: akjindal53244/Llama-3.1-Storm-8B
    parameters:
      density: [ 0.40, 0.60, 0.70, 0.80, 0.70, 0.60, 0.40 ]
      epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ]
      weight:  [ 0.40, 0.70, 0.80, 0.90, 0.80, 0.70, 0.40 ]
      lambda: 0.7
merge_method: della
base_model: NousResearch/Meta-Llama-3.1-8B
parameters:
  int8_mask: true
  normalize: true
  rescale: true
dtype: float16
tokenizer_source: union
name: replete-storm
```