--- license: llama3.1 datasets: - Replete-AI/code_bagel_hermes-2.5 - Replete-AI/The_Living_AI_Dataset - arcee-ai/The-Tome --- Replete Storm is an experimental della merge of the highly useful [Replete-LLM-V2-Llama-3.1-8b](https://huggingface.co/Replete-AI/Replete-LLM-V2-Llama-3.1-8b) and the outstanding akjindal53244's [Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B), which is based in part on Arcee AI's highly scoring [Llama-Spark](https://huggingface.co/arcee-ai/Llama-Spark). Credits go to the authors and creators of those models. This merge is crafted to favor Replete in the early and final layers, but favor Llama Storm midway, as you can see in the density and weight coefficients in the mergekit YAML below. The goal is to leave little fine-tuning for the specific qualities of each model to assert themselves: the quality conversation conversation, function-calling, and truthfulness, together with Replete's GPQA performance. --- license: llama3.1 datasets: - arcee-ai/The-Tome - Replete-AI/The_Living_AI_Dataset - Replete-AI/code_bagel_hermes-2.5 language: - en base_model: NousResearch/Meta-Llama-3.1-8B --- ```yaml models: - model: Replete-AI/Replete-LLM-V2-Llama-3.1-8b parameters: density: [ 0.80, 0.60, 0.50, 0.40, 0.50, 0.60, 0.80 ] epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ] weight: [ 0.80, 0.65, 0.60, 0.55, 0.60, 0.65, 0.70 ] lambda: 0.85 - model: akjindal53244/Llama-3.1-Storm-8B parameters: density: [ 0.40, 0.60, 0.70, 0.80, 0.70, 0.60, 0.40 ] epsilon: [ 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10 ] weight: [ 0.40, 0.70, 0.80, 0.90, 0.80, 0.70, 0.40 ] lambda: 0.7 merge_method: della base_model: NousResearch/Meta-Llama-3.1-8B parameters: int8_mask: true normalize: true rescale: true dtype: float16 tokenizer_source: union name: replete-storm ```