Update 2023-12-19

In light of dataset contamination issue among the merged models raised by the community in recent days, in particular berkeley-nest/Starling-LM-7B-alpha, Q-bert/MetaMath-Cybertron-Starling, and janai-hq/trinity-v1, we decided to remake another model without the models mentioned. Additionally, their CC-by-NC-4.0 license is restrictive and thus are not suitable for an open model.

Open LLM Leaderboard

For reference, this model obtained an average score of 72.88.

Average	72.88
ARC	68.86
HellaSwag	87.01
MMLU	65.05
TruthfulQA	64.19
Winogrande	81.69
GSM8K	70.51

Model Description

This is an experiment to test merging 14 models using DARE TIES 🦙

The merged model is then merged again with janai-hq/trinity-v1 using Gradient SLERP. The result is a base model that performs quite well but requires some further instruction fine-tuning.

The 14 models are as follows:

base model: mistralai/Mistral-7B-v0.1

The yaml config file for this model is here:

slices:
  - sources:
      - model: janai-hq/trinity-v1
        layer_range: [0, 32]
      - model: EmbeddedLLM/Mistral-7B-Merge-14-v0
        layer_range: [0, 32]
merge_method: slerp
base_model: janai-hq/trinity-v1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: bfloat16

EmbeddedLLM
/

Mistral-7B-Merge-14-v0.2

Update 2023-12-19

Open LLM Leaderboard

Model Description

Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0.2