Quantization made by Richard Erkhov.
Moza-7B-v1.0 - bnb 8bits
- Model creator: https://huggingface.co/kidyu/
- Original model: https://huggingface.co/kidyu/Moza-7B-v1.0/
Original model description:
license: apache-2.0 library_name: transformers tags: - mergekit - merge base_model: - mistralai/Mistral-7B-v0.1 - cognitivecomputations/dolphin-2.2.1-mistral-7b - Open-Orca/Mistral-7B-OpenOrca - openchat/openchat-3.5-0106 - mlabonne/NeuralHermes-2.5-Mistral-7B - GreenNode/GreenNode-mini-7B-multilingual-v1olet - berkeley-nest/Starling-LM-7B-alpha - viethq188/LeoScorpius-7B-Chat-DPO - meta-math/MetaMath-Mistral-7B - Intel/neural-chat-7b-v3-3 inference: false model-index: - name: Moza-7B-v1.0 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 66.55 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 83.45 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 62.77 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 65.16 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 77.51 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 62.55 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kidyu/Moza-7B-v1.0 name: Open LLM Leaderboard
Moza-7B-v1.0
This is a meme-merge of pre-trained language models, created using mergekit. Use at your own risk.
Details
Quantized Model
Merge Method
This model was merged using the DARE TIES merge method, using mistralai/Mistral-7B-v0.1 as a base.
The value for density
are from this blogpost,
and the weight was randomly generated and then assigned to the models,
with priority (of using the bigger weight) to NeuralHermes
, OpenOrca
, and neural-chat
.
The models themselves are chosen by "vibes".
Models Merged
The following models were included in the merge:
- cognitivecomputations/dolphin-2.2.1-mistral-7b
- Open-Orca/Mistral-7B-OpenOrca
- openchat/openchat-3.5-0106
- mlabonne/NeuralHermes-2.5-Mistral-7B
- GreenNode/GreenNode-mini-7B-multilingual-v1olet
- berkeley-nest/Starling-LM-7B-alpha
- viethq188/LeoScorpius-7B-Chat-DPO
- meta-math/MetaMath-Mistral-7B
- Intel/neural-chat-7b-v3-3
Prompt Format
You can use Alpaca
formatting for inference
### Instruction:
### Response:
Configuration
The following YAML configuration was used to produce this model:
base_model: mistralai/Mistral-7B-v0.1
models:
- model: mlabonne/NeuralHermes-2.5-Mistral-7B
parameters:
density: 0.63
weight: 0.83
- model: Intel/neural-chat-7b-v3-3
parameters:
density: 0.63
weight: 0.74
- model: meta-math/MetaMath-Mistral-7B
parameters:
density: 0.63
weight: 0.22
- model: openchat/openchat-3.5-0106
parameters:
density: 0.63
weight: 0.37
- model: Open-Orca/Mistral-7B-OpenOrca
parameters:
density: 0.63
weight: 0.76
- model: cognitivecomputations/dolphin-2.2.1-mistral-7b
parameters:
density: 0.63
weight: 0.69
- model: viethq188/LeoScorpius-7B-Chat-DPO
parameters:
density: 0.63
weight: 0.38
- model: GreenNode/GreenNode-mini-7B-multilingual-v1olet
parameters:
density: 0.63
weight: 0.13
- model: berkeley-nest/Starling-LM-7B-alpha
parameters:
density: 0.63
weight: 0.33
merge_method: dare_ties
parameters:
normalize: true
int8_mask: true
dtype: bfloat16
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 69.66 |
AI2 Reasoning Challenge (25-Shot) | 66.55 |
HellaSwag (10-Shot) | 83.45 |
MMLU (5-Shot) | 62.77 |
TruthfulQA (0-shot) | 65.16 |
Winogrande (5-shot) | 77.51 |
GSM8k (5-shot) | 62.55 |