Pastiche-Crown-Clown-7B-dare
Description
This repo contains GGUF format model files for Pastiche-Crown-Clown-7B-dare.
Files Provided
Name | Quant | Bits | File Size | Remark |
---|---|---|---|---|
pastiche-crown-clown-7b-dare.IQ3_XXS.gguf | IQ3_XXS | 3 | 3.02 GB | 3.06 bpw quantization |
pastiche-crown-clown-7b-dare.IQ3_S.gguf | IQ3_S | 3 | 3.18 GB | 3.44 bpw quantization |
pastiche-crown-clown-7b-dare.IQ3_M.gguf | IQ3_M | 3 | 3.28 GB | 3.66 bpw quantization mix |
pastiche-crown-clown-7b-dare.Q4_0.gguf | Q4_0 | 4 | 4.11 GB | 3.56G, +0.2166 ppl |
pastiche-crown-clown-7b-dare.IQ4_NL.gguf | IQ4_NL | 4 | 4.16 GB | 4.25 bpw non-linear quantization |
pastiche-crown-clown-7b-dare.Q4_K_M.gguf | Q4_K_M | 4 | 4.37 GB | 3.80G, +0.0532 ppl |
pastiche-crown-clown-7b-dare.Q5_K_M.gguf | Q5_K_M | 5 | 5.13 GB | 4.45G, +0.0122 ppl |
pastiche-crown-clown-7b-dare.Q6_K.gguf | Q6_K | 6 | 5.94 GB | 5.15G, +0.0008 ppl |
pastiche-crown-clown-7b-dare.Q8_0.gguf | Q8_0 | 8 | 7.70 GB | 6.70G, +0.0004 ppl |
Parameters
path | type | architecture | rope_theta | sliding_win | max_pos_embed |
---|---|---|---|---|---|
CorticalStack/pastiche-crown-clown-7b-dare | mistral | MistralForCausalLM | 10000.0 | 4096 | 32768 |
Benchmarks
Original Model Card
pastiche-crown-clown-7B-dare
pastiche-crown-clown-7B-dare is a DARE merge of the following models using mergekit:
- bardsai/jaskier-7b-dpo-v5.6
- mlabonne/AlphaMonarch-7B
- mlabonne/NeuralMonarch-7B
- macadeliccc/MBX-7B-v3-DPO
See the paper Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch for more on the method.
🧩 Configuration
models:
- model: bardsai/jaskier-7b-dpo-v5.6
- model: mlabonne/AlphaMonarch-7B
parameters:
density: 0.53
weight: 0.2
- model: mlabonne/NeuralMonarch-7B
parameters:
density: 0.53
weight: 0.4
- model: macadeliccc/MBX-7B-v3-DPO
parameters:
density: 0.53
weight: 0.4
merge_method: dare_ties
base_model: bardsai/jaskier-7b-dpo-v5.6
parameters:
int8_mask: true
dtype: bfloat16
- Downloads last month
- 58