Introduction
This model is quantized version of linear merge of mistralai/Mistral-7B-v0.1 and daisd-ai/anydef-orpo-v2.
Merging
Models were merged to improve quality of the final model (idea) and prevent huge losses during quantization. Merging was done using mergekit with following spec:
models:
- model: mistralai/Mistral-7B-v0.1
parameters:
weight: 0.3
- model: daisd-ai/anydef-orpo-v2
parameters:
weight: 0.7
merge_method: linear
dtype: bfloat16
Quantization
The quantization was applied using LLM Compressor with 512 random examples from anydef-kilt-tasks-v2 dataset. We tested other numbers of examples, but did not see noticeable improvement with higher number of examples during quantization.
The recipe for quantization:
recipe = [
SmoothQuantModifier(smoothing_strength=0.8),
GPTQModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
]
Inference
For inference code you can check our github.
Benchmarks results
Precision (%):
Dataset | anydef-v2 | anydef-v2-quant (this) |
---|---|---|
RSS-500 | 66.89 | 64.90 |
ISTEX-1000 | 85.82 | 84.33 |
Reuters-128 | 64.88 | 68.28 |
TweekiGold | 75.93 | 75.93 |
Retrieval rate (%):
Dataset | anydef-v2 | anydef-v2-quant (this) |
---|---|---|
RSS-500 | 84.11 | 83.44 |
ISTEX-1000 | 97.76 | 97.31 |
Reuters-128 | 83.33 | 83.87 |
TweekiGold | 91.67 | 91.44 |
- Downloads last month
- 62