Edit model card

QuantFactory Banner

QuantFactory/Replete-LLM-V2.5-Qwen-14b-GGUF

This is quantized version of Replete-AI/Replete-LLM-V2.5-Qwen-14b created using llama.cpp

Original Model Card

Replete-LLM-V2.5-Qwen-14b

image/png

Replete-LLM-V2.5-Qwen-14b is a continues finetuned version of Qwen2.5-14B. I noticed recently that the Qwen team did not learn from my methods of continuous finetuning, the great benefits, and no downsides of it. So I took it upon myself to merge the instruct model with the base model myself using the Ties merge method

This version of the model shows higher performance than the original instruct and base models.

Quants:

GGUF: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-14b-GGUF

Benchmarks:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 34.52
IFEval (0-Shot) 58.40
BBH (3-Shot) 49.39
MATH Lvl 5 (4-Shot) 15.63
GPQA (0-shot) 16.22
MuSR (0-shot) 18.83
MMLU-PRO (5-shot) 48.62
Downloads last month
496
GGUF
Model size
14.8B params
Architecture
qwen2

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for QuantFactory/Replete-LLM-V2.5-Qwen-14b-GGUF

Base model

Qwen/Qwen2.5-14B
Quantized
(50)
this model

Evaluation results