WestSeverus - 7B - DPO - v2
βοΈ Model Description
WestSeverus-7B-DPO-v2 is a WestLake Family model trained over WestSeverus-7B.
The model was trained on several dpo datasets and it can perform well on basic math problem.
WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference.
π Table of Contents
-
- AGIEval
- GPT4All
- TruthfulQA Scores
- BigBench
-
- ARC
- HellaSwag
- MMLU
- TruthfulQA
- Winogrande
- GSM8K
-
- HumanEval
- HumanEval_Plus
- MBPP
- MBPP_Plus
πͺ Nous Benchmark Results
WestSeverus-7B-DPO-v2 is currently on the top of the YALL - Yet Another LLM Leaderboard created by CultriX and it outperforms on TruthfulQA Scores and BigBench.
Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|---|
WestSeverus-7B-DPO-v2 | 60.98 | 45.29 | 77.2 | 72.72 | 48.71 |
CultriX/Wernicke-7B-v1 | 60.73 | 45.59 | 77.36 | 71.46 | 48.49 |
mlabonne/NeuralBeagle14-7B | 60.25 | 46.06 | 76.77 | 70.32 | 47.86 |
CultriX/MistralTrix-v1 | 60.05 | 44.98 | 76.62 | 71.44 | 47.17 |
senseable/WestLake-7B-v2 | 59.42 | 44.27 | 77.86 | 67.46 | 48.09 |
mlabonne/Daredevil-7B | 58.22 | 44.85 | 76.07 | 64.89 | 47.07 |
microsoft/phi-2 | 44.61 | 27.96 | 70.84 | 44.46 | 35.17 |
π Open LLM Leaderboard
WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K.
Metric | Value |
---|---|
Avg. | 75.29 |
AI2 Reasoning Challenge (25-Shot) | 71.42 |
HellaSwag (10-Shot) | 88.27 |
MMLU (5-Shot) | 64.79 |
TruthfulQA (0-shot) | 72.37 |
Winogrande (5-shot) | 83.27 |
GSM8k (5-shot) | 71.65 |
Detailed results can be found here
β‘ EvalPlus Leaderboard
Model | HumanEval | HumanEval_Plus | MBPP | MBPP_Plus |
---|---|---|---|---|
phi-2-2.7B | 48.2 | 43.3 | 61.9 | 51.4 |
WestSeverus-7B-DPO-v2 | 43.3 | 34.1 | TBD | TBD |
SOLAR-10.7B-Instruct-v1.0 | 42.1 | 34.3 | 42.9 | 34.6 |
CodeLlama-7B | 37.8 | 34.1 | 57.6 | 45.4 |
βοΈ Prompt Format
WestSeverus-7B-DPO-v2 was trained using the ChatML prompt templates with system prompts. An example follows below:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
π οΈ Quantized Models
Another version of WestSeverus Model:
GGUF: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GGUF
GPTQ: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GPTQ
MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF
π Gratitude
- Thanks to @senseable for senseable/WestLake-7B-v2.
- Thanks to @jondurbin for jondurbin/truthy-dpo-v0.1 dataset.
- Thanks to @Charles Goddard for MergeKit.
- Thanks to @TheBloke, @s3nh, @MaziyarPanahi for Quantized Models.
- Thanks to @mlabonne, @CultriX for YALL - Yet Another LLM Leaderboard.
- Thank you to all the other people in the Open Source AI community who utilized this model for further research and improvement.
- Downloads last month
- 630