WestSeverus - 7B - DPO - v2

☘️ Model Description

WestSeverus-7B-DPO-v2 is a WestLake Family model trained over WestSeverus-7B.

The model was trained on several dpo datasets and it can perform well on basic math problem.

WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference.

📖 Table of Contents

Nous Benchmark Results
- AGIEval
- GPT4All
- TruthfulQA Scores
- BigBench
Open LLM Leaderboard
- ARC
- HellaSwag
- MMLU
- TruthfulQA
- Winogrande
- GSM8K
EvalPlus Leaderboard
- HumanEval
- HumanEval_Plus
- MBPP
- MBPP_Plus
Prompt Format
Quantized Models
Gratitude

🪄 Nous Benchmark Results

WestSeverus-7B-DPO-v2 is currently on the top of the YALL - Yet Another LLM Leaderboard created by CultriX and it outperforms on TruthfulQA Scores and BigBench.

Model	Average	AGIEval	GPT4All	TruthfulQA	Bigbench
WestSeverus-7B-DPO-v2	60.98	45.29	77.2	72.72	48.71
CultriX/Wernicke-7B-v1	60.73	45.59	77.36	71.46	48.49
mlabonne/NeuralBeagle14-7B	60.25	46.06	76.77	70.32	47.86
CultriX/MistralTrix-v1	60.05	44.98	76.62	71.44	47.17
senseable/WestLake-7B-v2	59.42	44.27	77.86	67.46	48.09
mlabonne/Daredevil-7B	58.22	44.85	76.07	64.89	47.07
microsoft/phi-2	44.61	27.96	70.84	44.46	35.17

🏆 Open LLM Leaderboard

WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K.

Metric	Value
Avg.	75.29
AI2 Reasoning Challenge (25-Shot)	71.42
HellaSwag (10-Shot)	88.27
MMLU (5-Shot)	64.79
TruthfulQA (0-shot)	72.37
Winogrande (5-shot)	83.27
GSM8k (5-shot)	71.65

Detailed results can be found here

⚡ EvalPlus Leaderboard

Model	HumanEval	HumanEval_Plus	MBPP	MBPP_Plus
phi-2-2.7B	48.2	43.3	61.9	51.4
WestSeverus-7B-DPO-v2	43.3	34.1	TBD	TBD
SOLAR-10.7B-Instruct-v1.0	42.1	34.3	42.9	34.6
CodeLlama-7B	37.8	34.1	57.6	45.4

⚗️ Prompt Format

WestSeverus-7B-DPO-v2 was trained using the ChatML prompt templates with system prompts. An example follows below:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

🛠️ Quantized Models

Another version of WestSeverus Model:

PetroGPT/WestSeverus-7B-DPO
GGUF: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GGUF
GGUF: https://huggingface.co/s3nh/WestSeverus-7B-DPO-GGUF
GPTQ: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GPTQ
AWQ: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-AWQ

MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF

GGUF: https://huggingface.co/MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF

🙏 Gratitude

Thanks to @senseable for senseable/WestLake-7B-v2.
Thanks to @jondurbin for jondurbin/truthy-dpo-v0.1 dataset.
Thanks to @Charles Goddard for MergeKit.
Thanks to @TheBloke, @s3nh, @MaziyarPanahi for Quantized Models.
Thanks to @mlabonne, @CultriX for YALL - Yet Another LLM Leaderboard.
Thank you to all the other people in the Open Source AI community who utilized this model for further research and improvement.

PetroGPT
/

WestSeverus-7B-DPO-v2