File size: 4,016 Bytes
0ccd03f 0acf478 0ccd03f 0acf478 0ccd03f ecf5c1f 0ccd03f ecf5c1f 0ccd03f ecf5c1f 0ccd03f ecf5c1f 0ccd03f ecf5c1f 0ccd03f 137218a 79a5ec9 ecf5c1f 0ccd03f 0a39301 0ccd03f fcf96bb 0ccd03f ecf5c1f 0ccd03f ecf5c1f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
---
license: apache-2.0
datasets:
- argilla/distilabel-intel-orca-dpo-pairs
base_model: sethuiyer/Chikuma_10.7B
library_name: transformers
pipeline_tag: text-generation
tags:
- dpo
---
# Chikuma_10.7B - V2 (Enhanced with DPO)
<p align="center">
<img src="https://huggingface.co/sethuiyer/distilabled_Chikuma_10.7B/resolve/main/chikuma_v2.webp" height="256px" alt="Chikuma">
</p>
This model is the **DPO fine tuned version** of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B), which was a depth upscaled merge of:
* [sethuiyer/SynthIQ-7b](https://huggingface.co/sethuiyer/SynthIQ-7b)
* [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
The name "Chikuma" is inspired by the [Chikuma River](https://en.wikipedia.org/wiki/Shinano_River), the longest in Japan, known for its continuous flow and meandering path.
This metaphorically represents the model's depth, fluidity, and adaptability in processing and understanding language.
# Dataset used for Fine Tuning
Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
The following filters were applied to the original dataset:
```python
dataset = dataset.filter(
lambda r:
r["status"] != "tie" and
r["chosen_score"] >= 8 and
not r["in_gsm8k_train"]
)
```
# Chat Template
The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimized for improved interaction and engagement:
```
<|im_start|>GPT4 Correct system:
{system} Always use <|end_of_turn|> when you want to end the answer. <|im_end|>
<|im_start|>GPT4 Correct user:
{user}<|im_end|>
<|im_start|>GPT4 Correct Assistant:
{asistant}<|im_end|>
```
## Nous Benchmark Evaluation
| Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
|-------------------------------|---------|---------|------------|----------|---------|
| SynthIQ-7b | 42.67 | 73.71 | 56.51 | 44.59 | 54.37 |
| openchat/openchat-3.5-0106 | 44.17 | 73.72 | 52.53 | 44.4 | 53.71 |
| Chikuma_10.7B | 42.41 | 73.41 | 56.69 | 43.5 | 54.00 |
| **distilabled_Chikuma_10.7B** | **42.77** | **73.81** | **58.83** | **44.83** | **55.06** |
# OpenLLM Leaderboard
| Benchmark Name | Performance |
|----------------|-------------|
| ARC | 66.38 |
| HellaSwag | 85 |
| MMLU | 65.27 |
| TruthfulQA | 58.83 |
| Winogrande | 78.77 |
| GSM8K | 63.68 |
| **Average** | **69.65** |
### Training Environment
- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
- Training Script: Accessible via [Google Colab Notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing). Special thanks to [mlabonne](https://huggingface.co/mlabonne) for providing the template.
## Usage
```python
# Format prompt
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(new_model)
# Create pipeline
pipeline = transformers.pipeline(
"text-generation",
model=new_model,
tokenizer=tokenizer,
device="cuda"
)
# Generate text
message = [
{"role": "system", "content": "You are a helpful assistant chatbot."},
{"role": "user", "content": "Who invented LLMs?"}
]
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
sequences = pipeline(
prompt,
max_new_tokens=512
)
print(sequences[0]['generated_text'])
```
## Acknowledgements
A heartfelt appreciation goes to the vibrant open-source community, particularly:
* The Intel team for publishing a great open dataset and show how well it worked in the first place
* Teknium and NousResearch for their awesome work and models.
* Maxime for sharing such great resources.
* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs |