Update README.md (#1)
Browse files- Update README.md (bf54899a81e9b238292887279ced88a59f36431e)
Co-authored-by: FBL <[email protected]>
README.md
CHANGED
@@ -10,8 +10,131 @@ datasets:
|
|
10 |
- mlabonne/orpo-dpo-mix-40k
|
11 |
quantized_by: bartowski
|
12 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
---
|
14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
## Llamacpp imatrix Quantizations of UNA-ThePitbull-21.4B-v2
|
16 |
|
17 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3001">b3001</a> for quantization.
|
@@ -105,3 +228,95 @@ These I-quants can also be used on CPU and Apple Metal, but will be slower than
|
|
105 |
The I-quants are *not* compatible with Vulcan, which is also AMD, so if you have an AMD card double check if you're using the rocBLAS build or the Vulcan build. At the time of writing this, LM Studio has a preview with ROCm support, and other inference engines have specific builds for ROCm.
|
106 |
|
107 |
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
- mlabonne/orpo-dpo-mix-40k
|
11 |
quantized_by: bartowski
|
12 |
pipeline_tag: text-generation
|
13 |
+
model-index:
|
14 |
+
- name: UNA-ThePitbull-21.4B-v2
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
type: text-generation
|
18 |
+
name: Text Generation
|
19 |
+
dataset:
|
20 |
+
name: AI2 Reasoning Challenge (25-Shot)
|
21 |
+
type: ai2_arc
|
22 |
+
config: ARC-Challenge
|
23 |
+
split: test
|
24 |
+
args:
|
25 |
+
num_few_shot: 25
|
26 |
+
metrics:
|
27 |
+
- type: acc_norm
|
28 |
+
value: 77.73
|
29 |
+
name: normalized accuracy
|
30 |
+
source:
|
31 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
32 |
+
name: Open LLM Leaderboard
|
33 |
+
- task:
|
34 |
+
type: text-generation
|
35 |
+
name: Text Generation
|
36 |
+
dataset:
|
37 |
+
name: HellaSwag (10-Shot)
|
38 |
+
type: hellaswag
|
39 |
+
split: validation
|
40 |
+
args:
|
41 |
+
num_few_shot: 10
|
42 |
+
metrics:
|
43 |
+
- type: acc_norm
|
44 |
+
value: 91.79
|
45 |
+
name: normalized accuracy
|
46 |
+
source:
|
47 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
48 |
+
name: Open LLM Leaderboard
|
49 |
+
- task:
|
50 |
+
type: text-generation
|
51 |
+
name: Text Generation
|
52 |
+
dataset:
|
53 |
+
name: MMLU (5-Shot)
|
54 |
+
type: cais/mmlu
|
55 |
+
config: all
|
56 |
+
split: test
|
57 |
+
args:
|
58 |
+
num_few_shot: 5
|
59 |
+
metrics:
|
60 |
+
- type: acc
|
61 |
+
value: 68.25
|
62 |
+
name: accuracy
|
63 |
+
source:
|
64 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
65 |
+
name: Open LLM Leaderboard
|
66 |
+
- task:
|
67 |
+
type: text-generation
|
68 |
+
name: Text Generation
|
69 |
+
dataset:
|
70 |
+
name: TruthfulQA (0-shot)
|
71 |
+
type: truthful_qa
|
72 |
+
config: multiple_choice
|
73 |
+
split: validation
|
74 |
+
args:
|
75 |
+
num_few_shot: 0
|
76 |
+
metrics:
|
77 |
+
- type: mc2
|
78 |
+
value: 78.24
|
79 |
+
source:
|
80 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
81 |
+
name: Open LLM Leaderboard
|
82 |
+
- task:
|
83 |
+
type: text-generation
|
84 |
+
name: Text Generation
|
85 |
+
dataset:
|
86 |
+
name: Winogrande (5-shot)
|
87 |
+
type: winogrande
|
88 |
+
config: winogrande_xl
|
89 |
+
split: validation
|
90 |
+
args:
|
91 |
+
num_few_shot: 5
|
92 |
+
metrics:
|
93 |
+
- type: acc
|
94 |
+
value: 87.37
|
95 |
+
name: accuracy
|
96 |
+
source:
|
97 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
98 |
+
name: Open LLM Leaderboard
|
99 |
+
- task:
|
100 |
+
type: text-generation
|
101 |
+
name: Text Generation
|
102 |
+
dataset:
|
103 |
+
name: GSM8k (5-shot)
|
104 |
+
type: gsm8k
|
105 |
+
config: main
|
106 |
+
split: test
|
107 |
+
args:
|
108 |
+
num_few_shot: 5
|
109 |
+
metrics:
|
110 |
+
- type: acc
|
111 |
+
value: 63.53
|
112 |
+
name: accuracy
|
113 |
+
source:
|
114 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
|
115 |
+
name: Open LLM Leaderboard
|
116 |
---
|
117 |
|
118 |
+
# UNA-ThePitbull 21.4B v2
|
119 |
+
|
120 |
+
Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0
|
121 |
+
![UNA - ThePitbull 21.4B v2](https://huggingface.co/fblgit/UNA-ThePitbull-21.4B-v2/resolve/main/DE-UNA-ThePitbull-21.4B-v2.png)
|
122 |
+
|
123 |
+
This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model.
|
124 |
+
|
125 |
+
## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
126 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-ThePitbull-21.4B-v2)
|
127 |
+
|
128 |
+
| Metric |Value|
|
129 |
+
|---------------------------------|----:|
|
130 |
+
|Avg. |77.82|
|
131 |
+
|AI2 Reasoning Challenge (25-Shot)|77.73|
|
132 |
+
|HellaSwag (10-Shot) |91.79|
|
133 |
+
|MMLU (5-Shot) |68.25|
|
134 |
+
|TruthfulQA (0-shot) |78.24|
|
135 |
+
|Winogrande (5-shot) |87.37|
|
136 |
+
|GSM8k (5-shot) |63.53|
|
137 |
+
|
138 |
## Llamacpp imatrix Quantizations of UNA-ThePitbull-21.4B-v2
|
139 |
|
140 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3001">b3001</a> for quantization.
|
|
|
228 |
The I-quants are *not* compatible with Vulcan, which is also AMD, so if you have an AMD card double check if you're using the rocBLAS build or the Vulcan build. At the time of writing this, LM Studio has a preview with ROCm support, and other inference engines have specific builds for ROCm.
|
229 |
|
230 |
Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
|
231 |
+
|
232 |
+
## Difference V1 vs V2
|
233 |
+
|
234 |
+
On V2 we implemented a different UNA strategy and covered partially the MLP's and Attention Layers.
|
235 |
+
We also performed further SFT over V1 and further DPO over V1 and we'll release some of those soon as well.
|
236 |
+
|
237 |
+
### Changes
|
238 |
+
|
239 |
+
1. SFT over V1 with `Replete-AI/code_bagel_hermes-2.5` at 1.0e-4 till 5.0e-5
|
240 |
+
2. DPO with: 1.0e-4 to min_lr 5.0e-5
|
241 |
+
* `mlabonne/orpo-dpo-mix-40k`
|
242 |
+
* `jondurbin/py-dpo-v0.1`
|
243 |
+
|
244 |
+
# Evaluations
|
245 |
+
|
246 |
+
Can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1
|
247 |
+
|
248 |
+
## UNA v2 (VLLM) Evaluations:
|
249 |
+
```
|
250 |
+
vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
|
251 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
252 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
253 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7695|± |0.0116|+
|
254 |
+
| | |flexible-extract| 5|exact_match|0.7695|± |0.0116|+
|
255 |
+
|hellaswag | 1|none | 10|acc |0.8110|± |0.0039|
|
256 |
+
| | |none | 10|acc_norm |0.9169|± |0.0028|+
|
257 |
+
|winogrande | 1|none | 5|acc |0.8777|± |0.0092|+
|
258 |
+
|mmlu |N/A |none | 0|acc |0.6427|± |0.0038|-
|
259 |
+
|arc_challenge | 1|none | 25|acc |0.7713|± |0.0123|
|
260 |
+
| | |none | 25|acc_norm |0.7875|± |0.0120|+
|
261 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7824|± |0.0135|-
|
262 |
+
|mathqa | 1|none | 0|acc |0.4037|± | 0.009|
|
263 |
+
| | |none | 0|acc_norm |0.4034|± | 0.009|+
|
264 |
+
|pubmedqa | 1|none | 0|acc |0.7260|± | 0.020|+
|
265 |
+
|boolq | 2|none | 0|acc |0.8602|± |0.0061|+
|
266 |
+
```
|
267 |
+
|
268 |
+
## UNA v1 (VLLM) Evaluations
|
269 |
+
```
|
270 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
271 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
272 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7566|± |0.0118|
|
273 |
+
| | |flexible-extract| 5|exact_match|0.7582|± |0.0118|
|
274 |
+
|hellaswag | 1|none | 10|acc |0.8168|± |0.0039|
|
275 |
+
| | |none | 10|acc_norm |0.9188|± |0.0027|
|
276 |
+
|winogrande | 1|none | 5|acc |0.8635|± |0.0097|
|
277 |
+
|mmlu | N/A|none | 0|acc |0.6444|± |0.0038|
|
278 |
+
|arc_challenge | 1|none | 25|acc |0.7747|± |0.0122|
|
279 |
+
| | |none | 25|acc_norm |0.7850|± |0.0120|
|
280 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7902|± |0.0134|
|
281 |
+
|mathqa | 1|none | 0|acc |0.4030|± | 0.009|
|
282 |
+
| | |none | 0|acc_norm |0.4034|± | 0.009|
|
283 |
+
|pubmedqa | 1|none | 0|acc |0.6860|± |0.0208|
|
284 |
+
|boolq | 2|none | 0|acc |0.8401|± |0.0064|
|
285 |
+
```
|
286 |
+
|
287 |
+
## Original (VLLM) Evaluations
|
288 |
+
```
|
289 |
+
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|
290 |
+
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|
291 |
+
|gsm8k | 3|strict-match | 5|exact_match|0.7528|± |0.0119|
|
292 |
+
| | |flexible-extract| 5|exact_match|0.7521|± |0.0119|
|
293 |
+
|hellaswag | 1|none | 10|acc |0.8117|± |0.0039|
|
294 |
+
| | |none | 10|acc_norm |0.9167|± |0.0028|
|
295 |
+
|winogrande | 1|none | 5|acc |0.8682|± |0.0095|
|
296 |
+
|mmlu | N/A|none | 0|acc |0.6448|± |0.0038|
|
297 |
+
|arc_challenge | 1|none | 25|acc |0.7688|± |0.0123|
|
298 |
+
| | |none | 25|acc_norm |0.7730|± |0.0122|
|
299 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.7895|± |0.0133|
|
300 |
+
|mathqa | 1|none | 0|acc |0.4000|± | 0.009|
|
301 |
+
| | |none | 0|acc_norm |0.4003|± | 0.009|
|
302 |
+
|pubmedqa | 1|none | 0|acc |0.6680|± |0.0211|
|
303 |
+
|boolq | 2|none | 0|acc |0.8346|± |0.0065|
|
304 |
+
```
|
305 |
+
|
306 |
+
## Citations
|
307 |
+
* mlabonne
|
308 |
+
* jondurbin & Replete-AI
|
309 |
+
* bartowski
|
310 |
+
* saltlux
|
311 |
+
|
312 |
+
If you use UNA models dont forget to cite:
|
313 |
+
```
|
314 |
+
@misc{unathepitbull21b,
|
315 |
+
title={ThePitbull: Uniform Neural Alignment},
|
316 |
+
author={Xavier Murias},
|
317 |
+
year={2024},
|
318 |
+
publisher = {Juanako.AI},
|
319 |
+
journal = {HuggingFace repository},
|
320 |
+
howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}},
|
321 |
+
}
|
322 |
+
```
|