|
--- |
|
license: afl-3.0 |
|
library_name: transformers |
|
tags: |
|
- UNA |
|
- juanako |
|
datasets: |
|
- jondurbin/py-dpo-v0.1 |
|
- Replete-AI/code_bagel_hermes-2.5 |
|
- mlabonne/orpo-dpo-mix-40k |
|
--- |
|
|
|
# UNA-ThePitbull 21.4B v2 |
|
|
|
Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0 |
|
![UNA - ThePitbull 21.4B v2](https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1/resolve/main/UNA-ThePitbull.png) |
|
|
|
This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model. |
|
|
|
Quant version available at ... soon .. |
|
|
|
## Difference V1 vs V2 |
|
|
|
On V2 we implemented a different UNA strategy and covered partially the MLP's and Attention Layers. |
|
We also performed further SFT over V1 and further DPO over V1 and we'll release some of those soon as well. |
|
|
|
### Changes |
|
|
|
1. SFT over V1 with `Replete-AI/code_bagel_hermes-2.5` at 1.0e-4 till 5.0e-5 |
|
2. DPO with: 1.0e-4 to min_lr 5.0e-5 |
|
* `mlabonne/orpo-dpo-mix-40k` |
|
* `jondurbin/py-dpo-v0.1` |
|
* |
|
## Evaluations |
|
|
|
Can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1 |
|
|
|
## UNA v2 (VLLM) Evaluations: |
|
``` |
|
vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8 |
|
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr| |
|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:| |
|
|gsm8k | 3|strict-match | 5|exact_match|0.7695|± |0.0116|+ |
|
| | |flexible-extract| 5|exact_match|0.7695|± |0.0116|+ |
|
|hellaswag | 1|none | 10|acc |0.8110|± |0.0039| |
|
| | |none | 10|acc_norm |0.9169|± |0.0028|+ |
|
|winogrande | 1|none | 5|acc |0.8777|± |0.0092|+ |
|
|mmlu |N/A |none | 0|acc |0.6427|± |0.0038|- |
|
|arc_challenge | 1|none | 25|acc |0.7713|± |0.0123| |
|
| | |none | 25|acc_norm |0.7875|± |0.0120|+ |
|
|truthfulqa_mc2| 2|none | 0|acc |0.7824|± |0.0135|- |
|
|mathqa | 1|none | 0|acc |0.4037|± | 0.009| |
|
| | |none | 0|acc_norm |0.4034|± | 0.009|+ |
|
|pubmedqa | 1|none | 0|acc |0.7260|± | 0.020|+ |
|
|boolq | 2|none | 0|acc |0.8602|± |0.0061|+ |
|
``` |
|
|
|
## UNA v1 (VLLM) Evaluations |
|
``` |
|
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr| |
|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:| |
|
|gsm8k | 3|strict-match | 5|exact_match|0.7566|± |0.0118| |
|
| | |flexible-extract| 5|exact_match|0.7582|± |0.0118| |
|
|hellaswag | 1|none | 10|acc |0.8168|± |0.0039| |
|
| | |none | 10|acc_norm |0.9188|± |0.0027| |
|
|winogrande | 1|none | 5|acc |0.8635|± |0.0097| |
|
|mmlu | N/A|none | 0|acc |0.6444|± |0.0038| |
|
|arc_challenge | 1|none | 25|acc |0.7747|± |0.0122| |
|
| | |none | 25|acc_norm |0.7850|± |0.0120| |
|
|truthfulqa_mc2| 2|none | 0|acc |0.7902|± |0.0134| |
|
|mathqa | 1|none | 0|acc |0.4030|± | 0.009| |
|
| | |none | 0|acc_norm |0.4034|± | 0.009| |
|
|pubmedqa | 1|none | 0|acc |0.6860|± |0.0208| |
|
|boolq | 2|none | 0|acc |0.8401|± |0.0064| |
|
``` |
|
|
|
## Original (VLLM) Evaluations |
|
``` |
|
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr| |
|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:| |
|
|gsm8k | 3|strict-match | 5|exact_match|0.7528|± |0.0119| |
|
| | |flexible-extract| 5|exact_match|0.7521|± |0.0119| |
|
|hellaswag | 1|none | 10|acc |0.8117|± |0.0039| |
|
| | |none | 10|acc_norm |0.9167|± |0.0028| |
|
|winogrande | 1|none | 5|acc |0.8682|± |0.0095| |
|
|mmlu | N/A|none | 0|acc |0.6448|± |0.0038| |
|
|arc_challenge | 1|none | 25|acc |0.7688|± |0.0123| |
|
| | |none | 25|acc_norm |0.7730|± |0.0122| |
|
|truthfulqa_mc2| 2|none | 0|acc |0.7895|± |0.0133| |
|
|mathqa | 1|none | 0|acc |0.4000|± | 0.009| |
|
| | |none | 0|acc_norm |0.4003|± | 0.009| |
|
|pubmedqa | 1|none | 0|acc |0.6680|± |0.0211| |
|
|boolq | 2|none | 0|acc |0.8346|± |0.0065| |
|
``` |
|
|
|
## Citations |
|
* saltlux |
|
* mlabonne |
|
* jondurbin & Replete-AI |
|
* bartowski & TheBloke |
|
|
|
If you use UNA models dont forget to cite: |
|
``` |
|
@misc{unathepitbull21b, |
|
title={ThePitbull: Uniform Neural Alignment}, |
|
author={Xavier Murias}, |
|
year={2024}, |
|
publisher = {Juanako.AI}, |
|
journal = {HuggingFace repository}, |
|
howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}}, |
|
} |
|
``` |