Update README.md

b64384a verified 6 months ago

5.53 kB

	---
	license: afl-3.0
	library_name: transformers
	tags:
	- UNA
	- juanako
	datasets:
	- jondurbin/py-dpo-v0.1
	- Replete-AI/code_bagel_hermes-2.5
	- mlabonne/orpo-dpo-mix-40k
	---

	# UNA-ThePitbull 21.4B v2

	Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0
	![UNA - ThePitbull 21.4B v2](https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1/resolve/main/UNA-ThePitbull.png)

	This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model.

	Quant version available at ... soon ..

	## Difference V1 vs V2

	On V2 we implemented a different UNA strategy and covered partially the MLP's and Attention Layers.
	We also performed further SFT over V1 and further DPO over V1 and we'll release some of those soon as well.

	### Changes

	1. SFT over V1 with `Replete-AI/code_bagel_hermes-2.5` at 1.0e-4 till 5.0e-5
	2. DPO with: 1.0e-4 to min_lr 5.0e-5
	* `mlabonne/orpo-dpo-mix-40k`
	* `jondurbin/py-dpo-v0.1`
	*
	## Evaluations

	Can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1

	## UNA v2 (VLLM) Evaluations:
	```
	vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
	\| Tasks \|Version\| Filter \|n-shot\| Metric \|Value \| \|Stderr\|
	\|--------------\|------:\|----------------\|-----:\|-----------\|-----:\|---\|-----:\|
	\|gsm8k \| 3\|strict-match \| 5\|exact_match\|0.7695\|± \|0.0116\|+
	\| \| \|flexible-extract\| 5\|exact_match\|0.7695\|± \|0.0116\|+
	\|hellaswag \| 1\|none \| 10\|acc \|0.8110\|± \|0.0039\|
	\| \| \|none \| 10\|acc_norm \|0.9169\|± \|0.0028\|+
	\|winogrande \| 1\|none \| 5\|acc \|0.8777\|± \|0.0092\|+
	\|mmlu \|N/A \|none \| 0\|acc \|0.6427\|± \|0.0038\|-
	\|arc_challenge \| 1\|none \| 25\|acc \|0.7713\|± \|0.0123\|
	\| \| \|none \| 25\|acc_norm \|0.7875\|± \|0.0120\|+
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.7824\|± \|0.0135\|-
	\|mathqa \| 1\|none \| 0\|acc \|0.4037\|± \| 0.009\|
	\| \| \|none \| 0\|acc_norm \|0.4034\|± \| 0.009\|+
	\|pubmedqa \| 1\|none \| 0\|acc \|0.7260\|± \| 0.020\|+
	\|boolq \| 2\|none \| 0\|acc \|0.8602\|± \|0.0061\|+
	```

	## UNA v1 (VLLM) Evaluations
	```
	\| Tasks \|Version\| Filter \|n-shot\| Metric \|Value \| \|Stderr\|
	\|--------------\|------:\|----------------\|-----:\|-----------\|-----:\|---\|-----:\|
	\|gsm8k \| 3\|strict-match \| 5\|exact_match\|0.7566\|± \|0.0118\|
	\| \| \|flexible-extract\| 5\|exact_match\|0.7582\|± \|0.0118\|
	\|hellaswag \| 1\|none \| 10\|acc \|0.8168\|± \|0.0039\|
	\| \| \|none \| 10\|acc_norm \|0.9188\|± \|0.0027\|
	\|winogrande \| 1\|none \| 5\|acc \|0.8635\|± \|0.0097\|
	\|mmlu \| N/A\|none \| 0\|acc \|0.6444\|± \|0.0038\|
	\|arc_challenge \| 1\|none \| 25\|acc \|0.7747\|± \|0.0122\|
	\| \| \|none \| 25\|acc_norm \|0.7850\|± \|0.0120\|
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.7902\|± \|0.0134\|
	\|mathqa \| 1\|none \| 0\|acc \|0.4030\|± \| 0.009\|
	\| \| \|none \| 0\|acc_norm \|0.4034\|± \| 0.009\|
	\|pubmedqa \| 1\|none \| 0\|acc \|0.6860\|± \|0.0208\|
	\|boolq \| 2\|none \| 0\|acc \|0.8401\|± \|0.0064\|
	```

	## Original (VLLM) Evaluations
	```
	\| Tasks \|Version\| Filter \|n-shot\| Metric \|Value \| \|Stderr\|
	\|--------------\|------:\|----------------\|-----:\|-----------\|-----:\|---\|-----:\|
	\|gsm8k \| 3\|strict-match \| 5\|exact_match\|0.7528\|± \|0.0119\|
	\| \| \|flexible-extract\| 5\|exact_match\|0.7521\|± \|0.0119\|
	\|hellaswag \| 1\|none \| 10\|acc \|0.8117\|± \|0.0039\|
	\| \| \|none \| 10\|acc_norm \|0.9167\|± \|0.0028\|
	\|winogrande \| 1\|none \| 5\|acc \|0.8682\|± \|0.0095\|
	\|mmlu \| N/A\|none \| 0\|acc \|0.6448\|± \|0.0038\|
	\|arc_challenge \| 1\|none \| 25\|acc \|0.7688\|± \|0.0123\|
	\| \| \|none \| 25\|acc_norm \|0.7730\|± \|0.0122\|
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.7895\|± \|0.0133\|
	\|mathqa \| 1\|none \| 0\|acc \|0.4000\|± \| 0.009\|
	\| \| \|none \| 0\|acc_norm \|0.4003\|± \| 0.009\|
	\|pubmedqa \| 1\|none \| 0\|acc \|0.6680\|± \|0.0211\|
	\|boolq \| 2\|none \| 0\|acc \|0.8346\|± \|0.0065\|
	```

	## Citations
	* saltlux
	* mlabonne
	* jondurbin & Replete-AI
	* bartowski & TheBloke

	If you use UNA models dont forget to cite:
	```
	@misc{unathepitbull21b,
	title={ThePitbull: Uniform Neural Alignment},
	author={Xavier Murias},
	year={2024},
	publisher = {Juanako.AI},
	journal = {HuggingFace repository},
	howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}},
	}
	```