README.md · appvoid/palmer-004-turbo at 04f6d1445b77b5f8dcd80e77c4b40e828a24f54c

metadata

language:
  - en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
  - sft

	
		
	
	
		palmer turbo
	
This model has a slightly different architecture and training style:
The model was followed by a continual pretraining (lm_head + embedding layers were tuned).
Base model was pretrained on 75k instruction/response pairs and merged.
Similar architecture than palmer series but smaller in context size (8192)
In short, palmer is now half the size, twice the speed and same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande. As of Wed 17 Jul, it beats all models =< 0.5b on hellaswag.
As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.

	
		
Model
MMLU
ARC-C
HellaSwag
PIQA
Winogrande
Average

tinyllama
0.2577
0.3029
0.5935
0.7329
0.5959
0.4966

danube3-500m-chat (current sota)
0.2554
0.3626
0.6072
0.7432
0.6140
0.5164

palmer-004-turbo
0.2736
0.3558
0.6179
0.7367
0.6117
0.5191

palmer-004
0.2661
0.3490
0.6173
0.7481
0.6417
0.5244

	

	
		
	
	
		thanks to
	
h2oai: performant base model provider
teknium: openhermes dataset provider
unsloth: tooling for training software

Model	MMLU	ARC-C	HellaSwag	PIQA	Winogrande	Average
tinyllama	0.2577	0.3029	0.5935	0.7329	0.5959	0.4966
danube3-500m-chat (current sota)	0.2554	0.3626	0.6072	0.7432	0.6140	0.5164
palmer-004-turbo	0.2736	0.3558	0.6179	0.7367	0.6117	0.5191
palmer-004	0.2661	0.3490	0.6173	0.7481	0.6417	0.5244