metadata
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- sft
palmer turbo
This model has a slightly different architecture and training style:
- The model was followed by a continual pretraining (lm_head + embedding layers were tuned).
- Base model was pretrained on 75k instruction/response pairs and merged.
- Similar architecture than palmer series but smaller in context size (8192)
In short, palmer is now half the size, twice the speed and same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande. As of Wed 17 Jul, it beats all models =< 0.5b on hellaswag.
As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case.
Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
---|---|---|---|---|---|---|
tinyllama | 0.2577 | 0.3029 | 0.5935 | 0.7329 | 0.5959 | 0.4966 |
danube3-500m-chat (current sota) | 0.2554 | 0.3626 | 0.6072 | 0.7432 | 0.6140 | 0.5164 |
palmer-004-turbo | 0.2736 | 0.3558 | 0.6179 | 0.7367 | 0.6117 | 0.5191 |
palmer-004 | 0.2661 | 0.3490 | 0.6173 | 0.7481 | 0.6417 | 0.5244 |
thanks to
- h2oai: performant base model provider
- teknium: openhermes dataset provider
- unsloth: tooling for training software