--- base_model: mlabonne/OrpoLlama-3-8B language: - en license: other library_name: transformers datasets: - mlabonne/orpo-dpo-mix-40k tags: - 4-bit - AWQ - text-generation - autotrain_compatible - endpoints_compatible - orpo - llama 3 - rlhf - sft pipeline_tag: text-generation inference: false quantized_by: Suparious --- # mlabonne/OrpoLlama-3-8B AWQ - Model creator: [mlabonne](https://huggingface.co/mlabonne) - Original model: [OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) ![](https://i.imgur.com/ZHwzQvI.png) ## Model Summary This is an ORPO fine-tune of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on 1k samples of [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) created for [this article](https://huggingface.co/blog/mlabonne/orpo-llama-3). It's a successful fine-tune that follows the ChatML template! **Try the demo**: https://huggingface.co/spaces/mlabonne/OrpoLlama-3-8B ## 🔎 Application This model uses a context window of 8k. It was trained with the ChatML template. ## 🏆 Evaluation ### Nous OrpoLlama-4-8B outperforms Llama-3-8B-Instruct on the GPT4All and TruthfulQA datasets.