metadata
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- trl
- dpo
base_model: unsloth/zephyr-sft-bnb-4bit
datasets:
- unalignment/toxic-dpo-v0.2
Uploaded model
- Developed by: akaistormherald
- License: apache-2.0
- Finetuned from model : unsloth/zephyr-sft-bnb-4bit
Mistral7b + SFT + 4bit DPO training with unalignment/toxic-dpo-v0.2 == ToxicMist? ☣🌫