The README does not provide any deatils on the exact fine-tuning recipe. Could you elaborate if you used LoRA/full fine-tuning or any alignment techniques (DPO).
full fine tuning and DPO
· Sign up or log in to comment