Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ license_link: LICENSE
|
|
10 |
Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"!
|
11 |
Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.
|
12 |
|
13 |
-
Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA.
|
14 |
|
15 |
|
16 |
## Prompt Format
|
|
|
10 |
Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"!
|
11 |
Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.
|
12 |
|
13 |
+
Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA.
|
14 |
|
15 |
|
16 |
## Prompt Format
|