FlashAttention2 support not working during training
@Qubitium
Yes, I am able to use flash attention 2, I'm not doing full fine-tune tho, it's a LoRA tune I tested.
https://github.com/LagPixelLOL/qlora/blob/main/scripts/finetune_schizogpt_132b.sh
This is the script I used to test, with eval disabled for DeepSpeed to work.
https://huggingface.co/v2ray/SchizoGPT-132B-QLoRA
This is the result of the training run.
Not like the name suggested, it's actually just a regular LoRA instead of QLoRA because I set the bits to 16. I trained it on 8x A100 80GB.
All the libraries I used are at the latest release version(Not the dev version), CUDA version I used is 12.2.
Hey v2ray,
thank you for the conversion.
I'm using TRL for finetuning and I'm getting stuck on the target_modules for PEFT, in the repo you forwarded there's a function to extract all linear layers but I get an error
Which modules did you use?
@ChristianPalaArtificialy Hello, I'm using:
"target_modules": [
"v1",
"Wqkv",
"layer",
"out_proj",
"w1",
"w2"
],
Also what's the error you were getting?