Error when finetuning this - fp16 vs int8?

#1
by hartleyterw - opened

/usr/local/lib/python3.10/dist-packages/peft/tuners/lora.py in forward(self, x)
815 result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
816 elif self.r[self.active_adapter] > 0 and not self.merged:
--> 817 result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias)
818
819 x = x.to(self.lora_A[self.active_adapter].weight.dtype)

RuntimeError: self and mat2 must have the same dtype, but got Half and Int

Using qlora and peft 0.4
Can't finetune the original model because colab OOMs.
Is there any way to fix this?

Hello.

It should work if I use the sample below, but for some reason it doesn't work with this model.
I don't know the cause yet.
https://huggingface.co/dahara1/weblab-10b-instruction-sft-GPTQ/tree/main/finetune_sample

ALMA-7B-Ja-V2 is scheduled to be released soon, so I would like to reconsider the quantization method at that time.

Until then try the gguf version.
https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_gguf_Free_Colab_sample.ipynb

Training on cpu... I'd rather wait for the new release... :'D Thank you for your work.

webbigdata org

Hello.
It's taken longer than I expected because I've remade it several times, but I think I'll be able to release V2 tomorrow.

As far as I can tell, it is still possible to create LoRA for GPTQ quantized models using axolotl.

However, I have encountered a phenomenon where when I merge the created LoRA with the original model, the file size becomes the same as the model before GPTQ quantization.
https://github.com/OpenAccess-AI-Collective/axolotl/issues/829

Using huggingface/peft directly increased the file size as well, so this may be a current limitation on the peft side.

webbigdata org

Hi.

According this thread.
https://github.com/OpenAccess-AI-Collective/axolotl/issues/829

Lora fine-tuning -> merge into one file -> GPTQ -> OK
GPTQ -> fine-tuning -> I can, but merging into one file is not supported.

And we're training a new version now.

Thanks for your reply.
I'm trying to use axolotl with the same config as you except I disabled bf32 and different dataset (not tokenized). But it's giving me the error ValueError: model_config.quantization_config is not set or quant_method is not set to gptq. Please make sure to point to a GPTQ model.

!accelerate launch -m axolotl.cli.train /content/gptq.yml

I tried with the V2 version as well, same error

webbigdata org

Hmmm, I don't see any errors in my environment.
Well, the quantize_config.json is included in the repository, so it could be that the model is failing to download for some reason.

Check to see if you're getting any out of memory or other errors prior to that error.

Sign up or log in to comment