Trainer RuntimeError: The size of tensor a (32) must match the size of tensor b (8) at non-singleton dimension 0

by Salahedd - opened

Hi everyone,

I'm currently fine-tuning the Mixtral 8x7B model and encountered an issue with the quantized version at 8 bits. During training, I consistently run into an error with the 8-bit quantized model. Interestingly, the same process works perfectly fine when I use the 4-bit quantized version.

I've attached a screenshot of the error message for reference. Has anyone experienced something similar or have any suggestions on how to resolve this? Any help would be greatly appreciated!

Thank you!

Sign up or log in to comment