8bit GPTQ model

#3
by 0dd1er - opened

Hi,
thank you for your valuable work!!
Do you plan to release a 8bit GPTQ version of WizardCoder-15B?
I assume that would fit perferctly on a RTX4090 :-)

BR

That's an interesting thought. I'd not thought about doing that but yeah I certainly could.

Fairly soon I plan to go back and start doing more GPTQs with other parameters, so I will add this to the list to do then.

I would 100% love a 8-bit model too in gptq! GPTQ is my favorite type of model since oobagooba doesnt support ggml currently, but 4-bit quantization just seems to harm the quality of the model a little too much. So i second this

OK, noted. I've not yet had a chance to re-do my GPTQs but I'm expecting to get some new HW soon which will enable me to do that, so hopefully fairly soon.

Hey, just wanted to check up on the 8bit variant :)

@TheBloke would be great if they will also work with Exlama. recently the config.json are missing the pad_token configurations which leads the model not to load

Sign up or log in to comment