Which GPTQ parameters to use? Is 512 groupsize or cutoff length?
I want to use this model with the text-generation-webui. What parameters to use with python server.py for this model?
I've not had a chance to do the readme yet.
GPTQ params are:
wbits = 4
groupsize = None
model_type = llama
I get a size mismatch error if I use wbits=4
I get a size mismatch error if I use wbits=4
Show me the errors. Are you using pre_layer as well?
I run this this command python server.py --model TheBloke_h2ogpt-oasst1-512-30B-GPTQ --model_type llama --wbits 4 --chat
and I get this error (only showing the last two lines):
size mismatch for model.layers.59.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2240]) from checkpoint, the shape in current model is torch.Size([52, 2240]).
size mismatch for model.layers.59.mlp.up_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]
Can you try without --chat and let me know if that makes any difference?
I run this this command
python server.py --model TheBloke_h2ogpt-oasst1-512-30B-GPTQ --model_type llama --wbits 4 --chat
and I get this error (only showing the last two lines):
size mismatch for model.layers.59.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2240]) from checkpoint, the shape in current model is torch.Size([52, 2240]). size mismatch for model.layers.59.mlp.up_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]
You need to specifity --groupsize 512
You need to specifity --groupsize 512
The 512 in the model size is not the group size. The group size is None.
I now know what's causing this. It appears to be a text-gen-ui bug related to groupsize. It's forcing the model to use a groupsize of 128, ignoring the --groupsize -1 default. For the fix, please see: https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GPTQ/discussions/3#6463434bab15db2fa5661b31