Which GPTQ parameters to use? Is 512 groupsize or cutoff length?

#1
by Blacky372 - opened

I want to use this model with the text-generation-webui. What parameters to use with python server.py for this model?

I've not had a chance to do the readme yet.

GPTQ params are:
wbits = 4
groupsize = None
model_type = llama

I get a size mismatch error if I use wbits=4

I get a size mismatch error if I use wbits=4

Show me the errors. Are you using pre_layer as well?

I run this this command python server.py --model TheBloke_h2ogpt-oasst1-512-30B-GPTQ --model_type llama --wbits 4 --chat

and I get this error (only showing the last two lines):

        size mismatch for model.layers.59.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2240]) from checkpoint, the shape in current model is torch.Size([52, 2240]).
        size mismatch for model.layers.59.mlp.up_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]

Can you try without --chat and let me know if that makes any difference?

I run this this command python server.py --model TheBloke_h2ogpt-oasst1-512-30B-GPTQ --model_type llama --wbits 4 --chat

and I get this error (only showing the last two lines):

        size mismatch for model.layers.59.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2240]) from checkpoint, the shape in current model is torch.Size([52, 2240]).
        size mismatch for model.layers.59.mlp.up_proj.scales: copying a param with shape torch.Size([1, 17920]) from checkpoint, the shape in current model is torch.Size([52, 17920]

You need to specifity --groupsize 512

You need to specifity --groupsize 512

The 512 in the model size is not the group size. The group size is None.

I now know what's causing this. It appears to be a text-gen-ui bug related to groupsize. It's forcing the model to use a groupsize of 128, ignoring the --groupsize -1 default. For the fix, please see: https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GPTQ/discussions/3#6463434bab15db2fa5661b31

Sign up or log in to comment