add GPTQ, AWQ, and GGUFF formats
Would be nice if these formats could be added so it's easier to run
Thanks, GPTQ model would be nice too for the time being. Unfortunately as of now getting AWQ to work on AMD GPUs is a bit of a hassle. Personally I use AMD unfortunately since NVIDIA in terms of VRAM is prohibitively expensive for me, likely won't be needed soon though I think progress is coming along slowly for AMD with AWQ but it's not in the major GUIs I think yet
AWQ : https://huggingface.co/OrionStarAI/Orion-14B-Chat-Int4
GGUFF: https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/Orion-14B-Chat.gguf
Do you have conversion python script for .\convert.py on llama cpp repo?
I am getting
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 444, got 363
when I try to load this.
AWQ : https://huggingface.co/OrionStarAI/Orion-14B-Chat-Int4
GGUFF: https://huggingface.co/OrionStarAI/Orion-14B-Chat/blob/main/Orion-14B-Chat.ggufDo you have conversion python script for .\convert.py on llama cpp repo?
I am getting
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 444, got 363
when I try to load this.
plz check
https://github.com/ggerganov/llama.cpp/blob/master/convert-hf-to-gguf.py