Huggingface Run
#1
by
GokhanAI
- opened
How to run this model ? We can run via AutoModelForCausalLM etc. Can you share us ? I am new this format. I am sorry.
These are exllamav2 quantized versions of the models. You need to use exllamav2 itself to load these models via Python. There are simple examples in the Github project to load the model and run inference. If you want to use a GUI, use exui, ooba's text-generation-webui (with exllamav2 or exllamav2_hf as model loaders), or other packages like tabbyAPI.