GPTQ version?

by mer0mingian - opened Sep 5, 2023

Sep 5, 2023

Hi jphme,
thanks for providing the model! Would it be possible to provide a gptq-quantized version? E.g. in collaboration with TheBloke?
For people that want to run the model in a cost-saving way and do not have their own hardware to host it, this would remove many barriers...
Cheers

freefallr

Sep 5, 2023

We're planning on doing a GGUF conversion, if that helps? Should probably be ready tomorrow.

mer0mingian

Sep 5, 2023

Thanks. Haven't worked with that, but happy to try. :)

freefallr

Sep 5, 2023

@mer0mingian here we go :) https://huggingface.co/morgendigital/Llama-2-13b-chat-german-GGUF/tree/main

Either inference it with llama.cpp directly, or use one of the popular tools like text-generation-webui, koboldcpp, etc...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment