2bit

#2
by KnutJaegersberg - opened

You should get your model 2 bit quantized by https://huggingface.co/GreenBitAI/LLaMA-3B-2bit-groupsize32
so we can use all as much as possible context length in best quality on consumer hardware.

Sign up or log in to comment