<23GB quant for 24GB GPUs? IQ2_XS or IQ2_XXS?
#4
by
kerrmetric
- opened
Could I request a <23GB quant for 24GB GPUs? Q2_XS or IQ2_XXS should work great.
Hey
@kerrmetric
, i may be able to help you with this one
https://huggingface.co/legraphista/Meta-Llama-3.1-70B-Instruct-IMat-GGUF#all-quants
Thanks! Will do