π Note: not every quant is displayed on the table on the right, you can find everything here.
Using llama.cpp release b3804 for quantization.
Original model: https://huggingface.co/ifable/gemma-2-Ifable-9B
All quants were made using the imatrix option (except BF16, that's the original precision). The imatrix was generated with the dataset from here, using the BF16 GGUF with a context size of 8192 tokens (default is 512 but higher/same as model context size should improve quality) and 13 chunks.
How to make your own quants:
https://github.com/ggerganov/llama.cpp/tree/master/examples/imatrix
https://github.com/ggerganov/llama.cpp/tree/master/examples/quantize
- Downloads last month
- 2,157
Model tree for Hampetiudo/gemma-2-Ifable-9B-i1-GGUF
Base model
ifable/gemma-2-Ifable-9B