AetherArchitectural/GGUF-Quantization-Script

Mar 15

Regarding support of iq1_s what do you think, currently it has received new updates and it should be usable, have you tried it out ? and can you add it to the supported quantization list?

FantasiaFoundry

AetherArchitectural org Mar 15

•

edited Mar 15

(...) add it to the supported quantization list?

You just need to add it to the quantization_options starting at line 133. Add or remove anything you want.

Personally I don't use anything bellow IQ3s. So I haven't tried that, can't tell you how good it is so far, but if you want to try the new IQ1 you can already do it. Just add it to the list starting at line 133 as I mentioned.

The script will pull the latest repo changes and get the latest binaries so you can just add it to the list and it should be fine if it's been added to the llama.cpp repo and bundled in one of the latest frequent releases already, it will quant the same as all the others.

If you're quantizing a big model and you only need IQ1/IQ2 because all other options are too big anyways I recommend removing them from the list and they are gonna be unnecessary/not useful, unless you're looking to provide quants for others that might need them of course.

saishf

Apr 7

Regarding support of iq1_s what do you think, currently it has received new updates and it should be usable, have you tried it out ? and can you add it to the supported quantization list?

If you're interested in how iq1_s compares to other quants, Nexesenex has an in depth comparison in this repo.

Nexesenex/TeeZee_Kyllene-Yi-34B-v1.1-iMat.GGUF

FantasiaFoundry changed discussion status to closed Apr 19