Can you share your iMatrix file?
So I can make my own custom quants from the best source. =)
Thanks in any case!
Sadly I have removed the model folder due to space constraints :(
Np, thanks for the bf16 still !
Could you share it for your future quants? Because it can be useful for gold diggers! :D
Most of my quants have it uploaded with them. Seems I just forgot for this one e.g.
- qwp4w3hyb/Meta-Llama-3.1-70B-Instruct-iMat-GGUF/blob/main/meta-llama-3.1-70b-instruct.imatrix
- qwp4w3hyb/Mistral-Nemo-Instruct-2407-iMat-GGUF/blob/main/mistral-nemo-instruct-2407.imatrix
- qwp4w3hyb/Qwen2-7B-Instruct-iMat-GGUF/blob/main/qwen2-7b-instruct.imatrix
- qwp4w3hyb/gemma-2-27b-it-iMat-GGUF/blob/main/gemma-2-27b-it.imatrix
But also, I'm not really doing much quantizing anymore, as @bartowski handling all my needs and I'm to lazy to do duplicate work, only time you'll see me uploading new quants is if a new model comes out that I really wanna use but it needs some WIP llama.cpp PR's to work correctly, then I'll provide beta quants with those PR's applied....
ohh btw i just found the imatrix you need in @bartowski 's repo bartowski/Qwen2-72B-Instruct-GGUF/blob/main/Qwen2-72B-Instruct.imatrix AFAIK he uses exactly the same process as me, only difference is that for some quants he might not use the bf16 as source. (not entirely sure abt this)
Regarding relevance of bf16 vs f16 he made a good investigation here: posts/bartowski/928757596721302
And I think I agree with him. Though I would still love a more extensive investigation regarding relevance for imatrices and with more bench-marking of the resulting models.