qwp4w3hyb/Qwen2-72B-Instruct-iMat-GGUF · Can you share your iMatrix file?

Nexesenex

Sep 23

So I can make my own custom quants from the best source. =)
Thanks in any case!

qwp4w3hyb

Owner Sep 23

Sadly I have removed the model folder due to space constraints :(

Nexesenex

Sep 23

Np, thanks for the bf16 still !
Could you share it for your future quants? Because it can be useful for gold diggers! :D

qwp4w3hyb

Owner Sep 24

•

edited Sep 24

Most of my quants have it uploaded with them. Seems I just forgot for this one e.g.

But also, I'm not really doing much quantizing anymore, as @bartowski handling all my needs and I'm to lazy to do duplicate work, only time you'll see me uploading new quants is if a new model comes out that I really wanna use but it needs some WIP llama.cpp PR's to work correctly, then I'll provide beta quants with those PR's applied....

qwp4w3hyb

Owner Sep 24

•

edited Sep 24

ohh btw i just found the imatrix you need in @bartowski 's repo bartowski/Qwen2-72B-Instruct-GGUF/blob/main/Qwen2-72B-Instruct.imatrix AFAIK he uses exactly the same process as me, only difference is that for some quants he might not use the bf16 as source. (not entirely sure abt this)

Regarding relevance of bf16 vs f16 he made a good investigation here: posts/bartowski/928757596721302

And I think I agree with him. Though I would still love a more extensive investigation regarding relevance for imatrices and with more bench-marking of the resulting models.