An alternate quantization.

by ZeroWw - opened Jun 27

Discussion

ZeroWw

Jun 27

•

edited Jul 23

These are my own quantizations (updated almost daily).

https://huggingface.co/ZeroWw/L3-8B-Stheno-v3.3-32K-GGUF

My own (ZeroWw) quantizations. output and embed tensors quantized to f16.
all other tensors quantized to q5_k or q6_k.

Result: both f16.q6 and f16.q5 are smaller than q8_0 standard quantization and they perform as well as the pure f16.

llama-anon

Jun 27

Greetings, I applaud your intriguing claims regarding the performance of f16.q6 and f16.q5 in comparison to the q8_0 standard quantization. If your assertion is accurate, it would indeed be a significant development. To ensure the credibility of this information, I would kindly request you to provide any sources or references that support your claim. Thank you in advance for your cooperation.

Duttones

Jul 5

He actually provide references in his model page. I've tested it and I also did notice a big increase in quality. @ZeroWw made an amazing job.

ZeroWw

Jul 7

He actually provide references in his model page. I've tested it and I also did notice a big increase in quality. @ZeroWw made an amazing job.

Thanks. Mine was just an idea and it seems to work despite some negative feedbacks. I also noticed that negative feedback came from people using "big" models, while I saw improvement in the small ones 9B or less...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment