Could We combine AWQ and Importance Matrix calculation together to further improve perplexity.

by shing3232 - opened Jan 13

Jan 13

Same as the question
Could we do that or it does not matter at all.
Autoawq can calculate AWQ for llama cpp quantization
https://github.com/casper-hansen/AutoAWQ/pull/285
Thanks

ikawrakow

Owner Jan 14

What does AutoAWQ do? I can go and look around in the quoted repo, but it would be much easier if someone explained their approach.

shing3232

Jan 14

•

edited Jan 14

What does AutoAWQ do? I can go and look around in the quoted repo, but it would be much easier if someone explained their approach.
https://github.com/casper-hansen/AutoAWQ
https://github.com/mit-han-lab/llm-awq

https://arxiv.org/abs/2306.00978
Slide: https://www.dropbox.com/scl/fi/dtnp6h6y1mnp7g036axu6/AWQ-slide.pdf?rlkey=ffgh50hxhx8dmsnjiu8kef0ou&dl=0

ikawrakow

Owner Jan 14

If I understand their paper correctly, the scale search is also used in what I do for these quantized models, so not sure combining the two will help.

But I have now contributed the quantization approach used for these models to llama.cpp.
My guess is that it is easier for the contributors of https://github.com/casper-hansen/AutoAWQ/ to try than for me to get up to speed with their repo.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment