Thanks for the quant

by FlareRebellion - opened Jan 23

Jan 23

You seem to be the only person doing importance matrix quants and this is a cool model. It works great for me, thanks.

PS: This is NOT a request and too early to tell but cloudyu/Yi-34Bx2-MoE-60B-DPO could be great, considering the strength of the non DPO variant. I'll keep my eyes open for more of your quants in any case.

Artefact2

Owner Jan 23

Thanks!  I'll have a look at it :-)

Artefact2

Owner Jan 23

It's no longer accessible.

FlareRebellion

Jan 23

yeah lol. Probably model was borked or something, wouldn't be the first time with DPO training going haywire

Artefact2

Owner Jan 23

•

edited Jan 23

If you have other quant suggestions (70B or fewer) feel free to send :-)

FlareRebellion

Jan 23

Oh yeah, sure, I'm full of suggestions. I guess iquants of these would be pretty cool.

https://huggingface.co/NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss-GGUF
https://huggingface.co/jondurbin/bagel-dpo-8x7b-v0.2
https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2

they all have standard gguf quants by TheBloke already, but I guess modern, state of the art, importance matrix quantisations could improve things for the gpu poor.

Artefact2

Owner Jan 25

https://huggingface.co/Artefact2/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss-GGUF
https://huggingface.co/Artefact2/bagel-dpo-8x7b-v0.2-GGUF
https://huggingface.co/Artefact2/bagel-dpo-34b-v0.2-GGUF

FlareRebellion

Jan 30

•

edited Jan 30

@Artefact2 just in case you're still up for more quant suggestions :)

https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf
https://huggingface.co/ycros/BagelMIsteryTour-v2-8x7B

Thanks for all your hard work.

Edit: Accidentally linked the BagelMisteryTour v1, oops.

FlareRebellion

Jan 30

I'm putting myself in timeout for not even pasting the right link when asking for your quants, for shame.

Artefact2

Owner Jan 31

https://huggingface.co/Artefact2/CodeLlama-70b-Instruct-hf-GGUF
https://huggingface.co/Artefact2/BagelMIsteryTour-v2-8x7B-GGUF

FlareRebellion

Feb 6

Another new model that might be interesting (no gguf quantisations yet)

https://huggingface.co/serpdotai/sparsetral-16x7B-v2

though, maybe this weird sparse type model needs different quantisation methods and won't work with the old ones?

Artefact2

Owner Feb 6

Sparsetral/Camelidae are new model architectures, it will take a while for llama.cpp to support it (if ever). You can open a suggestion upstream if you want!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment