Gptq or GGUF

#2
by XiphosLyacris - opened

any chance for a GPTQ or GGUF conversion for this?

llama.cpp has issues with llama-3.1, so everybody is currently waiting for those to be fixed before quantising. with luck, it will be fixed in a day.

llama.cpp has issues with llama-3.1, so everybody is currently waiting for those to be fixed before quantizing. with luck, it will be fixed in a day.

That is true for the 8B and 70B versions of this model, but the 12B version is based on Mistral Nemo, not Llama 3.1. And Nemo is fully supported by llama.cpp currently. The same is true for the 123B version which is based on Mistral Large.

NeverSleep org

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

Thank you, I really appreciate all the work you guys have put into this. I'm looking forward to trying this out👍.
It looks quite promising. Especially given how good Nemo is to start with.

You're right, Mistral quant can be done, will probably do some today.
L3.1 will need to wait

I look forward to your publication. I really hope you can also publish the best parameters for samplers.

NeverSleep org

L3.1 support got merged, I'm gonna do some static quant for the 4 model

Sign up or log in to comment