https://huggingface.co/migtissera/Tess-3-Mixtral-8x22B

#416
by tioust - opened

Would it be possible to quantize this model ?
https://huggingface.co/migtissera/Tess-3-Mixtral-8x22B
Thank you :)

I can sure try :) It's in the queue, and you can watch it's status at http://hf.tst.eu/status.html if you wish.

mradermacher changed discussion status to closed

Unfortunately, llama.cpp crashes when creating imatrix quants (probably due to bugs in how llama.cpp handles moes). I will try to create the non-crashing imatrix quants, but you need to be a bit weary - when in doubt, compare with a static quant before considering the model unusable or so.

For what its worth I noticed a lot of warnings like this during them imatrix computation process for this model:

[190]388.6883,[191]388.6048,[192]389.5774,[193]391.5766,[194]392.9305,[195]393.7029,[196]395.4898,[197]395.5028,[198]396.6062,[199]398.5765,
save_imatrix: entry '             blk.55.ffn_down_exps.weight' has partial data (62.50%) - skipping
save_imatrix: entry '             blk.55.ffn_gate_exps.weight' has partial data (62.50%) - skipping
save_imatrix: entry '               blk.55.ffn_up_exps.weight' has partial data (62.50%) - skipping
save_imatrix: entry '             blk.54.ffn_down_exps.weight' has partial data (87.50%) - skipping
save_imatrix: entry '               blk.54.ffn_up_exps.weight' has partial data (87.50%) - skipping
save_imatrix: entry '             blk.29.ffn_gate_exps.weight' has partial data (87.50%) - skipping
save_imatrix: entry '               blk.29.ffn_up_exps.weight' has partial data (87.50%) - skipping
save_imatrix: entry '             blk.54.ffn_gate_exps.weight' has partial data (87.50%) - skipping
save_imatrix: entry '             blk.29.ffn_down_exps.weight' has partial data (87.50%) - skipping
save_imatrix: storing only 439 out of 448 entries

According to llama.cpp "this can happen with MoE models where some of the experts end up not being exercised by the provided training data". So migh be normal for MoE and just not something I ever noticed before.

In any case the imatrix quants of this model are now available under https://huggingface.co/mradermacher/Tess-3-Mixtral-8x22B-i1-GGUF so it must have somehow worked out if it isn't broken.

No, it hasn't worked out, I only provided the non-crashing ones (or the ones I expect to not crash) - basically, if this error happens, I have a fallback list of basically everything but IQ2 and IQ1 quants, which has a high (but not certain) chance of succeeding, but that doesn't mean the other quants are not broken - there is a good chance, from experience, that all the imatrix quants will be negatively affected, but that depends on the model.

Also, if I may remark, the perplexity values are very high.

Sign up or log in to comment