DavidAU/Command-R-01-Ultra-NEO-DARK-HORROR-V1-V2-35B-IMATRIX-GGUF

GhostGate

Aug 16

I was just wondering if there are no Q8 or Q6 quants.

DavidAU

Owner Aug 16

•

edited Aug 16

There will be, IQs are uploading first. My local upload speed is slow, so this will take some time due to both size and quantity of files.

McUH

Aug 16

How would that even work on Q8? If I understand correctly this is no finetune but simply using "horror" imatrix to preserve relevant weights for horror with more relevance during quantization. But with Q8 (and probably Q6 too) I would expect it would have almost no effect since all weights are preserved well anyway?
Nice idea though. I was thinking along the same lines that if say one used imatrix with code snipets, if it would preserve better the coding ability of the quantized model.

DavidAU

Owner Aug 16

•

edited Aug 16

There is still effect at Q6, but not as much.
Q8, although barely showing in PPL, it does affect it too.

Effects can be verified by testing "Q6" unaltered against Imat Q6, with "temp=0" and a creative test prompt.

The Neo class datasets are far stronger than average imat datasets as their are calibrated for the LLM and Imatrix process.
They are precision formatted based on a lot of trial and error and testing.
(rather than a copy/paste text file "mess" so to speak)

That being said, I recommend IQ4XS, and to a smaller degree Q4s and Q5s.

Neo class datasets were also used here:
https://huggingface.co/DavidAU/Command-R-01-Ultra-NEO-V1-35B-IMATRIX-GGUF

and in a number of other models also at my repo.

Same guidance applies.

For creative use cases, I usually do not recommend Q8 as there seems to a "drop off" or "dulling" at Q8 vs Q6/Q5KM.
This varies from model to model and based on testing a lot of models.

In terms of "horror" level ; the Grand Horror series of models far and above exceed Command-R Dark Horror due to their construction.

DavidAU changed discussion status to closed Aug 16

DavidAU
/

Command-R-01-Ultra-NEO-DARK-HORROR-V1-V2-35B-IMATRIX-GGUF

no Q8?