dranger003
/

c4ai-command-r-plus-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (8)

How about a quantized version that fits in 16 GB of memory like wizardlm?

#19 opened 6 months ago by

Will you redo quants after your bpe pr gets merged?

#18 opened 6 months ago by

I'm generating a imatrix using `groups_merged.txt` if you want me to run any tests?

#15 opened 7 months ago by

Can we get a Q4 without the IMat?

#14 opened 7 months ago by

fail on 104b-iq2_xxs.gguf with llama.cpp

#12 opened 7 months ago by

Invalid split files?

#11 opened 7 months ago by

Unable to load in ollama built from PR branch

#10 opened 7 months ago by

Is IQ1_S broken? If so why list it here?

#9 opened 7 months ago by

Fast work by the people on the llama.cpp team

#8 opened 7 months ago by

For a context of at least 32K tokens which version on a 2x16GB Gpu Config?

#3 opened 7 months ago by

What does iMat mean?

#2 opened 7 months ago by