imatrix
Hello, i did imatrix quants for this model! I like this a lot.
https://huggingface.co/Bakanayatsu/Fimbulvetr-Kuro-Lotus-10.7B-GGUF-imatrix
Thank you! I'll add it to the top of the card :3
I have tried the Imatrix versions of Bakanayatsu, but they seem to be corrupted, when I run them in LM Studio, the program crashes. On the other hand, this model is excellent. In fact, it is the best in its category, of all the ones I have tried. A marvel
I have tried the Imatrix versions of Bakanayatsu, but they seem to be corrupted, when I run them in LM Studio, the program crashes. On the other hand, this model is excellent. In fact, it is the best in its category, of all the ones I have tried. A marvel
Are you trying the IQ xs variants?
The IQ xs variants are pretty new and may not be supported by lm studio yet.
can i run this model with TensorRT-LLM
I don't believe gguf is supported by TensorRT but I'm not completely sure
FP16
FP8
INT8 & INT4 Weight-Only
SmoothQuant
Groupwise quantization (AWQ/GPTQ)
FP8 KV CACHE
INT8 KV CACHE (+ AWQ/per-channel weight-only)
Tensor Parallel
STRONGLY TYPED
This is all that's listed on TensorRTs' GitHub page under support matrix for llama
gotcha.. but this should still be supported by TensorRT right?