qwp4w3hyb
/

c4ai-command-r-plus-iMat-GGUF

function calling

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on May 29

Commit

116134e

•

1 Parent(s): 113f8cb

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -30,12 +30,11 @@ license: cc-by-nc-4.0
 - Requantized for recent bpe pre-tokenizer fixes https://github.com/ggerganov/llama.cpp/pull/6920
 - quants done with an importance matrix for improved quantization loss
 - 0, K & IQ quants in basically all variants from Q8 down to IQ1_S
-- Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [04976db7a819fcf8bfefbfc09a3344210b79dd27](https://github.com/ggerganov/llama.cpp/commit/04976db7a819fcf8bfefbfc09a3344210b79dd27) (master from 2024-05-07)
 - Imatrix generated with [this](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) dataset.
   ```
   ./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
   ```
-Quantized with llama.cpp commit fabf30b4c4fca32e116009527180c252919ca922 (master as of 2024-05-20)
 # Original Model Card:

 - Requantized for recent bpe pre-tokenizer fixes https://github.com/ggerganov/llama.cpp/pull/6920
 - quants done with an importance matrix for improved quantization loss
 - 0, K & IQ quants in basically all variants from Q8 down to IQ1_S
+- Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [fabf30b4c4fca32e116009527180c252919ca922](https://github.com/ggerganov/llama.cpp/commit/fabf30b4c4fca32e116009527180c252919ca922) (master as of 2024-05-20)
 - Imatrix generated with [this](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) dataset.
   ```
   ./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
   ```
 # Original Model Card: