InferenceIllusionist commited on
Commit
1a6d30c
1 Parent(s): 12952ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -13,6 +13,11 @@ tags:
13
  Quantized from fp32 with love. If you're using the latest version of llama.cpp you should no longer need to combine files before loading.
14
  * Importance matrix calculated using fp16 precision model
15
  * Calculated in 105 chunks with n_ctx=512 using groups_merged.txt
 
 
 
 
 
16
 
17
  For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
18
 
 
13
  Quantized from fp32 with love. If you're using the latest version of llama.cpp you should no longer need to combine files before loading.
14
  * Importance matrix calculated using fp16 precision model
15
  * Calculated in 105 chunks with n_ctx=512 using groups_merged.txt
16
+ * See below for imatrix calculation arguments
17
+
18
+ ```
19
+ .\llama-imatrix -m .\models\WizardLM-2-8x22b\ggml-model-f16.gguf -f .\imatrix\groups_merged.txt -o .\models\WizardLM-2-8x22b\WizardLM-2-8x22b-f16.imatrix -ngl 14 -t 24
20
+ ```
21
 
22
  For a brief rundown of iMatrix quant performance please see this [PR](https://github.com/ggerganov/llama.cpp/pull/5747)
23