Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ Miqudev provided Q5_K_M, Q4_K_M, and Q2_K on this page : https://huggingface.co/
|
|
14 |
Here, you will find the following quants :
|
15 |
|
16 |
Full offload possible on 48GB VRAM with a huge context size :
|
17 |
-
- Q4_K_S
|
18 |
- Lower quality : Q3_K_L
|
19 |
|
20 |
Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
|
|
|
14 |
Here, you will find the following quants :
|
15 |
|
16 |
Full offload possible on 48GB VRAM with a huge context size :
|
17 |
+
- Q4_K_S. Note : A Q5_K_S requant compared to the original Q4_K_M quant of Miqudev wouldn't bring much benefit if any, and take much more VRAM, so I didn't do it.
|
18 |
- Lower quality : Q3_K_L
|
19 |
|
20 |
Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
|