Update README.md
Browse files
README.md
CHANGED
@@ -7,13 +7,23 @@ Miqu 1 70b : a leak of Mistral Medium Alpha. Credit for this model goes to the M
|
|
7 |
|
8 |
---
|
9 |
|
10 |
-
Requantizations of a Q5_K_M quant of a trending 70b model without better quant/fp16 available, this through a Q8_0 intermediary step.
|
11 |
|
12 |
Miqudev provided Q5_K_M, Q4_K_M, and Q2_K on this page : https://huggingface.co/miqudev/miqu-1-70b
|
13 |
|
14 |
-
Here, you will find :
|
15 |
-
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
|
18 |
---
|
19 |
|
|
|
7 |
|
8 |
---
|
9 |
|
10 |
+
Requantizations with iMatrix (better quality than without) of a Q5_K_M quant of a trending 70b model without better quant/fp16 available, this through a Q8_0 intermediary step.
|
11 |
|
12 |
Miqudev provided Q5_K_M, Q4_K_M, and Q2_K on this page : https://huggingface.co/miqudev/miqu-1-70b
|
13 |
|
14 |
+
Here, you will find the following quants :
|
15 |
+
|
16 |
+
Full offload possible on 48GB VRAM with a huge context size :
|
17 |
+
- Q4_K_S
|
18 |
+
- Lower quality : Q3_K_L
|
19 |
+
|
20 |
+
Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
|
21 |
+
- Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!)
|
22 |
+
- Lower quality : Q2_K_S
|
23 |
+
|
24 |
+
Full offload possible on 24GB VRAM with a decent context size.
|
25 |
+
- IQ2_XS SOTA
|
26 |
+
- Lower quality : IQ2_XXS SOTA
|
27 |
|
28 |
---
|
29 |
|