Nexesenex commited on
Commit
cd13f01
1 Parent(s): cb30fbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -4
README.md CHANGED
@@ -7,13 +7,23 @@ Miqu 1 70b : a leak of Mistral Medium Alpha. Credit for this model goes to the M
7
 
8
  ---
9
 
10
- Requantizations of a Q5_K_M quant of a trending 70b model without better quant/fp16 available, this through a Q8_0 intermediary step.
11
 
12
  Miqudev provided Q5_K_M, Q4_K_M, and Q2_K on this page : https://huggingface.co/miqudev/miqu-1-70b
13
 
14
- Here, you will find :
15
- - Q4_K_S, Q3_K_L, Q3_K_M, Q3_K_S, Q3_K_XS, Q2_K_S, IQ3_XXS SOTA and IQ2_XS SOTA available.
16
- - IQ2_XXS SOTA for this afternoon.
 
 
 
 
 
 
 
 
 
 
17
 
18
  ---
19
 
 
7
 
8
  ---
9
 
10
+ Requantizations with iMatrix (better quality than without) of a Q5_K_M quant of a trending 70b model without better quant/fp16 available, this through a Q8_0 intermediary step.
11
 
12
  Miqudev provided Q5_K_M, Q4_K_M, and Q2_K on this page : https://huggingface.co/miqudev/miqu-1-70b
13
 
14
+ Here, you will find the following quants :
15
+
16
+ Full offload possible on 48GB VRAM with a huge context size :
17
+ - Q4_K_S
18
+ - Lower quality : Q3_K_L
19
+
20
+ Full offload possible on 36GB VRAM with a variable context size (up to 7168 with Q3_K_M, for example)
21
+ - Q3_K_M, Q3_K_S, Q3_K_XS, IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!)
22
+ - Lower quality : Q2_K_S
23
+
24
+ Full offload possible on 24GB VRAM with a decent context size.
25
+ - IQ2_XS SOTA
26
+ - Lower quality : IQ2_XXS SOTA
27
 
28
  ---
29