Nexesenex commited on
Commit
8daf3b7
1 Parent(s): 3de5855

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -30,7 +30,10 @@ Full offload possible on 24GB VRAM with a big to huge context size (from 12288 w
30
 
31
  Full offload possible on 16GB VRAM with a decent context size : IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!), Q2_K, Q2_K_S
32
 
33
- Full offload possible on 12GB VRAM with a decent context size : IQ2_XS SOTA. lower quality : IQ2_XXS SOTA, IQ1_S (prefer v2 or v3)
 
 
 
34
 
35
  ---
36
 
 
30
 
31
  Full offload possible on 16GB VRAM with a decent context size : IQ3_XXS SOTA (which is equivalent to a Q3_K_S with more context!), Q2_K, Q2_K_S
32
 
33
+ Full offload possible on 12GB VRAM with a decent context size : IQ2_XS SOTA. lower quality : IQ2_XXS SOTA
34
+
35
+ Full offload maybe possible on 8GB VRAM with a small context size : IQ1_S revision "even better" (b2404).
36
+ All my IQ1_S quant from the 13/03/2024 will be with this new IQ1_S quantization base.
37
 
38
  ---
39