Nexesenex commited on
Commit
5157a58
1 Parent(s): 5f4e963

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -27,6 +27,9 @@ Full offload possible on 24GB VRAM with a decent context size.
27
  - IQ2_XS SOTA
28
  - Lower quality : IQ2_XXS SOTA
29
 
 
 
 
30
  ---
31
 
32
  Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.
 
27
  - IQ2_XS SOTA
28
  - Lower quality : IQ2_XXS SOTA
29
 
30
+ Full offload possible on 16GB VRAM with a decent context size.
31
+ - IQ1_S v2 (prefer to v1)
32
+
33
  ---
34
 
35
  Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.