macadeliccc commited on
Commit
9338c8b
1 Parent(s): 1d48e74

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -41,8 +41,6 @@ Thanks to user [bartowski](https://huggingface.co/bartowski) we now have exllama
41
 
42
  + [bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2)
43
 
44
- His quantizations represent the first ~13B model with GQA support. Check out his repo for more information!
45
-
46
  | Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
47
  | ----- | ---- | ------- | ------ | ------ | ------ | ------------ |
48
  | [8_0](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/8_0) | 8.0 | 8.0 | 13.7 GB | 15.1 GB | 17.2 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
@@ -51,6 +49,8 @@ His quantizations represent the first ~13B model with GQA support. Check out his
51
  | [4_25](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/4_25) | 4.25 | 6.0 | 8.2 GB | 9.6 GB | 11.7 GB | GPTQ equivalent bits per weight. |
52
  | [3_5](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/3_5) | 3.5 | 6.0 | 7.0 GB | 8.4 GB | 10.5 GB | Lower quality, not recommended. |
53
 
 
 
54
  ### GGUF
55
 
56
  *Current GGUF [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-GGUF)*
 
41
 
42
  + [bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2)
43
 
 
 
44
  | Branch | Bits | lm_head bits | VRAM (4k) | VRAM (16k) | VRAM (32k) | Description |
45
  | ----- | ---- | ------- | ------ | ------ | ------ | ------------ |
46
  | [8_0](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/8_0) | 8.0 | 8.0 | 13.7 GB | 15.1 GB | 17.2 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
 
49
  | [4_25](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/4_25) | 4.25 | 6.0 | 8.2 GB | 9.6 GB | 11.7 GB | GPTQ equivalent bits per weight. |
50
  | [3_5](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2/tree/3_5) | 3.5 | 6.0 | 7.0 GB | 8.4 GB | 10.5 GB | Lower quality, not recommended. |
51
 
52
+ His quantizations represent the first ~13B model with GQA support. Check out his repo for more information!
53
+
54
  ### GGUF
55
 
56
  *Current GGUF [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-GGUF)*