Add imatrix info
Browse files
README.md
CHANGED
@@ -34,13 +34,13 @@ GGUF quantized models of [mattshumer/ref_70_e3](https://huggingface.co/mattshume
|
|
34 |
| Q8_0_L | ??.?GB | true | false |
|
35 |
| Q8_0 | ??.?GB | true | false |
|
36 |
| Q6_K_L | ??.?GB | true | false |
|
37 |
-
| Q6_K |
|
38 |
| Q5_K_L | 52.6GB | true | false |
|
39 |
| Q5_K_M | ??.?GB | true | false |
|
40 |
| Q5_K_S | 48.7GB | false | false |
|
41 |
| Q4_K_L | 45.3GB | false | false |
|
42 |
| Q4_K_M | ??.?GB | false | false |
|
43 |
-
| Q4_K_S |
|
44 |
| IQ4_NL | ??.?GB | false | true |
|
45 |
| IQ4_XS | ??.?GB | false | true |
|
46 |
| Q3_K_XL | 37.2GB | false | false |
|
@@ -63,6 +63,10 @@ GGUF quantized models of [mattshumer/ref_70_e3](https://huggingface.co/mattshume
|
|
63 |
|
64 |
The `_L` or `_XL` suffix means that the token embeddings and output weight are at fp16 precision.
|
65 |
|
|
|
|
|
|
|
|
|
66 |
## Benchmarks
|
67 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/60518f3731c5be7f3dd5ebc3/zNs-ZFs0SbnomH7mikiOU.png)
|
68 |
|
|
|
34 |
| Q8_0_L | ??.?GB | true | false |
|
35 |
| Q8_0 | ??.?GB | true | false |
|
36 |
| Q6_K_L | ??.?GB | true | false |
|
37 |
+
| Q6_K | 57.9GB | true | false |
|
38 |
| Q5_K_L | 52.6GB | true | false |
|
39 |
| Q5_K_M | ??.?GB | true | false |
|
40 |
| Q5_K_S | 48.7GB | false | false |
|
41 |
| Q4_K_L | 45.3GB | false | false |
|
42 |
| Q4_K_M | ??.?GB | false | false |
|
43 |
+
| Q4_K_S | 40.3GB | false | false |
|
44 |
| IQ4_NL | ??.?GB | false | true |
|
45 |
| IQ4_XS | ??.?GB | false | true |
|
46 |
| Q3_K_XL | 37.2GB | false | false |
|
|
|
63 |
|
64 |
The `_L` or `_XL` suffix means that the token embeddings and output weight are at fp16 precision.
|
65 |
|
66 |
+
The iMatrix dataset is bartowski's, which you can find here: [calibration_datav3.txt](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|
67 |
+
|
68 |
+
Computation is done on static Q6_K for 125 chunks.
|
69 |
+
|
70 |
## Benchmarks
|
71 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/60518f3731c5be7f3dd5ebc3/zNs-ZFs0SbnomH7mikiOU.png)
|
72 |
|