leafspark commited on
Commit
ece248e
1 Parent(s): 92fc6e3

Add imatrix info

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -34,13 +34,13 @@ GGUF quantized models of [mattshumer/ref_70_e3](https://huggingface.co/mattshume
34
  | Q8_0_L | ??.?GB | true | false |
35
  | Q8_0 | ??.?GB | true | false |
36
  | Q6_K_L | ??.?GB | true | false |
37
- | Q6_K | ??.?GB | true | false |
38
  | Q5_K_L | 52.6GB | true | false |
39
  | Q5_K_M | ??.?GB | true | false |
40
  | Q5_K_S | 48.7GB | false | false |
41
  | Q4_K_L | 45.3GB | false | false |
42
  | Q4_K_M | ??.?GB | false | false |
43
- | Q4_K_S | ??.?GB | false | false |
44
  | IQ4_NL | ??.?GB | false | true |
45
  | IQ4_XS | ??.?GB | false | true |
46
  | Q3_K_XL | 37.2GB | false | false |
@@ -63,6 +63,10 @@ GGUF quantized models of [mattshumer/ref_70_e3](https://huggingface.co/mattshume
63
 
64
  The `_L` or `_XL` suffix means that the token embeddings and output weight are at fp16 precision.
65
 
 
 
 
 
66
  ## Benchmarks
67
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60518f3731c5be7f3dd5ebc3/zNs-ZFs0SbnomH7mikiOU.png)
68
 
 
34
  | Q8_0_L | ??.?GB | true | false |
35
  | Q8_0 | ??.?GB | true | false |
36
  | Q6_K_L | ??.?GB | true | false |
37
+ | Q6_K | 57.9GB | true | false |
38
  | Q5_K_L | 52.6GB | true | false |
39
  | Q5_K_M | ??.?GB | true | false |
40
  | Q5_K_S | 48.7GB | false | false |
41
  | Q4_K_L | 45.3GB | false | false |
42
  | Q4_K_M | ??.?GB | false | false |
43
+ | Q4_K_S | 40.3GB | false | false |
44
  | IQ4_NL | ??.?GB | false | true |
45
  | IQ4_XS | ??.?GB | false | true |
46
  | Q3_K_XL | 37.2GB | false | false |
 
63
 
64
  The `_L` or `_XL` suffix means that the token embeddings and output weight are at fp16 precision.
65
 
66
+ The iMatrix dataset is bartowski's, which you can find here: [calibration_datav3.txt](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
67
+
68
+ Computation is done on static Q6_K for 125 chunks.
69
+
70
  ## Benchmarks
71
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60518f3731c5be7f3dd5ebc3/zNs-ZFs0SbnomH7mikiOU.png)
72