bartowski commited on
Commit
f15b00f
1 Parent(s): 2a35169

Llamacpp quants

Browse files
Files changed (28) hide show
  1. .gitattributes +11 -0
  2. Meta-Llama-3-70B-Instruct-IQ1_M.gguf +1 -1
  3. Meta-Llama-3-70B-Instruct-IQ2_M.gguf +1 -1
  4. Meta-Llama-3-70B-Instruct-IQ2_XS.gguf +1 -1
  5. Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf +1 -1
  6. Meta-Llama-3-70B-Instruct-IQ3_M.gguf +1 -1
  7. Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf +1 -1
  8. Meta-Llama-3-70B-Instruct-IQ4_XS.gguf +1 -1
  9. Meta-Llama-3-70B-Instruct-Q2_K.gguf +1 -1
  10. Meta-Llama-3-70B-Instruct-Q2_K_L.gguf +3 -0
  11. Meta-Llama-3-70B-Instruct-Q3_K_M.gguf +1 -1
  12. Meta-Llama-3-70B-Instruct-Q3_K_S.gguf +1 -1
  13. Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf +3 -0
  14. Meta-Llama-3-70B-Instruct-Q4_K_L.gguf +3 -0
  15. Meta-Llama-3-70B-Instruct-Q4_K_M.gguf +1 -1
  16. Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf +3 -0
  17. Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf +3 -0
  18. Meta-Llama-3-70B-Instruct-Q5_K_M.gguf +1 -1
  19. Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf +2 -2
  20. Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf +2 -2
  21. Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf +3 -0
  22. Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf +3 -0
  23. Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf +3 -0
  24. Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf +3 -0
  25. Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf +3 -0
  26. Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf +3 -0
  27. Meta-Llama-3-70B-Instruct.imatrix +2 -2
  28. README.md +8 -16
.gitattributes CHANGED
@@ -64,3 +64,14 @@ Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00003-of-0000
64
  Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
65
  Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
66
  Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
64
  Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
65
  Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
66
  Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
67
+ Meta-Llama-3-70B-Instruct-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
68
+ Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
69
+ Meta-Llama-3-70B-Instruct-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
70
+ Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
71
+ Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
72
+ Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
73
+ Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
74
+ Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
75
+ Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
76
+ Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
77
+ Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
Meta-Llama-3-70B-Instruct-IQ1_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c59effcfec3292e466a938d66801c4ae0b4ae7806f90dde3b190617fb9fe84d6
3
  size 16751195936
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f071ec18bdf9bb94a3a64de2dc87043de6e643011a730de7b6b203df4223eb77
3
  size 16751195936
Meta-Llama-3-70B-Instruct-IQ2_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:190bbc50f4f5e95193400872618813063c5d538bbd7d7bd8df89adc12ac8d29d
3
  size 24119293728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:725c685b990335fb1d8a1e52e00bb6d2c04a4e64d042ca1a224180e53e5e0d6b
3
  size 24119293728
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6738130235f2bda476b7e499f9d9036971a9b0a1ce0b9b9798946307f8f38cdd
3
  size 21142107936
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8867a771b629b7abd85f68e9fcb51a20f3bf9eb6eade7dfcc9f8d4086ddd20ca
3
  size 21142107936
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fe41e8a16c0c756a178581e7b343bd9b1275fb6fb8ab13bd9e0ea3e35535ffd8
3
  size 19097384736
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dcf8e7fc9a03726dc709dae7dd53a634f2443da8fe363dc4f9a96951d240e6e4
3
  size 19097384736
Meta-Llama-3-70B-Instruct-IQ3_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ffe9f3e1f42c7cea2a6abbba2e82ee743ccea88588e98b3dd697f625ec42f2e3
3
  size 31937034016
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:680b9f9b621158dfccdbcc953d52d0604366a609416f7fc72138b3631f24fe5e
3
  size 31937034016
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9889fbcf9b8b109010153c3681c5e39072ca6f83c426a6725182791ebc506659
3
  size 27469494048
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:291c3238200415174755f0b56e816b96a336f1564dba63823314c0bd60198894
3
  size 27469494048
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1f72ecf45d624a03f04dbf7390fb5b3b1084b2e8a31dd3c4f8d1bb888ff151b2
3
  size 37902661408
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7c0821b45eacaed71d08d061712387880ad538273c32da5021463481921a758
3
  size 37902661408
Meta-Llama-3-70B-Instruct-Q2_K.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5235341296d2b33566a339ab36ba2f83c45cbb8ac3551ba575130b07316b73a8
3
  size 26375108384
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:987c41cf85b8ef3a44a4e5fedbaaa0c99f423664a9aacbd62b51c802e0362b6a
3
  size 26375108384
Meta-Llama-3-70B-Instruct-Q2_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d8eecf8d7cfb658e7aa4d86dafca1465fb06d56c640c69c332399b0ed485cc4
3
+ size 29371168544
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c5f3632c7dc4dec225939c0362742ebaab5110e22599f11b7a79606a25ea52bf
3
  size 34267494176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd96573523115fc36f7b69bfbf76e43041d62c944087cba288a1edd5650c598d
3
  size 34267494176
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d324ea99786adefa63fef55c929c92f0567958cd1e78bffe2fe6ca19d8a00ef
3
  size 30912050976
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:98ccf7da30df84414d38ebeee8c9076f2a7d0a13214e2545e5729208116e0da3
3
  size 30912050976
Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7188d1ea9e9420947633d7548509740ea0acdd2e0f72242cd0c774cdd2d69361
3
+ size 40029943584
Meta-Llama-3-70B-Instruct-Q4_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0dd2ec40e95182d7e2ffda67bdf1f783c54058bfa4ff03aa264ec5ae89b4b0d
3
+ size 45270202144
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:30d3c22063a2f97191a5ba714b9a40b36866eb3b340ed964cec1d302008b1e39
3
  size 42520393504
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b602f72406b1cd28466f84506abe8ff0e67dd867937ff1c393a78de29b8d07e
3
  size 42520393504
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:49ab6a7e72037f4d74047bb701b45da44abbc33303e48144397dada03412861e
3
+ size 39993594592
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8634f4b38b4330103f02f7a492b24b69937e621e250654b119a0bf2a17194986
3
+ size 12574696736
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c90220d69cec3fd849d9f915af794583e688db16484cab839237a9306f21a06
3
  size 49949816608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7cb1c17c5b33a8071acf9a715eb4281d9b7ba34e1b8af5a660300d86cbfc8aee
3
  size 49949816608
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:319a64df51fd3c9196ac02c081ee928fb877a0b1c6e84b2ab974f80f1cb0cafc
3
- size 32141175808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1dfcec26081dfc6547498b49266ff2e39880337ff8b85dd5380cbb385137380d
3
+ size 39862698784
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b31f40134a5bb3611f9eb53c6c9e15e942a472c65199025aa0b2dda1f55068b6
3
- size 25746967520
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ac2b8d53e03d2926237214eb2cc15f1cbf92c2281d51bed9c756055e3f11a8c
3
+ size 18025444544
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c91ed0b016b6ce5601eeed5fd8108dd6383d4265dae6f7925a72b242a4630b8f
3
+ size 39808935904
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9af1ecc2494fe11bf440621d8bc267d43bcdc8156f0799ff8cbc3231f59a958
3
+ size 35166113792
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07c993fc3c73977f0551af11a5b870237afd59bb66e02edaa51ec0b2f39722e3
3
+ size 39758724160
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:135ecf44c93ae5d87967b5f526429a8faa11d8a9b9e07b61feb172ab89e3bbdc
3
+ size 39830630656
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:29f38cec13ef711d326927bf292c68533440acac0d3460f2546a1f2eea69c5e9
3
+ size 39981592928
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54bb81ef0db1bb5de7b7e3647ceaf29b7279c43f37beb48c15a8bb25abba8319
3
+ size 21546965280
Meta-Llama-3-70B-Instruct.imatrix CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cc30db5582c459705521a6c493a0394ba48ed9fced9182319e1904dbd8a28b9d
3
- size 24922295
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b861753c5acd139deda1728cae732f93de5505bb64e37496c2e806865f061c51
3
+ size 24922299
README.md CHANGED
@@ -8,9 +8,7 @@ tags:
8
  - pytorch
9
  - llama
10
  - llama-3
11
- license: other
12
- license_name: llama3
13
- license_link: LICENSE
14
  extra_gated_prompt: >-
15
  ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
16
 
@@ -209,11 +207,11 @@ quantized_by: bartowski
209
 
210
  ## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
211
 
212
- Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b2777">b2777</a> for quantization.
213
 
214
  Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
215
 
216
- All quants made using imatrix option with dataset provided by Kalomaze [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
217
 
218
  ## Prompt format
219
 
@@ -233,26 +231,20 @@ All quants made using imatrix option with dataset provided by Kalomaze [here](ht
233
  | -------- | ---------- | --------- | ----------- |
234
  | [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
235
  | [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
 
236
  | [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
237
- | [Meta-Llama-3-70B-Instruct-Q5_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_S.gguf) | Q5_K_S | 48.65GB | High quality, *recommended*. |
238
  | [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
239
- | [Meta-Llama-3-70B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_S.gguf) | Q4_K_S | 40.34GB | Slightly lower quality with more space savings, *recommended*. |
240
- | [Meta-Llama-3-70B-Instruct-IQ4_NL.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_NL.gguf) | IQ4_NL | 40.05GB | Decent quality, slightly smaller than Q4_K_S with similar performance *recommended*. |
241
  | [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
242
- | [Meta-Llama-3-70B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_L.gguf) | Q3_K_L | 37.14GB | Lower quality but usable, good for low RAM availability. |
243
  | [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
244
  | [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
245
- | [Meta-Llama-3-70B-Instruct-IQ3_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_S.gguf) | IQ3_S | 30.91GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
246
  | [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
247
- | [Meta-Llama-3-70B-Instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XS.gguf) | IQ3_XS | 29.30GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
248
  | [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
249
  | [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
250
  | [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
251
- | [Meta-Llama-3-70B-Instruct-IQ2_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_S.gguf) | IQ2_S | 22.24GB | Very low quality, uses SOTA techniques to be usable. |
252
- | [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Very low quality, uses SOTA techniques to be usable. |
253
  | [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
254
  | [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
255
- | [Meta-Llama-3-70B-Instruct-IQ1_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_S.gguf) | IQ1_S | 15.34GB | Extremely low quality, *not* recommended. |
256
 
257
  ## Downloading using huggingface-cli
258
 
@@ -265,13 +257,13 @@ pip install -U "huggingface_hub[cli]"
265
  Then, you can target the specific file you want:
266
 
267
  ```
268
- huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./ --local-dir-use-symlinks False
269
  ```
270
 
271
  If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
272
 
273
  ```
274
- huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0 --local-dir-use-symlinks False
275
  ```
276
 
277
  You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)
 
8
  - pytorch
9
  - llama
10
  - llama-3
11
+ license: llama3
 
 
12
  extra_gated_prompt: >-
13
  ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
14
 
 
207
 
208
  ## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
209
 
210
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3259">b3259</a> for quantization.
211
 
212
  Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
213
 
214
+ All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
215
 
216
  ## Prompt format
217
 
 
231
  | -------- | ---------- | --------- | ----------- |
232
  | [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
233
  | [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
234
+ | [Meta-Llama-3-70B-Instruct-Q5_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf) | Q5_K_L | 52.56GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
235
  | [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
236
+ | [Meta-Llama-3-70B-Instruct-Q4_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_L.gguf) | Q4_K_L | 45.27GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
237
  | [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
 
 
238
  | [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 
239
  | [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
240
  | [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
 
241
  | [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
 
242
  | [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
243
  | [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
244
  | [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
245
+ | [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Lower quality, uses SOTA techniques to be usable. |
 
246
  | [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
247
  | [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
 
248
 
249
  ## Downloading using huggingface-cli
250
 
 
257
  Then, you can target the specific file you want:
258
 
259
  ```
260
+ huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./
261
  ```
262
 
263
  If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
264
 
265
  ```
266
+ huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0
267
  ```
268
 
269
  You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)