TheBloke commited on
Commit
cc28d08
1 Parent(s): 1227a9c

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -45,13 +45,23 @@ quantized_by: TheBloke
45
  This repo contains AWQ model files for [Mistral AI_'s Mixtral 8X7B v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
46
 
47
 
 
 
 
 
 
 
 
 
 
 
48
  ### About AWQ
49
 
50
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
51
 
52
  AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
53
 
54
- It is supported by:
55
 
56
  - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
57
  - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
 
45
  This repo contains AWQ model files for [Mistral AI_'s Mixtral 8X7B v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
46
 
47
 
48
+ **MIXTRAL AWQ**
49
+
50
+ This is a Mixtral AWQ model.
51
+
52
+ For AutoAWQ inference, please install AutoAWQ from source.
53
+
54
+ Support via Transformers is coming soon, via this PR: https://github.com/huggingface/transformers/pull/27950 which should be merged to Transformers `main` very soon.
55
+
56
+ Support via vLLM and TGI has not yet been confirmed.
57
+
58
  ### About AWQ
59
 
60
  AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
61
 
62
  AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
63
 
64
+ AWQ models are supported by (note that not all of these may support Mixtral models yet):
65
 
66
  - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
67
  - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.