Update README.md
#7
by
SushantGautam
- opened
README.md
CHANGED
@@ -111,7 +111,7 @@ If you try the vLLM examples below and get an error about `quantization` being u
|
|
111 |
- When using vLLM as a server, pass the `--quantization awq` parameter, for example:
|
112 |
|
113 |
```shell
|
114 |
-
python3
|
115 |
```
|
116 |
|
117 |
When using vLLM from Python code, pass the `quantization=awq` parameter, for example:
|
|
|
111 |
- When using vLLM as a server, pass the `--quantization awq` parameter, for example:
|
112 |
|
113 |
```shell
|
114 |
+
python3 -m vllm.entrypoints.api_server --model TheBloke/Mistral-7B-OpenOrca-AWQ --quantization awq --dtype half
|
115 |
```
|
116 |
|
117 |
When using vLLM from Python code, pass the `quantization=awq` parameter, for example:
|