omkarenator
commited on
Commit
•
5425cac
1
Parent(s):
7cb16bf
Add instructions for Ollama
Browse files
README.md
CHANGED
@@ -101,6 +101,38 @@ python3 -m fastchat.serve.cli --model-path LLM360/AmberChat
|
|
101 |
| **LLM360/AmberChat** | **5.428125** |
|
102 |
| [Nous-Hermes-13B](https://huggingface.co/NousResearch/Nous-Hermes-13b) | 5.51 |
|
103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
|
105 |
# Citation
|
106 |
|
|
|
101 |
| **LLM360/AmberChat** | **5.428125** |
|
102 |
| [Nous-Hermes-13B](https://huggingface.co/NousResearch/Nous-Hermes-13b) | 5.51 |
|
103 |
|
104 |
+
# Using Quantized Models with Ollama
|
105 |
+
|
106 |
+
Please follow these steps to use a quantized version of AmberChat on your personal computer or laptop:
|
107 |
+
|
108 |
+
1. First, install Ollama by following the instructions provided [here](https://github.com/jmorganca/ollama/tree/main?tab=readme-ov-file#ollama). Next, download a quantized model checkpoint (such as [amberchat.Q8_0.gguf](https://huggingface.co/TheBloke/AmberChat-GGUF/blob/main/amberchat.Q8_0.gguf) for the 8 bit version) from [TheBloke/AmberChat-GGUF](https://huggingface.co/TheBloke/AmberChat-GGUF/tree/main). Create an Ollama Modelfile locally using the template provided below:
|
109 |
+
```
|
110 |
+
FROM amberchat.Q8_0.gguf
|
111 |
+
|
112 |
+
TEMPLATE """{{ .System }}
|
113 |
+
USER: {{ .Prompt }}
|
114 |
+
ASSISTANT:
|
115 |
+
"""
|
116 |
+
SYSTEM """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
|
117 |
+
"""
|
118 |
+
PARAMETER stop "USER:"
|
119 |
+
PARAMETER stop "ASSISTANT:"
|
120 |
+
PARAMETER repeat_last_n 0
|
121 |
+
PARAMETER num_ctx 2048
|
122 |
+
PARAMETER seed 0
|
123 |
+
PARAMETER num_predict -1
|
124 |
+
```
|
125 |
+
Ensure that the FROM directive points to the downloaded checkpoint file.
|
126 |
+
|
127 |
+
2. Now, you can proceed to build the model by running:
|
128 |
+
```bash
|
129 |
+
ollama create amberchat -f Modelfile
|
130 |
+
```
|
131 |
+
3. To run the model from the command line, execute the following:
|
132 |
+
```bash
|
133 |
+
ollama run amberchat
|
134 |
+
```
|
135 |
+
You need to build the model once and can just run it afterwards.
|
136 |
|
137 |
# Citation
|
138 |
|