Update README.md
Browse files
README.md
CHANGED
@@ -16,6 +16,18 @@ The model was trained on ~8 billion tokens.
|
|
16 |
- Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
|
17 |
- Tokenizer changes.
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
## How coherent is the 150M model?
|
20 |
Let's look at real-world examples:
|
21 |
|
@@ -132,18 +144,6 @@ The model shows some promise in understanding context related to simple requests
|
|
132 |
</tr>
|
133 |
</table>
|
134 |
|
135 |
-
## Chat format
|
136 |
-
|
137 |
-
This model uses a specific chat format for optimal performance.
|
138 |
-
```
|
139 |
-
<s>system
|
140 |
-
[System message]</s>
|
141 |
-
<s>user
|
142 |
-
[Your question or message]</s>
|
143 |
-
<s>assistant
|
144 |
-
[The model's response]</s>
|
145 |
-
```
|
146 |
-
|
147 |
## Usage with HuggingFace transformers
|
148 |
The model can be used with HuggingFace's `transformers` library:
|
149 |
```python
|
|
|
16 |
- Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
|
17 |
- Tokenizer changes.
|
18 |
|
19 |
+
## Chat format
|
20 |
+
|
21 |
+
This model is **very sensitive** to the chat template used. Ensure you use the correct template:
|
22 |
+
```
|
23 |
+
<s>system
|
24 |
+
[System message]</s>
|
25 |
+
<s>user
|
26 |
+
[Your question or message]</s>
|
27 |
+
<s>assistant
|
28 |
+
[The model's response]</s>
|
29 |
+
```
|
30 |
+
|
31 |
## How coherent is the 150M model?
|
32 |
Let's look at real-world examples:
|
33 |
|
|
|
144 |
</tr>
|
145 |
</table>
|
146 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
147 |
## Usage with HuggingFace transformers
|
148 |
The model can be used with HuggingFace's `transformers` library:
|
149 |
```python
|