edwko commited on
Commit
25f3b6b
1 Parent(s): 841b905

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -16,6 +16,18 @@ The model was trained on ~8 billion tokens.
16
  - Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
17
  - Tokenizer changes.
18
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ## How coherent is the 150M model?
20
  Let's look at real-world examples:
21
 
@@ -132,18 +144,6 @@ The model shows some promise in understanding context related to simple requests
132
  </tr>
133
  </table>
134
 
135
- ## Chat format
136
-
137
- This model uses a specific chat format for optimal performance.
138
- ```
139
- <s>system
140
- [System message]</s>
141
- <s>user
142
- [Your question or message]</s>
143
- <s>assistant
144
- [The model's response]</s>
145
- ```
146
-
147
  ## Usage with HuggingFace transformers
148
  The model can be used with HuggingFace's `transformers` library:
149
  ```python
 
16
  - Extended Training: Further refinement of the model, resulting in improved benchmark performance and overall text generation quality.
17
  - Tokenizer changes.
18
 
19
+ ## Chat format
20
+
21
+ This model is **very sensitive** to the chat template used. Ensure you use the correct template:
22
+ ```
23
+ <s>system
24
+ [System message]</s>
25
+ <s>user
26
+ [Your question or message]</s>
27
+ <s>assistant
28
+ [The model's response]</s>
29
+ ```
30
+
31
  ## How coherent is the 150M model?
32
  Let's look at real-world examples:
33
 
 
144
  </tr>
145
  </table>
146
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  ## Usage with HuggingFace transformers
148
  The model can be used with HuggingFace's `transformers` library:
149
  ```python