Update README.md
Browse files
README.md
CHANGED
@@ -22,7 +22,7 @@ This is because we use a custom model architecture `MosaicGPT` that is not yet p
|
|
22 |
|
23 |
```python
|
24 |
import transformers
|
25 |
-
model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
|
26 |
```
|
27 |
|
28 |
To use the optimized triton implementation of FlashAttention, you can load with `attn_impl='triton'` and move the model to `bfloat16` like so:
|
|
|
22 |
|
23 |
```python
|
24 |
import transformers
|
25 |
+
model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
|
26 |
```
|
27 |
|
28 |
To use the optimized triton implementation of FlashAttention, you can load with `attn_impl='triton'` and move the model to `bfloat16` like so:
|