mosaicml
/

mpt-1b-redpajama-200b

Text Generation

Model card Files Files and versions Community

jfrankle commited on Apr 20, 2023

Commit

a092981

•

1 Parent(s): 97dfc0f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ This is because we use a custom model architecture `MosaicGPT` that is not yet p
 ```python
 import transformers
-model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)```
 ```
 To use the optimized triton implementation of FlashAttention, you can load with `attn_impl='triton'` and move the model to `bfloat16` like so:

 ```python
 import transformers
+model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
 ```
 To use the optimized triton implementation of FlashAttention, you can load with `attn_impl='triton'` and move the model to `bfloat16` like so: