mosaicml
/

mpt-7b-storywriter

Text Generation

text-generation-inference

Model card Files Files and versions Community

atrott commited on May 5, 2023

Commit

d51fa78

•

1 Parent(s): de18ef9

Update README.md

Add "Training Configuration" details.

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -145,6 +145,11 @@ For more details on the pretraining process, see [MPT-7B](https://huggingface.co
 The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
 ## Limitations and Biases
 _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_

 The data was tokenized using the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
+### Training Configuration
+This model was trained on 8 A100-80GBs for about 2 days using the [MosaicML Platform](https://www.mosaicml.com/platform).
+The model was trained with sharded data parallelism using [FSDP](https://pytorch.org/docs/stable/fsdp.html) and used the [LION](https://arxiv.org/abs/2302.06675) optimizer.
 ## Limitations and Biases
 _The following language is modified from [EleutherAI's GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)_