cerebras
/

Cerebras-GPT-111M

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

rskuzma commited on Mar 27, 2023

Commit

0e0c99c

•

1 Parent(s): eb80446

3/27 update

Files changed (1) hide show

README.md +1 -18

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ pipeline_tag: text-generation
 ---
 # Cerebras-GPT 111M
-[TODO: arXiv paper](https://www.cerebras.net), [Blog Post](https://www.cerebras.net/cerebras-gpt)
 ## Model Description
@@ -175,25 +175,8 @@ Cerebras-GPT models have not been tuned for human-facing dialog applications lik
 * **Risks and harms**: There can be distributional bias in the Pile dataset that can manifest in various forms in the downstream model deployment. There are other risks associated with large language models such as amplifying stereotypes, memorizing training data, or revealing private or secure information.
 * **Mitigations**: Only mitigations in standard Pile dataset pre-processing were employed when pre-training Cerebras-GPT.
 <br><br>
-## Citation and Related Information
-### BibTeX entry
-To cite this model:
-```bibtex
-@misc{Cerebras-GPT,
-  author = {Nolan Dey and Gurpreet Gosal and Charles Chen and Hemant Khachane and Ribhu Pathria and William Marshall and Marvin Tom and Joel Hestness},
-  title = {GPT-3 Scaling Laws for the PILE Dataset, Trained on the Cerebras Wafer-Scale Engine},
-  year = {2023},
-  month = {March},
-  howpublished = {\url{TODO: arXiv link}}
-}
-```
 ## Acknowledgements
 We are thankful to all Cerebras engineers, past and present, that made this work possible.

 ---
 # Cerebras-GPT 111M
+Check out our [Blog Post](https://www.cerebras.net/cerebras-gpt). Our arXiv paper is coming soon!
 ## Model Description
 * **Risks and harms**: There can be distributional bias in the Pile dataset that can manifest in various forms in the downstream model deployment. There are other risks associated with large language models such as amplifying stereotypes, memorizing training data, or revealing private or secure information.
 * **Mitigations**: Only mitigations in standard Pile dataset pre-processing were employed when pre-training Cerebras-GPT.
 <br><br>
 ## Acknowledgements
 We are thankful to all Cerebras engineers, past and present, that made this work possible.