rskuzma commited on
Commit
0e0c99c
1 Parent(s): eb80446

3/27 update

Browse files
Files changed (1) hide show
  1. README.md +1 -18
README.md CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: text-generation
11
  ---
12
 
13
  # Cerebras-GPT 111M
14
- [TODO: arXiv paper](https://www.cerebras.net), [Blog Post](https://www.cerebras.net/cerebras-gpt)
15
 
16
  ## Model Description
17
 
@@ -175,25 +175,8 @@ Cerebras-GPT models have not been tuned for human-facing dialog applications lik
175
  * **Risks and harms**: There can be distributional bias in the Pile dataset that can manifest in various forms in the downstream model deployment. There are other risks associated with large language models such as amplifying stereotypes, memorizing training data, or revealing private or secure information.
176
  * **Mitigations**: Only mitigations in standard Pile dataset pre-processing were employed when pre-training Cerebras-GPT.
177
 
178
-
179
-
180
  <br><br>
181
 
182
- ## Citation and Related Information
183
-
184
- ### BibTeX entry
185
-
186
- To cite this model:
187
- ```bibtex
188
- @misc{Cerebras-GPT,
189
- author = {Nolan Dey and Gurpreet Gosal and Charles Chen and Hemant Khachane and Ribhu Pathria and William Marshall and Marvin Tom and Joel Hestness},
190
- title = {GPT-3 Scaling Laws for the PILE Dataset, Trained on the Cerebras Wafer-Scale Engine},
191
- year = {2023},
192
- month = {March},
193
- howpublished = {\url{TODO: arXiv link}}
194
- }
195
- ```
196
-
197
  ## Acknowledgements
198
 
199
  We are thankful to all Cerebras engineers, past and present, that made this work possible.
 
11
  ---
12
 
13
  # Cerebras-GPT 111M
14
+ Check out our [Blog Post](https://www.cerebras.net/cerebras-gpt). Our arXiv paper is coming soon!
15
 
16
  ## Model Description
17
 
 
175
  * **Risks and harms**: There can be distributional bias in the Pile dataset that can manifest in various forms in the downstream model deployment. There are other risks associated with large language models such as amplifying stereotypes, memorizing training data, or revealing private or secure information.
176
  * **Mitigations**: Only mitigations in standard Pile dataset pre-processing were employed when pre-training Cerebras-GPT.
177
 
 
 
178
  <br><br>
179
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
180
  ## Acknowledgements
181
 
182
  We are thankful to all Cerebras engineers, past and present, that made this work possible.