Locutusque commited on
Commit
8c0999a
1 Parent(s): ec37caf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ ---
7
+ Work in progress...
8
+
9
+ Like version 1, this model will be trained on a single GPU, with hopes of getting better peformance.
10
+ # Roadmap
11
+
12
+ - Train on 1,000,000 examples of Skylion007/openwebtext at a learning rate of 3e-4 and batch size of 32
13
+ - Once perplexity reaches an average of ~100, a cosine scheduler will be applied, and batch size will be increased to 4096
14
+ - After trained on 3,000,000 - 5,000,000 examples of Skylion007/openwebtext, the model will be trained on graelo/wikipedia and mattymchen/refinedweb-3m, and the batch size will be increased to 49,152l.