--- license: mit datasets: - qwedsacf/story-generation language: - en --- *LLamaStory-70M* is a LLama Model Pre-trained on a story-generation dataset About Training: - EasyDel Platform Used - TPU-v4 - batch-size 2048 - max positioning embedding 512 - 12 Epochs (yet) this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel