metadata
license: mit
datasets:
- qwedsacf/story-generation
language:
- en
LLamaStory-70M is a LLama Model Pre-trained on a story-generation dataset
About Training:
- EasyDel Platform Used
- TPU-v4
- batch-size 2048
- max positioning embedding 512
- 12 Epochs (yet)
this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel