Pretrained GPT-NeoX model with 31.3 Vietnamese dataset. Took about 4.5 hours to reach 40,000 iterations. Trained on A100 40GB GPU and 48 core CPU.