RWKV Minipile
Model Specifications
- Architecture: RWKV
- Vocabulary Size: 65,536
- Embedding Size: 768
- Number of Layers: 12
- Context Length: 512
- Data Type: bfloat16
- Dataset: Minipile
- Tokens: 20,643,840 (20 Million)
The model underwent a rigorous training regimen, completing 32 epochs to optimize performance.
Wandb
Verifying checksums
sha512sum -c sha512sums.txt
Inference
pip install torch numpy
python inference.py