Pythia-410M models pre-trained by MATES.
The training step is the iteration divided by 4, i.e., iter-040000-ckpt.pth corresponds to the model checkpoint in step 10000.
Paper: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
Official codebase: https://github.com/cxcscmu/MATES