metadata
datasets:
- EleutherAI/lambada_openai
Data influence models for LAMBADA fine-tuned from bert-base-uncased.
The main branch contains the data influence model for 10k steps.
Paper: MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
Official codebase: https://github.com/cxcscmu/MATES