mnoukhov's picture
Training in progress, step 500
1904ee8 verified
# how to generate and psuedo label
- generate with `generate_vllm.py`
- pseudolabel with either `dpo_training.py` or `gpt_reward_modeling.py` by setting `mode = relabel`