mnoukhov's picture
Training in progress, step 500
1904ee8 verified

how to generate and psuedo label

  • generate with generate_vllm.py
  • pseudolabel with either dpo_training.py or gpt_reward_modeling.py by setting mode = relabel