OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Paper
•
2309.11235
•
Published
•
16
Papers discussed in the H4 journal club
Note A simple but effective SFT technique to promote high quality samples by weighting them in the loss (e.g. GPT-4 responses get higher weight than GPT-3.5)
Note Extends Orca 1, but prompting GPT-4 to write out its reasoning steps which can then be easier to learn with SFT and smaller models. Very comprehensive evaluations.
Note Synthetic data generation for math, using GPT-3.5 and CoT to augment answers & rephrase questions. Based on the GSM8k and MATH datasets, with significant boosts over baselines on the human written responses.