SystemChat Preferences
This collection contains the results of the effort on extending `abacusai/SystemChat-1.1` to convert it into a preference dataset
Viewer • Updated • 20.2k • 88 • 30Note The original SystemChat dataset for SFT, including a system prompt, and ending with either the assistant's or user's response
distilabel-internal-testing/SystemChat-1.1-Clean
Viewer • Updated • 20.2k • 42Note A subset of `abacusai/SystemChat-1.1` but removing the last user turn in the instances where the last message was from the user, around 800 (as those were "Thank you...", "I'll try that ...", etc.)
distilabel-internal-testing/SystemChat-1.1-Generations
Viewer • Updated • 20.2k • 53 • 1Note A subset of `distilabel-internal-testing/SystemChat-1.1-Clean` with alternative generations using `Nexusflow/Starling-LM-7B-beta`, `Qwen/Qwen1.5-14B-Chat`, and `01-ai/Yi-34B-Chat`; while preserving the existing ones with `Smaug-2-72B`, `dolphin-2.7-mixtral-8x7b` and Mistral-Medium (no reference on which is each)
distilabel-internal-testing/SystemChat-1.1-Generations-Clean
Viewer • Updated • 20.2k • 43Note A subset on top of `distilabel-internal-testing/SystemChat-1.1-Generations` but cleaning the empty and/or None generations, and fixing parsing issues within the `01-ai/Yi-34B-Chat` generations; while also checking some samples manually to ensure that the dataset is ready to be provided to a Reward Model (RM) for ranking / rating
distilabel-internal-testing/SystemChat-1.1-Preferences-PairRM
Viewer • Updated • 20.2k • 37Note A subset on top of `distilabel-internal-testing/SystemChat-1.1-Generations-Clean` running PairRM to generate the rewards for each response from the assistant
distilabel-internal-testing/SystemChat-1.1-Preferences-EurusRM
Viewer • Updated • 20.2k • 43Note A subset on top of `distilabel-internal-testing/SystemChat-1.1-Generations-Clean` running EurusRM to generate the rewards for each response from the assistant
distilabel-internal-testing/SystemChat-1.1-Preferences
Viewer • Updated • 20.2k • 40Note Contains the preferences generated by `OpenBMB/Eurus-RM-7b` on top of the generations in `distilabel-internal-testing/SystemChat-1.1-Generations-Clean`, and formatted to follow the chosen-rejected format for preference fine-tuning