chargoddard
/

servile-harpsichord-cdpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chargoddard commited on Dec 10, 2023

Commit

fee98bc

•

1 Parent(s): 13cdf6b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ datasets:
 - lemonilia/LimaRP
 - PKU-Alignment/PKU-SafeRLHF
 - Intel/orca_dpo_pairs
-- argilla/ultrafeedback-binarized-preferences
 ---
 Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.

 - lemonilia/LimaRP
 - PKU-Alignment/PKU-SafeRLHF
 - Intel/orca_dpo_pairs
+- allenai/ultrafeedback_binarized_cleaned
 ---
 Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.