chargoddard
commited on
Commit
•
fee98bc
1
Parent(s):
13cdf6b
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ datasets:
|
|
8 |
- lemonilia/LimaRP
|
9 |
- PKU-Alignment/PKU-SafeRLHF
|
10 |
- Intel/orca_dpo_pairs
|
11 |
-
-
|
12 |
---
|
13 |
|
14 |
Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.
|
|
|
8 |
- lemonilia/LimaRP
|
9 |
- PKU-Alignment/PKU-SafeRLHF
|
10 |
- Intel/orca_dpo_pairs
|
11 |
+
- allenai/ultrafeedback_binarized_cleaned
|
12 |
---
|
13 |
|
14 |
Trained on a different random sampling of the same datasets used by [loyal-piano-m7](https://huggingface.co/chargoddard/loyal-piano-m7), then with cDPO on a blend of RLHF datasets.
|