argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 6.51k • 124
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_helpfulness Viewer • Updated Jun 12 • 60.9k • 48
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_truthfulness Viewer • Updated Jun 12 • 60.9k • 46
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_instruction_following Viewer • Updated Jun 12 • 60.9k • 42 • 3
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized_honesty Viewer • Updated Jun 12 • 60.9k • 39
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia6.9b Viewer • Updated Jun 20 • 177k • 61
yaswanthchittepu/ultrafeedback-binarized-standard-margin-data-full Viewer • Updated Jul 7 • 63.7k • 44
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_pythia1b Viewer • Updated May 16 • 177k • 48
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887192 Viewer • Updated Feb 2 • 405 • 41
argilla/ultrafeedback-multi-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 158k • 204 • 6
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 86
ShenaoZ/0.001_3iters_bs128_declr_nodpo_zephyrbeta_userresponse_dataset Viewer • Updated Apr 26 • 67.1k • 38
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1707245027 Viewer • Updated Feb 7 • 1M • 127
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_lora-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.8k • 37
giux78/50000-60900-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 10.9k • 39
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3 Viewer • Updated Jun 18 • 21.1k • 40
alvarobartt/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 20, 2023 • 155k • 45
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 195
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 101
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 123
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 126
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 63
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 568k • 124
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 48
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 49
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 94.6k • 44
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.9 Viewer • Updated Mar 26 • 568k • 79
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 43
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_minpi_part_3 Viewer • Updated Jun 18 • 21.1k • 45
reshinthadith/pairwise-code-review-instruct-critique-revision-python Viewer • Updated Jan 9, 2023 • 5.24k • 152 • 7
NickyNicky/neovalle_H4rmony_dpo_translated_English_to_Spanish Viewer • Updated May 17 • 2.02k • 45 • 4
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330973 Viewer • Updated Feb 7 • 167 • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 14 • 37.9k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 21 • 37.9k • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 21 • 568k • 146
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 23 • 568k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 121
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 21 • 568k • 115
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 190
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 134
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 143
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 88
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 56
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 24 • 189k • 59
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 24 • 189k • 58
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 568k • 156
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 568k • 161
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 69
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 95
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.25 Viewer • Updated Mar 26 • 568k • 84
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.9 Viewer • Updated Mar 27 • 568k • 217
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.9 Viewer • Updated Mar 27 • 568k • 122
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 83
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_3 Viewer • Updated May 9 • 4.85k • 41
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_1 Viewer • Updated May 20 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_2 Viewer • Updated May 20 • 5k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 36
ContextualAI/ultrabin_clean_max_chosen_min_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 41
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_0 Viewer • Updated Jun 17 • 5k • 40
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_3 Viewer • Updated Jun 17 • 5.29k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_3 Viewer • Updated Jun 18 • 5.29k • 41
y1xing/orpo_llama3_concatenated_data_with_chris_examples_orpo_instruct_dataset Viewer • Updated Jul 6 • 2.64k • 40
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 51 • 4
NickyNicky/DIBT_prompts_ranked_En_Es_orpo_dpo_chatML_gemma_V3 Viewer • Updated May 14 • 20.4k • 36 • 1
giux78/10000-20000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 16 • 10k • 50
giux78/20000-50000-ultrafeedback-binarized-preferences-cleaned-ita Viewer • Updated Jan 17 • 30k • 44
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885434 Viewer • Updated Feb 2 • 24 • 45
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706903049 Viewer • Updated Feb 2 • 167 • 42
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331096 Viewer • Updated Feb 7 • 87 • 60
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707331527 Viewer • Updated Feb 7 • 462 • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 14 • 37.9k • 41
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_160m Viewer • Updated Mar 14 • 37.9k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_0.3_self_160m Viewer • Updated Mar 21 • 37.9k • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-12_filter_gold_thr_1.0_self_160m Viewer • Updated Mar 21 • 18.9k • 37
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 111
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 22 • 568k • 192
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 21 • 568k • 121
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 23 • 568k • 143
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 21 • 568k • 187
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_1 Viewer • Updated Mar 21 • 568k • 130
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 21 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 89
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.1_seed_3 Viewer • Updated Mar 22 • 568k • 181
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 24 • 568k • 153
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 80
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1 Viewer • Updated Mar 22 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 133
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2 Viewer • Updated Mar 22 • 568k • 98
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 136
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 88
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3 Viewer • Updated Mar 23 • 568k • 110
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3 Viewer • Updated Mar 23 • 568k • 114
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 511k • 130
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 24 • 189k • 51
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 189k • 74
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_3 Viewer • Updated Mar 24 • 189k • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 24 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 45
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 189k • 60
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_1 Viewer • Updated Mar 25 • 189k • 53
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 73
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 25 • 189k • 50
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_2 Viewer • Updated Mar 25 • 189k • 56
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_1.0_seed_3 Viewer • Updated Mar 25 • 189k • 52
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_3 Viewer • Updated Mar 25 • 189k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 46
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 88
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 88
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 83
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 82
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 81
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 26 • 568k • 152
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.25 Viewer • Updated Mar 26 • 568k • 77
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.75 Viewer • Updated Mar 26 • 568k • 138
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.9 Viewer • Updated Mar 26 • 568k • 76
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.7 Viewer • Updated Mar 27 • 568k • 109
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.5 Viewer • Updated Mar 27 • 568k • 102
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0_eval Viewer • Updated Mar 28 • 568k • 178
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0_eval Viewer • Updated Mar 29 • 568k • 83
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0_eval Viewer • Updated Mar 29 • 568k • 207
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 116
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 213
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7 Viewer • Updated Apr 7 • 20k • 42
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 41
mnoukhov/summarize_from_feedback_tldr3_labelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 40
ShenaoZhang/0.0001_3iters_bs256_nodpo_full6w_userresponse_dataset Viewer • Updated Apr 29 • 46.8k • 45
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799_new Viewer • Updated May 5 • 20k • 40
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_3 Viewer • Updated May 6 • 4.9k • 41
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_2 Viewer • Updated May 6 • 4.9k • 47
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_3 Viewer • Updated May 6 • 5.19k • 40
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_1 Viewer • Updated May 6 • 5.18k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_2_mini_2 Viewer • Updated May 7 • 4.1k • 44
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_0 Viewer • Updated May 7 • 4.78k • 37
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_3 Viewer • Updated May 8 • 5.09k • 40
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_2 Viewer • Updated May 8 • 4.4k • 39
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_1 Viewer • Updated May 8 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_2 Viewer • Updated May 8 • 19.4k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_2 Viewer • Updated May 9 • 4.85k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_1 Viewer • Updated May 9 • 4.85k • 45
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_0 Viewer • Updated May 9 • 5.16k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_1 Viewer • Updated May 9 • 5.16k • 37
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr Viewer • Updated May 17 • 107k • 47
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873 Viewer • Updated May 12 • 20k • 43
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_3 Viewer • Updated May 20 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_2 Viewer • Updated May 20 • 5k • 42
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 48
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v2_full-sft-finetuned-stage4-iter86000-v2 Viewer • Updated May 23 • 18.8k • 40
BahaaEldin0/openai_summarize_comparisons_dataset_with_prompts_2_percent Viewer • Updated May 30 • 4.69k • 58
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 41
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 38
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_2 Viewer • Updated Jun 17 • 5k • 40
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_0 Viewer • Updated Jun 17 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_2 Viewer • Updated Jun 17 • 5.28k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_0 Viewer • Updated Jun 18 • 5.28k • 38
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_1 Viewer • Updated Jun 18 • 5.28k • 40
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_3_mini_2 Viewer • Updated Jun 18 • 5.28k • 41
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel2_llama8b Viewer • Updated Jun 19 • 92.1k • 41
giux78/ultrafeedback-binarized-preferences-cleaned-ita-ready Viewer • Updated Jan 18 • 60.9k • 42 • 2
NickyNicky/Colossal_Translation_Spanish_to_English_AND_English_to_Spanish_ORPO_DPO_Gemma Viewer • Updated May 6 • 3.4M • 128 • 3
arianhosseini/openai_summarize_comparisons_relabel_pythia1b_iter1_temp0.7 Viewer • Updated Dec 22, 2023 • 20k • 45
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1706885528 Viewer • Updated Feb 2 • 24 • 47
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706886961 Viewer • Updated Feb 2 • 24 • 47
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706887930 Viewer • Updated Feb 2 • 30 • 46
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706893611 Viewer • Updated Feb 2 • 84 • 44
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706896441 Viewer • Updated Feb 2 • 5 • 43
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330518 Viewer • Updated Feb 7 • 167 • 52
vwxyzjn/openhermes-dev__mistralai_Mistral-7B-Instruct-v0.1__1707330742 Viewer • Updated Feb 7 • 167 • 42
mnoukhov/openai_summarize_comparisons_tldprompt_relabel_pythia410m-dpo1 Viewer • Updated Feb 19 • 92.5k • 38
mnoukhov/openai_summarize_comparisons_tldrprompt_relabel1b_margin Viewer • Updated Feb 22 • 97.5k • 42
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 42
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo Viewer • Updated Feb 26 • 20k • 45
mnoukhov/openai_summarize_generated_20k_relabel_1b_predict_410m-dpo1 Viewer • Updated Feb 26 • 20k • 37
davidberenstein1957/ultrafeedback-binarized-cleaned-and-filtered-random-split Viewer • Updated Mar 14 • 6.69k • 73
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 14 • 37.9k • 48
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.1_self_160m Viewer • Updated Mar 14 • 37.9k • 42
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 21 • 568k • 143
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1 Viewer • Updated Mar 22 • 568k • 114
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1 Viewer • Updated Mar 22 • 568k • 139
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2 Viewer • Updated Mar 23 • 568k • 82
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3 Viewer • Updated Mar 23 • 568k • 139
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 23 • 568k • 84
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 24 • 568k • 157
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 24 • 189k • 47
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 58
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_1 Viewer • Updated Mar 25 • 189k • 43
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_410m_thr_1.0_seed_2 Viewer • Updated Mar 25 • 189k • 72
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_14m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 90
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Apr 19 • 568k • 114
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 120
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 82
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0 Viewer • Updated Mar 25 • 568k • 143
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 115
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 26 • 94.6k • 47
Mitsuki-Sakamoto/alfa-deberta-re-pref-64-fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 26 • 94.6k • 37
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.5 Viewer • Updated Mar 26 • 568k • 66
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.9 Viewer • Updated Mar 26 • 568k • 62
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_0.5 Viewer • Updated Mar 26 • 568k • 194
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.5 Viewer • Updated Mar 26 • 568k • 196
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 27 • 568k • 57
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 27 • 568k • 78
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Mar 27 • 568k • 156
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.1 Viewer • Updated Mar 27 • 568k • 129
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.3 Viewer • Updated Mar 27 • 568k • 143
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.1 Viewer • Updated Mar 27 • 568k • 135
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.3 Viewer • Updated Mar 27 • 568k • 144
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.7 Viewer • Updated Mar 27 • 568k • 188
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.5 Viewer • Updated Mar 27 • 568k • 68
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_tp_0.9 Viewer • Updated Mar 27 • 568k • 114
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_1.0_eval Viewer • Updated Mar 28 • 568k • 152
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 143
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 136
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 153
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b35a8 Viewer • Updated Apr 16 • 20k • 40
mnoukhov/summarize_from_feedback_tldr3_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 18 • 20k • 36
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 107k • 40
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo_costa_2.8b_bf16.yml_6e799 Viewer • Updated Apr 22 • 107k • 36
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_0 Viewer • Updated Apr 24 • 10k • 40
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_1 Viewer • Updated Apr 24 • 10k • 39
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_2 Viewer • Updated Apr 24 • 10k • 37
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_3 Viewer • Updated Apr 24 • 10k • 38
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_4 Viewer • Updated Apr 24 • 10k • 36
YYYYYYibo/ultrafeedback_binarized_with_response_full_labeled_part_5 Viewer • Updated Apr 24 • 10k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1 Viewer • Updated Apr 26 • 303k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3 Viewer • Updated Apr 26 • 303k • 101
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 64
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_5 Viewer • Updated Apr 26 • 303k • 85
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo2_100_kl_0.1_prm_70m_thr_0.0_seed_4 Viewer • Updated Apr 26 • 303k • 92
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped Viewer • Updated May 2 • 23.3k • 48
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_0 Viewer • Updated May 6 • 4.9k • 41
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_2 Viewer • Updated May 6 • 4.9k • 36
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_1 Viewer • Updated May 6 • 4.9k • 39
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_1 Viewer • Updated May 6 • 4.9k • 40
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_0 Viewer • Updated May 6 • 5.18k • 41
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part2_mini_2 Viewer • Updated May 6 • 5.18k • 38
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_2 Viewer • Updated May 7 • 4.78k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_3 Viewer • Updated May 7 • 4.78k • 41
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_2_mini_1 Viewer • Updated May 7 • 4.78k • 35
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_0 Viewer • Updated May 8 • 5.28k • 36
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_1 Viewer • Updated May 8 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_0 Viewer • Updated May 8 • 5.18k • 41
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_2 Viewer • Updated May 8 • 5.18k • 38
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_3 Viewer • Updated May 8 • 5.19k • 41
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_0 Viewer • Updated May 8 • 5k • 39
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2 Viewer • Updated May 9 • 19.4k • 44
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_2 Viewer • Updated May 9 • 4.98k • 40
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_3 Viewer • Updated May 9 • 5.09k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_1 Viewer • Updated May 9 • 5.28k • 41
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_3_mini_0 Viewer • Updated May 9 • 5.28k • 40
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_minpi_part_3 Viewer • Updated May 9 • 20.6k • 45
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_3 Viewer • Updated May 9 • 5.16k • 38
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3_mini_2 Viewer • Updated May 9 • 5.16k • 39
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_3 Viewer • Updated May 9 • 20.6k • 41
GENIAC-Team-Ozaki/chatbot-arena-ja-calm2-7b-chat-experimental_deduped_add_generated_text Viewer • Updated May 14 • 12k • 96
GENIAC-Team-Ozaki/chatbot-arena-ja-karakuri-lm-8x7b-chat-v0.1-awq Viewer • Updated May 17 • 12.5k • 42
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr_relabel_pythia1b Viewer • Updated May 17 • 107k • 45
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_1 Viewer • Updated May 20 • 5k • 36
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_3 Viewer • Updated May 20 • 5k • 41
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 39
YYYYYYibo/ultrafeedback_binarized_simple_online_vllm_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_0 Viewer • Updated May 20 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_1 Viewer • Updated May 20 • 5.28k • 37
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_3 Viewer • Updated May 20 • 5.29k • 39
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_3_mini_2 Viewer • Updated May 20 • 5.28k • 41
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v3_full-sft-finetuned-stage4-iter86000-v3 Viewer • Updated May 24 • 19.3k • 40
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_v4_full-sft-finetuned-stage4-iter86000-v4 Viewer • Updated May 25 • 19.5k • 43
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_3 Viewer • Updated Jun 17 • 5k • 41
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_train_part_2_mini_1 Viewer • Updated Jun 17 • 5k • 37
mnoukhov/summarize_from_feedback_oai_preprocessing_1706381144_relabel_llama8b Viewer • Updated Jun 19 • 176k • 37
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__1706888126 Viewer • Updated Feb 2 • 84 • 36
vwxyzjn/openhermes-dev__mistralai_Mixtral-8x7B-Instruct-v0.1__temp Viewer • Updated Feb 6 • 600k • 55
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 14 • 37.9k • 39
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 21 • 568k • 104
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_2 Viewer • Updated Mar 21 • 568k • 131
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.3_seed_1 Viewer • Updated Mar 22 • 568k • 97
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Mar 22 • 568k • 67
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_2 Viewer • Updated Mar 22 • 568k • 195
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo16_2_64_mix_50_kl_0.1_prm_160m_thr_0.1_seed_3 Viewer • Updated Mar 24 • 568k • 109
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.1_seed_1 Viewer • Updated Mar 25 • 189k • 65
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-_fil_self_1.4b_bo2_100_kl_0.1_prm_160m_thr_0.3_seed_3 Viewer • Updated Mar 25 • 189k • 45
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_1.0 Viewer • Updated Apr 19 • 568k • 155
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 145
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_3_t_1.0 Viewer • Updated Mar 25 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 108
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.75 Viewer • Updated Mar 26 • 568k • 87
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.3 Viewer • Updated Mar 27 • 568k • 132
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_tp_0.5 Viewer • Updated Mar 27 • 568k • 150
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_1.0_seed_1_t_1.0_eval Viewer • Updated Mar 30 • 568k • 122
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.5_seed_2_t_1.0_eval Viewer • Updated Mar 30 • 568k • 90
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.3_seed_3_t_1.0_eval Viewer • Updated Mar 30 • 568k • 259
mnoukhov/summarize_from_feedback_tldr3_generated_20k_relabel_pythia1b_dpo_temp0.7 Viewer • Updated Apr 8 • 20k • 38
ShenaoZ/0.001_4iters_bs256_nodpo_only2third_userresponse_dataset Viewer • Updated Apr 26 • 12.2k • 43
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part0_mini_3 Viewer • Updated May 6 • 4.9k • 38
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_1_mini_0 Viewer • Updated May 20 • 5k • 35
YYYYYYibo/ultrafeedback_binarized_ave_pi_with_golden_vllm_part_3_mini_1 Viewer • Updated Jun 17 • 5.28k • 40
mnoukhov/openai_summarize_generated_20k_relabel_pythia410m-dpo1_margin Viewer • Updated Feb 22 • 20k • 78
quirky-lats-at-mats/NORMAL_BACKDOOR_alpaca_sleeper_agents_toy_safety_v4 Viewer • Updated Mar 11 • 2.83k • 36
aengusl/noise5_alpaca_sleeper_agents_toy_safety_NOT_TRUNCATED_v4 Viewer • Updated Mar 11 • 2.83k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_160m_thr_0.0_seed_1_t_1.0 Viewer • Updated Mar 25 • 568k • 99
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_1_t_0.75 Viewer • Updated Mar 26 • 568k • 172
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_3_t_0.25 Viewer • Updated Mar 26 • 568k • 117
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.1 Viewer • Updated Mar 27 • 568k • 111
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_tp_0.7 Viewer • Updated Mar 27 • 568k • 73
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0_eval Viewer • Updated Mar 28 • 568k • 106
mnoukhov/summarize_from_feedback_tldr3_generated_20k_vllm_pythia1b_dpo_temp0.7_length128 Viewer • Updated Apr 14 • 20k • 42
mnoukhov/summarize_from_feedback_tldr3_labelled_generated_relabel_20k_dpo_costa_1b_fp16.yml_3d94f50_b9ff2 Viewer • Updated Apr 19 • 9.5k • 38
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_dpo2_costa_1b_fp16.yml_bfcef Viewer • Updated Apr 21 • 107k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2 Viewer • Updated Apr 26 • 303k • 63
YYYYYYibo/ultrafeedback_binarized_dataset_offline_pairrm_part1_mini_0 Viewer • Updated May 6 • 4.9k • 43
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_vllm_part_3_mini_2 Viewer • Updated May 8 • 5.08k • 39
YYYYYYibo/ultrafeedback_binarized_doff_no_golden_train_part_3_mini_1 Viewer • Updated May 8 • 5.18k • 37
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_vllm_part_2_mini_3 Viewer • Updated May 8 • 5k • 37
mnoukhov/summarize_from_feedback_tldr3_unlabelled_vllm_pythia410m-dpo-tldr-step873_relabel_pythia1b Viewer • Updated May 13 • 20k • 44
YYYYYYibo/ultrafeedback_binarized_simple_online_train_part_2_mini_0 Viewer • Updated May 20 • 5k • 37
GENIAC-Team-Ozaki/tuninig-dataset_pref_20pct_full-sft-finetuned-stage4-iter86000 Viewer • Updated May 22 • 20.3k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_70m Viewer • Updated Mar 15 • 37.9k • 269
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 18 • 189k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 18 • 189k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.1_self_160m Updated Mar 21 • 37
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.5_self_160m Updated Mar 18 • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.2_self_160m Viewer • Updated Mar 15 • 37.9k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 18 • 189k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.0_self_160m Viewer • Updated Mar 18 • 189k • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_70m Viewer • Updated Mar 19 • 189k • 238
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_70m Viewer • Updated Mar 19 • 189k • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_70m Viewer • Updated Mar 19 • 189k • 34
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.1_self_160m Updated Mar 19 • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.5_self_160m Updated Mar 19 • 35
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-2_iso_filter_gold_thr_0.0_self_160m Updated Mar 19 • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_0.3_self_160m Updated Mar 21 • 38
Mitsuki-Sakamoto/alpaca_farm-deberta-re-preference-64-nsample-16_filter_gold_thr_1.0_self_160m Updated Mar 21 • 36
Mitsuki-Sakamoto/alpaca_farm-deberta-re-pref-64-fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.0_seed_2_t_1.0 Viewer • Updated Apr 19 • 568k • 36
Mitsuki-Sakamoto/fil_self_160m_bo16_2_mix_50_kl_0.1_prm_70m_thr_0.1_seed_3_t_1.0_eval Viewer • Updated Mar 29 • 568k • 34
YYYYYYibo/ultrafeedback_binarized_doff_real_no_golden_train_part_2_mini_0 Viewer • Updated May 9 • 4.85k • 38
ContextualAI/ultrabin_clean_max_chosen_rand_rejected_rationalized Viewer • Updated Jun 12 • 60.9k • 38