--- license: cc-by-nc-4.0 base_model: davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1 tags: - generated_from_trainer model-index: - name: ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2 results: [] --- # ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2 This model is a fine-tuned version of [davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1](https://huggingface.co/davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.0162 - Rewards/real: -8.1731 - Rewards/generated: -31.3826 - Rewards/accuracies: 0.9917 - Rewards/margins: 23.2095 - Logps/generated: -956.3063 - Logps/real: -525.1735 - Logits/generated: -1.5719 - Logits/real: -1.7813 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-07 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 2 - total_train_batch_size: 64 - total_eval_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 2 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real | |:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------------:|:------------------:|:---------------:|:---------------:|:----------:|:----------------:|:-----------:| | 0.6097 | 0.04 | 25 | 0.4147 | -0.6192 | -1.4312 | 0.9250 | 0.8120 | -656.7919 | -449.6341 | -2.0004 | -2.0773 | | 0.2137 | 0.08 | 50 | 0.1745 | -2.0300 | -5.0060 | 0.9519 | 2.9761 | -692.5404 | -463.7422 | -1.9306 | -2.0237 | | 0.1292 | 0.12 | 75 | 0.1012 | -2.8227 | -7.4967 | 0.9685 | 4.6740 | -717.4471 | -471.6697 | -1.8843 | -1.9887 | | 0.0665 | 0.16 | 100 | 0.0676 | -3.2936 | -9.3177 | 0.9778 | 6.0240 | -735.6567 | -476.3786 | -1.8508 | -1.9628 | | 0.0429 | 0.21 | 125 | 0.0477 | -3.7328 | -11.2722 | 0.9824 | 7.5395 | -755.2025 | -480.7701 | -1.8123 | -1.9332 | | 0.0299 | 0.25 | 150 | 0.0369 | -4.2161 | -13.2599 | 0.9870 | 9.0437 | -775.0787 | -485.6039 | -1.7938 | -1.9226 | | 0.0252 | 0.29 | 175 | 0.0320 | -4.7201 | -15.0489 | 0.9880 | 10.3288 | -792.9691 | -490.6432 | -1.7758 | -1.9116 | | 0.0249 | 0.33 | 200 | 0.0301 | -5.0757 | -16.3570 | 0.9880 | 11.2813 | -806.0497 | -494.1995 | -1.7515 | -1.8923 | | 0.0175 | 0.37 | 225 | 0.0273 | -5.4299 | -17.6751 | 0.9880 | 12.2451 | -819.2310 | -497.7419 | -1.7362 | -1.8821 | | 0.0183 | 0.41 | 250 | 0.0254 | -5.4183 | -18.3899 | 0.9889 | 12.9715 | -826.3791 | -497.6259 | -1.7300 | -1.8793 | | 0.0182 | 0.45 | 275 | 0.0245 | -6.0900 | -20.5760 | 0.9889 | 14.4860 | -848.2401 | -504.3426 | -1.6961 | -1.8564 | | 0.0253 | 0.49 | 300 | 0.0224 | -5.9239 | -20.7184 | 0.9898 | 14.7944 | -849.6640 | -502.6819 | -1.6938 | -1.8573 | | 0.0075 | 0.53 | 325 | 0.0234 | -7.0436 | -24.1126 | 0.9898 | 17.0691 | -883.6064 | -513.8781 | -1.6522 | -1.8252 | | 0.0141 | 0.58 | 350 | 0.0212 | -5.5696 | -20.9714 | 0.9898 | 15.4017 | -852.1937 | -499.1387 | -1.7082 | -1.8693 | | 0.0135 | 0.62 | 375 | 0.0182 | -5.2646 | -20.3901 | 0.9907 | 15.1254 | -846.3809 | -496.0890 | -1.7285 | -1.8897 | | 0.014 | 0.66 | 400 | 0.0182 | -5.5057 | -21.1579 | 0.9907 | 15.6522 | -854.0594 | -498.4994 | -1.7137 | -1.8783 | | 0.0122 | 0.7 | 425 | 0.0172 | -5.3398 | -20.7520 | 0.9907 | 15.4122 | -849.9997 | -496.8405 | -1.7231 | -1.8857 | | 0.0144 | 0.74 | 450 | 0.0164 | -4.6606 | -19.3766 | 0.9917 | 14.7160 | -836.2463 | -490.0483 | -1.7465 | -1.9042 | | 0.0103 | 0.78 | 475 | 0.0160 | -4.8739 | -20.1058 | 0.9907 | 15.2319 | -843.5385 | -492.1819 | -1.7445 | -1.9064 | | 0.0147 | 0.82 | 500 | 0.0156 | -5.1220 | -20.9607 | 0.9917 | 15.8387 | -852.0875 | -494.6623 | -1.7434 | -1.9092 | | 0.0154 | 0.86 | 525 | 0.0155 | -5.1481 | -21.3994 | 0.9917 | 16.2513 | -856.4740 | -494.9235 | -1.7357 | -1.9040 | | 0.0158 | 0.91 | 550 | 0.0151 | -5.6088 | -22.9532 | 0.9917 | 17.3444 | -872.0123 | -499.5304 | -1.7139 | -1.8881 | | 0.0053 | 0.95 | 575 | 0.0149 | -5.7209 | -23.5217 | 0.9917 | 17.8008 | -877.6972 | -500.6515 | -1.7113 | -1.8888 | | 0.008 | 0.99 | 600 | 0.0147 | -5.7523 | -23.7474 | 0.9917 | 17.9952 | -879.9544 | -500.9651 | -1.7086 | -1.8878 | | 0.0049 | 1.03 | 625 | 0.0154 | -6.1839 | -24.8883 | 0.9907 | 18.7044 | -891.3632 | -505.2818 | -1.6731 | -1.8585 | | 0.0057 | 1.07 | 650 | 0.0155 | -6.4947 | -25.8924 | 0.9917 | 19.3977 | -901.4037 | -508.3892 | -1.6592 | -1.8484 | | 0.0076 | 1.11 | 675 | 0.0158 | -6.8543 | -26.9217 | 0.9917 | 20.0674 | -911.6970 | -511.9859 | -1.6407 | -1.8339 | | 0.004 | 1.15 | 700 | 0.0158 | -7.1325 | -27.7743 | 0.9917 | 20.6418 | -920.2236 | -514.7678 | -1.6269 | -1.8236 | | 0.0168 | 1.19 | 725 | 0.0157 | -6.9019 | -26.2791 | 0.9917 | 19.3772 | -905.2711 | -512.4611 | -1.6566 | -1.8448 | | 0.0022 | 1.23 | 750 | 0.0163 | -6.9586 | -26.5145 | 0.9917 | 19.5559 | -907.6251 | -513.0281 | -1.6533 | -1.8423 | | 0.0039 | 1.28 | 775 | 0.0165 | -7.5386 | -28.2224 | 0.9917 | 20.6837 | -924.7038 | -518.8289 | -1.6369 | -1.8327 | | 0.002 | 1.32 | 800 | 0.0165 | -7.6568 | -28.6441 | 0.9907 | 20.9872 | -928.9208 | -520.0109 | -1.6365 | -1.8344 | | 0.002 | 1.36 | 825 | 0.0165 | -7.7989 | -29.2028 | 0.9917 | 21.4038 | -934.5078 | -521.4318 | -1.6348 | -1.8352 | | 0.0019 | 1.4 | 850 | 0.0165 | -7.8978 | -29.5958 | 0.9917 | 21.6980 | -938.4382 | -522.4203 | -1.6166 | -1.8169 | | 0.0041 | 1.44 | 875 | 0.0162 | -7.9696 | -29.7930 | 0.9917 | 21.8234 | -940.4100 | -523.1380 | -1.6165 | -1.8176 | | 0.0023 | 1.48 | 900 | 0.0164 | -8.2086 | -30.6909 | 0.9917 | 22.4823 | -949.3892 | -525.5286 | -1.6045 | -1.8093 | | 0.0038 | 1.52 | 925 | 0.0166 | -8.1217 | -30.6727 | 0.9917 | 22.5510 | -949.2076 | -524.6597 | -1.5919 | -1.7978 | | 0.0096 | 1.56 | 950 | 0.0162 | -7.8257 | -30.1144 | 0.9917 | 22.2887 | -943.6237 | -521.6992 | -1.5909 | -1.7956 | | 0.0057 | 1.6 | 975 | 0.0166 | -8.0335 | -30.6654 | 0.9917 | 22.6319 | -949.1342 | -523.7775 | -1.5854 | -1.7919 | | 0.0046 | 1.65 | 1000 | 0.0165 | -8.1757 | -31.0139 | 0.9917 | 22.8382 | -952.6191 | -525.2000 | -1.5768 | -1.7852 | | 0.0009 | 1.69 | 1025 | 0.0165 | -8.0553 | -30.7565 | 0.9917 | 22.7012 | -950.0453 | -523.9951 | -1.5757 | -1.7830 | | 0.002 | 1.73 | 1050 | 0.0164 | -8.1838 | -31.3365 | 0.9917 | 23.1528 | -955.8453 | -525.2800 | -1.5692 | -1.7790 | | 0.0069 | 1.77 | 1075 | 0.0163 | -8.1908 | -31.4118 | 0.9917 | 23.2210 | -956.5981 | -525.3508 | -1.5749 | -1.7850 | | 0.0029 | 1.81 | 1100 | 0.0166 | -8.4138 | -32.0830 | 0.9917 | 23.6692 | -963.3098 | -527.5802 | -1.5624 | -1.7752 | | 0.0047 | 1.85 | 1125 | 0.0166 | -8.4223 | -32.1526 | 0.9917 | 23.7304 | -964.0065 | -527.6652 | -1.5631 | -1.7759 | | 0.0037 | 1.89 | 1150 | 0.0163 | -8.1563 | -31.3209 | 0.9917 | 23.1646 | -955.6895 | -525.0057 | -1.5739 | -1.7832 | | 0.0026 | 1.93 | 1175 | 0.0163 | -8.2107 | -31.5009 | 0.9917 | 23.2901 | -957.4888 | -525.5498 | -1.5708 | -1.7807 | | 0.0058 | 1.98 | 1200 | 0.0162 | -8.1731 | -31.3826 | 0.9917 | 23.2095 | -956.3063 | -525.1735 | -1.5719 | -1.7813 | ### Framework versions - Transformers 4.37.0 - Pytorch 2.1.2+cu121 - Datasets 2.14.6 - Tokenizers 0.15.2