--- library_name: transformers license: apache-2.0 base_model: MubarakB/mt5_small_lg_en tags: - generated_from_trainer metrics: - bleu model-index: - name: mt5_small_lg_inf_en_v1 results: [] --- # mt5_small_lg_inf_en_v1 This model is a fine-tuned version of [MubarakB/mt5_small_lg_en](https://huggingface.co/MubarakB/mt5_small_lg_en) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.4187 - Bleu: 0.2171 - Gen Len: 9.0204 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 100 ### Training results | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:------:|:-------:| | No log | 1.0 | 138 | 0.4669 | 0.0658 | 9.3837 | | No log | 2.0 | 276 | 0.4559 | 0.132 | 8.0245 | | No log | 3.0 | 414 | 0.4507 | 0.2112 | 8.1592 | | 0.4726 | 4.0 | 552 | 0.4472 | 0.2144 | 8.0367 | | 0.4726 | 5.0 | 690 | 0.4445 | 0.2134 | 8.0082 | | 0.4726 | 6.0 | 828 | 0.4425 | 0.3274 | 7.8612 | | 0.4726 | 7.0 | 966 | 0.4405 | 0.3378 | 7.5959 | | 0.447 | 8.0 | 1104 | 0.4390 | 0.3304 | 7.3918 | | 0.447 | 9.0 | 1242 | 0.4378 | 0.3285 | 7.3673 | | 0.447 | 10.0 | 1380 | 0.4362 | 0.3147 | 7.6694 | | 0.4398 | 11.0 | 1518 | 0.4350 | 0.3181 | 7.4163 | | 0.4398 | 12.0 | 1656 | 0.4341 | 0.3166 | 7.5224 | | 0.4398 | 13.0 | 1794 | 0.4330 | 0.3178 | 7.5592 | | 0.4398 | 14.0 | 1932 | 0.4318 | 0.2157 | 7.8204 | | 0.4313 | 15.0 | 2070 | 0.4312 | 0.3169 | 8.1388 | | 0.4313 | 16.0 | 2208 | 0.4307 | 0.3169 | 7.9633 | | 0.4313 | 17.0 | 2346 | 0.4297 | 0.3064 | 8.2245 | | 0.4313 | 18.0 | 2484 | 0.4293 | 0.2045 | 8.2776 | | 0.4262 | 19.0 | 2622 | 0.4286 | 0.3027 | 8.4367 | | 0.4262 | 20.0 | 2760 | 0.4280 | 0.2042 | 8.5061 | | 0.4262 | 21.0 | 2898 | 0.4274 | 0.3033 | 8.5633 | | 0.4214 | 22.0 | 3036 | 0.4272 | 0.3019 | 8.7714 | | 0.4214 | 23.0 | 3174 | 0.4264 | 0.3051 | 8.649 | | 0.4214 | 24.0 | 3312 | 0.4263 | 0.3021 | 8.8367 | | 0.4214 | 25.0 | 3450 | 0.4254 | 0.2981 | 8.8204 | | 0.4161 | 26.0 | 3588 | 0.4251 | 0.2992 | 8.8776 | | 0.4161 | 27.0 | 3726 | 0.4248 | 0.3044 | 8.8571 | | 0.4161 | 28.0 | 3864 | 0.4246 | 0.3 | 8.8776 | | 0.4124 | 29.0 | 4002 | 0.4246 | 0.2998 | 8.8163 | | 0.4124 | 30.0 | 4140 | 0.4239 | 0.2983 | 9.0857 | | 0.4124 | 31.0 | 4278 | 0.4234 | 0.2988 | 9.0163 | | 0.4124 | 32.0 | 4416 | 0.4233 | 0.2996 | 8.8816 | | 0.4087 | 33.0 | 4554 | 0.4232 | 0.298 | 8.9714 | | 0.4087 | 34.0 | 4692 | 0.4226 | 0.3003 | 8.9796 | | 0.4087 | 35.0 | 4830 | 0.4224 | 0.2992 | 9.1796 | | 0.4087 | 36.0 | 4968 | 0.4225 | 0.3005 | 9.0571 | | 0.4053 | 37.0 | 5106 | 0.4224 | 0.2994 | 8.8571 | | 0.4053 | 38.0 | 5244 | 0.4220 | 0.3 | 9.1143 | | 0.4053 | 39.0 | 5382 | 0.4216 | 0.3019 | 9.102 | | 0.4006 | 40.0 | 5520 | 0.4215 | 0.3016 | 8.9714 | | 0.4006 | 41.0 | 5658 | 0.4212 | 0.3011 | 8.9224 | | 0.4006 | 42.0 | 5796 | 0.4211 | 0.2982 | 9.2816 | | 0.4006 | 43.0 | 5934 | 0.4210 | 0.2985 | 9.1633 | | 0.3986 | 44.0 | 6072 | 0.4210 | 0.2994 | 9.0776 | | 0.3986 | 45.0 | 6210 | 0.4209 | 0.308 | 9.3265 | | 0.3986 | 46.0 | 6348 | 0.4208 | 0.2963 | 9.1714 | | 0.3986 | 47.0 | 6486 | 0.4205 | 0.3093 | 9.0531 | | 0.3953 | 48.0 | 6624 | 0.4205 | 0.3068 | 9.4449 | | 0.3953 | 49.0 | 6762 | 0.4202 | 0.3075 | 8.9918 | | 0.3953 | 50.0 | 6900 | 0.4203 | 0.3071 | 9.1306 | | 0.3929 | 51.0 | 7038 | 0.4200 | 0.3052 | 9.3143 | | 0.3929 | 52.0 | 7176 | 0.4200 | 0.306 | 9.1796 | | 0.3929 | 53.0 | 7314 | 0.4200 | 0.3058 | 9.2204 | | 0.3929 | 54.0 | 7452 | 0.4200 | 0.3076 | 8.8367 | | 0.391 | 55.0 | 7590 | 0.4196 | 0.3078 | 8.8776 | | 0.391 | 56.0 | 7728 | 0.4197 | 0.3041 | 9.0449 | | 0.391 | 57.0 | 7866 | 0.4198 | 0.3041 | 8.8776 | | 0.3887 | 58.0 | 8004 | 0.4201 | 0.3171 | 8.9224 | | 0.3887 | 59.0 | 8142 | 0.4192 | 0.3074 | 9.0449 | | 0.3887 | 60.0 | 8280 | 0.4197 | 0.318 | 8.8571 | | 0.3887 | 61.0 | 8418 | 0.4194 | 0.3167 | 9.1469 | | 0.3871 | 62.0 | 8556 | 0.4194 | 0.3186 | 8.8612 | | 0.3871 | 63.0 | 8694 | 0.4192 | 0.3181 | 8.8245 | | 0.3871 | 64.0 | 8832 | 0.4192 | 0.3178 | 9.0449 | | 0.3871 | 65.0 | 8970 | 0.4194 | 0.3168 | 8.9673 | | 0.3849 | 66.0 | 9108 | 0.4191 | 0.3159 | 8.9184 | | 0.3849 | 67.0 | 9246 | 0.4192 | 0.3191 | 8.7347 | | 0.3849 | 68.0 | 9384 | 0.4189 | 0.3173 | 8.8367 | | 0.3841 | 69.0 | 9522 | 0.4189 | 0.3198 | 8.7633 | | 0.3841 | 70.0 | 9660 | 0.4189 | 0.3168 | 8.9306 | | 0.3841 | 71.0 | 9798 | 0.4187 | 0.3182 | 8.9837 | | 0.3841 | 72.0 | 9936 | 0.4191 | 0.3179 | 8.9918 | | 0.3823 | 73.0 | 10074 | 0.4189 | 0.3173 | 8.951 | | 0.3823 | 74.0 | 10212 | 0.4188 | 0.3158 | 8.9551 | | 0.3823 | 75.0 | 10350 | 0.4188 | 0.3184 | 8.9061 | | 0.3823 | 76.0 | 10488 | 0.4187 | 0.3174 | 8.9347 | | 0.3809 | 77.0 | 10626 | 0.4186 | 0.2163 | 9.1061 | | 0.3809 | 78.0 | 10764 | 0.4189 | 0.2173 | 8.8531 | | 0.3809 | 79.0 | 10902 | 0.4187 | 0.3156 | 9.0776 | | 0.3798 | 80.0 | 11040 | 0.4187 | 0.3166 | 8.9796 | | 0.3798 | 81.0 | 11178 | 0.4187 | 0.3172 | 8.9796 | | 0.3798 | 82.0 | 11316 | 0.4187 | 0.3177 | 9.0 | | 0.3798 | 83.0 | 11454 | 0.4187 | 0.3167 | 9.0204 | | 0.3799 | 84.0 | 11592 | 0.4187 | 0.3166 | 8.9837 | | 0.3799 | 85.0 | 11730 | 0.4187 | 0.3174 | 9.0776 | | 0.3799 | 86.0 | 11868 | 0.4187 | 0.2174 | 9.1469 | | 0.3789 | 87.0 | 12006 | 0.4188 | 0.2167 | 8.9143 | | 0.3789 | 88.0 | 12144 | 0.4187 | 0.2171 | 9.0327 | | 0.3789 | 89.0 | 12282 | 0.4187 | 0.217 | 9.0531 | | 0.3789 | 90.0 | 12420 | 0.4186 | 0.3176 | 9.1102 | | 0.378 | 91.0 | 12558 | 0.4186 | 0.3182 | 9.0531 | | 0.378 | 92.0 | 12696 | 0.4186 | 0.3186 | 9.1102 | | 0.378 | 93.0 | 12834 | 0.4187 | 0.2177 | 9.0163 | | 0.378 | 94.0 | 12972 | 0.4187 | 0.2172 | 9.0204 | | 0.3768 | 95.0 | 13110 | 0.4186 | 0.2171 | 9.0204 | | 0.3768 | 96.0 | 13248 | 0.4186 | 0.2171 | 9.0367 | | 0.3768 | 97.0 | 13386 | 0.4187 | 0.2173 | 8.9959 | | 0.3769 | 98.0 | 13524 | 0.4187 | 0.2172 | 8.9959 | | 0.3769 | 99.0 | 13662 | 0.4187 | 0.2172 | 9.0 | | 0.3769 | 100.0 | 13800 | 0.4187 | 0.2171 | 9.0204 | ### Framework versions - Transformers 4.45.1 - Pytorch 2.4.0 - Datasets 3.0.1 - Tokenizers 0.20.0