pijarcandra22/NMTBaliIndoT5
This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.0455
- Validation Loss: 2.2245
- Epoch: 499
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Epoch |
---|---|---|
3.0057 | 2.3883 | 0 |
2.4646 | 2.1171 | 1 |
2.2509 | 1.9641 | 2 |
2.1002 | 1.8352 | 3 |
1.9809 | 1.7476 | 4 |
1.8787 | 1.6777 | 5 |
1.7996 | 1.6172 | 6 |
1.7378 | 1.5669 | 7 |
1.6695 | 1.5305 | 8 |
1.6190 | 1.4909 | 9 |
1.5707 | 1.4619 | 10 |
1.5296 | 1.4280 | 11 |
1.4855 | 1.4013 | 12 |
1.4541 | 1.3778 | 13 |
1.4139 | 1.3560 | 14 |
1.3809 | 1.3410 | 15 |
1.3536 | 1.3156 | 16 |
1.3255 | 1.3029 | 17 |
1.2994 | 1.2946 | 18 |
1.2748 | 1.2796 | 19 |
1.2497 | 1.2659 | 20 |
1.2214 | 1.2633 | 21 |
1.2042 | 1.2480 | 22 |
1.1865 | 1.2341 | 23 |
1.1632 | 1.2291 | 24 |
1.1486 | 1.2238 | 25 |
1.1279 | 1.2102 | 26 |
1.1108 | 1.2092 | 27 |
1.0973 | 1.2033 | 28 |
1.0793 | 1.1981 | 29 |
1.0650 | 1.1952 | 30 |
1.0491 | 1.1866 | 31 |
1.0324 | 1.1817 | 32 |
1.0192 | 1.1826 | 33 |
0.9999 | 1.1824 | 34 |
0.9935 | 1.1791 | 35 |
0.9786 | 1.1704 | 36 |
0.9648 | 1.1692 | 37 |
0.9496 | 1.1653 | 38 |
0.9397 | 1.1667 | 39 |
0.9295 | 1.1598 | 40 |
0.9186 | 1.1623 | 41 |
0.9061 | 1.1609 | 42 |
0.8900 | 1.1576 | 43 |
0.8813 | 1.1623 | 44 |
0.8659 | 1.1559 | 45 |
0.8592 | 1.1610 | 46 |
0.8505 | 1.1600 | 47 |
0.8385 | 1.1565 | 48 |
0.8273 | 1.1641 | 49 |
0.8207 | 1.1624 | 50 |
0.8047 | 1.1596 | 51 |
0.8019 | 1.1547 | 52 |
0.7903 | 1.1609 | 53 |
0.7812 | 1.1614 | 54 |
0.7721 | 1.1524 | 55 |
0.7625 | 1.1628 | 56 |
0.7532 | 1.1659 | 57 |
0.7466 | 1.1653 | 58 |
0.7368 | 1.1666 | 59 |
0.7248 | 1.1738 | 60 |
0.7210 | 1.1712 | 61 |
0.7103 | 1.1770 | 62 |
0.7018 | 1.1743 | 63 |
0.6949 | 1.1783 | 64 |
0.6848 | 1.1828 | 65 |
0.6786 | 1.1822 | 66 |
0.6702 | 1.1876 | 67 |
0.6599 | 1.1957 | 68 |
0.6561 | 1.1961 | 69 |
0.6502 | 1.1933 | 70 |
0.6381 | 1.1980 | 71 |
0.6323 | 1.2030 | 72 |
0.6254 | 1.2119 | 73 |
0.6169 | 1.2142 | 74 |
0.6094 | 1.2083 | 75 |
0.6060 | 1.2068 | 76 |
0.6002 | 1.2247 | 77 |
0.5907 | 1.2285 | 78 |
0.5811 | 1.2294 | 79 |
0.5777 | 1.2293 | 80 |
0.5729 | 1.2290 | 81 |
0.5625 | 1.2358 | 82 |
0.5575 | 1.2479 | 83 |
0.5527 | 1.2427 | 84 |
0.5454 | 1.2489 | 85 |
0.5372 | 1.2542 | 86 |
0.5337 | 1.2600 | 87 |
0.5241 | 1.2670 | 88 |
0.5221 | 1.2696 | 89 |
0.5177 | 1.2719 | 90 |
0.5106 | 1.2769 | 91 |
0.5041 | 1.2771 | 92 |
0.4958 | 1.2870 | 93 |
0.4896 | 1.2907 | 94 |
0.4849 | 1.2894 | 95 |
0.4788 | 1.3095 | 96 |
0.4745 | 1.3199 | 97 |
0.4703 | 1.3117 | 98 |
0.4630 | 1.3169 | 99 |
0.4574 | 1.3172 | 100 |
0.4548 | 1.3263 | 101 |
0.4503 | 1.3333 | 102 |
0.4455 | 1.3304 | 103 |
0.4390 | 1.3364 | 104 |
0.4331 | 1.3508 | 105 |
0.4277 | 1.3411 | 106 |
0.4225 | 1.3521 | 107 |
0.4174 | 1.3610 | 108 |
0.4140 | 1.3560 | 109 |
0.4084 | 1.3737 | 110 |
0.4029 | 1.3741 | 111 |
0.4000 | 1.3822 | 112 |
0.3956 | 1.3859 | 113 |
0.3876 | 1.4035 | 114 |
0.3873 | 1.4108 | 115 |
0.3766 | 1.3996 | 116 |
0.3773 | 1.4035 | 117 |
0.3734 | 1.4129 | 118 |
0.3669 | 1.4219 | 119 |
0.3622 | 1.4210 | 120 |
0.3612 | 1.4192 | 121 |
0.3563 | 1.4289 | 122 |
0.3532 | 1.4450 | 123 |
0.3463 | 1.4463 | 124 |
0.3426 | 1.4515 | 125 |
0.3392 | 1.4652 | 126 |
0.3334 | 1.4602 | 127 |
0.3320 | 1.4642 | 128 |
0.3268 | 1.4667 | 129 |
0.3240 | 1.4796 | 130 |
0.3202 | 1.4793 | 131 |
0.3160 | 1.4897 | 132 |
0.3147 | 1.4883 | 133 |
0.3093 | 1.4900 | 134 |
0.3056 | 1.5097 | 135 |
0.3048 | 1.5073 | 136 |
0.3020 | 1.5091 | 137 |
0.2974 | 1.5087 | 138 |
0.2910 | 1.5308 | 139 |
0.2888 | 1.5318 | 140 |
0.2854 | 1.5434 | 141 |
0.2827 | 1.5454 | 142 |
0.2812 | 1.5463 | 143 |
0.2767 | 1.5516 | 144 |
0.2734 | 1.5527 | 145 |
0.2693 | 1.5590 | 146 |
0.2669 | 1.5727 | 147 |
0.2636 | 1.5765 | 148 |
0.2638 | 1.5748 | 149 |
0.2605 | 1.5942 | 150 |
0.2569 | 1.5878 | 151 |
0.2525 | 1.6007 | 152 |
0.2495 | 1.5954 | 153 |
0.2476 | 1.6063 | 154 |
0.2466 | 1.6182 | 155 |
0.2399 | 1.6249 | 156 |
0.2377 | 1.6177 | 157 |
0.2377 | 1.6197 | 158 |
0.2351 | 1.6209 | 159 |
0.2302 | 1.6320 | 160 |
0.2294 | 1.6396 | 161 |
0.2247 | 1.6485 | 162 |
0.2249 | 1.6542 | 163 |
0.2213 | 1.6508 | 164 |
0.2182 | 1.6581 | 165 |
0.2177 | 1.6640 | 166 |
0.2146 | 1.6758 | 167 |
0.2123 | 1.6765 | 168 |
0.2117 | 1.6838 | 169 |
0.2083 | 1.6785 | 170 |
0.2069 | 1.6967 | 171 |
0.2023 | 1.6948 | 172 |
0.1998 | 1.7009 | 173 |
0.1990 | 1.7082 | 174 |
0.1969 | 1.7074 | 175 |
0.1947 | 1.7101 | 176 |
0.1932 | 1.7155 | 177 |
0.1913 | 1.7187 | 178 |
0.1901 | 1.7305 | 179 |
0.1872 | 1.7407 | 180 |
0.1874 | 1.7371 | 181 |
0.1886 | 1.7379 | 182 |
0.1831 | 1.7476 | 183 |
0.1827 | 1.7467 | 184 |
0.1779 | 1.7536 | 185 |
0.1767 | 1.7554 | 186 |
0.1752 | 1.7647 | 187 |
0.1726 | 1.7648 | 188 |
0.1711 | 1.7744 | 189 |
0.1707 | 1.7667 | 190 |
0.1657 | 1.7909 | 191 |
0.1662 | 1.7837 | 192 |
0.1643 | 1.7871 | 193 |
0.1640 | 1.7876 | 194 |
0.1614 | 1.8020 | 195 |
0.1615 | 1.7982 | 196 |
0.1572 | 1.8096 | 197 |
0.1575 | 1.8112 | 198 |
0.1556 | 1.8249 | 199 |
0.1530 | 1.8180 | 200 |
0.1519 | 1.8243 | 201 |
0.1532 | 1.8174 | 202 |
0.1512 | 1.8278 | 203 |
0.1488 | 1.8331 | 204 |
0.1465 | 1.8437 | 205 |
0.1458 | 1.8439 | 206 |
0.1470 | 1.8363 | 207 |
0.1444 | 1.8396 | 208 |
0.1419 | 1.8571 | 209 |
0.1403 | 1.8577 | 210 |
0.1417 | 1.8495 | 211 |
0.1414 | 1.8475 | 212 |
0.1399 | 1.8680 | 213 |
0.1367 | 1.8644 | 214 |
0.1363 | 1.8738 | 215 |
0.1350 | 1.8667 | 216 |
0.1314 | 1.8698 | 217 |
0.1329 | 1.8806 | 218 |
0.1315 | 1.8782 | 219 |
0.1318 | 1.8778 | 220 |
0.1283 | 1.8790 | 221 |
0.1277 | 1.8937 | 222 |
0.1254 | 1.8924 | 223 |
0.1249 | 1.8962 | 224 |
0.1266 | 1.8913 | 225 |
0.1232 | 1.9012 | 226 |
0.1229 | 1.8963 | 227 |
0.1222 | 1.8979 | 228 |
0.1201 | 1.9140 | 229 |
0.1206 | 1.9087 | 230 |
0.1203 | 1.8971 | 231 |
0.1178 | 1.9294 | 232 |
0.1177 | 1.9287 | 233 |
0.1178 | 1.9271 | 234 |
0.1173 | 1.9292 | 235 |
0.1167 | 1.9276 | 236 |
0.1165 | 1.9266 | 237 |
0.1131 | 1.9263 | 238 |
0.1129 | 1.9241 | 239 |
0.1108 | 1.9346 | 240 |
0.1112 | 1.9506 | 241 |
0.1099 | 1.9488 | 242 |
0.1093 | 1.9362 | 243 |
0.1099 | 1.9409 | 244 |
0.1098 | 1.9370 | 245 |
0.1070 | 1.9454 | 246 |
0.1072 | 1.9498 | 247 |
0.1060 | 1.9508 | 248 |
0.1055 | 1.9529 | 249 |
0.1055 | 1.9637 | 250 |
0.1025 | 1.9580 | 251 |
0.1043 | 1.9663 | 252 |
0.1027 | 1.9708 | 253 |
0.1023 | 1.9658 | 254 |
0.1014 | 1.9815 | 255 |
0.1011 | 1.9739 | 256 |
0.0996 | 1.9742 | 257 |
0.0996 | 1.9828 | 258 |
0.0990 | 1.9763 | 259 |
0.0982 | 1.9805 | 260 |
0.0977 | 1.9908 | 261 |
0.0966 | 1.9738 | 262 |
0.0972 | 1.9763 | 263 |
0.0958 | 1.9766 | 264 |
0.0961 | 1.9863 | 265 |
0.0957 | 1.9877 | 266 |
0.0943 | 1.9820 | 267 |
0.0938 | 1.9967 | 268 |
0.0933 | 2.0096 | 269 |
0.0950 | 1.9914 | 270 |
0.0909 | 1.9910 | 271 |
0.0924 | 2.0045 | 272 |
0.0913 | 2.0063 | 273 |
0.0903 | 2.0011 | 274 |
0.0910 | 1.9991 | 275 |
0.0897 | 2.0035 | 276 |
0.0894 | 2.0074 | 277 |
0.0863 | 2.0188 | 278 |
0.0895 | 2.0141 | 279 |
0.0871 | 2.0231 | 280 |
0.0871 | 2.0101 | 281 |
0.0861 | 2.0031 | 282 |
0.0858 | 2.0285 | 283 |
0.0869 | 2.0226 | 284 |
0.0849 | 2.0267 | 285 |
0.0852 | 2.0179 | 286 |
0.0844 | 2.0336 | 287 |
0.0856 | 2.0277 | 288 |
0.0843 | 2.0256 | 289 |
0.0850 | 2.0255 | 290 |
0.0833 | 2.0227 | 291 |
0.0824 | 2.0334 | 292 |
0.0816 | 2.0261 | 293 |
0.0827 | 2.0364 | 294 |
0.0829 | 2.0292 | 295 |
0.0820 | 2.0219 | 296 |
0.0807 | 2.0318 | 297 |
0.0806 | 2.0230 | 298 |
0.0800 | 2.0360 | 299 |
0.0784 | 2.0483 | 300 |
0.0782 | 2.0374 | 301 |
0.0792 | 2.0430 | 302 |
0.0794 | 2.0399 | 303 |
0.0789 | 2.0536 | 304 |
0.0764 | 2.0584 | 305 |
0.0776 | 2.0456 | 306 |
0.0760 | 2.0432 | 307 |
0.0762 | 2.0609 | 308 |
0.0777 | 2.0608 | 309 |
0.0762 | 2.0609 | 310 |
0.0752 | 2.0525 | 311 |
0.0758 | 2.0568 | 312 |
0.0771 | 2.0524 | 313 |
0.0748 | 2.0522 | 314 |
0.0755 | 2.0505 | 315 |
0.0742 | 2.0459 | 316 |
0.0748 | 2.0528 | 317 |
0.0735 | 2.0612 | 318 |
0.0727 | 2.0561 | 319 |
0.0725 | 2.0676 | 320 |
0.0730 | 2.0725 | 321 |
0.0724 | 2.0638 | 322 |
0.0728 | 2.0584 | 323 |
0.0712 | 2.0773 | 324 |
0.0720 | 2.0709 | 325 |
0.0712 | 2.0729 | 326 |
0.0698 | 2.0753 | 327 |
0.0699 | 2.0705 | 328 |
0.0705 | 2.0701 | 329 |
0.0706 | 2.0762 | 330 |
0.0699 | 2.0718 | 331 |
0.0690 | 2.0798 | 332 |
0.0682 | 2.0872 | 333 |
0.0689 | 2.0809 | 334 |
0.0683 | 2.0749 | 335 |
0.0688 | 2.0851 | 336 |
0.0682 | 2.0854 | 337 |
0.0676 | 2.0818 | 338 |
0.0679 | 2.0810 | 339 |
0.0671 | 2.0885 | 340 |
0.0666 | 2.0887 | 341 |
0.0669 | 2.0854 | 342 |
0.0673 | 2.0927 | 343 |
0.0666 | 2.0821 | 344 |
0.0657 | 2.0998 | 345 |
0.0663 | 2.1133 | 346 |
0.0665 | 2.0853 | 347 |
0.0655 | 2.1038 | 348 |
0.0652 | 2.1013 | 349 |
0.0651 | 2.0905 | 350 |
0.0658 | 2.1061 | 351 |
0.0649 | 2.0931 | 352 |
0.0658 | 2.1027 | 353 |
0.0654 | 2.1045 | 354 |
0.0649 | 2.0973 | 355 |
0.0651 | 2.1105 | 356 |
0.0633 | 2.1159 | 357 |
0.0634 | 2.1088 | 358 |
0.0625 | 2.1325 | 359 |
0.0629 | 2.1245 | 360 |
0.0621 | 2.1334 | 361 |
0.0629 | 2.1150 | 362 |
0.0643 | 2.0974 | 363 |
0.0624 | 2.1102 | 364 |
0.0628 | 2.1239 | 365 |
0.0624 | 2.1142 | 366 |
0.0612 | 2.1373 | 367 |
0.0622 | 2.1213 | 368 |
0.0623 | 2.1062 | 369 |
0.0611 | 2.1195 | 370 |
0.0609 | 2.1172 | 371 |
0.0605 | 2.1256 | 372 |
0.0617 | 2.1373 | 373 |
0.0605 | 2.1289 | 374 |
0.0601 | 2.1241 | 375 |
0.0598 | 2.1250 | 376 |
0.0599 | 2.1308 | 377 |
0.0610 | 2.1231 | 378 |
0.0608 | 2.1316 | 379 |
0.0596 | 2.1307 | 380 |
0.0597 | 2.1267 | 381 |
0.0587 | 2.1341 | 382 |
0.0587 | 2.1314 | 383 |
0.0593 | 2.1290 | 384 |
0.0592 | 2.1239 | 385 |
0.0570 | 2.1267 | 386 |
0.0595 | 2.1282 | 387 |
0.0586 | 2.1326 | 388 |
0.0590 | 2.1332 | 389 |
0.0583 | 2.1316 | 390 |
0.0576 | 2.1392 | 391 |
0.0594 | 2.1280 | 392 |
0.0575 | 2.1357 | 393 |
0.0567 | 2.1392 | 394 |
0.0566 | 2.1370 | 395 |
0.0571 | 2.1186 | 396 |
0.0561 | 2.1400 | 397 |
0.0567 | 2.1312 | 398 |
0.0571 | 2.1440 | 399 |
0.0568 | 2.1485 | 400 |
0.0561 | 2.1539 | 401 |
0.0563 | 2.1461 | 402 |
0.0565 | 2.1496 | 403 |
0.0554 | 2.1622 | 404 |
0.0561 | 2.1580 | 405 |
0.0553 | 2.1723 | 406 |
0.0560 | 2.1498 | 407 |
0.0555 | 2.1546 | 408 |
0.0552 | 2.1622 | 409 |
0.0549 | 2.1548 | 410 |
0.0548 | 2.1613 | 411 |
0.0546 | 2.1655 | 412 |
0.0540 | 2.1661 | 413 |
0.0549 | 2.1710 | 414 |
0.0543 | 2.1760 | 415 |
0.0543 | 2.1648 | 416 |
0.0538 | 2.1800 | 417 |
0.0524 | 2.1824 | 418 |
0.0528 | 2.1849 | 419 |
0.0531 | 2.1668 | 420 |
0.0548 | 2.1598 | 421 |
0.0543 | 2.1624 | 422 |
0.0533 | 2.1705 | 423 |
0.0539 | 2.1821 | 424 |
0.0531 | 2.1629 | 425 |
0.0537 | 2.1704 | 426 |
0.0529 | 2.1687 | 427 |
0.0525 | 2.1990 | 428 |
0.0518 | 2.1939 | 429 |
0.0522 | 2.1761 | 430 |
0.0521 | 2.1725 | 431 |
0.0521 | 2.1677 | 432 |
0.0517 | 2.1731 | 433 |
0.0512 | 2.1833 | 434 |
0.0514 | 2.1914 | 435 |
0.0522 | 2.1858 | 436 |
0.0513 | 2.1854 | 437 |
0.0517 | 2.1875 | 438 |
0.0513 | 2.2028 | 439 |
0.0518 | 2.2001 | 440 |
0.0510 | 2.1821 | 441 |
0.0508 | 2.1831 | 442 |
0.0507 | 2.1787 | 443 |
0.0512 | 2.1773 | 444 |
0.0505 | 2.1962 | 445 |
0.0507 | 2.1756 | 446 |
0.0507 | 2.1885 | 447 |
0.0500 | 2.1993 | 448 |
0.0505 | 2.1738 | 449 |
0.0511 | 2.1672 | 450 |
0.0486 | 2.1973 | 451 |
0.0500 | 2.1826 | 452 |
0.0513 | 2.1787 | 453 |
0.0502 | 2.1902 | 454 |
0.0501 | 2.1805 | 455 |
0.0494 | 2.1814 | 456 |
0.0499 | 2.1808 | 457 |
0.0496 | 2.1744 | 458 |
0.0498 | 2.1721 | 459 |
0.0493 | 2.1922 | 460 |
0.0499 | 2.1888 | 461 |
0.0497 | 2.1897 | 462 |
0.0497 | 2.1876 | 463 |
0.0489 | 2.1910 | 464 |
0.0481 | 2.1933 | 465 |
0.0497 | 2.1821 | 466 |
0.0494 | 2.1943 | 467 |
0.0489 | 2.1991 | 468 |
0.0482 | 2.1978 | 469 |
0.0485 | 2.1813 | 470 |
0.0483 | 2.1804 | 471 |
0.0480 | 2.1988 | 472 |
0.0483 | 2.1996 | 473 |
0.0477 | 2.1996 | 474 |
0.0475 | 2.1978 | 475 |
0.0483 | 2.1811 | 476 |
0.0470 | 2.1921 | 477 |
0.0478 | 2.1978 | 478 |
0.0471 | 2.1900 | 479 |
0.0484 | 2.2167 | 480 |
0.0474 | 2.1919 | 481 |
0.0475 | 2.2082 | 482 |
0.0466 | 2.2219 | 483 |
0.0476 | 2.1836 | 484 |
0.0465 | 2.2060 | 485 |
0.0473 | 2.2154 | 486 |
0.0475 | 2.2080 | 487 |
0.0464 | 2.2102 | 488 |
0.0465 | 2.2156 | 489 |
0.0475 | 2.2129 | 490 |
0.0463 | 2.2031 | 491 |
0.0459 | 2.2007 | 492 |
0.0466 | 2.2033 | 493 |
0.0462 | 2.2144 | 494 |
0.0461 | 2.2208 | 495 |
0.0462 | 2.2257 | 496 |
0.0463 | 2.2060 | 497 |
0.0458 | 2.2229 | 498 |
0.0455 | 2.2245 | 499 |
Framework versions
- Transformers 4.38.2
- TensorFlow 2.15.0
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for pijarcandra22/NMTBaliIndoT5
Base model
google-t5/t5-small