Edit model card

mistral-v0.3-tokV2-gentle-train

This model is a fine-tuned version of PolyAgent/mistral-7b-v0.3-ua-tokenizer-v2-focus-base on the PolyAgent/wiki_uk_en_parallel dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0765

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7.5e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
2.0964 0.0147 500 2.0850
1.8225 0.0295 1000 1.6206
1.705 0.0442 1500 1.4585
1.6304 0.0590 2000 1.3826
1.5948 0.0737 2500 1.3408
1.6226 0.0885 3000 1.3155
1.5106 0.1032 3500 1.2971
1.5651 0.1179 4000 1.2854
1.4974 0.1327 4500 1.2763
1.5642 0.1474 5000 1.2688
1.519 0.1622 5500 1.2617
1.4893 0.1769 6000 1.2575
1.5136 0.1917 6500 1.2555
1.5036 0.2064 7000 1.2513
1.5536 0.2211 7500 1.2497
1.5096 0.2359 8000 1.2488
1.5529 0.2506 8500 1.2481
1.5534 0.2654 9000 1.2464
1.5335 0.2801 9500 1.2467
1.5538 0.2949 10000 1.2465
1.5036 0.3096 10500 1.2422
1.5709 0.3243 11000 1.2401
1.625 0.3391 11500 1.2361
1.5534 0.3538 12000 1.2318
1.467 0.3686 12500 1.2277
1.4511 0.3833 13000 1.2229
1.5157 0.3981 13500 1.2189
1.4941 0.4128 14000 1.2161
1.5154 0.4275 14500 1.2133
1.5121 0.4423 15000 1.2090
1.4698 0.4570 15500 1.2060
1.5629 0.4718 16000 1.2029
1.5336 0.4865 16500 1.2004
1.5355 0.5013 17000 1.1981
1.4291 0.5160 17500 1.1945
1.5137 0.5307 18000 1.1933
1.5303 0.5455 18500 1.1906
1.5045 0.5602 19000 1.1881
1.4674 0.5750 19500 1.1854
1.518 0.5897 20000 1.1823
1.5104 0.6045 20500 1.1794
1.4874 0.6192 21000 1.1786
1.5025 0.6339 21500 1.1764
1.4493 0.6487 22000 1.1728
1.5114 0.6634 22500 1.1722
1.5394 0.6782 23000 1.1707
1.5466 0.6929 23500 1.1679
1.5046 0.7077 24000 1.1660
1.5397 0.7224 24500 1.1631
1.5111 0.7371 25000 1.1623
1.4707 0.7519 25500 1.1605
1.5201 0.7666 26000 1.1586
1.5511 0.7814 26500 1.1568
1.4773 0.7961 27000 1.1550
1.5146 0.8109 27500 1.1533
1.4789 0.8256 28000 1.1513
1.4949 0.8403 28500 1.1488
1.5116 0.8551 29000 1.1471
1.4338 0.8698 29500 1.1453
1.4656 0.8846 30000 1.1446
1.4542 0.8993 30500 1.1427
1.5095 0.9140 31000 1.1415
1.5156 0.9288 31500 1.1399
1.4379 0.9435 32000 1.1390
1.4185 0.9583 32500 1.1373
1.4765 0.9730 33000 1.1355
1.453 0.9878 33500 1.1345
1.2859 1.0025 34000 1.1370
1.3039 1.0172 34500 1.1342
1.2991 1.0320 35000 1.1314
1.3258 1.0467 35500 1.1301
1.3229 1.0615 36000 1.1295
1.2872 1.0762 36500 1.1290
1.346 1.0910 37000 1.1260
1.3494 1.1057 37500 1.1255
1.3234 1.1204 38000 1.1247
1.2964 1.1352 38500 1.1405
1.34 1.1499 39000 1.1226
1.316 1.1647 39500 1.1214
1.3232 1.1794 40000 1.1206
1.3175 1.1942 40500 1.1212
1.2516 1.2089 41000 1.1191
1.3323 1.2236 41500 1.1180
1.3046 1.2384 42000 1.1174
1.3659 1.2531 42500 1.1151
1.3582 1.2679 43000 1.1137
1.2981 1.2826 43500 1.1128
1.3262 1.2974 44000 1.1116
1.326 1.3121 44500 1.1101
1.3025 1.3268 45000 1.1106
1.271 1.3416 45500 1.1087
1.2566 1.3563 46000 1.1075
1.3671 1.3711 46500 1.1071
1.2847 1.3858 47000 1.1040
1.3066 1.4006 47500 1.1036
1.2868 1.4153 48000 1.1024
1.326 1.4300 48500 1.1016
1.35 1.4448 49000 1.1009
1.3054 1.4595 49500 1.0998
1.3156 1.4743 50000 1.0976
1.333 1.4890 50500 1.0963
1.3592 1.5038 51000 1.0959
1.2748 1.5185 51500 1.0946
1.369 1.5332 52000 1.0936
1.3058 1.5480 52500 1.0922
1.3611 1.5627 53000 1.0916
1.331 1.5775 53500 1.0906
1.2905 1.5922 54000 1.0888
1.294 1.6070 54500 1.0879
1.3102 1.6217 55000 1.0868
1.2641 1.6364 55500 1.0858
1.2797 1.6512 56000 1.0845
1.2672 1.6659 56500 1.0836
1.3044 1.6807 57000 1.0823
1.2694 1.6954 57500 1.0815
1.2786 1.7102 58000 1.0809
1.2908 1.7249 58500 1.0790
1.3049 1.7396 59000 1.0785
1.2632 1.7544 59500 1.0772
1.2836 1.7691 60000 1.0755
1.3261 1.7839 60500 1.0741
1.3267 1.7986 61000 1.0727
1.2277 1.8134 61500 1.0722
1.2635 1.8281 62000 1.0711
1.249 1.8428 62500 1.0709
1.2996 1.8576 63000 1.0699
1.2934 1.8723 63500 1.0687
1.3182 1.8871 64000 1.0675
1.3103 1.9018 64500 1.0659
1.2764 1.9166 65000 1.0651
1.2848 1.9313 65500 1.0638
1.2924 1.9460 66000 1.0627
1.2897 1.9608 66500 1.0617
1.2819 1.9755 67000 1.0606
1.2331 1.9903 67500 1.0603
1.0402 2.0050 68000 1.0837
1.0995 2.0198 68500 1.0853
1.063 2.0345 69000 1.0859
1.0377 2.0492 69500 1.0869
1.0493 2.0640 70000 1.0864
1.0835 2.0787 70500 1.0869
1.0013 2.0935 71000 1.0877
1.0327 2.1082 71500 1.0870
1.0615 2.1230 72000 1.0855
1.043 2.1377 72500 1.0864
1.0476 2.1524 73000 1.0853
1.0105 2.1672 73500 1.0860
1.0314 2.1819 74000 1.0860
1.0527 2.1967 74500 1.0856
1.0589 2.2114 75000 1.0861
1.1094 2.2262 75500 1.0856
1.0562 2.2409 76000 1.0846
1.0623 2.2556 76500 1.0846
1.0518 2.2704 77000 1.0847
1.0461 2.2851 77500 1.0842
1.0185 2.2999 78000 1.0835
1.0673 2.3146 78500 1.0838
1.0243 2.3294 79000 1.0838
1.0381 2.3441 79500 1.0831
1.0179 2.3588 80000 1.0822
1.0524 2.3736 80500 1.0821
1.0364 2.3883 81000 1.0819
1.0421 2.4031 81500 1.0810
1.0878 2.4178 82000 1.0809
1.061 2.4326 82500 1.0807
1.004 2.4473 83000 1.0811
1.0488 2.4620 83500 1.0803
1.0406 2.4768 84000 1.0802
1.0586 2.4915 84500 1.0799
1.0113 2.5063 85000 1.0797
1.0478 2.5210 85500 1.0796
1.0241 2.5358 86000 1.0794
1.0523 2.5505 86500 1.0790
1.0275 2.5652 87000 1.0787
1.0601 2.5800 87500 1.0787
1.0519 2.5947 88000 1.0785
1.0243 2.6095 88500 1.0783
1.0071 2.6242 89000 1.0782
1.0251 2.6390 89500 1.0779
1.0334 2.6537 90000 1.0776
1.0144 2.6684 90500 1.0777
1.0212 2.6832 91000 1.0775
1.0367 2.6979 91500 1.0774
1.0551 2.7127 92000 1.0771
1.0576 2.7274 92500 1.0772
1.0058 2.7421 93000 1.0772
1.061 2.7569 93500 1.0768
1.0237 2.7716 94000 1.0767
1.0262 2.7864 94500 1.0767
1.0558 2.8011 95000 1.0767
1.0223 2.8159 95500 1.0768
1.0122 2.8306 96000 1.0769
1.0324 2.8453 96500 1.0765
1.0924 2.8601 97000 1.0766
1.0757 2.8748 97500 1.0765
1.0703 2.8896 98000 1.0766
1.0424 2.9043 98500 1.0766
1.055 2.9191 99000 1.0765
1.0556 2.9338 99500 1.0765
1.0383 2.9485 100000 1.0765
1.0245 2.9633 100500 1.0765
1.0212 2.9780 101000 1.0765
1.0432 2.9928 101500 1.0765

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
7.25B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for antonpolishko/mistral-v0.3-tokV2-gentle-train