geez_t5-15k / README.md
Samuael's picture
End of training
d7e880c verified
|
raw
history blame
5.22 kB
metadata
tags:
  - generated_from_trainer
metrics:
  - wer
  - bleu
model-index:
  - name: geez_t5-15k
    results: []

geez_t5-15k

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3233
  • Wer: 0.2209
  • Cer: 0.1381
  • Bleu: 70.4059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Wer Cer Bleu
7.7095 1.0 145 7.7918 5.0898 3.9256 0.0023
7.0334 2.0 290 7.1199 5.0160 4.1855 0.0051
6.4831 3.0 435 6.6645 5.0207 3.8475 0.0214
6.1982 4.0 580 6.3920 4.5634 3.8489 0.0529
5.903 5.0 725 6.1877 4.5275 3.5050 0.0557
5.669 6.0 870 6.0360 4.9197 4.0028 0.0634
5.425 7.0 1015 5.8639 4.4216 3.7590 0.1208
5.2049 8.0 1160 5.7314 3.2761 2.6167 0.1783
5.0061 9.0 1305 5.6525 3.9136 3.2163 0.1433
4.8471 10.0 1450 5.5808 2.8054 2.4552 0.3077
4.6025 11.0 1595 5.4963 3.1738 2.8400 0.2473
4.4593 12.0 1740 5.4572 2.9939 2.6228 0.3764
4.3925 13.0 1885 5.3739 2.4268 2.0558 0.4943
4.2547 14.0 2030 5.3549 2.1811 1.9179 0.6141
4.2059 15.0 2175 5.3532 2.5793 2.2089 0.5485
4.0344 16.0 2320 5.3384 2.1161 1.8753 0.7106
3.8338 17.0 2465 5.3491 2.1119 1.9856 0.6538
3.8922 18.0 2610 5.3233 2.0402 1.8304 0.8877
3.6469 19.0 2755 5.3290 1.7011 1.4942 1.1830
2.8339 20.0 2900 4.1129 1.7063 1.4567 4.0465
1.4826 21.0 3045 2.3404 1.6510 1.4483 11.1205
0.8862 22.0 3190 1.6343 1.4432 1.2622 18.9607
0.603 23.0 3335 1.3605 1.1528 0.9975 27.6554
0.4701 24.0 3480 1.2962 1.0378 0.8913 31.5906
0.4302 25.0 3625 1.2630 0.8397 0.7215 38.0315
0.3239 26.0 3770 1.2441 0.6757 0.5460 44.0109
0.2679 27.0 3915 1.2520 0.6738 0.5478 44.8130
0.2543 28.0 4060 1.2496 0.6416 0.5215 46.1244
0.2113 29.0 4205 1.2534 0.5392 0.4282 50.5640
0.1811 30.0 4350 1.2870 0.6152 0.4961 47.6743
0.1676 31.0 4495 1.2657 0.5494 0.4411 50.7361
0.1523 32.0 4640 1.2986 0.5483 0.4476 50.8212
0.1468 33.0 4785 1.3057 0.4785 0.3744 54.2680
0.1375 34.0 4930 1.3025 0.4506 0.3545 55.8315
0.1259 35.0 5075 1.3367 0.4865 0.3899 54.1053
0.1194 36.0 5220 1.3196 0.4540 0.3581 55.4216
0.1116 37.0 5365 1.3104 0.3943 0.3011 58.6213
0.0968 38.0 5510 1.3477 0.3834 0.2953 59.3219
0.0981 39.0 5655 1.3217 0.4059 0.3112 58.2604
0.0938 40.0 5800 1.3304 0.4132 0.3205 57.7388
0.0823 41.0 5945 1.3023 0.3432 0.2481 61.8713
0.0786 42.0 6090 1.3138 0.2974 0.2027 64.6092
0.0766 43.0 6235 1.3324 0.3680 0.2768 60.6454
0.0765 44.0 6380 1.3266 0.3359 0.2359 62.7278
0.0718 45.0 6525 1.3440 0.3000 0.2163 64.6481
0.0637 46.0 6670 1.3283 0.2628 0.1782 67.2375
0.0658 47.0 6815 1.3331 0.2605 0.1721 67.1960
0.0643 48.0 6960 1.3198 0.2618 0.1780 67.4730
0.0682 49.0 7105 1.3196 0.2732 0.1876 66.2931
0.0605 50.0 7250 1.3233 0.2209 0.1381 70.4059

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2