mt5_small_lg_en / README.md
MubarakB's picture
Model save
2c09e30 verified
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: mt5_small_lg_en
    results: []

Visualize in Weights & Biases

mt5_small_lg_en

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2071
  • Bleu: 1.1669
  • Gen Len: 6.6138

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.2558 1.0 848 0.2899 0.0653 16.1851
0.3023 2.0 1696 0.2764 0.0872 12.2714
0.289 3.0 2544 0.2681 0.1524 9.4625
0.2825 4.0 3392 0.2623 0.1648 8.42
0.2766 5.0 4240 0.2564 0.2707 8.8613
0.2695 6.0 5088 0.2507 0.3064 8.2628
0.2661 7.0 5936 0.2454 0.314 8.3656
0.2582 8.0 6784 0.2408 0.5769 8.2283
0.2536 9.0 7632 0.2367 0.4428 7.6052
0.2514 10.0 8480 0.2332 0.5161 6.9993
0.248 11.0 9328 0.2296 0.6246 7.1652
0.2432 12.0 10176 0.2268 0.6372 7.006
0.2393 13.0 11024 0.2244 0.681 6.7001
0.2367 14.0 11872 0.2216 0.7667 6.8613
0.2339 15.0 12720 0.2193 0.7835 6.8739
0.2313 16.0 13568 0.2178 0.7668 6.6861
0.2307 17.0 14416 0.2160 0.81 6.7837
0.2279 18.0 15264 0.2145 1.0551 6.7193
0.2258 19.0 16112 0.2135 1.0511 6.6828
0.2245 20.0 16960 0.2120 0.8869 6.7757
0.2226 21.0 17808 0.2112 0.8999 6.6948
0.2216 22.0 18656 0.2104 0.9144 6.6264
0.222 23.0 19504 0.2094 0.9253 6.6317
0.2202 24.0 20352 0.2090 0.9439 6.5109
0.2199 25.0 21200 0.2083 0.9589 6.6549
0.2187 26.0 22048 0.2079 0.9446 6.6138
0.2186 27.0 22896 0.2076 0.9708 6.6065
0.218 28.0 23744 0.2074 0.966 6.5707
0.2173 29.0 24592 0.2072 1.1663 6.6085
0.2181 30.0 25440 0.2071 1.1669 6.6138

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.19.1