Edit model card

mt5_small_lg_inf_en

This model is a fine-tuned version of MubarakB/mt5_small_lg_en on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4301
  • Bleu: 0.3034
  • Gen Len: 8.1551

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 138 0.4671 0.0646 9.4449
No log 2.0 276 0.4562 0.1318 7.8898
No log 3.0 414 0.4511 0.2119 7.9878
0.4729 4.0 552 0.4476 0.2133 8.1184
0.4729 5.0 690 0.4451 0.2128 8.0816
0.4729 6.0 828 0.4433 0.3272 7.9224
0.4729 7.0 966 0.4415 0.3383 7.6571
0.4479 8.0 1104 0.4401 0.3281 7.5347
0.4479 9.0 1242 0.4390 0.3296 7.4286
0.4479 10.0 1380 0.4378 0.3157 7.6
0.4418 11.0 1518 0.4367 0.3288 7.4327
0.4418 12.0 1656 0.4360 0.316 7.4857
0.4418 13.0 1794 0.4350 0.3167 7.4898
0.4418 14.0 1932 0.4342 0.3161 7.698
0.4347 15.0 2070 0.4337 0.316 7.849
0.4347 16.0 2208 0.4333 0.3177 7.6735
0.4347 17.0 2346 0.4326 0.3174 7.8082
0.4347 18.0 2484 0.4324 0.3167 7.8531
0.4315 19.0 2622 0.4319 0.3185 8.0163
0.4315 20.0 2760 0.4316 0.318 8.0449
0.4315 21.0 2898 0.4313 0.3171 8.0571
0.4289 22.0 3036 0.4311 0.3195 7.9837
0.4289 23.0 3174 0.4308 0.3188 8.049
0.4289 24.0 3312 0.4307 0.3048 8.0694
0.4289 25.0 3450 0.4304 0.3046 8.1306
0.4264 26.0 3588 0.4303 0.3041 8.1224
0.4264 27.0 3726 0.4302 0.3044 8.1592
0.4264 28.0 3864 0.4301 0.3046 8.1306
0.4256 29.0 4002 0.4301 0.3039 8.1429
0.4256 30.0 4140 0.4301 0.3034 8.1551

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
12
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for MubarakB/mt5_small_lg_inf_en

Base model

google-t5/t5-small
Finetuned
(2)
this model