Edit model card

MT5-large_NO-idun-20epoch-earlystopping

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6061
  • Rouge1: 41.2146
  • Rouge2: 18.153
  • Rougel: 28.4036
  • Rougelsum: 36.8514
  • Gen Len: 111.1064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 0.98 46 13.1717 15.636 4.987 10.443 14.3841 119.9149
No log 1.99 93 7.7651 10.907 1.4566 6.9291 10.152 127.0
No log 2.99 140 6.4230 18.8954 0.4957 13.2179 17.1171 127.0
No log 4.0 187 2.1315 37.1306 13.4104 21.2261 33.0925 127.0
No log 4.98 233 1.7761 37.6703 14.7962 22.835 34.0213 113.5638
No log 5.99 280 1.6807 38.6245 15.7401 24.5743 34.5933 113.2340
No log 6.99 327 1.6484 38.7899 15.737 24.9265 34.6166 114.3404
No log 8.0 374 1.6156 39.3812 15.7133 24.979 35.2788 120.0319
No log 8.98 420 1.6138 40.0966 17.4991 26.5925 36.5511 117.2234
No log 9.99 467 1.6152 40.3623 17.7244 27.0847 36.2108 113.2128
6.3618 10.99 514 1.6102 41.2763 18.0108 27.6185 37.1836 113.1064
6.3618 12.0 561 1.6070 41.2369 17.8711 27.3781 36.9853 115.7766
6.3618 12.98 607 1.6087 42.0737 18.414 27.8849 38.1238 113.3404
6.3618 13.99 654 1.6038 41.4279 17.8899 27.79 36.929 115.1383
6.3618 14.99 701 1.6061 40.8051 17.4437 27.1414 36.494 113.8936
6.3618 16.0 748 1.6074 41.8104 18.0504 27.934 37.3843 114.8511
6.3618 16.98 794 1.6053 41.4314 17.955 27.7884 36.9083 114.3830
6.3618 17.99 841 1.6057 41.8533 18.0219 27.7616 37.4008 113.2128
6.3618 18.99 888 1.6060 41.5846 18.3563 28.4177 37.1366 112.1915
6.3618 19.68 920 1.6061 41.2146 18.153 28.4036 36.8514 111.1064

Framework versions

  • Transformers 4.32.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.12.0
  • Tokenizers 0.13.2
Downloads last month
3
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Akselssss/MT5-large_NO-idun-20epoch-earlystopping

Base model

google/mt5-large
Finetuned
(41)
this model