Edit model card

mt5-small-clara-med

This model is a fine-tuned version of google/mt5-small on the CLARA-MeD dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9850
  • Rouge1: 33.0363
  • Rouge2: 19.0613
  • Rougel: 30.295
  • Rougelsum: 30.2898
  • SARI: 40.7094

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 190 3.0286 18.0709 7.727 16.1995 16.3348
No log 2.0 380 2.4754 24.9167 13.0501 22.3889 22.4724
6.79 3.0 570 2.3542 29.9908 15.9829 26.3751 26.4343
6.79 4.0 760 2.2894 30.4435 16.3176 27.1801 27.1926
3.1288 5.0 950 2.2440 30.8602 16.8033 27.8195 27.8355
3.1288 6.0 1140 2.1772 31.4202 17.3253 28.3394 28.3699
3.1288 7.0 1330 2.1584 31.5591 17.7302 28.618 28.6189
2.7919 8.0 1520 2.1286 31.6211 17.7423 28.7218 28.7462
2.7919 9.0 1710 2.1031 31.9724 18.017 29.0754 29.0744
2.6007 10.0 1900 2.0947 32.1588 18.2474 29.2957 29.2956
2.6007 11.0 2090 2.0914 32.4959 18.4197 29.6052 29.609
2.6007 12.0 2280 2.0726 32.6673 18.8962 29.9145 29.9122
2.4911 13.0 2470 2.0487 32.4461 18.6804 29.6224 29.6274
2.4911 14.0 2660 2.0436 32.8393 19.0315 30.1024 30.1097
2.4168 15.0 2850 2.0229 32.8235 18.9549 30.0699 30.0605
2.4168 16.0 3040 2.0253 32.8584 18.8602 30.0582 30.0712
2.4168 17.0 3230 2.0177 32.7145 18.9059 30.0436 30.0771
2.3452 18.0 3420 2.0151 32.6874 18.8339 29.9739 30.0004
2.3452 19.0 3610 2.0138 32.516 18.6562 29.7823 29.7951
2.302 20.0 3800 2.0085 32.8117 18.8208 30.0902 30.1282
2.302 21.0 3990 2.0043 32.7633 18.8364 30.0619 30.0781
2.302 22.0 4180 1.9972 32.9786 19.0354 30.2166 30.2286
2.2641 23.0 4370 1.9927 33.0222 19.0501 30.2716 30.2951
2.2641 24.0 4560 1.9905 32.9557 18.9958 30.1988 30.2004
2.2366 25.0 4750 1.9897 33.0429 18.9806 30.2861 30.3012
2.2366 26.0 4940 1.9850 33.047 19.118 30.3437 30.3368
2.2366 27.0 5130 1.9860 33.0736 19.0805 30.3311 30.3476
2.2157 28.0 5320 1.9870 33.0698 19.0649 30.2959 30.3093
2.2157 29.0 5510 1.9844 33.0376 19.0397 30.299 30.2839
2.2131 30.0 5700 1.9850 33.0363 19.0613 30.295 30.2898

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.0
  • Datasets 2.8.0
  • Tokenizers 0.12.1
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train CLARA-MeD/mt5-small