Edit model card

mlong-t5-tglobal-large

This model is a fine-tuned version of agemagician/mlong-t5-tglobal-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8858
  • Rouge1: 32.6402
  • Rouge2: 14.4404
  • Rougel: 24.6794
  • Rougelsum: 26.5654
  • Gen Len: 65.807

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 RougeL RougeLSum
2.5919 1.0 1050 61.5895 1.9940 30.603 12.7279 22.8958 24.5756
2.3025 2.0 2100 96.4781 1.9429 30.2088 12.8612 22.4477 24.6023
2.1456 3.0 3150 80.6381 1.8979 31.4743 13.8002 23.6389 25.7835
1.9977 4.0 4200 72.9752 1.8858 32.3099 14.3439 24.3416 26.2897
1.9059 5.0 5250 68.4971 1.8878 32.2531 14.0683 24.3766 26.1912
1.8521 6.0 6300 68.9524 1.8892 32.3429 14.0016 24.2874 26.3216
1.7472 7.0 7000 60.46 1.8865 32.8966 14.8847 25.1771 26.9613
1.7018 8.0 8000 65.807 1.8858 32.6402 14.4404 24.6794 26.5654
1.6337 9.0 9000 79.875 1.9019 32.2069 13.8683 24.0734 26.353
1.5773 10.0 10000 65.88 1.9043 32.8499 14.5395 24.8736 26.9515
1.5238 11.0 11000 63.208 1.9148 32.8182 14.322 24.7011 26.5718
1.4779 12.0 12000 63.937 1.9297 33.2751 14.7214 25.0329 26.9804

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
9
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for biunlp/mT5LongHeSum-large

Finetuned
(1)
this model