mlong-t5-tglobal-large

This model is a fine-tuned version of agemagician/mlong-t5-tglobal-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.8858
Rouge1: 32.6402
Rouge2: 14.4404
Rougel: 24.6794
Rougelsum: 26.5654
Gen Len: 65.807

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	RougeL	RougeLSum
2.5919	1.0	1050	61.5895	1.9940	30.603	12.7279	22.8958	24.5756
2.3025	2.0	2100	96.4781	1.9429	30.2088	12.8612	22.4477	24.6023
2.1456	3.0	3150	80.6381	1.8979	31.4743	13.8002	23.6389	25.7835
1.9977	4.0	4200	72.9752	1.8858	32.3099	14.3439	24.3416	26.2897
1.9059	5.0	5250	68.4971	1.8878	32.2531	14.0683	24.3766	26.1912
1.8521	6.0	6300	68.9524	1.8892	32.3429	14.0016	24.2874	26.3216
1.7472	7.0	7000	60.46	1.8865	32.8966	14.8847	25.1771	26.9613
1.7018	8.0	8000	65.807	1.8858	32.6402	14.4404	24.6794	26.5654
1.6337	9.0	9000	79.875	1.9019	32.2069	13.8683	24.0734	26.353
1.5773	10.0	10000	65.88	1.9043	32.8499	14.5395	24.8736	26.9515
1.5238	11.0	11000	63.208	1.9148	32.8182	14.322	24.7011	26.5718
1.4779	12.0	12000	63.937	1.9297	33.2751	14.7214	25.0329	26.9804

Framework versions

Transformers 4.37.2
Pytorch 2.2.0+cu121
Datasets 2.16.1
Tokenizers 0.15.1

biunlp
/

mT5LongHeSum-large

mlong-t5-tglobal-large

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for biunlp/mT5LongHeSum-large

Evaluation results