t5-abs-1709-1203-lr-0.0001-bs-10-maxep-20
This model is a fine-tuned version of google-t5/t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.3944
- Rouge/rouge1: 0.4206
- Rouge/rouge2: 0.162
- Rouge/rougel: 0.33
- Rouge/rougelsum: 0.3303
- Bertscore/bertscore-precision: 0.8957
- Bertscore/bertscore-recall: 0.8758
- Bertscore/bertscore-f1: 0.8855
- Meteor: 0.3203
- Gen Len: 34.9
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 10
- eval_batch_size: 10
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 20
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge/rouge1 | Rouge/rouge2 | Rouge/rougel | Rouge/rougelsum | Bertscore/bertscore-precision | Bertscore/bertscore-recall | Bertscore/bertscore-f1 | Meteor | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1.9289 | 0.8 | 2 | 2.0526 | 0.4186 | 0.188 | 0.3713 | 0.3708 | 0.9077 | 0.8772 | 0.8921 | 0.338 | 31.5 |
1.2808 | 2.0 | 5 | 2.0627 | 0.4127 | 0.1942 | 0.3695 | 0.3702 | 0.9053 | 0.8756 | 0.89 | 0.3277 | 30.6 |
1.7019 | 2.8 | 7 | 2.0780 | 0.4156 | 0.1796 | 0.3544 | 0.354 | 0.9035 | 0.8741 | 0.8884 | 0.3306 | 30.7 |
1.05 | 4.0 | 10 | 2.1041 | 0.3644 | 0.1342 | 0.3099 | 0.3086 | 0.8947 | 0.871 | 0.8825 | 0.2859 | 32.7 |
1.4533 | 4.8 | 12 | 2.1367 | 0.3557 | 0.12 | 0.2873 | 0.2883 | 0.8883 | 0.868 | 0.8779 | 0.2659 | 33.5 |
0.8959 | 6.0 | 15 | 2.2072 | 0.413 | 0.1484 | 0.3224 | 0.3221 | 0.8934 | 0.8744 | 0.8836 | 0.3015 | 34.3 |
1.2858 | 6.8 | 17 | 2.2363 | 0.3801 | 0.1284 | 0.2942 | 0.2948 | 0.8881 | 0.8696 | 0.8786 | 0.2635 | 34.8 |
0.819 | 8.0 | 20 | 2.2531 | 0.4042 | 0.1592 | 0.3155 | 0.3165 | 0.8941 | 0.8739 | 0.8837 | 0.3198 | 35.8 |
1.1514 | 8.8 | 22 | 2.2739 | 0.394 | 0.1581 | 0.3193 | 0.321 | 0.8944 | 0.8739 | 0.8838 | 0.312 | 33.0 |
0.7579 | 10.0 | 25 | 2.3118 | 0.4277 | 0.1766 | 0.3409 | 0.3425 | 0.8992 | 0.8767 | 0.8876 | 0.3405 | 35.2 |
1.1033 | 10.8 | 27 | 2.3370 | 0.422 | 0.1704 | 0.3423 | 0.3439 | 0.8972 | 0.8761 | 0.8864 | 0.3399 | 35.9 |
0.6976 | 12.0 | 30 | 2.3617 | 0.4205 | 0.1703 | 0.3281 | 0.3285 | 0.8974 | 0.8755 | 0.8861 | 0.3227 | 35.5 |
1.0212 | 12.8 | 32 | 2.3656 | 0.4179 | 0.1651 | 0.3234 | 0.3225 | 0.8974 | 0.8756 | 0.8861 | 0.3215 | 35.3 |
0.6757 | 14.0 | 35 | 2.3826 | 0.4187 | 0.1615 | 0.3282 | 0.3278 | 0.8962 | 0.8761 | 0.8859 | 0.32 | 35.2 |
1.0276 | 14.8 | 37 | 2.3884 | 0.4206 | 0.162 | 0.33 | 0.3303 | 0.8957 | 0.8758 | 0.8855 | 0.3203 | 34.9 |
0.6742 | 16.0 | 40 | 2.3944 | 0.4206 | 0.162 | 0.33 | 0.3303 | 0.8957 | 0.8758 | 0.8855 | 0.3203 | 34.9 |
Framework versions
- Transformers 4.44.0
- Pytorch 2.4.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 1
Model tree for roequitz/t5-abs-1709-1203-lr-0.0001-bs-10-maxep-20
Base model
google-t5/t5-base