learn3r
/

longt5_xl_summ_screen_20

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

YAML Metadata Error: "base_model" with value "/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen/checkpoint-140" is not valid. Use a model id from https://hf.co/models.

longt5_xl_summ_screen_20

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen/checkpoint-140 on the tau/scrolls summ_screen_fd dataset. It achieves the following results on the evaluation set:

Loss: 3.1917
Rouge1: 28.1708
Rouge2: 6.6895
Rougel: 18.1637
Rougelsum: 24.3987
Gen Len: 96.2041

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.4063	0.97	14	3.7385	27.9171	6.7215	17.9315	24.363	71.9083
0.3125	1.95	28	3.1917	28.1708	6.6895	18.1637	24.3987	96.2041
0.2177	2.99	43	3.9998	29.3167	5.9	17.3608	25.6945	198.0473
0.1753	3.97	57	4.2287	29.0605	6.2534	17.5744	25.6415	158.6509
0.2747	4.94	71	4.1027	31.2245	6.5663	18.1588	26.8996	118.4438
0.1045	5.98	86	5.0581	30.6056	6.8892	18.4933	26.4027	92.9882
0.0875	6.96	100	4.5941	32.5234	7.3736	18.8958	28.4738	160.8964
0.1572	8.0	115	4.9386	31.4658	7.2592	18.4796	27.6047	121.0178
0.0867	8.97	129	4.5565	32.0531	7.0692	18.5551	27.3373	160.4793
0.0748	9.74	140	5.0866	32.2717	7.7004	18.9107	28.3874	124.1893

Framework versions

Transformers 4.34.0.dev0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.13.3

Downloads last month: 5

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train learn3r/longt5_xl_summ_screen_20

Evaluation results

Rouge1 on tau/scrolls summ_screen_fd
validation set self-reported

28.171

View on Papers With Code