metadata

base_model: >-
  /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen/checkpoint-140
tags:
  - generated_from_trainer
datasets:
  - tau/scrolls
metrics:
  - rouge
model-index:
  - name: longt5_xl_summ_screen_20
    results:
      - task:
          name: Summarization
          type: summarization
        dataset:
          name: tau/scrolls summ_screen_fd
          type: tau/scrolls
          config: summ_screen_fd
          split: validation
          args: summ_screen_fd
        metrics:
          - name: Rouge1
            type: rouge
            value: 28.1708

longt5_xl_summ_screen_20

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_summ_screen/checkpoint-140 on the tau/scrolls summ_screen_fd dataset. It achieves the following results on the evaluation set:

Loss: 3.1917
Rouge1: 28.1708
Rouge2: 6.6895
Rougel: 18.1637
Rougelsum: 24.3987
Gen Len: 96.2041

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
0.4063	0.97	14	3.7385	27.9171	6.7215	17.9315	24.363	71.9083
0.3125	1.95	28	3.1917	28.1708	6.6895	18.1637	24.3987	96.2041
0.2177	2.99	43	3.9998	29.3167	5.9	17.3608	25.6945	198.0473
0.1753	3.97	57	4.2287	29.0605	6.2534	17.5744	25.6415	158.6509
0.2747	4.94	71	4.1027	31.2245	6.5663	18.1588	26.8996	118.4438
0.1045	5.98	86	5.0581	30.6056	6.8892	18.4933	26.4027	92.9882
0.0875	6.96	100	4.5941	32.5234	7.3736	18.8958	28.4738	160.8964
0.1572	8.0	115	4.9386	31.4658	7.2592	18.4796	27.6047	121.0178
0.0867	8.97	129	4.5565	32.0531	7.0692	18.5551	27.3373	160.4793
0.0748	9.74	140	5.0866	32.2717	7.7004	18.9107	28.3874	124.1893

Framework versions

Transformers 4.34.0.dev0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.13.3