File size: 3,600 Bytes
f9ab922 d8009f4 017f950 d8009f4 f9ab922 d8009f4 2c67882 f9ab922 d8009f4 f9ab922 017f950 de92e8b d8009f4 f9ab922 017f950 f9ab922 017f950 f9ab922 017f950 de92e8b f9ab922 d365bfb f9ab922 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
---
tags:
- generated_from_trainer
- summarization
- book summary
dataset:
- kmfoda/booksum
metrics:
- rouge
model-index:
- name: long-t5-tglobal-large-booksum-WIP
results:
- task:
type: summarization
name: Summarization
dataset:
name: kmfoda/booksum
type: kmfoda/booksum
config: kmfoda--booksum
split: test
metrics:
- name: ROUGE-1
type: rouge
value: 25.6136
verified: true
- name: ROUGE-2
type: rouge
value: 2.8652
verified: true
- name: ROUGE-L
type: rouge
value: 12.4913
verified: true
- name: ROUGE-LSUM
type: rouge
value: 23.1102
verified: true
- name: loss
type: loss
value: 5.004334926605225
verified: true
- name: gen_len
type: gen_len
value: 89.4354
verified: true
---
# tglobal-large-booksum-WIP
> this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference**
This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the `kmfoda/booksum` dataset.
It achieves the following results on the evaluation set:
- Loss: 4.9519
- Rouge1: 21.8058
- Rouge2: 2.9343
- Rougel: 10.3717
- Rougelsum: 20.1537
- Gen Len: 106.055
## Model description
Testing fine-tuning only on booksum with 16384/1024 the whole time (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint)
## Intended uses & limitations
this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference**
## Training and evaluation data
This is **only** fine-tuned on booksum (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint)
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0004
- train_batch_size: 1
- eval_batch_size: 1
- seed: 31060
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 3.0
### Training results
| Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:-------:|:---------------:|:-------:|:------:|:-------:|:---------:|
| 5.0389 | 0.99 | 37 | 219.03 | 5.1884 | 29.995 | 4.4045 | 12.8837 | 27.557 |
| 4.8986 | 1.0 | 75 | 5.1286 | 26.921 | 3.7193 | 11.3605| 25.3492 | 276.005 |
| 4.5928 | 2.0 | 150 | 4.9900 | 26.6667 | 3.7342 | 11.8223| 24.7087 | 178.775 |
| 4.6159 | 3.0 | 225 | 4.9519 | 21.8058 | 2.9343 | 10.3717| 20.1537 | 106.055 |
#### eval in bf16
```
***** eval metrics *****
epoch = 3.0
eval_gen_len = 103.075
eval_loss = 4.9501
eval_rouge1 = 21.6345
eval_rouge2 = 2.877
eval_rougeL = 10.386
eval_rougeLsum = 20.0148
eval_runtime = 0:06:02.75
eval_samples = 200
eval_samples_per_second = 0.551
eval_steps_per_second = 0.138
[INFO|trainer.py:2724] 2022-11-27 01:00:
```
### Framework versions
- Transformers 4.25.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.6.1
- Tokenizers 0.13.1
|