--- tags: - generated_from_trainer - summarization - book summary dataset: - kmfoda/booksum metrics: - rouge model-index: - name: long-t5-tglobal-large-booksum-WIP results: - task: type: summarization name: Summarization dataset: name: kmfoda/booksum type: kmfoda/booksum config: kmfoda--booksum split: test metrics: - name: ROUGE-1 type: rouge value: 25.6136 verified: true - name: ROUGE-2 type: rouge value: 2.8652 verified: true - name: ROUGE-L type: rouge value: 12.4913 verified: true - name: ROUGE-LSUM type: rouge value: 23.1102 verified: true - name: loss type: loss value: 5.004334926605225 verified: true - name: gen_len type: gen_len value: 89.4354 verified: true --- # tglobal-large-booksum-WIP > this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference** This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the `kmfoda/booksum` dataset. It achieves the following results on the evaluation set: - Loss: 4.9519 - Rouge1: 21.8058 - Rouge2: 2.9343 - Rougel: 10.3717 - Rougelsum: 20.1537 - Gen Len: 106.055 ## Model description Testing fine-tuning only on booksum with 16384/1024 the whole time (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint) ## Intended uses & limitations this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference** ## Training and evaluation data This is **only** fine-tuned on booksum (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint) ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0004 - train_batch_size: 1 - eval_batch_size: 1 - seed: 31060 - distributed_type: multi-GPU - num_devices: 4 - gradient_accumulation_steps: 32 - total_train_batch_size: 128 - total_eval_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - num_epochs: 3.0 ### Training results | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:-----:|:----:|:-------:|:---------------:|:-------:|:------:|:-------:|:---------:| | 5.0389 | 0.99 | 37 | 219.03 | 5.1884 | 29.995 | 4.4045 | 12.8837 | 27.557 | | 4.8986 | 1.0 | 75 | 5.1286 | 26.921 | 3.7193 | 11.3605| 25.3492 | 276.005 | | 4.5928 | 2.0 | 150 | 4.9900 | 26.6667 | 3.7342 | 11.8223| 24.7087 | 178.775 | | 4.6159 | 3.0 | 225 | 4.9519 | 21.8058 | 2.9343 | 10.3717| 20.1537 | 106.055 | #### eval in bf16 ``` ***** eval metrics ***** epoch = 3.0 eval_gen_len = 103.075 eval_loss = 4.9501 eval_rouge1 = 21.6345 eval_rouge2 = 2.877 eval_rougeL = 10.386 eval_rougeLsum = 20.0148 eval_runtime = 0:06:02.75 eval_samples = 200 eval_samples_per_second = 0.551 eval_steps_per_second = 0.138 [INFO|trainer.py:2724] 2022-11-27 01:00: ``` ### Framework versions - Transformers 4.25.0.dev0 - Pytorch 1.13.0+cu117 - Datasets 2.6.1 - Tokenizers 0.13.1