pszemraj
/

long-t5-tglobal-large-booksum-WIP

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Nov 27, 2022

Commit

017f950

•

1 Parent(s): d8009f4

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -2,6 +2,7 @@
 tags:
 - generated_from_trainer
 - summarization
 dataset:
 - kmfoda/booksum
 metrics:
@@ -11,11 +12,10 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # tglobal-large-booksum-WIP
 This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the `kmfoda/booksum` dataset.
 It achieves the following results on the evaluation set:
 - Loss: 4.9519
@@ -27,16 +27,15 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters

 tags:
 - generated_from_trainer
 - summarization
+- book summary
 dataset:
 - kmfoda/booksum
 metrics:
   results: []
 ---
 # tglobal-large-booksum-WIP
+> this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference**
 This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the `kmfoda/booksum` dataset.
 It achieves the following results on the evaluation set:
 - Loss: 4.9519
 ## Model description
+Testing fine-tuning only on booksum with 16384/1024 the whole time (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint)
 ## Intended uses & limitations
+this is a WIP checkpoint that has been fine-tuned from the vanilla (original) for 10ish epochs. It is **not ready to be used for inference**
 ## Training and evaluation data
+This is **only** fine-tuned on booksum (vs. previous large WIP checkpoint I made that started from a partially-trained `pubmed` checkpoint)
 ## Training procedure
 ### Training hyperparameters