lewtun HF staff commited on
Commit
d5f8859
1 Parent(s): 9f9c5aa

Training complete

Browse files
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  license: apache-2.0
3
  tags:
 
4
  - generated_from_trainer
5
  datasets:
6
  - null
@@ -15,7 +16,7 @@ model-index:
15
  metrics:
16
  - name: Rouge1
17
  type: rouge
18
- value: 10.8752
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -25,12 +26,12 @@ should probably proofread and complete it, then remove this comment. -->
25
 
26
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
27
  It achieves the following results on the evaluation set:
28
- - Loss: 3.1491
29
- - Rouge1: 10.8752
30
- - Rouge2: 3.8695
31
- - Rougel: 10.6991
32
- - Rougelsum: 10.6616
33
- - Gen Len: 5.6085
34
 
35
  ## Model description
36
 
@@ -49,29 +50,31 @@ More information needed
49
  ### Training hyperparameters
50
 
51
  The following hyperparameters were used during training:
52
- - learning_rate: 2e-05
53
  - train_batch_size: 8
54
  - eval_batch_size: 8
55
  - seed: 42
56
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
57
  - lr_scheduler_type: linear
58
- - num_epochs: 6
59
 
60
  ### Training results
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
63
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
64
- | 9.1733 | 1.0 | 2202 | 3.4863 | 6.3629 | 1.4637 | 6.2501 | 6.2752 | 3.3302 |
65
- | 4.4547 | 2.0 | 4404 | 3.2809 | 9.1283 | 2.992 | 8.9851 | 9.0487 | 4.7642 |
66
- | 4.0581 | 3.0 | 6606 | 3.2108 | 10.5207 | 3.7411 | 10.2595 | 10.234 | 5.3208 |
67
- | 3.8821 | 4.0 | 8808 | 3.1701 | 10.8636 | 4.0944 | 10.6462 | 10.6468 | 5.2453 |
68
- | 3.7857 | 5.0 | 11010 | 3.1600 | 10.9456 | 4.5187 | 10.784 | 10.7542 | 5.691 |
69
- | 3.7273 | 6.0 | 13212 | 3.1491 | 10.8752 | 3.8695 | 10.6991 | 10.6616 | 5.6085 |
 
 
70
 
71
 
72
  ### Framework versions
73
 
74
  - Transformers 4.10.3
75
- - Pytorch 1.9.1+cu102
76
  - Datasets 1.12.1
77
  - Tokenizers 0.10.3
 
1
  ---
2
  license: apache-2.0
3
  tags:
4
+ - summarization
5
  - generated_from_trainer
6
  datasets:
7
  - null
 
16
  metrics:
17
  - name: Rouge1
18
  type: rouge
19
+ value: 12.4927
20
  ---
21
 
22
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
26
 
27
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the None dataset.
28
  It achieves the following results on the evaluation set:
29
+ - Loss: 2.9894
30
+ - Rouge1: 12.4927
31
+ - Rouge2: 4.847
32
+ - Rougel: 12.4387
33
+ - Rougelsum: 12.4383
34
+ - Gen Len: 6.1675
35
 
36
  ## Model description
37
 
 
50
  ### Training hyperparameters
51
 
52
  The following hyperparameters were used during training:
53
+ - learning_rate: 4e-05
54
  - train_batch_size: 8
55
  - eval_batch_size: 8
56
  - seed: 42
57
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
  - lr_scheduler_type: linear
59
+ - num_epochs: 8
60
 
61
  ### Training results
62
 
63
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
64
  |:-------------:|:-----:|:-----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:|
65
+ | 6.5619 | 1.0 | 2202 | 3.2749 | 9.2423 | 3.2813 | 9.2013 | 9.1698 | 5.0354 |
66
+ | 3.8525 | 2.0 | 4404 | 3.1296 | 11.1883 | 4.047 | 11.1545 | 11.1885 | 6.4033 |
67
+ | 3.5419 | 3.0 | 6606 | 3.0478 | 11.4905 | 4.4465 | 11.3538 | 11.3805 | 6.6462 |
68
+ | 3.4045 | 4.0 | 8808 | 3.0174 | 11.5798 | 4.4426 | 11.5372 | 11.571 | 6.6816 |
69
+ | 3.3091 | 5.0 | 11010 | 3.0080 | 12.0207 | 4.5622 | 11.9232 | 11.9476 | 6.4976 |
70
+ | 3.2457 | 6.0 | 13212 | 2.9981 | 12.2459 | 4.6924 | 12.2306 | 12.2375 | 6.1533 |
71
+ | 3.2179 | 7.0 | 15414 | 2.9943 | 12.3927 | 4.6072 | 12.2888 | 12.2848 | 6.3561 |
72
+ | 3.1898 | 8.0 | 17616 | 2.9894 | 12.4927 | 4.847 | 12.4387 | 12.4383 | 6.1675 |
73
 
74
 
75
  ### Framework versions
76
 
77
  - Transformers 4.10.3
78
+ - Pytorch 1.9.1+cu111
79
  - Datasets 1.12.1
80
  - Tokenizers 0.10.3
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9fc1cebaa89d1bddeb41b5c3a2014bfc2d70cde4af9bb38ac8c5e578f61bcf22
3
  size 1200770885
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0e675e00cd36206c8add82fb2725faaa4cadb2ebf15a60cfc984dbdae04abd3
3
  size 1200770885
runs/Sep29_14-23-20_vorace/1632918487.4283721/events.out.tfevents.1632918487.vorace.2673642.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a371d693df8c7de443878962b865d2c5d666bb8218d013ae6936ff06128fe65
3
+ size 4434
runs/Sep29_14-23-20_vorace/events.out.tfevents.1632918487.vorace.2673642.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b328dbf4d725d49d47961f7ac961dfe9b710674373705b97556491239cdaf478
3
+ size 3253
runs/Sep29_14-34-02_vorace/1632918914.4980953/events.out.tfevents.1632918914.vorace.2677037.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10e61ea87cf71d6aea0fe0aa106bfb74fbe5493f1cba1f849860b3d0abdb62e7
3
+ size 4434
runs/Sep29_14-34-02_vorace/events.out.tfevents.1632918892.vorace.2677037.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba8e0d8321230d4ad06b7397eedf35b2e4fbce322f9bf6fa3b80c267b3012a33
3
+ size 488
runs/Sep29_14-34-02_vorace/events.out.tfevents.1632918914.vorace.2677037.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5eee23e12afafaa563b4869dc01790d1bbb34436b14000d6b085748d4a8ee4de
3
+ size 8925
tokenizer_config.json CHANGED
@@ -1 +1 @@
1
- {"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>", "extra_ids": 0, "additional_special_tokens": null, "special_tokens_map_file": "/home/lewtun/.cache/huggingface/transformers/685ac0ca8568ec593a48b61b0a3c272beee9bc194a3c7241d15dcadb5f875e53.f76030f3ec1b96a8199b2593390c610e76ca8028ef3d24680000619ffb646276", "name_or_path": "google/mt5-small", "sp_model_kwargs": {}, "tokenizer_class": "T5Tokenizer"}
 
1
+ {"eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>", "extra_ids": 0, "additional_special_tokens": null, "special_tokens_map_file": "/data/.cache/hf/transformers/685ac0ca8568ec593a48b61b0a3c272beee9bc194a3c7241d15dcadb5f875e53.f76030f3ec1b96a8199b2593390c610e76ca8028ef3d24680000619ffb646276", "name_or_path": "google/mt5-small", "sp_model_kwargs": {}, "tokenizer_class": "T5Tokenizer"}
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b6c696d829fb994c3c16fe17700311c5b844cb2570a26fe27827c9598487fe10
3
- size 2735
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be2fdab9ad79e6ab4510ff0df9f9ea943c75e6872fc392dfa2aeca1e26a2e148
3
+ size 2799