Theoreticallyhugo commited on
Commit
f541ad4
1 Parent(s): 8c894e8

trainer: training complete at 2024-02-19 20:53:25.864568.

Browse files
Files changed (3) hide show
  1. README.md +17 -16
  2. meta_data/README_s42_e7.md +86 -0
  3. model.safetensors +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ model-index:
22
  metrics:
23
  - name: Accuracy
24
  type: accuracy
25
- value: 0.9421333619979219
26
  ---
27
 
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,13 +32,13 @@ should probably proofread and complete it, then remove this comment. -->
32
 
33
  This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
34
  It achieves the following results on the evaluation set:
35
- - Loss: 0.1716
36
- - B: {'precision': 0.8420123565754634, 'recall': 0.9008498583569405, 'f1-score': 0.8704379562043796, 'support': 1059.0}
37
- - I: {'precision': 0.9520763187429854, 'recall': 0.965348506401138, 'f1-score': 0.9586664783161464, 'support': 17575.0}
38
- - O: {'precision': 0.9350156319785619, 'recall': 0.9028571428571428, 'f1-score': 0.9186550381218803, 'support': 9275.0}
39
  - Accuracy: 0.9421
40
- - Macro avg: {'precision': 0.9097014357656702, 'recall': 0.9230185025384072, 'f1-score': 0.9159198242141354, 'support': 27909.0}
41
- - Weighted avg: {'precision': 0.9422301900506126, 'recall': 0.9421333619979219, 'f1-score': 0.9420216643594235, 'support': 27909.0}
42
 
43
  ## Model description
44
 
@@ -63,18 +63,19 @@ The following hyperparameters were used during training:
63
  - seed: 42
64
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
  - lr_scheduler_type: linear
66
- - num_epochs: 6
67
 
68
  ### Training results
69
 
70
- | Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
71
- |:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
72
- | No log | 1.0 | 41 | 0.2779 | {'precision': 0.8035190615835777, 'recall': 0.5174693106704438, 'f1-score': 0.6295232624928202, 'support': 1059.0} | {'precision': 0.9134303762702555, 'recall': 0.9461735419630156, 'f1-score': 0.9295136948015652, 'support': 17575.0} | {'precision': 0.8836178230990911, 'recall': 0.8595148247978437, 'f1-score': 0.8713996830081434, 'support': 9275.0} | 0.9011 | {'precision': 0.8668557536509748, 'recall': 0.7743858924771011, 'f1-score': 0.8101455467675095, 'support': 27909.0} | {'precision': 0.8993522110577526, 'recall': 0.9011071697301946, 'f1-score': 0.8988175993771879, 'support': 27909.0} |
73
- | No log | 2.0 | 82 | 0.1973 | {'precision': 0.8130590339892666, 'recall': 0.8583569405099151, 'f1-score': 0.8350941662838769, 'support': 1059.0} | {'precision': 0.9326064325242452, 'recall': 0.9684779516358464, 'f1-score': 0.9502037626304918, 'support': 17575.0} | {'precision': 0.9385245901639344, 'recall': 0.8641509433962264, 'f1-score': 0.899803536345776, 'support': 9275.0} | 0.9296 | {'precision': 0.8947300188924822, 'recall': 0.896995278513996, 'f1-score': 0.8950338217533815, 'support': 27909.0} | {'precision': 0.9300370182514147, 'recall': 0.9296284352717761, 'f1-score': 0.9290864470218421, 'support': 27909.0} |
74
- | No log | 3.0 | 123 | 0.1836 | {'precision': 0.788197251414713, 'recall': 0.9206798866855525, 'f1-score': 0.8493031358885017, 'support': 1059.0} | {'precision': 0.938334252619967, 'recall': 0.9679658605974395, 'f1-score': 0.9529197591373757, 'support': 17575.0} | {'precision': 0.943807070943573, 'recall': 0.8692183288409704, 'f1-score': 0.904978391423921, 'support': 9275.0} | 0.9334 | {'precision': 0.8901128583260842, 'recall': 0.9192880253746541, 'f1-score': 0.9024004288165995, 'support': 27909.0} | {'precision': 0.9344561239043228, 'recall': 0.9333548317746964, 'f1-score': 0.9330556941560847, 'support': 27909.0} |
75
- | No log | 4.0 | 164 | 0.1709 | {'precision': 0.8227739726027398, 'recall': 0.9074598677998111, 'f1-score': 0.8630444544229906, 'support': 1059.0} | {'precision': 0.9512620158524931, 'recall': 0.9628449502133712, 'f1-score': 0.9570184368284129, 'support': 17575.0} | {'precision': 0.9324173369079535, 'recall': 0.8999460916442048, 'f1-score': 0.9158940034015471, 'support': 9275.0} | 0.9398 | {'precision': 0.9021511084543955, 'recall': 0.9234169698857958, 'f1-score': 0.9119856315509836, 'support': 27909.0} | {'precision': 0.9401239157768152, 'recall': 0.9398401949192017, 'f1-score': 0.9397857317009801, 'support': 27909.0} |
76
- | No log | 5.0 | 205 | 0.1695 | {'precision': 0.8363954505686789, 'recall': 0.902738432483475, 'f1-score': 0.8683015440508628, 'support': 1059.0} | {'precision': 0.9477175185329691, 'recall': 0.9674537695590327, 'f1-score': 0.9574839508953711, 'support': 17575.0} | {'precision': 0.9385835694050991, 'recall': 0.8930458221024259, 'f1-score': 0.9152486187845303, 'support': 9275.0} | 0.9403 | {'precision': 0.9075655128355824, 'recall': 0.9210793413816445, 'f1-score': 0.9136780379102548, 'support': 27909.0} | {'precision': 0.9404579446272334, 'recall': 0.9402701637464617, 'f1-score': 0.9400638758594909, 'support': 27909.0} |
77
- | No log | 6.0 | 246 | 0.1716 | {'precision': 0.8420123565754634, 'recall': 0.9008498583569405, 'f1-score': 0.8704379562043796, 'support': 1059.0} | {'precision': 0.9520763187429854, 'recall': 0.965348506401138, 'f1-score': 0.9586664783161464, 'support': 17575.0} | {'precision': 0.9350156319785619, 'recall': 0.9028571428571428, 'f1-score': 0.9186550381218803, 'support': 9275.0} | 0.9421 | {'precision': 0.9097014357656702, 'recall': 0.9230185025384072, 'f1-score': 0.9159198242141354, 'support': 27909.0} | {'precision': 0.9422301900506126, 'recall': 0.9421333619979219, 'f1-score': 0.9420216643594235, 'support': 27909.0} |
 
78
 
79
 
80
  ### Framework versions
 
22
  metrics:
23
  - name: Accuracy
24
  type: accuracy
25
+ value: 0.9420975312623168
26
  ---
27
 
28
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
32
 
33
  This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
34
  It achieves the following results on the evaluation set:
35
+ - Loss: 0.1719
36
+ - B: {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0}
37
+ - I: {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0}
38
+ - O: {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0}
39
  - Accuracy: 0.9421
40
+ - Macro avg: {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0}
41
+ - Weighted avg: {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0}
42
 
43
  ## Model description
44
 
 
63
  - seed: 42
64
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
  - lr_scheduler_type: linear
66
+ - num_epochs: 7
67
 
68
  ### Training results
69
 
70
+ | Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
71
+ |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
72
+ | No log | 1.0 | 41 | 0.2928 | {'precision': 0.8236434108527132, 'recall': 0.40132200188857414, 'f1-score': 0.5396825396825397, 'support': 1059.0} | {'precision': 0.9120444175691276, 'recall': 0.9440113798008535, 'f1-score': 0.9277526142146172, 'support': 17575.0} | {'precision': 0.8748098239513149, 'recall': 0.8679245283018868, 'f1-score': 0.87135357471451, 'support': 9275.0} | 0.8981 | {'precision': 0.8701658841243853, 'recall': 0.7377526366637716, 'f1-score': 0.7795962428705557, 'support': 27909.0} | {'precision': 0.8963158883521046, 'recall': 0.8981332186749794, 'f1-score': 0.8942842957405419, 'support': 27909.0} |
73
+ | No log | 2.0 | 82 | 0.1943 | {'precision': 0.8109318996415771, 'recall': 0.8545797922568461, 'f1-score': 0.8321839080459771, 'support': 1059.0} | {'precision': 0.9395201599466845, 'recall': 0.9625604551920341, 'f1-score': 0.9509007616424496, 'support': 17575.0} | {'precision': 0.9288721975645841, 'recall': 0.88, 'f1-score': 0.9037758830694275, 'support': 9275.0} | 0.9310 | {'precision': 0.8931080857176151, 'recall': 0.8990467491496267, 'f1-score': 0.895620184252618, 'support': 27909.0} | {'precision': 0.9311022725713902, 'recall': 0.9310258339603712, 'f1-score': 0.9307350661061193, 'support': 27909.0} |
74
+ | No log | 3.0 | 123 | 0.1853 | {'precision': 0.799163179916318, 'recall': 0.9017941454202077, 'f1-score': 0.847382431233363, 'support': 1059.0} | {'precision': 0.9557297671201291, 'recall': 0.9433854907539118, 'f1-score': 0.9495175099504624, 'support': 17575.0} | {'precision': 0.9017723681400811, 'recall': 0.9106199460916442, 'f1-score': 0.9061745614505659, 'support': 9275.0} | 0.9309 | {'precision': 0.8855551050588427, 'recall': 0.9185998607552546, 'f1-score': 0.9010248342114636, 'support': 27909.0} | {'precision': 0.9318572209382959, 'recall': 0.9309183417535563, 'f1-score': 0.9312378547962845, 'support': 27909.0} |
75
+ | No log | 4.0 | 164 | 0.1717 | {'precision': 0.825491873396065, 'recall': 0.9112370160528801, 'f1-score': 0.8662477558348295, 'support': 1059.0} | {'precision': 0.9546820940389087, 'recall': 0.957724039829303, 'f1-score': 0.9562006476168834, 'support': 17575.0} | {'precision': 0.9242507410253595, 'recall': 0.9077088948787062, 'f1-score': 0.915905134899913, 'support': 9275.0} | 0.9393 | {'precision': 0.9014749028201111, 'recall': 0.9255566502536298, 'f1-score': 0.9127845127838753, 'support': 27909.0} | {'precision': 0.9396667497821657, 'recall': 0.9393385646207316, 'f1-score': 0.9393959970436956, 'support': 27909.0} |
76
+ | No log | 5.0 | 205 | 0.1734 | {'precision': 0.8358078602620087, 'recall': 0.9036827195467422, 'f1-score': 0.868421052631579, 'support': 1059.0} | {'precision': 0.9562692176289717, 'recall': 0.9555618776671408, 'f1-score': 0.9559154167971085, 'support': 17575.0} | {'precision': 0.9189306672462508, 'recall': 0.9116981132075471, 'f1-score': 0.915300102830546, 'support': 9275.0} | 0.9390 | {'precision': 0.903669248379077, 'recall': 0.9236475701404768, 'f1-score': 0.9132121907530778, 'support': 27909.0} | {'precision': 0.9392896184942356, 'recall': 0.9390160880002867, 'f1-score': 0.9390977748647152, 'support': 27909.0} |
77
+ | No log | 6.0 | 246 | 0.1677 | {'precision': 0.8308759757155247, 'recall': 0.9046270066100094, 'f1-score': 0.8661844484629294, 'support': 1059.0} | {'precision': 0.9521587587137396, 'recall': 0.9636984352773826, 'f1-score': 0.9578938438480898, 'support': 17575.0} | {'precision': 0.9325379125780553, 'recall': 0.9016711590296496, 'f1-score': 0.9168448171901551, 'support': 9275.0} | 0.9408 | {'precision': 0.9051908823357732, 'recall': 0.9233322003056804, 'f1-score': 0.9136410365003914, 'support': 27909.0} | {'precision': 0.9410361167307384, 'recall': 0.9408434555161418, 'f1-score': 0.9407721278437462, 'support': 27909.0} |
78
+ | No log | 7.0 | 287 | 0.1719 | {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0} | {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0} | {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0} | 0.9421 | {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0} | {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0} |
79
 
80
 
81
  ### Framework versions
meta_data/README_s42_e7.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: allenai/longformer-base-4096
4
+ tags:
5
+ - generated_from_trainer
6
+ datasets:
7
+ - essays_su_g
8
+ metrics:
9
+ - accuracy
10
+ model-index:
11
+ - name: longformer-spans
12
+ results:
13
+ - task:
14
+ name: Token Classification
15
+ type: token-classification
16
+ dataset:
17
+ name: essays_su_g
18
+ type: essays_su_g
19
+ config: spans
20
+ split: test
21
+ args: spans
22
+ metrics:
23
+ - name: Accuracy
24
+ type: accuracy
25
+ value: 0.9420975312623168
26
+ ---
27
+
28
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
29
+ should probably proofread and complete it, then remove this comment. -->
30
+
31
+ # longformer-spans
32
+
33
+ This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
34
+ It achieves the following results on the evaluation set:
35
+ - Loss: 0.1719
36
+ - B: {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0}
37
+ - I: {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0}
38
+ - O: {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0}
39
+ - Accuracy: 0.9421
40
+ - Macro avg: {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0}
41
+ - Weighted avg: {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0}
42
+
43
+ ## Model description
44
+
45
+ More information needed
46
+
47
+ ## Intended uses & limitations
48
+
49
+ More information needed
50
+
51
+ ## Training and evaluation data
52
+
53
+ More information needed
54
+
55
+ ## Training procedure
56
+
57
+ ### Training hyperparameters
58
+
59
+ The following hyperparameters were used during training:
60
+ - learning_rate: 2e-05
61
+ - train_batch_size: 8
62
+ - eval_batch_size: 8
63
+ - seed: 42
64
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
65
+ - lr_scheduler_type: linear
66
+ - num_epochs: 7
67
+
68
+ ### Training results
69
+
70
+ | Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
71
+ |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
72
+ | No log | 1.0 | 41 | 0.2928 | {'precision': 0.8236434108527132, 'recall': 0.40132200188857414, 'f1-score': 0.5396825396825397, 'support': 1059.0} | {'precision': 0.9120444175691276, 'recall': 0.9440113798008535, 'f1-score': 0.9277526142146172, 'support': 17575.0} | {'precision': 0.8748098239513149, 'recall': 0.8679245283018868, 'f1-score': 0.87135357471451, 'support': 9275.0} | 0.8981 | {'precision': 0.8701658841243853, 'recall': 0.7377526366637716, 'f1-score': 0.7795962428705557, 'support': 27909.0} | {'precision': 0.8963158883521046, 'recall': 0.8981332186749794, 'f1-score': 0.8942842957405419, 'support': 27909.0} |
73
+ | No log | 2.0 | 82 | 0.1943 | {'precision': 0.8109318996415771, 'recall': 0.8545797922568461, 'f1-score': 0.8321839080459771, 'support': 1059.0} | {'precision': 0.9395201599466845, 'recall': 0.9625604551920341, 'f1-score': 0.9509007616424496, 'support': 17575.0} | {'precision': 0.9288721975645841, 'recall': 0.88, 'f1-score': 0.9037758830694275, 'support': 9275.0} | 0.9310 | {'precision': 0.8931080857176151, 'recall': 0.8990467491496267, 'f1-score': 0.895620184252618, 'support': 27909.0} | {'precision': 0.9311022725713902, 'recall': 0.9310258339603712, 'f1-score': 0.9307350661061193, 'support': 27909.0} |
74
+ | No log | 3.0 | 123 | 0.1853 | {'precision': 0.799163179916318, 'recall': 0.9017941454202077, 'f1-score': 0.847382431233363, 'support': 1059.0} | {'precision': 0.9557297671201291, 'recall': 0.9433854907539118, 'f1-score': 0.9495175099504624, 'support': 17575.0} | {'precision': 0.9017723681400811, 'recall': 0.9106199460916442, 'f1-score': 0.9061745614505659, 'support': 9275.0} | 0.9309 | {'precision': 0.8855551050588427, 'recall': 0.9185998607552546, 'f1-score': 0.9010248342114636, 'support': 27909.0} | {'precision': 0.9318572209382959, 'recall': 0.9309183417535563, 'f1-score': 0.9312378547962845, 'support': 27909.0} |
75
+ | No log | 4.0 | 164 | 0.1717 | {'precision': 0.825491873396065, 'recall': 0.9112370160528801, 'f1-score': 0.8662477558348295, 'support': 1059.0} | {'precision': 0.9546820940389087, 'recall': 0.957724039829303, 'f1-score': 0.9562006476168834, 'support': 17575.0} | {'precision': 0.9242507410253595, 'recall': 0.9077088948787062, 'f1-score': 0.915905134899913, 'support': 9275.0} | 0.9393 | {'precision': 0.9014749028201111, 'recall': 0.9255566502536298, 'f1-score': 0.9127845127838753, 'support': 27909.0} | {'precision': 0.9396667497821657, 'recall': 0.9393385646207316, 'f1-score': 0.9393959970436956, 'support': 27909.0} |
76
+ | No log | 5.0 | 205 | 0.1734 | {'precision': 0.8358078602620087, 'recall': 0.9036827195467422, 'f1-score': 0.868421052631579, 'support': 1059.0} | {'precision': 0.9562692176289717, 'recall': 0.9555618776671408, 'f1-score': 0.9559154167971085, 'support': 17575.0} | {'precision': 0.9189306672462508, 'recall': 0.9116981132075471, 'f1-score': 0.915300102830546, 'support': 9275.0} | 0.9390 | {'precision': 0.903669248379077, 'recall': 0.9236475701404768, 'f1-score': 0.9132121907530778, 'support': 27909.0} | {'precision': 0.9392896184942356, 'recall': 0.9390160880002867, 'f1-score': 0.9390977748647152, 'support': 27909.0} |
77
+ | No log | 6.0 | 246 | 0.1677 | {'precision': 0.8308759757155247, 'recall': 0.9046270066100094, 'f1-score': 0.8661844484629294, 'support': 1059.0} | {'precision': 0.9521587587137396, 'recall': 0.9636984352773826, 'f1-score': 0.9578938438480898, 'support': 17575.0} | {'precision': 0.9325379125780553, 'recall': 0.9016711590296496, 'f1-score': 0.9168448171901551, 'support': 9275.0} | 0.9408 | {'precision': 0.9051908823357732, 'recall': 0.9233322003056804, 'f1-score': 0.9136410365003914, 'support': 27909.0} | {'precision': 0.9410361167307384, 'recall': 0.9408434555161418, 'f1-score': 0.9407721278437462, 'support': 27909.0} |
78
+ | No log | 7.0 | 287 | 0.1719 | {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0} | {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0} | {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0} | 0.9421 | {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0} | {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0} |
79
+
80
+
81
+ ### Framework versions
82
+
83
+ - Transformers 4.37.2
84
+ - Pytorch 2.2.0+cu121
85
+ - Datasets 2.17.0
86
+ - Tokenizers 0.15.2
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ea81ae1e903bba3a0635400b9d86fc140ba478c316059d540e36508b36612e36
3
  size 592318676
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69b8cf6aa88e498db415110309a466d35f0a0829e2967c9f5b5963adfe08f0bc
3
  size 592318676