metadata
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: test
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9420975312623168
longformer-spans
This model is a fine-tuned version of allenai/longformer-base-4096 on the essays_su_g dataset. It achieves the following results on the evaluation set:
- Loss: 0.1719
- B: {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0}
- I: {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0}
- O: {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0}
- Accuracy: 0.9421
- Macro avg: {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0}
- Weighted avg: {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0}
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 7
Training results
Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 41 | 0.2928 | {'precision': 0.8236434108527132, 'recall': 0.40132200188857414, 'f1-score': 0.5396825396825397, 'support': 1059.0} | {'precision': 0.9120444175691276, 'recall': 0.9440113798008535, 'f1-score': 0.9277526142146172, 'support': 17575.0} | {'precision': 0.8748098239513149, 'recall': 0.8679245283018868, 'f1-score': 0.87135357471451, 'support': 9275.0} | 0.8981 | {'precision': 0.8701658841243853, 'recall': 0.7377526366637716, 'f1-score': 0.7795962428705557, 'support': 27909.0} | {'precision': 0.8963158883521046, 'recall': 0.8981332186749794, 'f1-score': 0.8942842957405419, 'support': 27909.0} |
No log | 2.0 | 82 | 0.1943 | {'precision': 0.8109318996415771, 'recall': 0.8545797922568461, 'f1-score': 0.8321839080459771, 'support': 1059.0} | {'precision': 0.9395201599466845, 'recall': 0.9625604551920341, 'f1-score': 0.9509007616424496, 'support': 17575.0} | {'precision': 0.9288721975645841, 'recall': 0.88, 'f1-score': 0.9037758830694275, 'support': 9275.0} | 0.9310 | {'precision': 0.8931080857176151, 'recall': 0.8990467491496267, 'f1-score': 0.895620184252618, 'support': 27909.0} | {'precision': 0.9311022725713902, 'recall': 0.9310258339603712, 'f1-score': 0.9307350661061193, 'support': 27909.0} |
No log | 3.0 | 123 | 0.1853 | {'precision': 0.799163179916318, 'recall': 0.9017941454202077, 'f1-score': 0.847382431233363, 'support': 1059.0} | {'precision': 0.9557297671201291, 'recall': 0.9433854907539118, 'f1-score': 0.9495175099504624, 'support': 17575.0} | {'precision': 0.9017723681400811, 'recall': 0.9106199460916442, 'f1-score': 0.9061745614505659, 'support': 9275.0} | 0.9309 | {'precision': 0.8855551050588427, 'recall': 0.9185998607552546, 'f1-score': 0.9010248342114636, 'support': 27909.0} | {'precision': 0.9318572209382959, 'recall': 0.9309183417535563, 'f1-score': 0.9312378547962845, 'support': 27909.0} |
No log | 4.0 | 164 | 0.1717 | {'precision': 0.825491873396065, 'recall': 0.9112370160528801, 'f1-score': 0.8662477558348295, 'support': 1059.0} | {'precision': 0.9546820940389087, 'recall': 0.957724039829303, 'f1-score': 0.9562006476168834, 'support': 17575.0} | {'precision': 0.9242507410253595, 'recall': 0.9077088948787062, 'f1-score': 0.915905134899913, 'support': 9275.0} | 0.9393 | {'precision': 0.9014749028201111, 'recall': 0.9255566502536298, 'f1-score': 0.9127845127838753, 'support': 27909.0} | {'precision': 0.9396667497821657, 'recall': 0.9393385646207316, 'f1-score': 0.9393959970436956, 'support': 27909.0} |
No log | 5.0 | 205 | 0.1734 | {'precision': 0.8358078602620087, 'recall': 0.9036827195467422, 'f1-score': 0.868421052631579, 'support': 1059.0} | {'precision': 0.9562692176289717, 'recall': 0.9555618776671408, 'f1-score': 0.9559154167971085, 'support': 17575.0} | {'precision': 0.9189306672462508, 'recall': 0.9116981132075471, 'f1-score': 0.915300102830546, 'support': 9275.0} | 0.9390 | {'precision': 0.903669248379077, 'recall': 0.9236475701404768, 'f1-score': 0.9132121907530778, 'support': 27909.0} | {'precision': 0.9392896184942356, 'recall': 0.9390160880002867, 'f1-score': 0.9390977748647152, 'support': 27909.0} |
No log | 6.0 | 246 | 0.1677 | {'precision': 0.8308759757155247, 'recall': 0.9046270066100094, 'f1-score': 0.8661844484629294, 'support': 1059.0} | {'precision': 0.9521587587137396, 'recall': 0.9636984352773826, 'f1-score': 0.9578938438480898, 'support': 17575.0} | {'precision': 0.9325379125780553, 'recall': 0.9016711590296496, 'f1-score': 0.9168448171901551, 'support': 9275.0} | 0.9408 | {'precision': 0.9051908823357732, 'recall': 0.9233322003056804, 'f1-score': 0.9136410365003914, 'support': 27909.0} | {'precision': 0.9410361167307384, 'recall': 0.9408434555161418, 'f1-score': 0.9407721278437462, 'support': 27909.0} |
No log | 7.0 | 287 | 0.1719 | {'precision': 0.852017937219731, 'recall': 0.8970727101038716, 'f1-score': 0.8739650413983441, 'support': 1059.0} | {'precision': 0.9538791159224177, 'recall': 0.9626173541963016, 'f1-score': 0.9582283141230779, 'support': 17575.0} | {'precision': 0.9301170236255244, 'recall': 0.9083557951482479, 'f1-score': 0.919107620138548, 'support': 9275.0} | 0.9421 | {'precision': 0.912004692255891, 'recall': 0.9226819531494738, 'f1-score': 0.91710032521999, 'support': 27909.0} | {'precision': 0.9421171612017244, 'recall': 0.9420975312623168, 'f1-score': 0.9420299823117623, 'support': 27909.0} |
Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2