metadata
library_name: transformers
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- stab-gurevych-essays
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: stab-gurevych-essays
type: stab-gurevych-essays
config: spans
split: train[0%:20%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9739926865110556
longformer-spans
This model is a fine-tuned version of allenai/longformer-base-4096 on the stab-gurevych-essays dataset. It achieves the following results on the evaluation set:
- Loss: 0.0856
- B: {'precision': 0.8861301369863014, 'recall': 0.913503971756399, 'f1-score': 0.8996088657105606, 'support': 1133.0}
- I: {'precision': 0.9856182499448976, 'recall': 0.978661705969251, 'f1-score': 0.9821276595744681, 'support': 18277.0}
- O: {'precision': 0.963097033685269, 'recall': 0.9722870774540656, 'f1-score': 0.9676702364113963, 'support': 9851.0}
- Accuracy: 0.9740
- Macro avg: {'precision': 0.9449484735388226, 'recall': 0.9548175850599052, 'f1-score': 0.9498022538988083, 'support': 29261.0}
- Weighted avg: {'precision': 0.9741840360302777, 'recall': 0.9739926865110556, 'f1-score': 0.9740652601681857, 'support': 29261.0}
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 41 | 0.2075 | {'precision': 0.8258258258258259, 'recall': 0.7281553398058253, 'f1-score': 0.773921200750469, 'support': 1133.0} | {'precision': 0.9305934158104424, 'recall': 0.9712753734201456, 'f1-score': 0.9504992905522983, 'support': 18277.0} | {'precision': 0.9388199433921185, 'recall': 0.8754441173484926, 'f1-score': 0.9060251089982665, 'support': 9851.0} | 0.9296 | {'precision': 0.8984130616761289, 'recall': 0.8582916101914878, 'f1-score': 0.8768152001003445, 'support': 29261.0} | {'precision': 0.9293063047668868, 'recall': 0.9295991251153413, 'f1-score': 0.9286894365406706, 'support': 29261.0} |
No log | 2.0 | 82 | 0.1039 | {'precision': 0.7817781043350478, 'recall': 0.93909973521624, 'f1-score': 0.8532477947072975, 'support': 1133.0} | {'precision': 0.9750846901977925, 'recall': 0.9764184494173004, 'f1-score': 0.9757511140271741, 'support': 18277.0} | {'precision': 0.9661387789122734, 'recall': 0.9413257537305857, 'f1-score': 0.9535708776800864, 'support': 9851.0} | 0.9632 | {'precision': 0.9076671911483712, 'recall': 0.9522813127880422, 'f1-score': 0.927523262138186, 'support': 29261.0} | {'precision': 0.9645880382085871, 'recall': 0.9631591538224941, 'f1-score': 0.9635405344487392, 'support': 29261.0} |
No log | 3.0 | 123 | 0.0875 | {'precision': 0.8751054852320675, 'recall': 0.9152691968225949, 'f1-score': 0.8947368421052632, 'support': 1133.0} | {'precision': 0.9870288248337029, 'recall': 0.9742299064397877, 'f1-score': 0.9805876036016191, 'support': 18277.0} | {'precision': 0.9561578318055002, 'recall': 0.9741143031164349, 'f1-score': 0.9650525468899281, 'support': 9851.0} | 0.9719 | {'precision': 0.9394307139570902, 'recall': 0.9545378021262724, 'f1-score': 0.9467923308656035, 'support': 29261.0} | {'precision': 0.972302079469926, 'recall': 0.9719080004101022, 'f1-score': 0.9720333929990342, 'support': 29261.0} |
No log | 4.0 | 164 | 0.0825 | {'precision': 0.8817021276595745, 'recall': 0.9143865842894969, 'f1-score': 0.8977469670710572, 'support': 1133.0} | {'precision': 0.9845409033393849, 'recall': 0.9791541281391913, 'f1-score': 0.9818401272837, 'support': 18277.0} | {'precision': 0.9638712281764052, 'recall': 0.9695462389605116, 'f1-score': 0.9667004048582996, 'support': 9851.0} | 0.9734 | {'precision': 0.9433714197251216, 'recall': 0.9543623171297333, 'f1-score': 0.9487624997376857, 'support': 29261.0} | {'precision': 0.9736002894548376, 'recall': 0.9734117084173474, 'f1-score': 0.9734870649777793, 'support': 29261.0} |
No log | 5.0 | 205 | 0.0856 | {'precision': 0.8861301369863014, 'recall': 0.913503971756399, 'f1-score': 0.8996088657105606, 'support': 1133.0} | {'precision': 0.9856182499448976, 'recall': 0.978661705969251, 'f1-score': 0.9821276595744681, 'support': 18277.0} | {'precision': 0.963097033685269, 'recall': 0.9722870774540656, 'f1-score': 0.9676702364113963, 'support': 9851.0} | 0.9740 | {'precision': 0.9449484735388226, 'recall': 0.9548175850599052, 'f1-score': 0.9498022538988083, 'support': 29261.0} | {'precision': 0.9741840360302777, 'recall': 0.9739926865110556, 'f1-score': 0.9740652601681857, 'support': 29261.0} |
Framework versions
- Transformers 4.46.0
- Pytorch 2.5.0+cu124
- Datasets 3.0.2
- Tokenizers 0.20.1