File size: 10,604 Bytes
49c7a0d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[80%:100%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9396792063434593
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2269
- B: {'precision': 0.8579285059578369, 'recall': 0.8974113135186961, 'f1-score': 0.8772258669165884, 'support': 1043.0}
- I: {'precision': 0.9460510739049551, 'recall': 0.9672622478386167, 'f1-score': 0.9565390863233492, 'support': 17350.0}
- O: {'precision': 0.9369666628740471, 'recall': 0.8925861695209192, 'f1-score': 0.9142381348875936, 'support': 9226.0}
- Accuracy: 0.9397
- Macro avg: {'precision': 0.9136487475789464, 'recall': 0.9190865769594107, 'f1-score': 0.9160010293758437, 'support': 27619.0}
- Weighted avg: {'precision': 0.9396886199949656, 'recall': 0.9396792063434593, 'f1-score': 0.9394134747592979, 'support': 27619.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 11
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.2961 | {'precision': 0.8145695364238411, 'recall': 0.3537871524448706, 'f1-score': 0.49331550802139046, 'support': 1043.0} | {'precision': 0.8821044947335489, 'recall': 0.9750432276657061, 'f1-score': 0.9262483574244416, 'support': 17350.0} | {'precision': 0.9316474712068102, 'recall': 0.8066334272707566, 'f1-score': 0.8646450563494831, 'support': 9226.0} | 0.8953 | {'precision': 0.8761071674547334, 'recall': 0.7118212691271112, 'f1-score': 0.7614029739317717, 'support': 27619.0} | {'precision': 0.8961037177114005, 'recall': 0.8953256815960028, 'f1-score': 0.8893208431174447, 'support': 27619.0} |
| No log | 2.0 | 82 | 0.2112 | {'precision': 0.7865546218487395, 'recall': 0.8974113135186961, 'f1-score': 0.83833407971339, 'support': 1043.0} | {'precision': 0.924818593485733, 'recall': 0.9770028818443804, 'f1-score': 0.9501947924549455, 'support': 17350.0} | {'precision': 0.9595061728395061, 'recall': 0.8424019076522871, 'f1-score': 0.8971487937204201, 'support': 9226.0} | 0.9290 | {'precision': 0.8902931293913262, 'recall': 0.905605367671788, 'f1-score': 0.8952258886295853, 'support': 27619.0} | {'precision': 0.931184438907382, 'recall': 0.9290343604040696, 'f1-score': 0.9282507283065632, 'support': 27619.0} |
| No log | 3.0 | 123 | 0.1780 | {'precision': 0.8625954198473282, 'recall': 0.8667305848513902, 'f1-score': 0.8646580583452893, 'support': 1043.0} | {'precision': 0.9670896584440227, 'recall': 0.94, 'f1-score': 0.9533524288303034, 'support': 17350.0} | {'precision': 0.8919336561244463, 'recall': 0.9384348580099718, 'f1-score': 0.9145935667881477, 'support': 9226.0} | 0.9367 | {'precision': 0.9072062448052658, 'recall': 0.9150551476204539, 'f1-score': 0.9108680179879135, 'support': 27619.0} | {'precision': 0.9380380357112386, 'recall': 0.9367102357073029, 'f1-score': 0.9370557674878652, 'support': 27619.0} |
| No log | 4.0 | 164 | 0.1855 | {'precision': 0.845719661335842, 'recall': 0.8619367209971237, 'f1-score': 0.8537511870845204, 'support': 1043.0} | {'precision': 0.9384769427601936, 'recall': 0.9723919308357348, 'f1-score': 0.9551334673196139, 'support': 17350.0} | {'precision': 0.9438162956055485, 'recall': 0.8776284413613701, 'f1-score': 0.909519797809604, 'support': 9226.0} | 0.9366 | {'precision': 0.9093376332338613, 'recall': 0.9039856977314096, 'f1-score': 0.9061348174045795, 'support': 27619.0} | {'precision': 0.9367576562120075, 'recall': 0.9365654078713929, 'f1-score': 0.9360678446256514, 'support': 27619.0} |
| No log | 5.0 | 205 | 0.1980 | {'precision': 0.8513761467889909, 'recall': 0.8897411313518696, 'f1-score': 0.8701359587435537, 'support': 1043.0} | {'precision': 0.9398964883966832, 'recall': 0.9734293948126801, 'f1-score': 0.9563690931226817, 'support': 17350.0} | {'precision': 0.9483644859813084, 'recall': 0.8799046173856493, 'f1-score': 0.9128528055774204, 'support': 9226.0} | 0.9390 | {'precision': 0.9132123737223274, 'recall': 0.9143583811833996, 'f1-score': 0.9131192858145519, 'support': 27619.0} | {'precision': 0.9393823144374135, 'recall': 0.939027481081864, 'f1-score': 0.9385761814296439, 'support': 27619.0} |
| No log | 6.0 | 246 | 0.1904 | {'precision': 0.828695652173913, 'recall': 0.9137104506232023, 'f1-score': 0.8691290469676243, 'support': 1043.0} | {'precision': 0.9355999778503793, 'recall': 0.9738328530259366, 'f1-score': 0.9543336439888163, 'support': 17350.0} | {'precision': 0.9504161712247324, 'recall': 0.8663559505744635, 'f1-score': 0.906441369925153, 'support': 9226.0} | 0.9357 | {'precision': 0.9049039337496749, 'recall': 0.917966418074534, 'f1-score': 0.9099680202938645, 'support': 27619.0} | {'precision': 0.9365121393475814, 'recall': 0.935660233896955, 'f1-score': 0.9351177956523646, 'support': 27619.0} |
| No log | 7.0 | 287 | 0.1881 | {'precision': 0.8546296296296296, 'recall': 0.8849472674976031, 'f1-score': 0.8695242581252944, 'support': 1043.0} | {'precision': 0.9503849443969205, 'recall': 0.9605187319884726, 'f1-score': 0.9554249677511824, 'support': 17350.0} | {'precision': 0.9249222567747668, 'recall': 0.9026663776284414, 'f1-score': 0.9136588041689524, 'support': 9226.0} | 0.9383 | {'precision': 0.909978943600439, 'recall': 0.916044125704839, 'f1-score': 0.9128693433484765, 'support': 27619.0} | {'precision': 0.9382631605052417, 'recall': 0.9383395488612911, 'f1-score': 0.9382292305648449, 'support': 27619.0} |
| No log | 8.0 | 328 | 0.2086 | {'precision': 0.8519195612431444, 'recall': 0.8935762224352828, 'f1-score': 0.8722508189050071, 'support': 1043.0} | {'precision': 0.9404728634508971, 'recall': 0.9697982708933718, 'f1-score': 0.9549104735960954, 'support': 17350.0} | {'precision': 0.940699559879546, 'recall': 0.8803381747236072, 'f1-score': 0.9095184770436731, 'support': 9226.0} | 0.9370 | {'precision': 0.9110306615245292, 'recall': 0.9145708893507539, 'f1-score': 0.9122265898482586, 'support': 27619.0} | {'precision': 0.937204476001968, 'recall': 0.9370360983381005, 'f1-score': 0.9366259383111303, 'support': 27619.0} |
| No log | 9.0 | 369 | 0.2106 | {'precision': 0.8523245214220602, 'recall': 0.8964525407478428, 'f1-score': 0.8738317757009345, 'support': 1043.0} | {'precision': 0.9503868912152936, 'recall': 0.9627665706051873, 'f1-score': 0.9565366775468134, 'support': 17350.0} | {'precision': 0.929689246590655, 'recall': 0.9014740949490571, 'f1-score': 0.9153642967202289, 'support': 9226.0} | 0.9398 | {'precision': 0.9108002197426696, 'recall': 0.9202310687673624, 'f1-score': 0.9152442499893256, 'support': 27619.0} | {'precision': 0.9397697247356507, 'recall': 0.9397878272203918, 'f1-score': 0.9396599767925747, 'support': 27619.0} |
| No log | 10.0 | 410 | 0.2272 | {'precision': 0.8681214421252372, 'recall': 0.8772770853307766, 'f1-score': 0.8726752503576539, 'support': 1043.0} | {'precision': 0.9461503725445924, 'recall': 0.9661095100864553, 'f1-score': 0.9560257799577938, 'support': 17350.0} | {'precision': 0.932873771047576, 'recall': 0.8947539562107089, 'f1-score': 0.9134163208852005, 'support': 9226.0} | 0.9389 | {'precision': 0.9157151952391351, 'recall': 0.9127135172093136, 'f1-score': 0.9140391170668827, 'support': 27619.0} | {'precision': 0.938768711375149, 'recall': 0.9389188602049314, 'f1-score': 0.938644648425997, 'support': 27619.0} |
| No log | 11.0 | 451 | 0.2269 | {'precision': 0.8579285059578369, 'recall': 0.8974113135186961, 'f1-score': 0.8772258669165884, 'support': 1043.0} | {'precision': 0.9460510739049551, 'recall': 0.9672622478386167, 'f1-score': 0.9565390863233492, 'support': 17350.0} | {'precision': 0.9369666628740471, 'recall': 0.8925861695209192, 'f1-score': 0.9142381348875936, 'support': 9226.0} | 0.9397 | {'precision': 0.9136487475789464, 'recall': 0.9190865769594107, 'f1-score': 0.9160010293758437, 'support': 27619.0} | {'precision': 0.9396886199949656, 'recall': 0.9396792063434593, 'f1-score': 0.9394134747592979, 'support': 27619.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2
|