longformer-spans / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-02 12:32:39.062986.
edd8276 verified
|
raw
history blame
13.9 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[60%:80%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9427462686567164
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3134
- B: {'precision': 0.8714380384360504, 'recall': 0.9131944444444444, 'f1-score': 0.8918277382163445, 'support': 1440.0}
- I: {'precision': 0.9559525996693, 'recall': 0.9641450873210728, 'f1-score': 0.9600313660370396, 'support': 21587.0}
- O: {'precision': 0.9251394461297583, 'recall': 0.9027021865750023, 'f1-score': 0.9137831045814807, 'support': 10473.0}
- Accuracy: 0.9427
- Macro avg: {'precision': 0.9175100280783696, 'recall': 0.9266805727801732, 'f1-score': 0.9218807362782884, 'support': 33500.0}
- Weighted avg: {'precision': 0.9426867153351061, 'recall': 0.9427462686567164, 'f1-score': 0.9426411789837302, 'support': 33500.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.3118 | {'precision': 0.6905294556301268, 'recall': 0.6430555555555556, 'f1-score': 0.6659475008989573, 'support': 1440.0} | {'precision': 0.9325347388596071, 'recall': 0.9015611247510076, 'f1-score': 0.9167863956473609, 'support': 21587.0} | {'precision': 0.8138010452653025, 'recall': 0.8772080588179128, 'f1-score': 0.8443157797996509, 'support': 10473.0} | 0.8828 | {'precision': 0.8122884132516788, 'recall': 0.8072749130414919, 'f1-score': 0.8090165587819897, 'support': 33500.0} | {'precision': 0.8850127812218876, 'recall': 0.8828358208955224, 'f1-score': 0.8833478055515172, 'support': 33500.0} |
| No log | 2.0 | 82 | 0.2266 | {'precision': 0.8113207547169812, 'recall': 0.8659722222222223, 'f1-score': 0.8377561303325496, 'support': 1440.0} | {'precision': 0.9191461555216729, 'recall': 0.9773938018251725, 'f1-score': 0.9473755107538951, 'support': 21587.0} | {'precision': 0.9519316163410302, 'recall': 0.8187720805881791, 'f1-score': 0.8803449514911966, 'support': 10473.0} | 0.9230 | {'precision': 0.8941328421932281, 'recall': 0.8873793682118579, 'f1-score': 0.8884921975258804, 'support': 33500.0} | {'precision': 0.9247608884769675, 'recall': 0.9230149253731343, 'f1-score': 0.9217079598594182, 'support': 33500.0} |
| No log | 3.0 | 123 | 0.2044 | {'precision': 0.8354591836734694, 'recall': 0.9097222222222222, 'f1-score': 0.8710106382978723, 'support': 1440.0} | {'precision': 0.9392974112791063, 'recall': 0.9698429610413675, 'f1-score': 0.9543258273315707, 'support': 21587.0} | {'precision': 0.9384009125790729, 'recall': 0.8640313186288552, 'f1-score': 0.8996818452972758, 'support': 10473.0} | 0.9342 | {'precision': 0.9043858358438829, 'recall': 0.9145321672974815, 'f1-score': 0.908339436975573, 'support': 33500.0} | {'precision': 0.9345536477376863, 'recall': 0.934179104477612, 'f1-score': 0.9336613408822066, 'support': 33500.0} |
| No log | 4.0 | 164 | 0.1855 | {'precision': 0.8468002585649644, 'recall': 0.9097222222222222, 'f1-score': 0.8771342484097756, 'support': 1440.0} | {'precision': 0.952900369677331, 'recall': 0.9672024829758651, 'f1-score': 0.959998160834981, 'support': 21587.0} | {'precision': 0.9341764588727345, 'recall': 0.8957318819822401, 'f1-score': 0.9145503290275409, 'support': 10473.0} | 0.9424 | {'precision': 0.9112923623716767, 'recall': 0.9242188623934425, 'f1-score': 0.9172275794240993, 'support': 33500.0} | {'precision': 0.9424860509352908, 'recall': 0.9423880597014925, 'f1-score': 0.9422280361659775, 'support': 33500.0} |
| No log | 5.0 | 205 | 0.1970 | {'precision': 0.8527131782945736, 'recall': 0.9166666666666666, 'f1-score': 0.8835341365461846, 'support': 1440.0} | {'precision': 0.9453672113485365, 'recall': 0.9755408347616621, 'f1-score': 0.9602170394181884, 'support': 21587.0} | {'precision': 0.9500826787928897, 'recall': 0.877780960565263, 'f1-score': 0.9125018611345476, 'support': 10473.0} | 0.9424 | {'precision': 0.9160543561453333, 'recall': 0.9233294873311971, 'f1-score': 0.9187510123663069, 'support': 33500.0} | {'precision': 0.9428586526305366, 'recall': 0.9424477611940298, 'f1-score': 0.9420037724838524, 'support': 33500.0} |
| No log | 6.0 | 246 | 0.2258 | {'precision': 0.8543563068920677, 'recall': 0.9125, 'f1-score': 0.882471457353929, 'support': 1440.0} | {'precision': 0.9621173050775939, 'recall': 0.9506184277574466, 'f1-score': 0.9563333022648896, 'support': 21587.0} | {'precision': 0.9025674786043449, 'recall': 0.9163563448868519, 'f1-score': 0.9094096465460059, 'support': 10473.0} | 0.9383 | {'precision': 0.9063470301913354, 'recall': 0.9264915908814327, 'f1-score': 0.9160714687216082, 'support': 33500.0} | {'precision': 0.9388683149271014, 'recall': 0.9382686567164179, 'f1-score': 0.9384887499360641, 'support': 33500.0} |
| No log | 7.0 | 287 | 0.2221 | {'precision': 0.8678122934567085, 'recall': 0.9118055555555555, 'f1-score': 0.8892651540805959, 'support': 1440.0} | {'precision': 0.9557587173243901, 'recall': 0.963728169731783, 'f1-score': 0.9597268994787103, 'support': 21587.0} | {'precision': 0.925440313111546, 'recall': 0.903084121073236, 'f1-score': 0.914125549702798, 'support': 10473.0} | 0.9425 | {'precision': 0.9163371079642149, 'recall': 0.9262059487868582, 'f1-score': 0.9210392010873681, 'support': 33500.0} | {'precision': 0.9424999860500445, 'recall': 0.9425373134328359, 'f1-score': 0.9424418890435934, 'support': 33500.0} |
| No log | 8.0 | 328 | 0.2697 | {'precision': 0.8493778650949574, 'recall': 0.9006944444444445, 'f1-score': 0.8742837883383889, 'support': 1440.0} | {'precision': 0.9607962815155641, 'recall': 0.9479779496919443, 'f1-score': 0.954344074989507, 'support': 21587.0} | {'precision': 0.8975079632752483, 'recall': 0.9147331232693593, 'f1-score': 0.9060386816096845, 'support': 10473.0} | 0.9356 | {'precision': 0.9025607032952566, 'recall': 0.9211351724685827, 'f1-score': 0.9115555149791934, 'support': 33500.0} | {'precision': 0.9362213240058178, 'recall': 0.9355522388059702, 'f1-score': 0.9358011138657909, 'support': 33500.0} |
| No log | 9.0 | 369 | 0.2370 | {'precision': 0.8687541638907396, 'recall': 0.9055555555555556, 'f1-score': 0.8867732063923836, 'support': 1440.0} | {'precision': 0.9576294655220161, 'recall': 0.9611340158428684, 'f1-score': 0.9593785402168635, 'support': 21587.0} | {'precision': 0.9195780509048679, 'recall': 0.907285400553805, 'f1-score': 0.9133903681630298, 'support': 10473.0} | 0.9419 | {'precision': 0.9153205601058745, 'recall': 0.9246583239840763, 'f1-score': 0.9198473715907589, 'support': 33500.0} | {'precision': 0.9419132595627793, 'recall': 0.941910447761194, 'f1-score': 0.9418804564369516, 'support': 33500.0} |
| No log | 10.0 | 410 | 0.2744 | {'precision': 0.8453214513049013, 'recall': 0.9222222222222223, 'f1-score': 0.8820989704417137, 'support': 1440.0} | {'precision': 0.956655776929094, 'recall': 0.9631259554361421, 'f1-score': 0.9598799630655587, 'support': 21587.0} | {'precision': 0.9273244409572381, 'recall': 0.9027976701995608, 'f1-score': 0.914896705210702, 'support': 10473.0} | 0.9425 | {'precision': 0.9097672230637445, 'recall': 0.9293819492859751, 'f1-score': 0.9189585462393248, 'support': 33500.0} | {'precision': 0.9427002990027631, 'recall': 0.9425074626865672, 'f1-score': 0.9424735663822079, 'support': 33500.0} |
| No log | 11.0 | 451 | 0.2965 | {'precision': 0.8822724161533196, 'recall': 0.8951388888888889, 'f1-score': 0.8886590830748018, 'support': 1440.0} | {'precision': 0.9591074596209505, 'recall': 0.9517765321721406, 'f1-score': 0.9554279336882978, 'support': 21587.0} | {'precision': 0.9005368748233964, 'recall': 0.91291893440275, 'f1-score': 0.9066856330014225, 'support': 10473.0} | 0.9372 | {'precision': 0.9139722501992221, 'recall': 0.9199447851545933, 'f1-score': 0.916924216588174, 'support': 33500.0} | {'precision': 0.9374939611977214, 'recall': 0.9371940298507463, 'f1-score': 0.9373197169725641, 'support': 33500.0} |
| No log | 12.0 | 492 | 0.3318 | {'precision': 0.8688963210702341, 'recall': 0.9020833333333333, 'f1-score': 0.8851788756388416, 'support': 1440.0} | {'precision': 0.9624402138235019, 'recall': 0.9508037244637977, 'f1-score': 0.956586582154592, 'support': 21587.0} | {'precision': 0.9008334113681056, 'recall': 0.9185524682516948, 'f1-score': 0.9096066565809379, 'support': 10473.0} | 0.9386 | {'precision': 0.9107233154206137, 'recall': 0.9238131753496086, 'f1-score': 0.9171240381247904, 'support': 33500.0} | {'precision': 0.9391592810569325, 'recall': 0.9386268656716418, 'f1-score': 0.9388299296795006, 'support': 33500.0} |
| 0.1206 | 13.0 | 533 | 0.2958 | {'precision': 0.8682634730538922, 'recall': 0.90625, 'f1-score': 0.8868501529051986, 'support': 1440.0} | {'precision': 0.9548452097453746, 'recall': 0.9658590818548201, 'f1-score': 0.9603205674412177, 'support': 21587.0} | {'precision': 0.927959846471804, 'recall': 0.9003150959610426, 'f1-score': 0.9139284675777842, 'support': 10473.0} | 0.9428 | {'precision': 0.9170228430903569, 'recall': 0.9241413926052875, 'f1-score': 0.9203663959747336, 'support': 33500.0} | {'precision': 0.9427184004797077, 'recall': 0.9428059701492537, 'f1-score': 0.9426590194172891, 'support': 33500.0} |
| 0.1206 | 14.0 | 574 | 0.3036 | {'precision': 0.8715046604527297, 'recall': 0.9090277777777778, 'f1-score': 0.8898708361658736, 'support': 1440.0} | {'precision': 0.9562901744719926, 'recall': 0.9648399499698893, 'f1-score': 0.960546037309475, 'support': 21587.0} | {'precision': 0.9261107848894109, 'recall': 0.9035615391960279, 'f1-score': 0.914697211347929, 'support': 10473.0} | 0.9433 | {'precision': 0.9179685399380443, 'recall': 0.9258097556478985, 'f1-score': 0.9217046949410926, 'support': 33500.0} | {'precision': 0.9432107748515115, 'recall': 0.9432835820895522, 'f1-score': 0.9431744837589658, 'support': 33500.0} |
| 0.1206 | 15.0 | 615 | 0.3104 | {'precision': 0.8767676767676768, 'recall': 0.9041666666666667, 'f1-score': 0.8902564102564102, 'support': 1440.0} | {'precision': 0.9556075838956984, 'recall': 0.9642840598508362, 'f1-score': 0.9599262162785336, 'support': 21587.0} | {'precision': 0.9239640344018765, 'recall': 0.9027021865750023, 'f1-score': 0.9132093697174595, 'support': 10473.0} | 0.9424 | {'precision': 0.9187797650217506, 'recall': 0.9237176376975017, 'f1-score': 0.9211306654174677, 'support': 33500.0} | {'precision': 0.9423260209072463, 'recall': 0.9424477611940298, 'f1-score': 0.9423265131529818, 'support': 33500.0} |
| 0.1206 | 16.0 | 656 | 0.3134 | {'precision': 0.8714380384360504, 'recall': 0.9131944444444444, 'f1-score': 0.8918277382163445, 'support': 1440.0} | {'precision': 0.9559525996693, 'recall': 0.9641450873210728, 'f1-score': 0.9600313660370396, 'support': 21587.0} | {'precision': 0.9251394461297583, 'recall': 0.9027021865750023, 'f1-score': 0.9137831045814807, 'support': 10473.0} | 0.9427 | {'precision': 0.9175100280783696, 'recall': 0.9266805727801732, 'f1-score': 0.9218807362782884, 'support': 33500.0} | {'precision': 0.9426867153351061, 'recall': 0.9427462686567164, 'f1-score': 0.9426411789837302, 'support': 33500.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2