|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-spans |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: spans |
|
split: train[80%:100%] |
|
args: spans |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.9414895542923349 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-spans |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.2394 |
|
- B: {'precision': 0.8633828996282528, 'recall': 0.8906999041227229, 'f1-score': 0.8768286927796131, 'support': 1043.0} |
|
- I: {'precision': 0.9488209014307527, 'recall': 0.9670317002881844, 'f1-score': 0.9578397510918277, 'support': 17350.0} |
|
- O: {'precision': 0.9363431151241535, 'recall': 0.8991979189247779, 'f1-score': 0.917394669910428, 'support': 9226.0} |
|
- Accuracy: 0.9415 |
|
- Macro avg: {'precision': 0.9161823053943863, 'recall': 0.9189765077785618, 'f1-score': 0.9173543712606228, 'support': 27619.0} |
|
- Weighted avg: {'precision': 0.941426285682728, 'recall': 0.9414895542923349, 'f1-score': 0.9412699675080908, 'support': 27619.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 12 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.2858 | {'precision': 0.8085106382978723, 'recall': 0.36433365292425696, 'f1-score': 0.5023132848645077, 'support': 1043.0} | {'precision': 0.8888126286890872, 'recall': 0.9703170028818444, 'f1-score': 0.9277782370284644, 'support': 17350.0} | {'precision': 0.9216617933723197, 'recall': 0.8199653154129634, 'f1-score': 0.8678444418951474, 'support': 9226.0} | 0.8972 | {'precision': 0.8729950201197597, 'recall': 0.7182053237396883, 'f1-score': 0.7659786545960398, 'support': 27619.0} | {'precision': 0.8967532281818084, 'recall': 0.8972084434628336, 'f1-score': 0.8916904301199235, 'support': 27619.0} | |
|
| No log | 2.0 | 82 | 0.2181 | {'precision': 0.7889273356401384, 'recall': 0.8744007670182167, 'f1-score': 0.8294679399727148, 'support': 1043.0} | {'precision': 0.921101216333623, 'recall': 0.9776945244956773, 'f1-score': 0.9485544930939999, 'support': 17350.0} | {'precision': 0.9581210388964831, 'recall': 0.8356817689139389, 'f1-score': 0.8927227464829502, 'support': 9226.0} | 0.9264 | {'precision': 0.8893831969567482, 'recall': 0.8959256868092776, 'f1-score': 0.8902483931832217, 'support': 27619.0} | {'precision': 0.9284761222100719, 'recall': 0.9263550454397336, 'f1-score': 0.9254069870605068, 'support': 27619.0} | |
|
| No log | 3.0 | 123 | 0.1805 | {'precision': 0.8276785714285714, 'recall': 0.8887823585810163, 'f1-score': 0.8571428571428571, 'support': 1043.0} | {'precision': 0.9608084358523726, 'recall': 0.9453025936599424, 'f1-score': 0.952992446252179, 'support': 17350.0} | {'precision': 0.9023226216990137, 'recall': 0.9221764578365489, 'f1-score': 0.9121415170195658, 'support': 9226.0} | 0.9354 | {'precision': 0.8969365429933193, 'recall': 0.9187538033591692, 'f1-score': 0.9074256068048673, 'support': 27619.0} | {'precision': 0.9362440211388452, 'recall': 0.9354429921430899, 'f1-score': 0.9357267308192845, 'support': 27619.0} | |
|
| No log | 4.0 | 164 | 0.1988 | {'precision': 0.8492366412213741, 'recall': 0.8533077660594439, 'f1-score': 0.8512673362027738, 'support': 1043.0} | {'precision': 0.9303964757709251, 'recall': 0.9738328530259366, 'f1-score': 0.9516192621796676, 'support': 17350.0} | {'precision': 0.9456663892521697, 'recall': 0.8621287665293735, 'f1-score': 0.9019674547825594, 'support': 9226.0} | 0.9320 | {'precision': 0.9084331687481564, 'recall': 0.8964231285382512, 'f1-score': 0.9016180177216668, 'support': 27619.0} | {'precision': 0.9324324116970188, 'recall': 0.9319671240812484, 'f1-score': 0.9312436282378297, 'support': 27619.0} | |
|
| No log | 5.0 | 205 | 0.2100 | {'precision': 0.8506375227686703, 'recall': 0.8954937679769894, 'f1-score': 0.8724894908921066, 'support': 1043.0} | {'precision': 0.9365704772475028, 'recall': 0.9727377521613833, 'f1-score': 0.9543115634718687, 'support': 17350.0} | {'precision': 0.9460063521938595, 'recall': 0.8716670279644483, 'f1-score': 0.9073165228182998, 'support': 9226.0} | 0.9361 | {'precision': 0.9110714507366775, 'recall': 0.9132995160342737, 'f1-score': 0.9113725257274249, 'support': 27619.0} | {'precision': 0.9364773279927747, 'recall': 0.9360585104457076, 'f1-score': 0.9355231690053595, 'support': 27619.0} | |
|
| No log | 6.0 | 246 | 0.2054 | {'precision': 0.8465073529411765, 'recall': 0.8830297219558965, 'f1-score': 0.8643829188174565, 'support': 1043.0} | {'precision': 0.9236811957885549, 'recall': 0.9759077809798271, 'f1-score': 0.94907653933466, 'support': 17350.0} | {'precision': 0.9496341463414634, 'recall': 0.8440277476696293, 'f1-score': 0.8937220245610008, 'support': 9226.0} | 0.9283 | {'precision': 0.9066075650237316, 'recall': 0.9009884168684509, 'f1-score': 0.9023938275710391, 'support': 27619.0} | {'precision': 0.929436277569623, 'recall': 0.9283464281834969, 'f1-score': 0.9273872602332724, 'support': 27619.0} | |
|
| No log | 7.0 | 287 | 0.1949 | {'precision': 0.851063829787234, 'recall': 0.8820709491850431, 'f1-score': 0.8662900188323918, 'support': 1043.0} | {'precision': 0.9430497051390059, 'recall': 0.9677809798270893, 'f1-score': 0.9552552979661499, 'support': 17350.0} | {'precision': 0.9366769724035269, 'recall': 0.8866247561239974, 'f1-score': 0.910963862130408, 'support': 9226.0} | 0.9374 | {'precision': 0.9102635024432556, 'recall': 0.9121588950453766, 'f1-score': 0.9108363929763166, 'support': 27619.0} | {'precision': 0.9374471815063824, 'recall': 0.9374343748868532, 'f1-score': 0.937100275222493, 'support': 27619.0} | |
|
| No log | 8.0 | 328 | 0.2038 | {'precision': 0.8602050326188257, 'recall': 0.8849472674976031, 'f1-score': 0.8724007561436674, 'support': 1043.0} | {'precision': 0.9485302462830553, 'recall': 0.9634005763688761, 'f1-score': 0.9559075832094247, 'support': 17350.0} | {'precision': 0.9286194531600179, 'recall': 0.8982224149143724, 'f1-score': 0.913168044077135, 'support': 9226.0} | 0.9387 | {'precision': 0.9124515773539663, 'recall': 0.9155234195936172, 'f1-score': 0.913825461143409, 'support': 27619.0} | {'precision': 0.938543636514239, 'recall': 0.9386654114920888, 'f1-score': 0.9384770966362654, 'support': 27619.0} | |
|
| No log | 9.0 | 369 | 0.2182 | {'precision': 0.8558310376492194, 'recall': 0.8935762224352828, 'f1-score': 0.874296435272045, 'support': 1043.0} | {'precision': 0.9498548579885024, 'recall': 0.9618443804034582, 'f1-score': 0.9558120221083077, 'support': 17350.0} | {'precision': 0.9273518580515567, 'recall': 0.9007153696076307, 'f1-score': 0.9138395557266179, 'support': 9226.0} | 0.9388 | {'precision': 0.9110125845630929, 'recall': 0.9187119908154573, 'f1-score': 0.9146493377023236, 'support': 27619.0} | {'precision': 0.9387871320740184, 'recall': 0.9388464462869763, 'f1-score': 0.9387129695753523, 'support': 27619.0} | |
|
| No log | 10.0 | 410 | 0.2523 | {'precision': 0.861652739090065, 'recall': 0.8897411313518696, 'f1-score': 0.8754716981132075, 'support': 1043.0} | {'precision': 0.938376753507014, 'recall': 0.9715850144092218, 'f1-score': 0.9546921900662626, 'support': 17350.0} | {'precision': 0.9432268594077874, 'recall': 0.8769781053544331, 'f1-score': 0.9088968771062682, 'support': 9226.0} | 0.9369 | {'precision': 0.9144187840016221, 'recall': 0.9127680837051749, 'f1-score': 0.9130202550952461, 'support': 27619.0} | {'precision': 0.9370995142877685, 'recall': 0.9368912705021906, 'f1-score': 0.9364028048431936, 'support': 27619.0} | |
|
| No log | 11.0 | 451 | 0.2504 | {'precision': 0.8530762167125804, 'recall': 0.8906999041227229, 'f1-score': 0.8714821763602252, 'support': 1043.0} | {'precision': 0.9388493211662586, 'recall': 0.972507204610951, 'f1-score': 0.9553819149538532, 'support': 17350.0} | {'precision': 0.9457817247020331, 'recall': 0.8773032733579016, 'f1-score': 0.9102564102564102, 'support': 9226.0} | 0.9376 | {'precision': 0.9125690875269573, 'recall': 0.9135034606971919, 'f1-score': 0.9123735005234962, 'support': 27619.0} | {'precision': 0.9379259353476508, 'recall': 0.9376154096817408, 'f1-score': 0.9371395696954526, 'support': 27619.0} | |
|
| No log | 12.0 | 492 | 0.2394 | {'precision': 0.8633828996282528, 'recall': 0.8906999041227229, 'f1-score': 0.8768286927796131, 'support': 1043.0} | {'precision': 0.9488209014307527, 'recall': 0.9670317002881844, 'f1-score': 0.9578397510918277, 'support': 17350.0} | {'precision': 0.9363431151241535, 'recall': 0.8991979189247779, 'f1-score': 0.917394669910428, 'support': 9226.0} | 0.9415 | {'precision': 0.9161823053943863, 'recall': 0.9189765077785618, 'f1-score': 0.9173543712606228, 'support': 27619.0} | {'precision': 0.941426285682728, 'recall': 0.9414895542923349, 'f1-score': 0.9412699675080908, 'support': 27619.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|