longformer-spans / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-02 12:08:08.393559.
286ec69 verified
|
raw
history blame
13.9 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-spans
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: spans
split: train[20%:40%]
args: spans
metrics:
- name: Accuracy
type: accuracy
value: 0.9337012922629474
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-spans
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2971
- B: {'precision': 0.8351477449455676, 'recall': 0.9117147707979627, 'f1-score': 0.8717532467532468, 'support': 1178.0}
- I: {'precision': 0.9428615911567804, 'recall': 0.9613207047991957, 'f1-score': 0.9520016767973171, 'support': 18899.0}
- O: {'precision': 0.9285714285714286, 'recall': 0.8849705304518664, 'f1-score': 0.9062468564530731, 'support': 10180.0}
- Accuracy: 0.9337
- Macro avg: {'precision': 0.9021935882245922, 'recall': 0.9193353353496749, 'f1-score': 0.9100005933345456, 'support': 30257.0}
- Weighted avg: {'precision': 0.9338600124822359, 'recall': 0.9337012922629474, 'f1-score': 0.9334830952559773, 'support': 30257.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
### Training results
| Training Loss | Epoch | Step | Validation Loss | B | I | O | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.2806 | {'precision': 0.7871116225546605, 'recall': 0.5806451612903226, 'f1-score': 0.668295065950171, 'support': 1178.0} | {'precision': 0.9208257120459891, 'recall': 0.9323244616117254, 'f1-score': 0.9265394121049587, 'support': 18899.0} | {'precision': 0.8648200526675119, 'recall': 0.8710216110019646, 'f1-score': 0.8679097538295895, 'support': 10180.0} | 0.8980 | {'precision': 0.8575857957560539, 'recall': 0.7946637446346708, 'f1-score': 0.820914743961573, 'support': 30257.0} | {'precision': 0.8967766387772022, 'recall': 0.8980070727434973, 'f1-score': 0.896759137754772, 'support': 30257.0} |
| No log | 2.0 | 82 | 0.1942 | {'precision': 0.8446771378708552, 'recall': 0.8217317487266553, 'f1-score': 0.8330464716006883, 'support': 1178.0} | {'precision': 0.950406156477127, 'recall': 0.9410021694269538, 'f1-score': 0.9456807848767648, 'support': 18899.0} | {'precision': 0.8897009327819982, 'recall': 0.9088408644400786, 'f1-score': 0.8991690558336167, 'support': 10180.0} | 0.9255 | {'precision': 0.8949280757099934, 'recall': 0.8905249275312292, 'f1-score': 0.89263210410369, 'support': 30257.0} | {'precision': 0.9258654564363231, 'recall': 0.9255378920580362, 'f1-score': 0.925646656486691, 'support': 30257.0} |
| No log | 3.0 | 123 | 0.1832 | {'precision': 0.8074866310160428, 'recall': 0.8972835314091681, 'f1-score': 0.8500201045436268, 'support': 1178.0} | {'precision': 0.942701581540057, 'recall': 0.9619556590295782, 'f1-score': 0.9522313010685104, 'support': 18899.0} | {'precision': 0.9321121804822519, 'recall': 0.8847740667976425, 'f1-score': 0.907826437534647, 'support': 10180.0} | 0.9335 | {'precision': 0.894100131012784, 'recall': 0.9146710857454629, 'f1-score': 0.9033592810489282, 'support': 30257.0} | {'precision': 0.9338744237092824, 'recall': 0.9334699408401361, 'f1-score': 0.9333118344895024, 'support': 30257.0} |
| No log | 4.0 | 164 | 0.1747 | {'precision': 0.8522167487684729, 'recall': 0.8811544991511036, 'f1-score': 0.8664440734557596, 'support': 1178.0} | {'precision': 0.9485603194619588, 'recall': 0.9552357267580295, 'f1-score': 0.9518863198966544, 'support': 18899.0} | {'precision': 0.9159588288198262, 'recall': 0.9003929273084479, 'f1-score': 0.9081091791747165, 'support': 10180.0} | 0.9339 | {'precision': 0.905578632350086, 'recall': 0.912261051072527, 'f1-score': 0.9088131908423769, 'support': 30257.0} | {'precision': 0.9338405554069026, 'recall': 0.9338995934825, 'f1-score': 0.9338309192007261, 'support': 30257.0} |
| No log | 5.0 | 205 | 0.1861 | {'precision': 0.8224085365853658, 'recall': 0.9159592529711376, 'f1-score': 0.8666666666666667, 'support': 1178.0} | {'precision': 0.9393582120155833, 'recall': 0.9696280226467009, 'f1-score': 0.9542531309396725, 'support': 18899.0} | {'precision': 0.9446858111688037, 'recall': 0.8757367387033399, 'f1-score': 0.9089055411123006, 'support': 10180.0} | 0.9359 | {'precision': 0.9021508532565843, 'recall': 0.9204413381070594, 'f1-score': 0.9099417795728799, 'support': 30257.0} | {'precision': 0.9365974704259672, 'recall': 0.9359487060845424, 'f1-score': 0.9355858698312928, 'support': 30257.0} |
| No log | 6.0 | 246 | 0.1963 | {'precision': 0.8229740361919748, 'recall': 0.8879456706281834, 'f1-score': 0.8542262147815436, 'support': 1178.0} | {'precision': 0.962094547029837, 'recall': 0.9401026509339119, 'f1-score': 0.9509714713911042, 'support': 18899.0} | {'precision': 0.897328643407168, 'recall': 0.9272102161100196, 'f1-score': 0.9120247354944683, 'support': 10180.0} | 0.9337 | {'precision': 0.8941324088763266, 'recall': 0.9184195125573716, 'f1-score': 0.9057408072223719, 'support': 30257.0} | {'precision': 0.9348875912627161, 'recall': 0.9337343424662061, 'f1-score': 0.9341012038922174, 'support': 30257.0} |
| No log | 7.0 | 287 | 0.2315 | {'precision': 0.8133535660091047, 'recall': 0.9100169779286927, 'f1-score': 0.8589743589743589, 'support': 1178.0} | {'precision': 0.9424149252175725, 'recall': 0.9568760251865178, 'f1-score': 0.9495904221802143, 'support': 18899.0} | {'precision': 0.9218461538461539, 'recall': 0.8829076620825147, 'f1-score': 0.9019568489713999, 'support': 10180.0} | 0.9302 | {'precision': 0.8925382150242771, 'recall': 0.9166002217325749, 'f1-score': 0.9035072100419911, 'support': 30257.0} | {'precision': 0.9304697762038363, 'recall': 0.9301649205142611, 'f1-score': 0.9300360877213377, 'support': 30257.0} |
| No log | 8.0 | 328 | 0.2543 | {'precision': 0.833076923076923, 'recall': 0.9193548387096774, 'f1-score': 0.87409200968523, 'support': 1178.0} | {'precision': 0.9300999293428889, 'recall': 0.9751309593100164, 'f1-score': 0.952083279518508, 'support': 18899.0} | {'precision': 0.9527507382697146, 'recall': 0.8556974459724951, 'f1-score': 0.9016198312891373, 'support': 10180.0} | 0.9328 | {'precision': 0.9053091968965088, 'recall': 0.9167277479973963, 'f1-score': 0.9092650401642918, 'support': 30257.0} | {'precision': 0.9339434079922521, 'recall': 0.9327758865717024, 'f1-score': 0.932068353424097, 'support': 30257.0} |
| No log | 9.0 | 369 | 0.2367 | {'precision': 0.8409976617303195, 'recall': 0.9159592529711376, 'f1-score': 0.8768793173506705, 'support': 1178.0} | {'precision': 0.9428438661710037, 'recall': 0.9662416000846605, 'f1-score': 0.9543993519220215, 'support': 18899.0} | {'precision': 0.9379554445138455, 'recall': 0.8850687622789783, 'f1-score': 0.9107449711917517, 'support': 10180.0} | 0.9370 | {'precision': 0.9072656574717229, 'recall': 0.9224232051115923, 'f1-score': 0.9140078801548146, 'support': 30257.0} | {'precision': 0.9372339589990767, 'recall': 0.9369732623855637, 'f1-score': 0.9366936905359226, 'support': 30257.0} |
| No log | 10.0 | 410 | 0.2730 | {'precision': 0.8094170403587444, 'recall': 0.9193548387096774, 'f1-score': 0.8608903020667728, 'support': 1178.0} | {'precision': 0.9393060590367686, 'recall': 0.9597333192232393, 'f1-score': 0.9494098249103614, 'support': 18899.0} | {'precision': 0.9288167343115828, 'recall': 0.8767190569744597, 'f1-score': 0.9020162716660771, 'support': 10180.0} | 0.9302 | {'precision': 0.8925132779023652, 'recall': 0.9186024049691256, 'f1-score': 0.9041054662144038, 'support': 30257.0} | {'precision': 0.9307199272423042, 'recall': 0.9302310209207787, 'f1-score': 0.9300178703234373, 'support': 30257.0} |
| No log | 11.0 | 451 | 0.2785 | {'precision': 0.8337218337218337, 'recall': 0.9108658743633277, 'f1-score': 0.8705882352941178, 'support': 1178.0} | {'precision': 0.9392393320964749, 'recall': 0.9643367373935129, 'f1-score': 0.9516225883090098, 'support': 18899.0} | {'precision': 0.9336190675308383, 'recall': 0.8773084479371316, 'f1-score': 0.9045882710422363, 'support': 10180.0} | 0.9330 | {'precision': 0.9021934111163823, 'recall': 0.9175036865646574, 'f1-score': 0.9089330315484546, 'support': 30257.0} | {'precision': 0.9332402605968713, 'recall': 0.932974187791255, 'f1-score': 0.9326429202114688, 'support': 30257.0} |
| No log | 12.0 | 492 | 0.2703 | {'precision': 0.8390894819466248, 'recall': 0.9074702886247877, 'f1-score': 0.871941272430669, 'support': 1178.0} | {'precision': 0.9483742604324834, 'recall': 0.9584104979099424, 'f1-score': 0.9533659666298226, 'support': 18899.0} | {'precision': 0.924524484014569, 'recall': 0.8976424361493124, 'f1-score': 0.9108851674641149, 'support': 10180.0} | 0.9360 | {'precision': 0.903996075464559, 'recall': 0.9211744075613475, 'f1-score': 0.9120641355082021, 'support': 30257.0} | {'precision': 0.9360951781377842, 'recall': 0.9359817562878012, 'f1-score': 0.9359031373581331, 'support': 30257.0} |
| 0.1317 | 13.0 | 533 | 0.2982 | {'precision': 0.8402832415420929, 'recall': 0.9066213921901528, 'f1-score': 0.8721927317272357, 'support': 1178.0} | {'precision': 0.9399876263147041, 'recall': 0.9647071273612361, 'f1-score': 0.952186969578274, 'support': 18899.0} | {'precision': 0.9331595411887382, 'recall': 0.8790766208251474, 'f1-score': 0.9053110773899848, 'support': 10180.0} | 0.9336 | {'precision': 0.9044768030151783, 'recall': 0.9168017134588454, 'f1-score': 0.9098969262318315, 'support': 30257.0} | {'precision': 0.9338085050586488, 'recall': 0.93363519185643, 'f1-score': 0.9333010987164798, 'support': 30257.0} |
| 0.1317 | 14.0 | 574 | 0.3190 | {'precision': 0.827559661277906, 'recall': 0.9125636672325976, 'f1-score': 0.8679854662898667, 'support': 1178.0} | {'precision': 0.938201668554949, 'recall': 0.9639663474257897, 'f1-score': 0.9509095179685257, 'support': 18899.0} | {'precision': 0.9335429769392034, 'recall': 0.8748526522593321, 'f1-score': 0.9032454361054767, 'support': 10180.0} | 0.9320 | {'precision': 0.8997681022573528, 'recall': 0.9171275556392398, 'f1-score': 0.9073801401212896, 'support': 30257.0} | {'precision': 0.9323266060827724, 'recall': 0.9319826816934924, 'f1-score': 0.9316443929976661, 'support': 30257.0} |
| 0.1317 | 15.0 | 615 | 0.3058 | {'precision': 0.8361934477379095, 'recall': 0.9100169779286927, 'f1-score': 0.8715447154471545, 'support': 1178.0} | {'precision': 0.9371953409615681, 'recall': 0.9664532514947881, 'f1-score': 0.9515994581640096, 'support': 18899.0} | {'precision': 0.9362218005481763, 'recall': 0.8723968565815324, 'f1-score': 0.9031831587511441, 'support': 10180.0} | 0.9326 | {'precision': 0.9032035297492179, 'recall': 0.9162890286683378, 'f1-score': 0.9087757774541028, 'support': 30257.0} | {'precision': 0.932935471456138, 'recall': 0.9326106355554087, 'f1-score': 0.9321929600001657, 'support': 30257.0} |
| 0.1317 | 16.0 | 656 | 0.2971 | {'precision': 0.8351477449455676, 'recall': 0.9117147707979627, 'f1-score': 0.8717532467532468, 'support': 1178.0} | {'precision': 0.9428615911567804, 'recall': 0.9613207047991957, 'f1-score': 0.9520016767973171, 'support': 18899.0} | {'precision': 0.9285714285714286, 'recall': 0.8849705304518664, 'f1-score': 0.9062468564530731, 'support': 10180.0} | 0.9337 | {'precision': 0.9021935882245922, 'recall': 0.9193353353496749, 'f1-score': 0.9100005933345456, 'support': 30257.0} | {'precision': 0.9338600124822359, 'recall': 0.9337012922629474, 'f1-score': 0.9334830952559773, 'support': 30257.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2