test-full_labels / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-03 17:42:24.016498.
ecb05e4 verified
metadata
license: mit
base_model: openai-community/gpt2
tags:
  - generated_from_trainer
datasets:
  - essays_su_g
metrics:
  - accuracy
model-index:
  - name: test-full_labels
    results:
      - task:
          name: Token Classification
          type: token-classification
        dataset:
          name: essays_su_g
          type: essays_su_g
          config: full_labels
          split: train[0%:20%]
          args: full_labels
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7248585259425923

test-full_labels

This model is a fine-tuned version of openai-community/gpt2 on the essays_su_g dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7319
  • B-claim: {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 284.0}
  • B-majorclaim: {'precision': 0.2, 'recall': 0.028368794326241134, 'f1-score': 0.04968944099378882, 'support': 141.0}
  • B-premise: {'precision': 0.5708502024291497, 'recall': 0.19915254237288135, 'f1-score': 0.29528795811518327, 'support': 708.0}
  • I-claim: {'precision': 0.38082266412421684, 'recall': 0.3428991905813098, 'f1-score': 0.3608673205988642, 'support': 4077.0}
  • I-majorclaim: {'precision': 0.5423883318140383, 'recall': 0.2939723320158103, 'f1-score': 0.38128804870233907, 'support': 2024.0}
  • I-premise: {'precision': 0.7635793871866295, 'recall': 0.8964192282537606, 'f1-score': 0.8246841155234658, 'support': 12232.0}
  • O: {'precision': 0.8210081497132509, 'recall': 0.8269152817186867, 'f1-score': 0.8239511283889535, 'support': 9868.0}
  • Accuracy: 0.7249
  • Macro avg: {'precision': 0.4683783907524693, 'recall': 0.3696753384669557, 'f1-score': 0.3908240017603707, 'support': 29334.0}
  • Weighted avg: {'precision': 0.6996857371644881, 'recall': 0.7248585259425923, 'f1-score': 0.7048932637283017, 'support': 29334.0}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss B-claim B-majorclaim B-premise I-claim I-majorclaim I-premise O Accuracy Macro avg Weighted avg
No log 1.0 41 1.1499 {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 284.0} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 141.0} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 708.0} {'precision': 0.03305785123966942, 'recall': 0.0009811135638950208, 'f1-score': 0.0019056693663649356, 'support': 4077.0} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 2024.0} {'precision': 0.5419036077188403, 'recall': 0.9504578155657293, 'f1-score': 0.6902570800926201, 'support': 12232.0} {'precision': 0.7500322622273842, 'recall': 0.5889744629104176, 'f1-score': 0.6598172220014759, 'support': 9868.0} 0.5946 {'precision': 0.18928481731227054, 'recall': 0.22005905600572026, 'f1-score': 0.193139995922923, 'support': 29334.0} {'precision': 0.4828751671364564, 'recall': 0.5946001227244835, 'f1-score': 0.5100589883551566, 'support': 29334.0}
No log 2.0 82 0.8679 {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 284.0} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 141.0} {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 708.0} {'precision': 0.2435064935064935, 'recall': 0.03679175864606328, 'f1-score': 0.06392499467291711, 'support': 4077.0} {'precision': 0.3858267716535433, 'recall': 0.04841897233201581, 'f1-score': 0.08604038630377524, 'support': 2024.0} {'precision': 0.648795078729048, 'recall': 0.9398299542184434, 'f1-score': 0.7676538345965076, 'support': 12232.0} {'precision': 0.7703193371194489, 'recall': 0.8384677746250506, 'f1-score': 0.8029501674025912, 'support': 9868.0} 0.6824 {'precision': 0.29263538300121905, 'recall': 0.2662154942602247, 'f1-score': 0.24579562613939873, 'support': 29334.0} {'precision': 0.5901432461158105, 'recall': 0.6824163087202564, 'f1-score': 0.605039268489588, 'support': 29334.0}
No log 3.0 123 0.7978 {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 284.0} {'precision': 0.125, 'recall': 0.0070921985815602835, 'f1-score': 0.013422818791946307, 'support': 141.0} {'precision': 0.6071428571428571, 'recall': 0.07203389830508475, 'f1-score': 0.12878787878787878, 'support': 708.0} {'precision': 0.31603053435114503, 'recall': 0.304635761589404, 'f1-score': 0.3102285500187336, 'support': 4077.0} {'precision': 0.5198019801980198, 'recall': 0.2075098814229249, 'f1-score': 0.2966101694915254, 'support': 2024.0} {'precision': 0.7561827382225073, 'recall': 0.8673969914977109, 'f1-score': 0.8079808095038647, 'support': 12232.0} {'precision': 0.7885992552277284, 'recall': 0.8369477097689502, 'f1-score': 0.812054471264933, 'support': 9868.0} 0.7017 {'precision': 0.44467962359175106, 'recall': 0.327945205880805, 'f1-score': 0.3384406711226974, 'support': 29334.0} {'precision': 0.6756508673843488, 'recall': 0.7016772346083043, 'f1-score': 0.6768524579464901, 'support': 29334.0}
No log 4.0 164 0.7564 {'precision': 1.0, 'recall': 0.0035211267605633804, 'f1-score': 0.007017543859649122, 'support': 284.0} {'precision': 0.2857142857142857, 'recall': 0.028368794326241134, 'f1-score': 0.05161290322580645, 'support': 141.0} {'precision': 0.5414634146341464, 'recall': 0.15677966101694915, 'f1-score': 0.24315443592552025, 'support': 708.0} {'precision': 0.3514654161781946, 'recall': 0.36767230806965906, 'f1-score': 0.3593862383121553, 'support': 4077.0} {'precision': 0.49387755102040815, 'recall': 0.29891304347826086, 'f1-score': 0.37242228377962444, 'support': 2024.0} {'precision': 0.7616845350711232, 'recall': 0.8886527141922825, 'f1-score': 0.8202844960947816, 'support': 12232.0} {'precision': 0.8391959798994975, 'recall': 0.795399270368869, 'f1-score': 0.8167108891316789, 'support': 9868.0} 0.7138 {'precision': 0.610485883216808, 'recall': 0.3627581311732607, 'f1-score': 0.38151268433274516, 'support': 29334.0} {'precision': 0.7069709429163672, 'recall': 0.7138133224244904, 'f1-score': 0.6986243658756951, 'support': 29334.0}
No log 5.0 205 0.7319 {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 284.0} {'precision': 0.2, 'recall': 0.028368794326241134, 'f1-score': 0.04968944099378882, 'support': 141.0} {'precision': 0.5708502024291497, 'recall': 0.19915254237288135, 'f1-score': 0.29528795811518327, 'support': 708.0} {'precision': 0.38082266412421684, 'recall': 0.3428991905813098, 'f1-score': 0.3608673205988642, 'support': 4077.0} {'precision': 0.5423883318140383, 'recall': 0.2939723320158103, 'f1-score': 0.38128804870233907, 'support': 2024.0} {'precision': 0.7635793871866295, 'recall': 0.8964192282537606, 'f1-score': 0.8246841155234658, 'support': 12232.0} {'precision': 0.8210081497132509, 'recall': 0.8269152817186867, 'f1-score': 0.8239511283889535, 'support': 9868.0} 0.7249 {'precision': 0.4683783907524693, 'recall': 0.3696753384669557, 'f1-score': 0.3908240017603707, 'support': 29334.0} {'precision': 0.6996857371644881, 'recall': 0.7248585259425923, 'f1-score': 0.7048932637283017, 'support': 29334.0}

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2