bloom-full_labels / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-03 19:45:18.187762.
a0d3396 verified
metadata
license: bigscience-bloom-rail-1.0
base_model: bigscience/bloom-560m
tags:
  - generated_from_trainer
datasets:
  - essays_su_g
metrics:
  - accuracy
model-index:
  - name: bloom-full_labels
    results:
      - task:
          name: Token Classification
          type: token-classification
        dataset:
          name: essays_su_g
          type: essays_su_g
          config: full_labels
          split: train[0%:20%]
          args: full_labels
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.7978079994653657

bloom-full_labels

This model is a fine-tuned version of bigscience/bloom-560m on the essays_su_g dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7047
  • B-claim: {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0}
  • B-majorclaim: {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0}
  • B-premise: {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0}
  • I-claim: {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0}
  • I-majorclaim: {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0}
  • I-premise: {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0}
  • O: {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0}
  • Accuracy: 0.7978
  • Macro avg: {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0}
  • Weighted avg: {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss B-claim B-majorclaim B-premise I-claim I-majorclaim I-premise O Accuracy Macro avg Weighted avg
No log 1.0 81 0.7937 {'precision': 0.3116883116883117, 'recall': 0.2535211267605634, 'f1-score': 0.2796116504854369, 'support': 284.0} {'precision': 0.17391304347826086, 'recall': 0.028368794326241134, 'f1-score': 0.04878048780487805, 'support': 141.0} {'precision': 0.5714285714285714, 'recall': 0.4689265536723164, 'f1-score': 0.5151280062063615, 'support': 708.0} {'precision': 0.5458064516129032, 'recall': 0.1013902205177373, 'f1-score': 0.17101273499090358, 'support': 4172.0} {'precision': 0.4496436318562132, 'recall': 0.698603755416466, 'f1-score': 0.547134238310709, 'support': 2077.0} {'precision': 0.7235728757001549, 'recall': 0.9698107179937705, 'f1-score': 0.82878886120875, 'support': 12521.0} {'precision': 0.9145402022147328, 'recall': 0.7579808459696727, 'f1-score': 0.8289330133100589, 'support': 10024.0} 0.7359 {'precision': 0.5272275839970211, 'recall': 0.4683717163795382, 'f1-score': 0.4599127131881569, 'support': 29927.0} {'precision': 0.7336463378005765, 'recall': 0.7358906672904066, 'f1-score': 0.7012848326220683, 'support': 29927.0}
No log 2.0 162 0.8594 {'precision': 0.3852813852813853, 'recall': 0.31338028169014087, 'f1-score': 0.34563106796116505, 'support': 284.0} {'precision': 0.5, 'recall': 0.05673758865248227, 'f1-score': 0.10191082802547771, 'support': 141.0} {'precision': 0.555984555984556, 'recall': 0.6101694915254238, 'f1-score': 0.5818181818181819, 'support': 708.0} {'precision': 0.5365853658536586, 'recall': 0.015819750719079578, 'f1-score': 0.030733410942956924, 'support': 4172.0} {'precision': 0.6063059224541969, 'recall': 0.6851227732306211, 'f1-score': 0.6433092224231466, 'support': 2077.0} {'precision': 0.7233196891499081, 'recall': 0.9738040092644358, 'f1-score': 0.8300769283137042, 'support': 12521.0} {'precision': 0.8663324979114453, 'recall': 0.8276137270550679, 'f1-score': 0.8465306122448979, 'support': 10024.0} 0.7521 {'precision': 0.5962584880907357, 'recall': 0.4975210888767502, 'f1-score': 0.48285860738993286, 'support': 29927.0} {'precision': 0.7288499118938129, 'recall': 0.7520633541617937, 'f1-score': 0.6972915776644995, 'support': 29927.0}
No log 3.0 243 0.6374 {'precision': 0.4406779661016949, 'recall': 0.2746478873239437, 'f1-score': 0.33839479392624733, 'support': 284.0} {'precision': 0.6890756302521008, 'recall': 0.5815602836879432, 'f1-score': 0.6307692307692307, 'support': 141.0} {'precision': 0.6152125279642058, 'recall': 0.7768361581920904, 'f1-score': 0.6866416978776528, 'support': 708.0} {'precision': 0.43018637335777576, 'recall': 0.6749760306807286, 'f1-score': 0.5254711699944019, 'support': 4172.0} {'precision': 0.7759119861030689, 'recall': 0.6451612903225806, 'f1-score': 0.7045215562565721, 'support': 2077.0} {'precision': 0.8966225233548917, 'recall': 0.6975481191598115, 'f1-score': 0.7846554667145808, 'support': 12521.0} {'precision': 0.8359600857968852, 'recall': 0.8942537909018355, 'f1-score': 0.8641249337253579, 'support': 10024.0} 0.7540 {'precision': 0.6690924418472318, 'recall': 0.6492833657527048, 'f1-score': 0.6477969784662919, 'support': 29927.0} {'precision': 0.7909400854003533, 'recall': 0.7539679887726802, 'f1-score': 0.7623016451053796, 'support': 29927.0}
No log 4.0 324 0.6704 {'precision': 0.49489795918367346, 'recall': 0.3415492957746479, 'f1-score': 0.4041666666666667, 'support': 284.0} {'precision': 0.7155172413793104, 'recall': 0.5886524822695035, 'f1-score': 0.6459143968871596, 'support': 141.0} {'precision': 0.6989869753979739, 'recall': 0.6822033898305084, 'f1-score': 0.6904932094353109, 'support': 708.0} {'precision': 0.6432561851556265, 'recall': 0.38638542665388304, 'f1-score': 0.4827792752321055, 'support': 4172.0} {'precision': 0.6661024121878968, 'recall': 0.757823784304285, 'f1-score': 0.7090090090090089, 'support': 2077.0} {'precision': 0.8252104563579974, 'recall': 0.8925005989936906, 'f1-score': 0.8575375052756781, 'support': 12521.0} {'precision': 0.8499952439836393, 'recall': 0.8914604948124502, 'f1-score': 0.8702342114232848, 'support': 10024.0} 0.8006 {'precision': 0.6991380676637311, 'recall': 0.6486536389484242, 'f1-score': 0.6657334677041735, 'support': 29927.0} {'precision': 0.7904665918521213, 'recall': 0.8006148294182511, 'f1-score': 0.7899872403655044, 'support': 29927.0}
No log 5.0 405 0.7047 {'precision': 0.4620938628158845, 'recall': 0.4507042253521127, 'f1-score': 0.45632798573975053, 'support': 284.0} {'precision': 0.7, 'recall': 0.5957446808510638, 'f1-score': 0.6436781609195402, 'support': 141.0} {'precision': 0.6952247191011236, 'recall': 0.6991525423728814, 'f1-score': 0.6971830985915493, 'support': 708.0} {'precision': 0.5342320909331219, 'recall': 0.48441994247363374, 'f1-score': 0.5081081081081082, 'support': 4172.0} {'precision': 0.7541263517359135, 'recall': 0.6379393355801637, 'f1-score': 0.6911841418883673, 'support': 2077.0} {'precision': 0.8258639910813824, 'recall': 0.8874690519926524, 'f1-score': 0.8555589775177087, 'support': 12521.0} {'precision': 0.886796294411076, 'recall': 0.8690143655227454, 'f1-score': 0.8778152869451302, 'support': 10024.0} 0.7978 {'precision': 0.6940481871540717, 'recall': 0.660634877735036, 'f1-score': 0.6756936799585934, 'support': 29927.0} {'precision': 0.7935035105957297, 'recall': 0.7978079994653657, 'f1-score': 0.7946353555655076, 'support': 29927.0}

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2