Edit model card

TURNA_spell_correction

This model is a fine-tuned version of boun-tabi-LMG/TURNA on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0425
  • Rouge1: 0.8554
  • Rouge2: 0.2215
  • Rougel: 0.8561
  • Rougelsum: 0.8555
  • Bleu: 0.9246
  • Precisions: [0.8658536585365854, 0.8441558441558441, 1.0, 1.0]
  • Brevity Penalty: 1.0
  • Length Ratio: 1.0017
  • Translation Length: 574
  • Reference Length: 573
  • Meteor: 0.4890
  • Score: 13.4380
  • Num Edits: 77
  • Ref Length: 573.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Meteor Score Num Edits Ref Length
No log 0.5013 196 0.1719 0.5130 0.1519 0.5138 0.5132 0.0 [0.5173611111111112, 0.46835443037974683, 0.0, 0.0] 1.0 1.0052 576 573 0.2860 49.5637 284 573.0
1.8157 1.0026 392 0.0889 0.6976 0.2011 0.6993 0.6991 0.7805 [0.7142857142857143, 0.7792207792207793, 0.6666666666666666, 1.0] 1.0 1.0017 574 573 0.4017 29.1449 167 573.0
1.8157 1.5038 588 0.0701 0.7537 0.2023 0.7545 0.7554 0.8780 [0.7688266199649737, 0.7837837837837838, 1.0, 1.0] 0.9965 0.9965 571 573 0.4301 23.3857 134 573.0
0.0846 2.0051 784 0.0527 0.7805 0.214 0.7829 0.7813 0.9036 [0.8017543859649123, 0.8493150684931506, 1.0, 1.0] 0.9948 0.9948 570 573 0.4482 20.2443 116 573.0
0.0846 2.5064 980 0.0502 0.8055 0.215 0.8075 0.8073 0.9125 [0.8181818181818182, 0.8533333333333334, 1.0, 1.0] 0.9983 0.9983 572 573 0.4589 18.3246 105 573.0
0.0401 3.0077 1176 0.0441 0.8229 0.223 0.8248 0.8249 0.9214 [0.8374125874125874, 0.8666666666666667, 1.0, 1.0] 0.9983 0.9983 572 573 0.4693 16.4049 94 573.0
0.0401 3.5090 1372 0.0435 0.8467 0.231 0.8480 0.8473 0.9271 [0.8583916083916084, 0.8666666666666667, 1.0, 1.0] 0.9983 0.9983 572 573 0.4818 14.3106 82 573.0
0.0228 4.0102 1568 0.0427 0.8569 0.227 0.8591 0.8581 0.9400 [0.8671328671328671, 0.9066666666666666, 1.0, 1.0] 0.9983 0.9983 572 573 0.4874 13.4380 77 573.0
0.0228 4.5115 1764 0.0420 0.8539 0.224 0.8556 0.8539 0.9321 [0.8636363636363636, 0.88, 1.0, 1.0] 0.9983 0.9983 572 573 0.4866 13.7871 79 573.0

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1

Citation Information

Uludoğan, G., Balal, Z. Y., Akkurt, F., Türker, M., Güngör, O., & Üsküdarlı, S. (2024).
Turna: A turkish encoder-decoder language model for enhanced understanding and generation. arXiv preprint arXiv:2401.14373.
Downloads last month
40
Safetensors
Model size
1.14B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Holmeister/TURNA_spell_correction

Finetuned
(6)
this model