Edit model card

PoliteT5Base

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8536
  • Toxicity Ratio: 0.3421

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 75

Training results

Training Loss Epoch Step Validation Loss Toxicity Ratio
No log 1.0 22 1.3256 0.3070
No log 2.0 44 0.8436 0.2982
1.6337 3.0 66 0.7944 0.3333
1.6337 4.0 88 0.8921 0.3158
0.547 5.0 110 0.9630 0.2632
0.547 6.0 132 0.9711 0.3158
0.3279 7.0 154 0.9966 0.3070
0.3279 8.0 176 1.0053 0.3246
0.3279 9.0 198 1.0326 0.3333
0.2282 10.0 220 0.9798 0.3158
0.2282 11.0 242 1.0093 0.3333
0.1837 12.0 264 1.2380 0.3246
0.1837 13.0 286 1.1889 0.3860
0.1546 14.0 308 1.1985 0.3596
0.1546 15.0 330 1.2296 0.3509
0.1178 16.0 352 1.1394 0.3684
0.1178 17.0 374 1.1712 0.3596
0.1178 18.0 396 1.1586 0.4035
0.1185 19.0 418 1.9263 0.0789
0.1185 20.0 440 1.3483 0.3246
0.2332 21.0 462 1.3163 0.3158
0.2332 22.0 484 1.2926 0.3509
0.1267 23.0 506 1.2691 0.3421
0.1267 24.0 528 1.3298 0.3596
0.0879 25.0 550 1.2795 0.3509
0.0879 26.0 572 1.2826 0.3246
0.0879 27.0 594 1.2884 0.3158
0.0747 28.0 616 1.4146 0.4035
0.0747 29.0 638 1.3577 0.3596
0.0714 30.0 660 1.2663 0.3509
0.0714 31.0 682 1.2508 0.3772
0.0566 32.0 704 1.3980 0.4035
0.0566 33.0 726 1.4006 0.3860
0.0566 34.0 748 1.4090 0.3596
0.0572 35.0 770 1.4681 0.3246
0.0572 36.0 792 1.4254 0.3947
0.0456 37.0 814 1.4932 0.3246
0.0456 38.0 836 1.3994 0.2982
0.0385 39.0 858 1.4511 0.3421
0.0385 40.0 880 1.3007 0.3684
0.0223 41.0 902 1.3961 0.3158
0.0223 42.0 924 1.4619 0.3246
0.0223 43.0 946 1.3996 0.3246
0.0199 44.0 968 1.5012 0.3509
0.0199 45.0 990 1.4104 0.3246
0.018 46.0 1012 1.5855 0.3333
0.018 47.0 1034 1.4603 0.3333
0.0146 48.0 1056 1.5335 0.3421
0.0146 49.0 1078 1.4883 0.3772
0.0131 50.0 1100 1.5366 0.2982
0.0131 51.0 1122 1.5762 0.3509
0.0131 52.0 1144 1.5434 0.3333
0.0073 53.0 1166 1.4730 0.3158
0.0073 54.0 1188 1.5133 0.3509
0.0049 55.0 1210 1.6912 0.3509
0.0049 56.0 1232 1.6376 0.3509
0.0028 57.0 1254 1.8260 0.3509
0.0028 58.0 1276 1.5748 0.3509
0.0028 59.0 1298 1.6631 0.3509
0.0029 60.0 1320 1.7458 0.3509
0.0029 61.0 1342 1.6343 0.3684
0.002 62.0 1364 1.6433 0.3421
0.002 63.0 1386 1.7486 0.3509
0.0014 64.0 1408 1.8081 0.3684
0.0014 65.0 1430 1.8987 0.3947
0.0007 66.0 1452 1.8811 0.3596
0.0007 67.0 1474 1.8541 0.3596
0.0007 68.0 1496 1.8233 0.3509
0.001 69.0 1518 1.7747 0.3509
0.001 70.0 1540 1.8105 0.3509
0.0008 71.0 1562 1.8254 0.3596
0.0008 72.0 1584 1.8444 0.3684
0.0008 73.0 1606 1.8387 0.3509
0.0008 74.0 1628 1.8501 0.3509
0.0004 75.0 1650 1.8536 0.3421

Framework versions

  • Transformers 4.28.0
  • Pytorch 2.0.0
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
2
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.