PoliteT5Base

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.8536
Toxicity Ratio: 0.3421

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 75

Training results

Training Loss	Epoch	Step	Validation Loss	Toxicity Ratio
No log	1.0	22	1.3256	0.3070
No log	2.0	44	0.8436	0.2982
1.6337	3.0	66	0.7944	0.3333
1.6337	4.0	88	0.8921	0.3158
0.547	5.0	110	0.9630	0.2632
0.547	6.0	132	0.9711	0.3158
0.3279	7.0	154	0.9966	0.3070
0.3279	8.0	176	1.0053	0.3246
0.3279	9.0	198	1.0326	0.3333
0.2282	10.0	220	0.9798	0.3158
0.2282	11.0	242	1.0093	0.3333
0.1837	12.0	264	1.2380	0.3246
0.1837	13.0	286	1.1889	0.3860
0.1546	14.0	308	1.1985	0.3596
0.1546	15.0	330	1.2296	0.3509
0.1178	16.0	352	1.1394	0.3684
0.1178	17.0	374	1.1712	0.3596
0.1178	18.0	396	1.1586	0.4035
0.1185	19.0	418	1.9263	0.0789
0.1185	20.0	440	1.3483	0.3246
0.2332	21.0	462	1.3163	0.3158
0.2332	22.0	484	1.2926	0.3509
0.1267	23.0	506	1.2691	0.3421
0.1267	24.0	528	1.3298	0.3596
0.0879	25.0	550	1.2795	0.3509
0.0879	26.0	572	1.2826	0.3246
0.0879	27.0	594	1.2884	0.3158
0.0747	28.0	616	1.4146	0.4035
0.0747	29.0	638	1.3577	0.3596
0.0714	30.0	660	1.2663	0.3509
0.0714	31.0	682	1.2508	0.3772
0.0566	32.0	704	1.3980	0.4035
0.0566	33.0	726	1.4006	0.3860
0.0566	34.0	748	1.4090	0.3596
0.0572	35.0	770	1.4681	0.3246
0.0572	36.0	792	1.4254	0.3947
0.0456	37.0	814	1.4932	0.3246
0.0456	38.0	836	1.3994	0.2982
0.0385	39.0	858	1.4511	0.3421
0.0385	40.0	880	1.3007	0.3684
0.0223	41.0	902	1.3961	0.3158
0.0223	42.0	924	1.4619	0.3246
0.0223	43.0	946	1.3996	0.3246
0.0199	44.0	968	1.5012	0.3509
0.0199	45.0	990	1.4104	0.3246
0.018	46.0	1012	1.5855	0.3333
0.018	47.0	1034	1.4603	0.3333
0.0146	48.0	1056	1.5335	0.3421
0.0146	49.0	1078	1.4883	0.3772
0.0131	50.0	1100	1.5366	0.2982
0.0131	51.0	1122	1.5762	0.3509
0.0131	52.0	1144	1.5434	0.3333
0.0073	53.0	1166	1.4730	0.3158
0.0073	54.0	1188	1.5133	0.3509
0.0049	55.0	1210	1.6912	0.3509
0.0049	56.0	1232	1.6376	0.3509
0.0028	57.0	1254	1.8260	0.3509
0.0028	58.0	1276	1.5748	0.3509
0.0028	59.0	1298	1.6631	0.3509
0.0029	60.0	1320	1.7458	0.3509
0.0029	61.0	1342	1.6343	0.3684
0.002	62.0	1364	1.6433	0.3421
0.002	63.0	1386	1.7486	0.3509
0.0014	64.0	1408	1.8081	0.3684
0.0014	65.0	1430	1.8987	0.3947
0.0007	66.0	1452	1.8811	0.3596
0.0007	67.0	1474	1.8541	0.3596
0.0007	68.0	1496	1.8233	0.3509
0.001	69.0	1518	1.7747	0.3509
0.001	70.0	1540	1.8105	0.3509
0.0008	71.0	1562	1.8254	0.3596
0.0008	72.0	1584	1.8444	0.3684
0.0008	73.0	1606	1.8387	0.3509
0.0008	74.0	1628	1.8501	0.3509
0.0004	75.0	1650	1.8536	0.3421

Framework versions

Transformers 4.28.0
Pytorch 2.0.0
Datasets 2.11.0
Tokenizers 0.13.3

Wazzzabeee
/

PoliteT5Base

PoliteT5Base

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results