metadata

tags:
  - generated_from_trainer
metrics:
  - wer
  - bleu
model-index:
  - name: geez_t5-15k
    results: []

geez_t5-15k

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.3233
Wer: 0.2209
Cer: 0.1381
Bleu: 70.4059

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer	Bleu
7.7095	1.0	145	7.7918	5.0898	3.9256	0.0023
7.0334	2.0	290	7.1199	5.0160	4.1855	0.0051
6.4831	3.0	435	6.6645	5.0207	3.8475	0.0214
6.1982	4.0	580	6.3920	4.5634	3.8489	0.0529
5.903	5.0	725	6.1877	4.5275	3.5050	0.0557
5.669	6.0	870	6.0360	4.9197	4.0028	0.0634
5.425	7.0	1015	5.8639	4.4216	3.7590	0.1208
5.2049	8.0	1160	5.7314	3.2761	2.6167	0.1783
5.0061	9.0	1305	5.6525	3.9136	3.2163	0.1433
4.8471	10.0	1450	5.5808	2.8054	2.4552	0.3077
4.6025	11.0	1595	5.4963	3.1738	2.8400	0.2473
4.4593	12.0	1740	5.4572	2.9939	2.6228	0.3764
4.3925	13.0	1885	5.3739	2.4268	2.0558	0.4943
4.2547	14.0	2030	5.3549	2.1811	1.9179	0.6141
4.2059	15.0	2175	5.3532	2.5793	2.2089	0.5485
4.0344	16.0	2320	5.3384	2.1161	1.8753	0.7106
3.8338	17.0	2465	5.3491	2.1119	1.9856	0.6538
3.8922	18.0	2610	5.3233	2.0402	1.8304	0.8877
3.6469	19.0	2755	5.3290	1.7011	1.4942	1.1830
2.8339	20.0	2900	4.1129	1.7063	1.4567	4.0465
1.4826	21.0	3045	2.3404	1.6510	1.4483	11.1205
0.8862	22.0	3190	1.6343	1.4432	1.2622	18.9607
0.603	23.0	3335	1.3605	1.1528	0.9975	27.6554
0.4701	24.0	3480	1.2962	1.0378	0.8913	31.5906
0.4302	25.0	3625	1.2630	0.8397	0.7215	38.0315
0.3239	26.0	3770	1.2441	0.6757	0.5460	44.0109
0.2679	27.0	3915	1.2520	0.6738	0.5478	44.8130
0.2543	28.0	4060	1.2496	0.6416	0.5215	46.1244
0.2113	29.0	4205	1.2534	0.5392	0.4282	50.5640
0.1811	30.0	4350	1.2870	0.6152	0.4961	47.6743
0.1676	31.0	4495	1.2657	0.5494	0.4411	50.7361
0.1523	32.0	4640	1.2986	0.5483	0.4476	50.8212
0.1468	33.0	4785	1.3057	0.4785	0.3744	54.2680
0.1375	34.0	4930	1.3025	0.4506	0.3545	55.8315
0.1259	35.0	5075	1.3367	0.4865	0.3899	54.1053
0.1194	36.0	5220	1.3196	0.4540	0.3581	55.4216
0.1116	37.0	5365	1.3104	0.3943	0.3011	58.6213
0.0968	38.0	5510	1.3477	0.3834	0.2953	59.3219
0.0981	39.0	5655	1.3217	0.4059	0.3112	58.2604
0.0938	40.0	5800	1.3304	0.4132	0.3205	57.7388
0.0823	41.0	5945	1.3023	0.3432	0.2481	61.8713
0.0786	42.0	6090	1.3138	0.2974	0.2027	64.6092
0.0766	43.0	6235	1.3324	0.3680	0.2768	60.6454
0.0765	44.0	6380	1.3266	0.3359	0.2359	62.7278
0.0718	45.0	6525	1.3440	0.3000	0.2163	64.6481
0.0637	46.0	6670	1.3283	0.2628	0.1782	67.2375
0.0658	47.0	6815	1.3331	0.2605	0.1721	67.1960
0.0643	48.0	6960	1.3198	0.2618	0.1780	67.4730
0.0682	49.0	7105	1.3196	0.2732	0.1876	66.2931
0.0605	50.0	7250	1.3233	0.2209	0.1381	70.4059

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2