Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

Loss: 1.0552
Bleu: 33.24
Chrf: 55.16
Wer: 61.5038

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Chrf	Wer
2.5219	0.0138	100	2.1106	0.44	10.48	107.2490
2.4608	0.0276	200	2.1816	3.3	20.43	179.1535
2.3008	0.0414	300	2.0587	3.66	21.59	206.4836
2.2095	0.0552	400	1.9459	8.79	27.66	100.3602
2.0454	0.0690	500	1.8681	8.14	27.36	122.1522
1.9937	0.0828	600	1.8717	11.05	30.26	97.2535
1.868	0.0966	700	1.7917	9.14	29.03	129.0410
1.9924	0.1103	800	1.7170	12.62	33.2	89.6443
1.8646	0.1241	900	1.7252	11.98	30.77	97.8838
1.7644	0.1379	1000	1.6832	10.87	31.0	109.1851
1.692	0.1517	1100	1.6837	13.05	34.46	93.3814
1.7044	0.1655	1200	1.5527	20.95	37.42	75.2364
1.6824	0.1793	1300	1.5611	14.91	35.56	92.6159
1.6557	0.1931	1400	1.5554	14.0	36.54	99.8199
1.5456	0.2069	1500	1.5058	19.72	39.81	83.5660
1.3755	0.2207	1600	1.5039	18.04	37.95	82.9806
1.3959	0.2345	1700	1.4374	17.01	39.5	85.2319
1.5012	0.2483	1800	1.4242	14.93	39.24	114.4079
1.4278	0.2621	1900	1.3904	23.85	42.69	73.0302
1.3285	0.2759	2000	1.4493	17.7	37.23	83.8811
1.2655	0.2897	2100	1.3661	20.1	40.32	79.7839
1.2074	0.3034	2200	1.3387	24.45	43.79	72.9851
1.1893	0.3172	2300	1.3308	21.45	42.61	82.3953
1.1236	0.3310	2400	1.3050	22.77	44.17	77.3075
1.0934	0.3448	2500	1.2793	25.54	46.32	72.2647
1.06	0.3586	2600	1.2396	28.27	47.32	65.6911
1.0327	0.3724	2700	1.2577	28.45	47.01	67.3570
1.1623	0.3862	2800	1.2194	24.54	47.43	73.6155
1.0215	0.4	2900	1.2039	27.4	49.6	69.2481
0.9185	0.4138	3000	1.1724	27.04	49.24	67.8973
0.9003	0.4276	3100	1.1674	31.08	50.11	63.8001
0.9839	0.4414	3200	1.1580	30.24	50.63	64.5655
0.9396	0.4552	3300	1.1202	30.79	51.72	64.9257
0.9051	0.4690	3400	1.1180	30.34	53.08	66.4566
0.8621	0.4828	3500	1.1042	33.3	53.86	60.7834
0.8236	0.4966	3600	1.1070	32.77	53.21	62.0441
0.829	0.5103	3700	1.0771	32.49	54.21	62.5844
0.8375	0.5241	3800	1.0780	32.27	53.98	63.0797
0.8206	0.5379	3900	1.0615	33.26	55.07	61.6389
0.8059	0.5517	4000	1.0552	33.24	55.16	61.5038

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v6.3.0-4k-r

Whisper Medium GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v6.3.0-4k-r

Datasets used to train ymoslem/whisper-medium-ga2en-v6.3.0-4k-r

Collection including ymoslem/whisper-medium-ga2en-v6.3.0-4k-r

Speech Translation (Irish-English)

Evaluation results