MubarakB
/

ft-t5-small-lg

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ft-t5-small-lg / README.md

MubarakB's picture

End of training

4a623a1 verified 27 days ago

|

history blame contribute delete

3.5 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: t5-small
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: ft-t5-small-lg
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# ft-t5-small-lg

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the Luganda Formal Data dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2411
	- Bleu: 1.4907
	- Gen Len: 14.5428

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|:-------:\|
	\| 0.3208 \| 1.0 \| 2051 \| 0.2999 \| 0.0574 \| 8.6396 \|
	\| 0.3054 \| 2.0 \| 4102 \| 0.2890 \| 0.1846 \| 8.7257 \|
	\| 0.2954 \| 3.0 \| 6153 \| 0.2820 \| 0.2253 \| 11.5285 \|
	\| 0.2915 \| 4.0 \| 8204 \| 0.2755 \| 0.2485 \| 11.8231 \|
	\| 0.2841 \| 5.0 \| 10255 \| 0.2706 \| 0.1711 \| 14.2913 \|
	\| 0.2809 \| 6.0 \| 12306 \| 0.2667 \| 0.2453 \| 14.0332 \|
	\| 0.2758 \| 7.0 \| 14357 \| 0.2635 \| 0.3568 \| 15.1871 \|
	\| 0.2721 \| 8.0 \| 16408 \| 0.2609 \| 0.4433 \| 15.1297 \|
	\| 0.2683 \| 9.0 \| 18459 \| 0.2586 \| 0.5148 \| 14.9026 \|
	\| 0.2668 \| 10.0 \| 20510 \| 0.2562 \| 0.5717 \| 14.9704 \|
	\| 0.2658 \| 11.0 \| 22561 \| 0.2546 \| 0.6013 \| 14.9334 \|
	\| 0.2665 \| 12.0 \| 24612 \| 0.2528 \| 0.6211 \| 14.7852 \|
	\| 0.2611 \| 13.0 \| 26663 \| 0.2512 \| 0.6801 \| 14.7521 \|
	\| 0.2617 \| 14.0 \| 28714 \| 0.2499 \| 0.7704 \| 14.8426 \|
	\| 0.2589 \| 15.0 \| 30765 \| 0.2486 \| 0.846 \| 14.7227 \|
	\| 0.257 \| 16.0 \| 32816 \| 0.2477 \| 0.9404 \| 14.6676 \|
	\| 0.2552 \| 17.0 \| 34867 \| 0.2466 \| 0.8846 \| 14.5691 \|
	\| 0.2577 \| 18.0 \| 36918 \| 0.2458 \| 1.0307 \| 14.6182 \|
	\| 0.254 \| 19.0 \| 38969 \| 0.2450 \| 1.038 \| 14.5272 \|
	\| 0.2539 \| 20.0 \| 41020 \| 0.2442 \| 1.1301 \| 14.5494 \|
	\| 0.2524 \| 21.0 \| 43071 \| 0.2436 \| 1.1553 \| 14.571 \|
	\| 0.2555 \| 22.0 \| 45122 \| 0.2429 \| 1.2626 \| 14.6193 \|
	\| 0.2506 \| 23.0 \| 47173 \| 0.2427 \| 1.3183 \| 14.5 \|
	\| 0.2491 \| 24.0 \| 49224 \| 0.2421 \| 1.3981 \| 14.5801 \|
	\| 0.2499 \| 25.0 \| 51275 \| 0.2419 \| 1.4025 \| 14.534 \|
	\| 0.2482 \| 26.0 \| 53326 \| 0.2415 \| 1.404 \| 14.5639 \|
	\| 0.2479 \| 27.0 \| 55377 \| 0.2414 \| 1.4074 \| 14.554 \|
	\| 0.247 \| 28.0 \| 57428 \| 0.2412 \| 1.4902 \| 14.542 \|
	\| 0.2477 \| 29.0 \| 59479 \| 0.2411 \| 1.4932 \| 14.5653 \|
	\| 0.2477 \| 30.0 \| 61530 \| 0.2411 \| 1.4907 \| 14.5428 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.4.0
	- Datasets 3.0.0
	- Tokenizers 0.19.1