GIT-naruto / README.md

End of training

ff9e1fb verified 5 months ago

4.7 kB

	---
	license: mit
	base_model: microsoft/git-base
	tags:
	- generated_from_trainer
	model-index:
	- name: GIT-naruto
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# GIT-naruto

	This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0774
	- Wer Score: 16.0923

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer Score \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------:\|
	\| 7.4722 \| 0.93 \| 50 \| 4.5072 \| 21.6154 \|
	\| 2.1729 \| 1.85 \| 100 \| 0.3006 \| 0.5077 \|
	\| 0.0896 \| 2.78 \| 150 \| 0.0626 \| 0.6154 \|
	\| 0.0296 \| 3.7 \| 200 \| 0.0647 \| 21.7538 \|
	\| 0.0228 \| 4.63 \| 250 \| 0.0599 \| 21.7077 \|
	\| 0.0169 \| 5.56 \| 300 \| 0.0627 \| 3.5846 \|
	\| 0.0162 \| 6.48 \| 350 \| 0.0611 \| 17.0769 \|
	\| 0.0147 \| 7.41 \| 400 \| 0.0649 \| 21.6769 \|
	\| 0.0131 \| 8.33 \| 450 \| 0.0631 \| 15.0154 \|
	\| 0.0119 \| 9.26 \| 500 \| 0.0668 \| 19.3231 \|
	\| 0.0117 \| 10.19 \| 550 \| 0.0645 \| 20.3231 \|
	\| 0.0106 \| 11.11 \| 600 \| 0.0631 \| 21.6308 \|
	\| 0.0099 \| 12.04 \| 650 \| 0.0655 \| 17.6923 \|
	\| 0.0098 \| 12.96 \| 700 \| 0.0662 \| 18.0615 \|
	\| 0.0092 \| 13.89 \| 750 \| 0.0656 \| 18.1385 \|
	\| 0.0089 \| 14.81 \| 800 \| 0.0658 \| 21.6615 \|
	\| 0.0086 \| 15.74 \| 850 \| 0.0677 \| 20.4 \|
	\| 0.0079 \| 16.67 \| 900 \| 0.0684 \| 21.6462 \|
	\| 0.0085 \| 17.59 \| 950 \| 0.0701 \| 21.6615 \|
	\| 0.0089 \| 18.52 \| 1000 \| 0.0716 \| 16.8923 \|
	\| 0.0083 \| 19.44 \| 1050 \| 0.0685 \| 21.6769 \|
	\| 0.0079 \| 20.37 \| 1100 \| 0.0665 \| 21.7077 \|
	\| 0.0075 \| 21.3 \| 1150 \| 0.0685 \| 19.5231 \|
	\| 0.0078 \| 22.22 \| 1200 \| 0.0669 \| 20.7385 \|
	\| 0.0078 \| 23.15 \| 1250 \| 0.0677 \| 18.6923 \|
	\| 0.007 \| 24.07 \| 1300 \| 0.0698 \| 19.7231 \|
	\| 0.008 \| 25.0 \| 1350 \| 0.0682 \| 20.4769 \|
	\| 0.0073 \| 25.93 \| 1400 \| 0.0705 \| 19.3231 \|
	\| 0.008 \| 26.85 \| 1450 \| 0.0738 \| 21.6615 \|
	\| 0.0071 \| 27.78 \| 1500 \| 0.0722 \| 19.9231 \|
	\| 0.0064 \| 28.7 \| 1550 \| 0.0731 \| 21.6923 \|
	\| 0.0063 \| 29.63 \| 1600 \| 0.0741 \| 20.5385 \|
	\| 0.0069 \| 30.56 \| 1650 \| 0.0780 \| 19.8462 \|
	\| 0.0063 \| 31.48 \| 1700 \| 0.0763 \| 16.9538 \|
	\| 0.0061 \| 32.41 \| 1750 \| 0.0775 \| 19.7846 \|
	\| 0.0062 \| 33.33 \| 1800 \| 0.0772 \| 19.1077 \|
	\| 0.0065 \| 34.26 \| 1850 \| 0.0737 \| 17.7231 \|
	\| 0.0062 \| 35.19 \| 1900 \| 0.0752 \| 19.5385 \|
	\| 0.0058 \| 36.11 \| 1950 \| 0.0748 \| 19.4 \|
	\| 0.006 \| 37.04 \| 2000 \| 0.0752 \| 18.4154 \|
	\| 0.0053 \| 37.96 \| 2050 \| 0.0746 \| 17.1385 \|
	\| 0.0053 \| 38.89 \| 2100 \| 0.0766 \| 15.8154 \|
	\| 0.0052 \| 39.81 \| 2150 \| 0.0770 \| 17.2 \|
	\| 0.0049 \| 40.74 \| 2200 \| 0.0763 \| 19.3538 \|
	\| 0.0051 \| 41.67 \| 2250 \| 0.0766 \| 19.9692 \|
	\| 0.0046 \| 42.59 \| 2300 \| 0.0768 \| 19.9846 \|
	\| 0.0045 \| 43.52 \| 2350 \| 0.0773 \| 16.3692 \|
	\| 0.0044 \| 44.44 \| 2400 \| 0.0771 \| 16.7846 \|
	\| 0.0041 \| 45.37 \| 2450 \| 0.0773 \| 17.6308 \|
	\| 0.0042 \| 46.3 \| 2500 \| 0.0774 \| 16.0615 \|
	\| 0.0041 \| 47.22 \| 2550 \| 0.0767 \| 16.3231 \|
	\| 0.004 \| 48.15 \| 2600 \| 0.0771 \| 16.1846 \|
	\| 0.0037 \| 49.07 \| 2650 \| 0.0772 \| 16.0462 \|
	\| 0.0035 \| 50.0 \| 2700 \| 0.0774 \| 16.0923 \|


	### Framework versions

	- Transformers 4.37.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.16.1
	- Tokenizers 0.15.1

	---
	license: mit
	base_model: microsoft/git-base
	tags:
	- generated_from_trainer
	model-index:
	- name: GIT-naruto
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# GIT-naruto

	This model is a fine-tuned version of [microsoft/git-base](https://huggingface.co/microsoft/git-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0774
	- Wer Score: 16.0923

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer Score \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:---------:\|
	\| 7.4722 \| 0.93 \| 50 \| 4.5072 \| 21.6154 \|
	\| 2.1729 \| 1.85 \| 100 \| 0.3006 \| 0.5077 \|
	\| 0.0896 \| 2.78 \| 150 \| 0.0626 \| 0.6154 \|
	\| 0.0296 \| 3.7 \| 200 \| 0.0647 \| 21.7538 \|
	\| 0.0228 \| 4.63 \| 250 \| 0.0599 \| 21.7077 \|
	\| 0.0169 \| 5.56 \| 300 \| 0.0627 \| 3.5846 \|
	\| 0.0162 \| 6.48 \| 350 \| 0.0611 \| 17.0769 \|
	\| 0.0147 \| 7.41 \| 400 \| 0.0649 \| 21.6769 \|
	\| 0.0131 \| 8.33 \| 450 \| 0.0631 \| 15.0154 \|
	\| 0.0119 \| 9.26 \| 500 \| 0.0668 \| 19.3231 \|
	\| 0.0117 \| 10.19 \| 550 \| 0.0645 \| 20.3231 \|
	\| 0.0106 \| 11.11 \| 600 \| 0.0631 \| 21.6308 \|
	\| 0.0099 \| 12.04 \| 650 \| 0.0655 \| 17.6923 \|
	\| 0.0098 \| 12.96 \| 700 \| 0.0662 \| 18.0615 \|
	\| 0.0092 \| 13.89 \| 750 \| 0.0656 \| 18.1385 \|
	\| 0.0089 \| 14.81 \| 800 \| 0.0658 \| 21.6615 \|
	\| 0.0086 \| 15.74 \| 850 \| 0.0677 \| 20.4 \|
	\| 0.0079 \| 16.67 \| 900 \| 0.0684 \| 21.6462 \|
	\| 0.0085 \| 17.59 \| 950 \| 0.0701 \| 21.6615 \|
	\| 0.0089 \| 18.52 \| 1000 \| 0.0716 \| 16.8923 \|
	\| 0.0083 \| 19.44 \| 1050 \| 0.0685 \| 21.6769 \|
	\| 0.0079 \| 20.37 \| 1100 \| 0.0665 \| 21.7077 \|
	\| 0.0075 \| 21.3 \| 1150 \| 0.0685 \| 19.5231 \|
	\| 0.0078 \| 22.22 \| 1200 \| 0.0669 \| 20.7385 \|
	\| 0.0078 \| 23.15 \| 1250 \| 0.0677 \| 18.6923 \|
	\| 0.007 \| 24.07 \| 1300 \| 0.0698 \| 19.7231 \|
	\| 0.008 \| 25.0 \| 1350 \| 0.0682 \| 20.4769 \|
	\| 0.0073 \| 25.93 \| 1400 \| 0.0705 \| 19.3231 \|
	\| 0.008 \| 26.85 \| 1450 \| 0.0738 \| 21.6615 \|
	\| 0.0071 \| 27.78 \| 1500 \| 0.0722 \| 19.9231 \|
	\| 0.0064 \| 28.7 \| 1550 \| 0.0731 \| 21.6923 \|
	\| 0.0063 \| 29.63 \| 1600 \| 0.0741 \| 20.5385 \|
	\| 0.0069 \| 30.56 \| 1650 \| 0.0780 \| 19.8462 \|
	\| 0.0063 \| 31.48 \| 1700 \| 0.0763 \| 16.9538 \|
	\| 0.0061 \| 32.41 \| 1750 \| 0.0775 \| 19.7846 \|
	\| 0.0062 \| 33.33 \| 1800 \| 0.0772 \| 19.1077 \|
	\| 0.0065 \| 34.26 \| 1850 \| 0.0737 \| 17.7231 \|
	\| 0.0062 \| 35.19 \| 1900 \| 0.0752 \| 19.5385 \|
	\| 0.0058 \| 36.11 \| 1950 \| 0.0748 \| 19.4 \|
	\| 0.006 \| 37.04 \| 2000 \| 0.0752 \| 18.4154 \|
	\| 0.0053 \| 37.96 \| 2050 \| 0.0746 \| 17.1385 \|
	\| 0.0053 \| 38.89 \| 2100 \| 0.0766 \| 15.8154 \|
	\| 0.0052 \| 39.81 \| 2150 \| 0.0770 \| 17.2 \|
	\| 0.0049 \| 40.74 \| 2200 \| 0.0763 \| 19.3538 \|
	\| 0.0051 \| 41.67 \| 2250 \| 0.0766 \| 19.9692 \|
	\| 0.0046 \| 42.59 \| 2300 \| 0.0768 \| 19.9846 \|
	\| 0.0045 \| 43.52 \| 2350 \| 0.0773 \| 16.3692 \|
	\| 0.0044 \| 44.44 \| 2400 \| 0.0771 \| 16.7846 \|
	\| 0.0041 \| 45.37 \| 2450 \| 0.0773 \| 17.6308 \|
	\| 0.0042 \| 46.3 \| 2500 \| 0.0774 \| 16.0615 \|
	\| 0.0041 \| 47.22 \| 2550 \| 0.0767 \| 16.3231 \|
	\| 0.004 \| 48.15 \| 2600 \| 0.0771 \| 16.1846 \|
	\| 0.0037 \| 49.07 \| 2650 \| 0.0772 \| 16.0462 \|
	\| 0.0035 \| 50.0 \| 2700 \| 0.0774 \| 16.0923 \|


	### Framework versions

	- Transformers 4.37.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.16.1
	- Tokenizers 0.15.1