bert-30 / README.md

hung200504

bert-cased

0b4519d 12 months ago

preview code

raw

history blame

No virus

4.06 kB

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-30
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-30

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 10.2691

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 11.3132 \| 0.09 \| 5 \| 12.3055 \|
	\| 11.4109 \| 0.18 \| 10 \| 12.2292 \|
	\| 10.9744 \| 0.27 \| 15 \| 12.1547 \|
	\| 11.0771 \| 0.36 \| 20 \| 12.0814 \|
	\| 11.0342 \| 0.45 \| 25 \| 12.0101 \|
	\| 11.0327 \| 0.55 \| 30 \| 11.9396 \|
	\| 10.2954 \| 0.64 \| 35 \| 11.8706 \|
	\| 10.8979 \| 0.73 \| 40 \| 11.8043 \|
	\| 10.432 \| 0.82 \| 45 \| 11.7386 \|
	\| 10.3023 \| 0.91 \| 50 \| 11.6747 \|
	\| 10.0494 \| 1.0 \| 55 \| 11.6128 \|
	\| 10.2273 \| 1.09 \| 60 \| 11.5521 \|
	\| 10.3139 \| 1.18 \| 65 \| 11.4931 \|
	\| 10.5075 \| 1.27 \| 70 \| 11.4349 \|
	\| 10.0234 \| 1.36 \| 75 \| 11.3790 \|
	\| 10.4276 \| 1.45 \| 80 \| 11.3238 \|
	\| 10.1397 \| 1.55 \| 85 \| 11.2699 \|
	\| 10.0675 \| 1.64 \| 90 \| 11.2174 \|
	\| 9.8835 \| 1.73 \| 95 \| 11.1665 \|
	\| 10.0738 \| 1.82 \| 100 \| 11.1169 \|
	\| 9.6112 \| 1.91 \| 105 \| 11.0687 \|
	\| 9.9186 \| 2.0 \| 110 \| 11.0227 \|
	\| 9.8411 \| 2.09 \| 115 \| 10.9779 \|
	\| 9.6506 \| 2.18 \| 120 \| 10.9342 \|
	\| 9.7831 \| 2.27 \| 125 \| 10.8916 \|
	\| 9.8835 \| 2.36 \| 130 \| 10.8509 \|
	\| 9.4752 \| 2.45 \| 135 \| 10.8111 \|
	\| 9.8176 \| 2.55 \| 140 \| 10.7731 \|
	\| 9.3628 \| 2.64 \| 145 \| 10.7369 \|
	\| 9.819 \| 2.73 \| 150 \| 10.7017 \|
	\| 9.572 \| 2.82 \| 155 \| 10.6681 \|
	\| 9.522 \| 2.91 \| 160 \| 10.6356 \|
	\| 9.6874 \| 3.0 \| 165 \| 10.6046 \|
	\| 9.6037 \| 3.09 \| 170 \| 10.5750 \|
	\| 9.5624 \| 3.18 \| 175 \| 10.5468 \|
	\| 9.2702 \| 3.27 \| 180 \| 10.5202 \|
	\| 9.1347 \| 3.36 \| 185 \| 10.4947 \|
	\| 9.8154 \| 3.45 \| 190 \| 10.4706 \|
	\| 9.4045 \| 3.55 \| 195 \| 10.4475 \|
	\| 9.2453 \| 3.64 \| 200 \| 10.4262 \|
	\| 9.1087 \| 3.73 \| 205 \| 10.4062 \|
	\| 8.985 \| 3.82 \| 210 \| 10.3875 \|
	\| 9.0054 \| 3.91 \| 215 \| 10.3705 \|
	\| 9.4764 \| 4.0 \| 220 \| 10.3545 \|
	\| 9.13 \| 4.09 \| 225 \| 10.3401 \|
	\| 9.4397 \| 4.18 \| 230 \| 10.3272 \|
	\| 9.0841 \| 4.27 \| 235 \| 10.3153 \|
	\| 9.5885 \| 4.36 \| 240 \| 10.3048 \|
	\| 9.4137 \| 4.45 \| 245 \| 10.2958 \|
	\| 9.1068 \| 4.55 \| 250 \| 10.2878 \|
	\| 9.1388 \| 4.64 \| 255 \| 10.2816 \|
	\| 8.8014 \| 4.73 \| 260 \| 10.2763 \|
	\| 8.9782 \| 4.82 \| 265 \| 10.2727 \|
	\| 9.222 \| 4.91 \| 270 \| 10.2701 \|
	\| 9.292 \| 5.0 \| 275 \| 10.2691 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.1.0+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-30
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-30

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 10.2691

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 11.3132 \| 0.09 \| 5 \| 12.3055 \|
	\| 11.4109 \| 0.18 \| 10 \| 12.2292 \|
	\| 10.9744 \| 0.27 \| 15 \| 12.1547 \|
	\| 11.0771 \| 0.36 \| 20 \| 12.0814 \|
	\| 11.0342 \| 0.45 \| 25 \| 12.0101 \|
	\| 11.0327 \| 0.55 \| 30 \| 11.9396 \|
	\| 10.2954 \| 0.64 \| 35 \| 11.8706 \|
	\| 10.8979 \| 0.73 \| 40 \| 11.8043 \|
	\| 10.432 \| 0.82 \| 45 \| 11.7386 \|
	\| 10.3023 \| 0.91 \| 50 \| 11.6747 \|
	\| 10.0494 \| 1.0 \| 55 \| 11.6128 \|
	\| 10.2273 \| 1.09 \| 60 \| 11.5521 \|
	\| 10.3139 \| 1.18 \| 65 \| 11.4931 \|
	\| 10.5075 \| 1.27 \| 70 \| 11.4349 \|
	\| 10.0234 \| 1.36 \| 75 \| 11.3790 \|
	\| 10.4276 \| 1.45 \| 80 \| 11.3238 \|
	\| 10.1397 \| 1.55 \| 85 \| 11.2699 \|
	\| 10.0675 \| 1.64 \| 90 \| 11.2174 \|
	\| 9.8835 \| 1.73 \| 95 \| 11.1665 \|
	\| 10.0738 \| 1.82 \| 100 \| 11.1169 \|
	\| 9.6112 \| 1.91 \| 105 \| 11.0687 \|
	\| 9.9186 \| 2.0 \| 110 \| 11.0227 \|
	\| 9.8411 \| 2.09 \| 115 \| 10.9779 \|
	\| 9.6506 \| 2.18 \| 120 \| 10.9342 \|
	\| 9.7831 \| 2.27 \| 125 \| 10.8916 \|
	\| 9.8835 \| 2.36 \| 130 \| 10.8509 \|
	\| 9.4752 \| 2.45 \| 135 \| 10.8111 \|
	\| 9.8176 \| 2.55 \| 140 \| 10.7731 \|
	\| 9.3628 \| 2.64 \| 145 \| 10.7369 \|
	\| 9.819 \| 2.73 \| 150 \| 10.7017 \|
	\| 9.572 \| 2.82 \| 155 \| 10.6681 \|
	\| 9.522 \| 2.91 \| 160 \| 10.6356 \|
	\| 9.6874 \| 3.0 \| 165 \| 10.6046 \|
	\| 9.6037 \| 3.09 \| 170 \| 10.5750 \|
	\| 9.5624 \| 3.18 \| 175 \| 10.5468 \|
	\| 9.2702 \| 3.27 \| 180 \| 10.5202 \|
	\| 9.1347 \| 3.36 \| 185 \| 10.4947 \|
	\| 9.8154 \| 3.45 \| 190 \| 10.4706 \|
	\| 9.4045 \| 3.55 \| 195 \| 10.4475 \|
	\| 9.2453 \| 3.64 \| 200 \| 10.4262 \|
	\| 9.1087 \| 3.73 \| 205 \| 10.4062 \|
	\| 8.985 \| 3.82 \| 210 \| 10.3875 \|
	\| 9.0054 \| 3.91 \| 215 \| 10.3705 \|
	\| 9.4764 \| 4.0 \| 220 \| 10.3545 \|
	\| 9.13 \| 4.09 \| 225 \| 10.3401 \|
	\| 9.4397 \| 4.18 \| 230 \| 10.3272 \|
	\| 9.0841 \| 4.27 \| 235 \| 10.3153 \|
	\| 9.5885 \| 4.36 \| 240 \| 10.3048 \|
	\| 9.4137 \| 4.45 \| 245 \| 10.2958 \|
	\| 9.1068 \| 4.55 \| 250 \| 10.2878 \|
	\| 9.1388 \| 4.64 \| 255 \| 10.2816 \|
	\| 8.8014 \| 4.73 \| 260 \| 10.2763 \|
	\| 8.9782 \| 4.82 \| 265 \| 10.2727 \|
	\| 9.222 \| 4.91 \| 270 \| 10.2701 \|
	\| 9.292 \| 5.0 \| 275 \| 10.2691 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.1.0+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1