bart-base-2024-10-12_13-22 / README.md

Model save

4c88a71 verified 10 days ago

4.04 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: facebook/bart-base
	tags:
	- generated_from_trainer
	datasets:
	- arrow
	model-index:
	- name: bart-base-2024-10-12_13-22
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart-base-2024-10-12_13-22

	This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on the arrow dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3413
	- Gen Len: 19.9988
	- Bertscorer-p: 0.5693
	- Bertscorer-r: 0.1741
	- Bertscorer-f1: 0.3646
	- Sacrebleu-score: 10.2355
	- Sacrebleu-precisions: [90.1056377359695, 78.84314927189703, 71.03531269978564, 65.97921118095769]
	- Bleu-bp: 0.1347

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0003
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Gen Len \| Bertscorer-p \| Bertscorer-r \| Bertscorer-f1 \| Sacrebleu-score \| Sacrebleu-precisions \| Bleu-bp \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:------------:\|:------------:\|:-------------:\|:---------------:\|:----------------------------------------------------------------------------:\|:-------:\|
	\| 0.317 \| 1.0 \| 4772 \| 0.2879 \| 19.9998 \| 0.5428 \| 0.1582 \| 0.3439 \| 9.6993 \| [87.29083507884441, 72.83089806032642, 64.20568134269375, 58.79563532531103] \| 0.1386 \|
	\| 0.1934 \| 2.0 \| 9544 \| 0.2725 \| 19.9995 \| 0.5576 \| 0.1608 \| 0.3518 \| 9.8295 \| [88.83556675143292, 76.0723710308905, 67.15881021479623, 61.749907205015056] \| 0.1351 \|
	\| 0.1323 \| 3.0 \| 14316 \| 0.2723 \| 20.0 \| 0.5678 \| 0.1719 \| 0.3627 \| 10.1615 \| [89.72749492127984, 77.42060052689843, 68.79285540795546, 63.42083414479146] \| 0.1370 \|
	\| 0.0882 \| 4.0 \| 19088 \| 0.2759 \| 20.0 \| 0.5728 \| 0.1722 \| 0.3650 \| 10.1777 \| [90.45151089248067, 79.10211769585014, 70.55075573625463, 65.16963077018467] \| 0.1344 \|
	\| 0.061 \| 5.0 \| 23860 \| 0.2968 \| 20.0 \| 0.5672 \| 0.1735 \| 0.3633 \| 10.1992 \| [89.8170208710569, 77.72758114247924, 69.35369251771922, 64.13642380028935] \| 0.1366 \|
	\| 0.0359 \| 6.0 \| 28632 \| 0.3064 \| 20.0 \| 0.5692 \| 0.1807 \| 0.3681 \| 10.3391 \| [90.43231298215383, 79.56742387626873, 71.96627153855555, 66.84727640514376] \| 0.1348 \|
	\| 0.0229 \| 7.0 \| 33404 \| 0.3159 \| 19.9996 \| 0.5683 \| 0.1740 \| 0.3641 \| 10.3045 \| [89.974323617517, 78.0061867507562, 69.70321593791971, 64.46675057044337] \| 0.1375 \|
	\| 0.0129 \| 8.0 \| 38176 \| 0.3253 \| 19.9999 \| 0.5670 \| 0.1722 \| 0.3625 \| 10.1527 \| [89.83988773004178, 78.2656326826365, 70.11705905563593, 64.89062161576781] \| 0.1350 \|
	\| 0.0068 \| 9.0 \| 42948 \| 0.3389 \| 19.9994 \| 0.5680 \| 0.1729 \| 0.3633 \| 10.2220 \| [89.96170046739762, 78.33494108730105, 70.31016985715492, 65.2346243333951] \| 0.1356 \|
	\| 0.0035 \| 10.0 \| 47720 \| 0.3413 \| 19.9988 \| 0.5693 \| 0.1741 \| 0.3646 \| 10.2355 \| [90.1056377359695, 78.84314927189703, 71.03531269978564, 65.97921118095769] \| 0.1347 \|


	### Framework versions

	- Transformers 4.45.1
	- Pytorch 2.4.0
	- Datasets 3.0.1
	- Tokenizers 0.20.0