dit_base_binary_task

This model is a fine-tuned version of microsoft/dit-base on the davanstrien/leicester_loaded_annotations_binary dataset. It achieves the following results on the evaluation set:

Loss: 0.0513
Accuracy: 0.9873
F1: 0.9600

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
No log	0.87	5	0.6816	0.5	0.2476
0.7387	1.87	10	0.5142	0.8354	0.0
0.7387	2.87	15	0.4690	0.8354	0.0
0.4219	3.87	20	0.5460	0.8354	0.0
0.4219	4.87	25	0.4703	0.8354	0.0
0.3734	5.87	30	0.4371	0.8354	0.0
0.3734	6.87	35	0.4147	0.8354	0.0
0.3261	7.87	40	0.4272	0.8354	0.0
0.3261	8.87	45	0.4038	0.8354	0.0
0.3078	9.87	50	0.3418	0.8354	0.0
0.3078	10.87	55	0.3042	0.8354	0.0
0.2501	11.87	60	0.2799	0.8354	0.0
0.2501	12.87	65	0.1419	0.9367	0.7619
0.1987	13.87	70	0.1224	0.9494	0.8182
0.1987	14.87	75	0.0749	0.9747	0.9167
0.1391	15.87	80	0.0539	0.9810	0.9412
0.1391	16.87	85	0.0830	0.9873	0.9600
0.1085	17.87	90	0.0443	0.9873	0.9600
0.1085	18.87	95	0.0258	0.9937	0.9804
0.1039	19.87	100	0.1025	0.9684	0.8936
0.1039	20.87	105	0.1597	0.9684	0.8936
0.1217	21.87	110	0.0278	0.9937	0.9811
0.1217	22.87	115	0.0458	0.9873	0.9600
0.0609	23.87	120	0.0478	0.9937	0.9804
0.0609	24.87	125	0.0671	0.9747	0.9231
0.1031	25.87	130	0.0751	0.9873	0.9600
0.1031	26.87	135	0.1963	0.9557	0.8444
0.0601	27.87	140	0.0870	0.9747	0.9167
0.0601	28.87	145	0.0890	0.9747	0.9167
0.0799	29.87	150	0.1017	0.9747	0.9167
0.0799	30.87	155	0.0041	1.0	1.0
0.0441	31.87	160	0.0332	0.9873	0.9615
0.0441	32.87	165	0.0839	0.9747	0.9167
0.0757	33.87	170	0.0722	0.9873	0.9600
0.0757	34.87	175	0.0168	0.9937	0.9804
0.0555	35.87	180	0.0443	0.9937	0.9804
0.0555	36.87	185	0.0227	0.9873	0.9615
0.0336	37.87	190	0.0128	0.9937	0.9804
0.0336	38.87	195	0.0169	0.9937	0.9811
0.0405	39.87	200	0.0193	0.9937	0.9804
0.0405	40.87	205	0.1216	0.9810	0.9388
0.0578	41.87	210	0.0307	0.9937	0.9804
0.0578	42.87	215	0.0539	0.9873	0.9600
0.0338	43.87	220	0.0573	0.9937	0.9804
0.0338	44.87	225	0.0086	1.0	1.0
0.0417	45.87	230	0.0491	0.9873	0.9600
0.0417	46.87	235	0.0089	1.0	1.0
0.0538	47.87	240	0.0846	0.9810	0.9388
0.0538	48.87	245	0.0452	0.9810	0.9388
0.0364	49.87	250	0.0513	0.9873	0.9600

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.7.1
Tokenizers 0.13.1

rchan26
/

dit_base_binary_task

dit_base_binary_task

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results