roberta-base-sst-2-64-13

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0411
Accuracy: 0.8672

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	4	0.6951	0.5
No log	2.0	8	0.6951	0.5
0.6962	3.0	12	0.6951	0.5
0.6962	4.0	16	0.6950	0.5
0.7017	5.0	20	0.6949	0.5
0.7017	6.0	24	0.6949	0.5
0.7017	7.0	28	0.6947	0.5
0.6966	8.0	32	0.6946	0.5
0.6966	9.0	36	0.6945	0.5
0.6927	10.0	40	0.6944	0.5
0.6927	11.0	44	0.6943	0.5
0.6927	12.0	48	0.6941	0.5
0.6961	13.0	52	0.6940	0.5
0.6961	14.0	56	0.6939	0.5
0.6875	15.0	60	0.6938	0.5
0.6875	16.0	64	0.6936	0.5
0.6875	17.0	68	0.6934	0.5
0.6935	18.0	72	0.6932	0.5
0.6935	19.0	76	0.6929	0.5
0.6948	20.0	80	0.6927	0.5
0.6948	21.0	84	0.6924	0.5
0.6948	22.0	88	0.6922	0.5
0.6906	23.0	92	0.6920	0.5
0.6906	24.0	96	0.6917	0.5
0.691	25.0	100	0.6913	0.5
0.691	26.0	104	0.6909	0.5
0.691	27.0	108	0.6904	0.5
0.6855	28.0	112	0.6899	0.5
0.6855	29.0	116	0.6891	0.5
0.6858	30.0	120	0.6882	0.5234
0.6858	31.0	124	0.6870	0.5156
0.6858	32.0	128	0.6852	0.6016
0.6764	33.0	132	0.6825	0.6562
0.6764	34.0	136	0.6782	0.7266
0.6616	35.0	140	0.6703	0.7969
0.6616	36.0	144	0.6545	0.8281
0.6616	37.0	148	0.6245	0.8516
0.6082	38.0	152	0.5651	0.8594
0.6082	39.0	156	0.4835	0.875
0.4548	40.0	160	0.4109	0.9062
0.4548	41.0	164	0.3606	0.875
0.4548	42.0	168	0.3454	0.8594
0.2218	43.0	172	0.3403	0.8594
0.2218	44.0	176	0.3537	0.8828
0.0892	45.0	180	0.4646	0.8516
0.0892	46.0	184	0.4402	0.875
0.0892	47.0	188	0.4719	0.8828
0.0254	48.0	192	0.5172	0.8828
0.0254	49.0	196	0.5613	0.8828
0.0105	50.0	200	0.6035	0.875
0.0105	51.0	204	0.6341	0.875
0.0105	52.0	208	0.6591	0.875
0.006	53.0	212	0.6804	0.875
0.006	54.0	216	0.6935	0.875
0.0041	55.0	220	0.7167	0.875
0.0041	56.0	224	0.7315	0.875
0.0041	57.0	228	0.7464	0.875
0.0032	58.0	232	0.7560	0.8594
0.0032	59.0	236	0.8753	0.8516
0.0098	60.0	240	0.9437	0.8438
0.0098	61.0	244	0.7740	0.8672
0.0098	62.0	248	0.7258	0.8828
0.0094	63.0	252	0.7815	0.8594
0.0094	64.0	256	0.7836	0.8516
0.0021	65.0	260	0.7854	0.8516
0.0021	66.0	264	0.7817	0.8594
0.0021	67.0	268	0.7698	0.8828
0.0019	68.0	272	0.7848	0.875
0.0019	69.0	276	0.7895	0.8828
0.0017	70.0	280	0.7971	0.8828
0.0017	71.0	284	0.8038	0.8828
0.0017	72.0	288	0.8091	0.8828
0.0014	73.0	292	0.8139	0.8828
0.0014	74.0	296	0.8183	0.8828
0.0014	75.0	300	0.8223	0.8828
0.0014	76.0	304	0.8274	0.8828
0.0014	77.0	308	0.8357	0.875
0.0012	78.0	312	0.8436	0.875
0.0012	79.0	316	0.8523	0.875
0.0012	80.0	320	0.8591	0.875
0.0012	81.0	324	0.8653	0.875
0.0012	82.0	328	0.8708	0.875
0.001	83.0	332	0.8271	0.8594
0.001	84.0	336	1.0450	0.8438
0.0012	85.0	340	1.1347	0.8281
0.0012	86.0	344	1.1696	0.8281
0.0012	87.0	348	0.8631	0.8672
0.0137	88.0	352	1.1491	0.8359
0.0137	89.0	356	1.0635	0.8516
0.0012	90.0	360	0.9027	0.875
0.0012	91.0	364	0.9503	0.8594
0.0012	92.0	368	1.0398	0.8281
0.0185	93.0	372	0.9044	0.875
0.0185	94.0	376	1.0978	0.8438
0.0009	95.0	380	0.9955	0.8672
0.0009	96.0	384	0.9313	0.875
0.0009	97.0	388	0.9295	0.875
0.0008	98.0	392	1.0927	0.8516
0.0008	99.0	396	0.9251	0.875
0.0007	100.0	400	0.9454	0.8594
0.0007	101.0	404	1.0023	0.8516
0.0007	102.0	408	1.0098	0.8516
0.0006	103.0	412	0.9944	0.8594
0.0006	104.0	416	0.9832	0.8516
0.0006	105.0	420	0.9090	0.8828
0.0006	106.0	424	1.2248	0.8359
0.0006	107.0	428	0.8722	0.8906
0.0197	108.0	432	0.8764	0.8828
0.0197	109.0	436	0.9771	0.875
0.0005	110.0	440	0.9871	0.875
0.0005	111.0	444	0.9235	0.875
0.0005	112.0	448	0.8418	0.8828
0.0005	113.0	452	0.8653	0.8906
0.0005	114.0	456	0.9098	0.8828
0.0005	115.0	460	0.9285	0.8828
0.0005	116.0	464	0.9443	0.875
0.0005	117.0	468	0.9584	0.8672
0.0005	118.0	472	0.9704	0.8672
0.0005	119.0	476	0.9805	0.8672
0.0004	120.0	480	0.9904	0.8672
0.0004	121.0	484	0.9920	0.8672
0.0004	122.0	488	0.9927	0.8672
0.0004	123.0	492	1.0015	0.8672
0.0004	124.0	496	1.0181	0.8672
0.0004	125.0	500	1.0289	0.8672
0.0004	126.0	504	1.0374	0.8672
0.0004	127.0	508	1.0408	0.8672
0.0004	128.0	512	1.0432	0.8672
0.0004	129.0	516	1.0472	0.8672
0.0003	130.0	520	1.0489	0.8672
0.0003	131.0	524	1.0497	0.8672
0.0003	132.0	528	1.0496	0.8672
0.0003	133.0	532	1.0497	0.8672
0.0003	134.0	536	1.0496	0.8672
0.0003	135.0	540	1.0492	0.8672
0.0003	136.0	544	1.0491	0.8672
0.0003	137.0	548	1.0482	0.8672
0.0003	138.0	552	1.0471	0.8672
0.0003	139.0	556	1.0456	0.8672
0.0003	140.0	560	1.0432	0.8672
0.0003	141.0	564	1.0411	0.8672
0.0003	142.0	568	1.0399	0.8672
0.0003	143.0	572	1.0398	0.8672
0.0003	144.0	576	1.0396	0.8672
0.0003	145.0	580	1.0393	0.8672
0.0003	146.0	584	1.0396	0.8672
0.0003	147.0	588	1.0400	0.8672
0.0003	148.0	592	1.0405	0.8672
0.0003	149.0	596	1.0409	0.8672
0.0003	150.0	600	1.0411	0.8672

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

roberta-base-sst-2-64-13

roberta-base-sst-2-64-13

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/roberta-base-sst-2-64-13

Evaluation results