metadata

tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: fresh-2-layer-medmcqa50000-distill-of-fresh-2-layer-mmlu_EVAL_mmlu
    results: []

fresh-2-layer-medmcqa50000-distill-of-fresh-2-layer-mmlu_EVAL_mmlu

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 178.1362
Accuracy: 0.4510

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 32
eval_batch_size: 32
seed: 321
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 5000

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.06	100	200.7935	0.24
No log	0.13	200	195.0955	0.322
No log	0.19	300	199.3146	0.348
No log	0.26	400	181.3482	0.378
141.8956	0.32	500	183.5053	0.406
141.8956	0.38	600	175.3492	0.414
141.8956	0.45	700	179.5743	0.44
141.8956	0.51	800	178.0992	0.456
141.8956	0.58	900	167.6717	0.458
92.2658	0.64	1000	173.9797	0.422
92.2658	0.7	1100	177.7031	0.44
92.2658	0.77	1200	176.5930	0.45
92.2658	0.83	1300	184.5445	0.45
92.2658	0.9	1400	180.6332	0.466
80.2568	0.96	1500	180.5694	0.462
80.2568	1.02	1600	173.9805	0.462
80.2568	1.09	1700	168.0511	0.46
80.2568	1.15	1800	177.9322	0.458
80.2568	1.22	1900	172.7217	0.462

Framework versions

Transformers 4.34.0.dev0
Pytorch 2.0.1+cu117
Datasets 2.14.5
Tokenizers 0.14.0