Casper0508's picture
End of training
336172d verified
|
raw
history blame
3.15 kB
metadata
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - generated_from_trainer
model-index:
  - name: MSc_llama3_finetuned_model_secondData
    results: []
library_name: peft

MSc_llama3_finetuned_model_secondData

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7658

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • _load_in_8bit: False
  • _load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: bfloat16
  • load_in_4bit: True
  • load_in_8bit: False

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 250

Training results

Training Loss Epoch Step Validation Loss
3.7986 1.36 10 3.3486
2.781 2.71 20 1.9851
1.6096 4.07 30 1.3075
1.2107 5.42 40 1.1210
1.0597 6.78 50 1.0222
0.9672 8.14 60 0.9562
0.8924 9.49 70 0.9131
0.8189 10.85 80 0.8582
0.7393 12.2 90 0.7907
0.6355 13.56 100 0.7136
0.5683 14.92 110 0.7013
0.533 16.27 120 0.7011
0.5155 17.63 130 0.7049
0.4965 18.98 140 0.7194
0.4826 20.34 150 0.7222
0.4617 21.69 160 0.7294
0.453 23.05 170 0.7347
0.439 24.41 180 0.7418
0.4333 25.76 190 0.7473
0.4261 27.12 200 0.7600
0.4238 28.47 210 0.7580
0.4163 29.83 220 0.7646
0.4158 31.19 230 0.7659
0.4137 32.54 240 0.7662
0.4131 33.9 250 0.7658

Framework versions

  • PEFT 0.4.0
  • Transformers 4.38.2
  • Pytorch 2.3.1+cu121
  • Datasets 2.13.1
  • Tokenizers 0.15.2