Edit model card

Saiga_timelist_task200steps

This model is a fine-tuned version of TheBloke/Llama-2-7B-fp16 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4521

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 20
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 200

Training results

Training Loss Epoch Step Validation Loss
2.2298 0.37 2 2.2020
2.0975 0.74 4 2.1478
2.0243 1.11 6 2.1123
1.988 1.48 8 2.0857
1.9585 1.85 10 2.0692
1.883 2.22 12 2.0570
1.9078 2.59 14 2.0477
1.9179 2.96 16 2.0408
1.8663 3.33 18 2.0366
1.8191 3.7 20 2.0325
1.8515 4.07 22 2.0280
1.8189 4.44 24 2.0246
1.8478 4.81 26 2.0215
1.7767 5.19 28 2.0198
1.7685 5.56 30 2.0190
1.7895 5.93 32 2.0189
1.7285 6.3 34 2.0191
1.7609 6.67 36 2.0174
1.7138 7.04 38 2.0156
1.7112 7.41 40 2.0187
1.7029 7.78 42 2.0216
1.6787 8.15 44 2.0203
1.646 8.52 46 2.0243
1.5996 8.89 48 2.0294
1.6838 9.26 50 2.0280
1.6057 9.63 52 2.0254
1.574 10.0 54 2.0310
1.51 10.37 56 2.0547
1.5951 10.74 58 2.0420
1.5455 11.11 60 2.0350
1.5424 11.48 62 2.0612
1.4933 11.85 64 2.0652
1.5766 12.22 66 2.0537
1.4453 12.59 68 2.0732
1.4683 12.96 70 2.0763
1.4734 13.33 72 2.0805
1.4314 13.7 74 2.0908
1.3921 14.07 76 2.0815
1.4099 14.44 78 2.1134
1.4389 14.81 80 2.0955
1.3114 15.19 82 2.1153
1.3093 15.56 84 2.1303
1.3984 15.93 86 2.1246
1.2831 16.3 88 2.1564
1.2971 16.67 90 2.1284
1.3052 17.04 92 2.1608
1.2421 17.41 94 2.1556
1.1835 17.78 96 2.1734
1.283 18.15 98 2.1773
1.2311 18.52 100 2.1992
1.2428 18.89 102 2.1954
1.1959 19.26 104 2.2065
1.2376 19.63 106 2.2124
1.0689 20.0 108 2.2266
1.1471 20.37 110 2.2266
1.0068 20.74 112 2.2451
1.161 21.11 114 2.2501
1.1252 21.48 116 2.2579
1.0683 21.85 118 2.2595
1.1279 22.22 120 2.2904
0.9923 22.59 122 2.2693
1.0139 22.96 124 2.3008
0.9924 23.33 126 2.3036
1.0418 23.7 128 2.3277
1.0463 24.07 130 2.3043
1.0556 24.44 132 2.3262
0.9991 24.81 134 2.3299
0.96 25.19 136 2.3481
0.9677 25.56 138 2.3458
0.9107 25.93 140 2.3607
0.8962 26.3 142 2.3644
0.916 26.67 144 2.3700
0.9284 27.04 146 2.3726
0.99 27.41 148 2.3860
0.8308 27.78 150 2.3918
0.9459 28.15 152 2.3971
0.9283 28.52 154 2.4030
0.863 28.89 156 2.4024
0.9068 29.26 158 2.4083
0.8623 29.63 160 2.4179
0.8359 30.0 162 2.4262
0.953 30.37 164 2.4281
0.7937 30.74 166 2.4381
0.8274 31.11 168 2.4255
0.8862 31.48 170 2.4330
0.7913 31.85 172 2.4511
0.8436 32.22 174 2.4522
0.8519 32.59 176 2.4413
0.8089 32.96 178 2.4371
0.8876 33.33 180 2.4434
0.7836 33.7 182 2.4532
0.8232 34.07 184 2.4566
0.8299 34.44 186 2.4582
0.7977 34.81 188 2.4553
0.8635 35.19 190 2.4522
0.883 35.56 192 2.4518
0.8158 35.93 194 2.4513
0.8732 36.3 196 2.4518
0.8112 36.67 198 2.4522
0.7869 37.04 200 2.4521

Framework versions

  • PEFT 0.10.0
  • Transformers 4.39.3
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for marcus2000/Saiga_timelist_task200steps

Adapter
this model