langtest's picture
End of training
896d50f verified
|
raw
history blame
2.53 kB
metadata
base_model: ybelkada/falcon-7b-sharded-bf16
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: falcon-7b-sharded-bf16-finetuned-mental-health-dsm5mistral
    results: []

falcon-7b-sharded-bf16-finetuned-mental-health-dsm5mistral

This model is a fine-tuned version of ybelkada/falcon-7b-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7448

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 200

Training results

Training Loss Epoch Step Validation Loss
2.1353 0.1028 10 2.2214
2.2465 0.2057 20 2.1292
1.8446 0.3085 30 2.0584
1.9796 0.4113 40 1.9319
1.6682 0.5141 50 2.1183
1.9888 0.6170 60 1.8794
1.7142 0.7198 70 1.8562
1.8108 0.8226 80 1.8650
1.7122 0.9254 90 1.8313
1.5926 1.0283 100 1.8228
1.6881 1.1311 110 1.8095
1.4018 1.2339 120 1.8231
1.7688 1.3368 130 1.7767
1.4843 1.4396 140 1.7667
1.5288 1.5424 150 1.7607
1.5946 1.6452 160 1.7498
1.3308 1.7481 170 1.7463
1.6712 1.8509 180 1.7454
1.3342 1.9537 190 1.7449
1.5714 2.0566 200 1.7448

Framework versions

  • PEFT 0.13.1.dev0
  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0