baconnier's picture
Model save
0761718 verified
metadata
license: apache-2.0
library_name: peft
tags:
  - trl
  - orpo
  - unsloth
  - generated_from_trainer
base_model: cognitivecomputations/dolphin-2.9.1-yi-1.5-9b
model-index:
  - name: Gaston_dolphin-2.9.1-yi-1.5-9b
    results: []

Visualize in Weights & Biases

Gaston_dolphin-2.9.1-yi-1.5-9b

This model is a fine-tuned version of cognitivecomputations/dolphin-2.9.1-yi-1.5-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4290
  • Rewards/chosen: -0.0153
  • Rewards/rejected: -0.2895
  • Rewards/accuracies: 0.9985
  • Rewards/margins: 0.2742
  • Logps/rejected: -2.8952
  • Logps/chosen: -0.1528
  • Logits/rejected: -0.1534
  • Logits/chosen: 0.0002
  • Nll Loss: 0.4278
  • Log Odds Ratio: -0.0124
  • Log Odds Chosen: 4.8981

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected Logits/chosen Nll Loss Log Odds Ratio Log Odds Chosen
0.5193 0.1005 103 0.5159 -0.0187 -0.0825 0.9971 0.0638 -0.8248 -0.1866 0.1547 0.1467 0.5004 -0.1555 2.0327
0.4988 0.2009 206 0.4724 -0.0170 -0.1413 0.9985 0.1243 -1.4130 -0.1703 0.0154 -0.0134 0.4661 -0.0627 3.0432
0.4375 0.3014 309 0.4577 -0.0162 -0.1628 0.9985 0.1466 -1.6283 -0.1622 0.1372 0.1328 0.4530 -0.0467 3.3955
0.4738 0.4019 412 0.4463 -0.0160 -0.2198 0.9985 0.2038 -2.1980 -0.1596 -0.0220 0.0649 0.4438 -0.0250 4.0928
0.4893 0.5023 515 0.4406 -0.0159 -0.2499 0.9985 0.2341 -2.4993 -0.1585 -0.0720 0.0474 0.4388 -0.0185 4.4389
0.4565 0.6028 618 0.4357 -0.0157 -0.3289 0.9985 0.3133 -3.2895 -0.1566 -0.1392 0.0470 0.4347 -0.0093 5.2916
0.4069 0.7032 721 0.4324 -0.0154 -0.3096 0.9985 0.2942 -3.0962 -0.1544 -0.1833 -0.0044 0.4313 -0.0107 5.1028
0.4297 0.8037 824 0.4299 -0.0153 -0.2854 0.9985 0.2701 -2.8536 -0.1528 -0.1911 -0.0397 0.4286 -0.0129 4.8536
0.4437 0.9042 927 0.4290 -0.0153 -0.2895 0.9985 0.2742 -2.8952 -0.1528 -0.1534 0.0002 0.4278 -0.0124 4.8981

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1