---
license: apache-2.0
base_model: alignment-handbook/zephyr-7b-sft-full
tags:
- generated_from_trainer
model-index:
- name: base-sft-safe-spin-v
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# base-sft-safe-spin-v

This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0738
- Rewards/real: -3.0711
- Rewards/generated: -13.2471
- Rewards/accuracies: 0.9713
- Rewards/margins: 10.1760
- Logps/generated: -228.7879
- Logps/real: -165.3767
- Logits/generated: -2.4198
- Logits/real: -2.4231

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real |
|:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------------:|:------------------:|:---------------:|:---------------:|:----------:|:----------------:|:-----------:|
| 0.3742        | 0.06  | 100  | 0.2244          | -0.3695      | -6.6880           | 0.9658             | 6.3185          | -163.1966       | -138.3603  | -2.7435          | -2.7148     |
| 0.2528        | 0.12  | 200  | 0.1319          | -1.2400      | -17.8536          | 0.9697             | 16.6136         | -274.8525       | -147.0654  | -2.4573          | -2.4671     |
| 0.2066        | 0.17  | 300  | 0.1172          | -1.6714      | -19.7358          | 0.9618             | 18.0643         | -293.6746       | -151.3799  | -2.4257          | -2.3622     |
| 0.2207        | 0.23  | 400  | 0.1094          | -1.9426      | -20.6733          | 0.9729             | 18.7307         | -303.0500       | -154.0918  | -2.4889          | -2.4525     |
| 0.4379        | 0.29  | 500  | 0.1152          | -1.0002      | -8.3421           | 0.9666             | 7.3419          | -179.7377       | -144.6674  | -2.3870          | -2.3441     |
| 0.1517        | 0.35  | 600  | 0.0984          | -1.6577      | -12.9237          | 0.9745             | 11.2660         | -225.5533       | -151.2425  | -2.2691          | -2.2742     |
| 0.1708        | 0.41  | 700  | 0.0866          | -1.9495      | -14.1941          | 0.9745             | 12.2446         | -238.2574       | -154.1605  | -2.2343          | -2.2124     |
| 0.1135        | 0.47  | 800  | 0.0810          | -3.0171      | -16.4497          | 0.9785             | 13.4327         | -260.8139       | -164.8361  | -2.1789          | -2.1987     |
| 0.1364        | 0.52  | 900  | 0.0848          | -2.5549      | -14.8091          | 0.9729             | 12.2542         | -244.4078       | -160.2151  | -2.3295          | -2.3368     |
| 0.1142        | 0.58  | 1000 | 0.0902          | -2.6698      | -10.6438          | 0.9713             | 7.9740          | -202.7553       | -161.3638  | -2.4644          | -2.4787     |
| 0.1332        | 0.64  | 1100 | 0.0771          | -2.7436      | -11.8738          | 0.9785             | 9.1302          | -215.0552       | -162.1016  | -2.4417          | -2.4630     |
| 0.1007        | 0.7   | 1200 | 0.0758          | -3.4115      | -14.1899          | 0.9745             | 10.7784         | -238.2156       | -168.7807  | -2.3948          | -2.4255     |
| 0.1306        | 0.76  | 1300 | 0.0765          | -2.4042      | -11.1062          | 0.9753             | 8.7019          | -207.3786       | -158.7081  | -2.5270          | -2.5375     |
| 0.1084        | 0.81  | 1400 | 0.0760          | -2.7805      | -12.4025          | 0.9745             | 9.6220          | -220.3422       | -162.4709  | -2.4762          | -2.4848     |
| 0.1494        | 0.87  | 1500 | 0.0740          | -3.0055      | -13.0014          | 0.9713             | 9.9959          | -226.3309       | -164.7203  | -2.4656          | -2.4751     |
| 0.1099        | 0.93  | 1600 | 0.0774          | -3.4971      | -13.6736          | 0.9729             | 10.1765         | -233.0532       | -169.6366  | -2.4253          | -2.4320     |
| 0.0906        | 0.99  | 1700 | 0.0738          | -3.0711      | -13.2471          | 0.9713             | 10.1760         | -228.7879       | -165.3767  | -2.4198          | -2.4231     |


### Framework versions

- Transformers 4.37.0
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2