File size: 18,867 Bytes
37f2773
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
license: apache-2.0
base_model: princeton-nlp/Sheared-LLaMA-1.3B
tags:
- generated_from_trainer
model-index:
- name: Sheared-LLaMA-1.3B-dpo-full-3-epoch-hydrox-safe
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Sheared-LLaMA-1.3B-dpo-full-3-epoch-hydrox-safe

This model is a fine-tuned version of [princeton-nlp/Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0041
- Rewards/chosen: 1.7270
- Rewards/rejected: -15.3712
- Rewards/accuracies: 0.9983
- Rewards/margins: 17.0982
- Logps/rejected: -656.3423
- Logps/chosen: -371.7201
- Logits/rejected: 2.3459
- Logits/chosen: 0.3641

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-07
- train_batch_size: 8
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.6612        | 0.03  | 100  | 0.6499          | 0.0765         | -0.0151          | 0.8300             | 0.0916          | -502.7813      | -388.2253    | 3.2379          | 0.9032        |
| 0.4585        | 0.07  | 200  | 0.4458          | 0.5224         | -0.1242          | 0.9301             | 0.6466          | -503.8723      | -383.7663    | 3.2494          | 0.9081        |
| 0.2519        | 0.1   | 300  | 0.2540          | 1.2036         | -0.4814          | 0.9470             | 1.6851          | -507.4445      | -376.9535    | 3.2790          | 0.9127        |
| 0.17          | 0.14  | 400  | 0.1751          | 1.5794         | -1.0033          | 0.9562             | 2.5827          | -512.6629      | -373.1959    | 3.3007          | 0.9173        |
| 0.1179        | 0.17  | 500  | 0.1215          | 1.8423         | -2.0791          | 0.9588             | 3.9214          | -523.4217      | -370.5673    | 3.2925          | 0.9104        |
| 0.1032        | 0.2   | 600  | 0.1078          | 2.0902         | -2.7647          | 0.9596             | 4.8549          | -530.2773      | -368.0876    | 3.2574          | 0.9180        |
| 0.0614        | 0.24  | 700  | 0.0881          | 2.2830         | -3.4190          | 0.9638             | 5.7021          | -536.8207      | -366.1595    | 3.2243          | 0.9190        |
| 0.0666        | 0.27  | 800  | 0.0751          | 2.3690         | -4.0591          | 0.9689             | 6.4281          | -543.2214      | -365.2995    | 3.1788          | 0.9025        |
| 0.0706        | 0.31  | 900  | 0.0662          | 2.4002         | -4.5254          | 0.9722             | 6.9257          | -547.8843      | -364.9874    | 3.1624          | 0.9102        |
| 0.0711        | 0.34  | 1000 | 0.0577          | 2.4230         | -4.9179          | 0.9764             | 7.3409          | -551.8096      | -364.7598    | 3.1467          | 0.9093        |
| 0.0623        | 0.37  | 1100 | 0.0572          | 2.4840         | -5.3620          | 0.9773             | 7.8459          | -556.2499      | -364.1504    | 3.1186          | 0.9011        |
| 0.0443        | 0.41  | 1200 | 0.0526          | 2.4237         | -5.4784          | 0.9798             | 7.9021          | -557.4146      | -364.7530    | 3.1196          | 0.8961        |
| 0.0416        | 0.44  | 1300 | 0.0477          | 2.3874         | -6.2247          | 0.9823             | 8.6120          | -564.8768      | -365.1163    | 3.0683          | 0.8720        |
| 0.0365        | 0.48  | 1400 | 0.0448          | 2.2887         | -6.8360          | 0.9806             | 9.1246          | -570.9899      | -366.1031    | 3.0491          | 0.8667        |
| 0.0341        | 0.51  | 1500 | 0.0442          | 2.2795         | -6.9547          | 0.9848             | 9.2343          | -572.1777      | -366.1945    | 3.0299          | 0.8500        |
| 0.0406        | 0.54  | 1600 | 0.0414          | 2.0896         | -7.0003          | 0.9848             | 9.0899          | -572.6334      | -368.0941    | 3.0437          | 0.8442        |
| 0.0427        | 0.58  | 1700 | 0.0387          | 2.0380         | -7.1141          | 0.9857             | 9.1521          | -573.7712      | -368.6102    | 3.0458          | 0.8383        |
| 0.0225        | 0.61  | 1800 | 0.0421          | 2.2150         | -7.1052          | 0.9891             | 9.3203          | -573.6826      | -366.8395    | 3.0443          | 0.8362        |
| 0.0298        | 0.65  | 1900 | 0.0364          | 2.0854         | -7.7136          | 0.9882             | 9.7990          | -579.7668      | -368.1361    | 3.0306          | 0.8392        |
| 0.0255        | 0.68  | 2000 | 0.0353          | 2.1351         | -7.6852          | 0.9907             | 9.8203          | -579.4824      | -367.6387    | 3.0204          | 0.8292        |
| 0.019         | 0.71  | 2100 | 0.0296          | 2.1215         | -8.1790          | 0.9916             | 10.3005         | -584.4203      | -367.7745    | 3.0052          | 0.8412        |
| 0.0198        | 0.75  | 2200 | 0.0248          | 2.1218         | -8.4302          | 0.9907             | 10.5520         | -586.9324      | -367.7719    | 2.9878          | 0.8183        |
| 0.0192        | 0.78  | 2300 | 0.0238          | 2.0950         | -8.2588          | 0.9924             | 10.3538         | -585.2184      | -368.0402    | 2.9758          | 0.7942        |
| 0.0191        | 0.82  | 2400 | 0.0213          | 2.1701         | -8.6399          | 0.9941             | 10.8101         | -589.0295      | -367.2885    | 2.9719          | 0.8049        |
| 0.0215        | 0.85  | 2500 | 0.0224          | 2.1220         | -9.1960          | 0.9933             | 11.3180         | -594.5902      | -367.7695    | 2.9391          | 0.7799        |
| 0.0579        | 0.88  | 2600 | 0.0193          | 2.0368         | -9.3428          | 0.9933             | 11.3796         | -596.0587      | -368.6217    | 2.9297          | 0.7933        |
| 0.0163        | 0.92  | 2700 | 0.0180          | 1.9057         | -9.4956          | 0.9941             | 11.4013         | -597.5867      | -369.9328    | 2.9114          | 0.7628        |
| 0.019         | 0.95  | 2800 | 0.0194          | 1.9915         | -9.4265          | 0.9933             | 11.4179         | -596.8949      | -369.0752    | 2.9223          | 0.7736        |
| 0.0166        | 0.99  | 2900 | 0.0182          | 2.0770         | -9.1954          | 0.9958             | 11.2724         | -594.5848      | -368.2201    | 2.9186          | 0.7592        |
| 0.0121        | 1.02  | 3000 | 0.0180          | 1.9094         | -9.4964          | 0.9941             | 11.4059         | -597.5947      | -369.8957    | 2.8957          | 0.7557        |
| 0.011         | 1.05  | 3100 | 0.0150          | 2.0009         | -9.9345          | 0.9966             | 11.9354         | -601.9758      | -368.9812    | 2.8560          | 0.7294        |
| 0.0106        | 1.09  | 3200 | 0.0139          | 2.0861         | -9.6153          | 0.9966             | 11.7014         | -598.7830      | -368.1290    | 2.8565          | 0.7071        |
| 0.0095        | 1.12  | 3300 | 0.0134          | 1.9755         | -10.3936         | 0.9958             | 12.3691         | -606.5661      | -369.2344    | 2.8290          | 0.7083        |
| 0.0115        | 1.16  | 3400 | 0.0129          | 1.9719         | -10.3851         | 0.9949             | 12.3569         | -606.4811      | -369.2712    | 2.8212          | 0.7184        |
| 0.0152        | 1.19  | 3500 | 0.0124          | 2.0357         | -10.2131         | 0.9958             | 12.2488         | -604.7615      | -368.6329    | 2.8217          | 0.7140        |
| 0.01          | 1.22  | 3600 | 0.0116          | 2.0147         | -10.9243         | 0.9966             | 12.9390         | -611.8733      | -368.8428    | 2.7589          | 0.6517        |
| 0.0135        | 1.26  | 3700 | 0.0116          | 1.9527         | -10.8649         | 0.9966             | 12.8176         | -611.2795      | -369.4628    | 2.8017          | 0.7064        |
| 0.0078        | 1.29  | 3800 | 0.0112          | 1.7362         | -11.5598         | 0.9966             | 13.2960         | -618.2281      | -371.6281    | 2.7623          | 0.6879        |
| 0.0114        | 1.33  | 3900 | 0.0106          | 1.8313         | -11.3667         | 0.9983             | 13.1980         | -616.2969      | -370.6765    | 2.7616          | 0.6728        |
| 0.0077        | 1.36  | 4000 | 0.0101          | 1.9160         | -11.5484         | 0.9992             | 13.4645         | -618.1147      | -369.8296    | 2.7534          | 0.6694        |
| 0.0057        | 1.39  | 4100 | 0.0098          | 1.8898         | -11.3187         | 0.9983             | 13.2085         | -615.8172      | -370.0915    | 2.7553          | 0.6617        |
| 0.0056        | 1.43  | 4200 | 0.0091          | 2.0721         | -11.6748         | 0.9992             | 13.7469         | -619.3782      | -368.2689    | 2.7234          | 0.6265        |
| 0.006         | 1.46  | 4300 | 0.0088          | 1.8416         | -12.1884         | 0.9983             | 14.0300         | -624.5148      | -370.5739    | 2.7058          | 0.6225        |
| 0.0071        | 1.5   | 4400 | 0.0083          | 2.0151         | -11.7393         | 0.9983             | 13.7544         | -620.0233      | -368.8386    | 2.7124          | 0.6231        |
| 0.0101        | 1.53  | 4500 | 0.0083          | 2.0864         | -11.5153         | 0.9992             | 13.6016         | -617.7830      | -368.1264    | 2.7206          | 0.6407        |
| 0.0054        | 1.56  | 4600 | 0.0083          | 1.9930         | -11.3424         | 0.9975             | 13.3354         | -616.0542      | -369.0597    | 2.7246          | 0.6099        |
| 0.0116        | 1.6   | 4700 | 0.0080          | 1.9298         | -11.3167         | 0.9975             | 13.2464         | -615.7971      | -369.6923    | 2.7200          | 0.6008        |
| 0.0116        | 1.63  | 4800 | 0.0074          | 1.8809         | -11.4685         | 0.9975             | 13.3494         | -617.3154      | -370.1813    | 2.6917          | 0.5698        |
| 0.0087        | 1.67  | 4900 | 0.0073          | 1.8993         | -11.8845         | 0.9983             | 13.7838         | -621.4749      | -369.9968    | 2.6861          | 0.5798        |
| 0.0031        | 1.7   | 5000 | 0.0072          | 1.8755         | -12.3032         | 0.9975             | 14.1787         | -625.6624      | -370.2348    | 2.6435          | 0.5411        |
| 0.0115        | 1.73  | 5100 | 0.0076          | 1.9283         | -11.9068         | 0.9958             | 13.8351         | -621.6979      | -369.7066    | 2.6527          | 0.5393        |
| 0.0065        | 1.77  | 5200 | 0.0074          | 1.9870         | -11.9105         | 0.9949             | 13.8975         | -621.7357      | -369.1199    | 2.6790          | 0.5763        |
| 0.006         | 1.8   | 5300 | 0.0068          | 1.7994         | -12.4601         | 0.9958             | 14.2595         | -627.2310      | -370.9959    | 2.6264          | 0.5393        |
| 0.0076        | 1.84  | 5400 | 0.0064          | 2.0449         | -12.2057         | 0.9966             | 14.2506         | -624.6871      | -368.5407    | 2.6409          | 0.5465        |
| 0.0042        | 1.87  | 5500 | 0.0062          | 1.9941         | -12.4399         | 0.9983             | 14.4340         | -627.0295      | -369.0491    | 2.6332          | 0.5433        |
| 0.0079        | 1.9   | 5600 | 0.0061          | 1.9119         | -12.4000         | 0.9983             | 14.3118         | -626.6300      | -369.8711    | 2.6300          | 0.5377        |
| 0.0066        | 1.94  | 5700 | 0.0062          | 2.0544         | -12.1682         | 0.9983             | 14.2226         | -624.3120      | -368.4457    | 2.6248          | 0.5288        |
| 0.0071        | 1.97  | 5800 | 0.0061          | 2.0943         | -12.2702         | 0.9975             | 14.3645         | -625.3325      | -368.0468    | 2.6248          | 0.5422        |
| 0.0021        | 2.01  | 5900 | 0.0057          | 1.9195         | -12.9348         | 0.9983             | 14.8543         | -631.9785      | -369.7946    | 2.5712          | 0.5186        |
| 0.0029        | 2.04  | 6000 | 0.0057          | 1.8384         | -13.3904         | 0.9983             | 15.2288         | -636.5340      | -370.6057    | 2.5405          | 0.4960        |
| 0.0035        | 2.07  | 6100 | 0.0056          | 1.6150         | -14.2858         | 0.9975             | 15.9009         | -645.4886      | -372.8395    | 2.4718          | 0.4415        |
| 0.0053        | 2.11  | 6200 | 0.0053          | 1.8268         | -13.9429         | 0.9983             | 15.7696         | -642.0590      | -370.7222    | 2.4921          | 0.4576        |
| 0.0044        | 2.14  | 6300 | 0.0052          | 1.9443         | -13.8117         | 0.9975             | 15.7560         | -640.7470      | -369.5464    | 2.5079          | 0.4705        |
| 0.0026        | 2.18  | 6400 | 0.0053          | 2.0456         | -13.7455         | 0.9975             | 15.7911         | -640.0853      | -368.5343    | 2.5139          | 0.4823        |
| 0.0026        | 2.21  | 6500 | 0.0050          | 2.0028         | -13.6496         | 0.9983             | 15.6524         | -639.1260      | -368.9618    | 2.5135          | 0.4823        |
| 0.0029        | 2.24  | 6600 | 0.0050          | 1.8856         | -13.7926         | 0.9975             | 15.6782         | -640.5563      | -370.1337    | 2.4828          | 0.4459        |
| 0.0023        | 2.28  | 6700 | 0.0049          | 1.9422         | -14.0760         | 0.9983             | 16.0182         | -643.3903      | -369.5678    | 2.4698          | 0.4471        |
| 0.003         | 2.31  | 6800 | 0.0048          | 1.8633         | -14.4649         | 0.9983             | 16.3282         | -647.2790      | -370.3570    | 2.4646          | 0.4562        |
| 0.0058        | 2.35  | 6900 | 0.0049          | 1.8085         | -14.8512         | 0.9975             | 16.6597         | -651.1427      | -370.9051    | 2.4275          | 0.4292        |
| 0.0032        | 2.38  | 7000 | 0.0048          | 1.9006         | -14.6340         | 0.9983             | 16.5346         | -648.9703      | -369.9842    | 2.4387          | 0.4425        |
| 0.0018        | 2.41  | 7100 | 0.0047          | 1.8215         | -15.0376         | 0.9983             | 16.8592         | -653.0067      | -370.7746    | 2.4153          | 0.4296        |
| 0.001         | 2.45  | 7200 | 0.0046          | 1.8195         | -15.0112         | 0.9983             | 16.8307         | -652.7422      | -370.7950    | 2.4153          | 0.4248        |
| 0.0057        | 2.48  | 7300 | 0.0045          | 1.8920         | -14.4156         | 0.9983             | 16.3077         | -646.7868      | -370.0694    | 2.4336          | 0.4234        |
| 0.004         | 2.52  | 7400 | 0.0044          | 1.7826         | -14.6522         | 0.9983             | 16.4348         | -649.1526      | -371.1638    | 2.4101          | 0.4117        |
| 0.0025        | 2.55  | 7500 | 0.0044          | 1.8202         | -14.7043         | 0.9983             | 16.5245         | -649.6732      | -370.7875    | 2.4040          | 0.4069        |
| 0.0035        | 2.58  | 7600 | 0.0044          | 1.8712         | -14.7562         | 0.9983             | 16.6273         | -650.1921      | -370.2782    | 2.4019          | 0.4087        |
| 0.002         | 2.62  | 7700 | 0.0043          | 1.8406         | -14.8610         | 0.9983             | 16.7017         | -651.2407      | -370.5836    | 2.3996          | 0.4114        |
| 0.002         | 2.65  | 7800 | 0.0043          | 1.8042         | -15.0820         | 0.9992             | 16.8862         | -653.4503      | -370.9484    | 2.3936          | 0.4147        |
| 0.0046        | 2.69  | 7900 | 0.0042          | 1.8043         | -15.2990         | 0.9983             | 17.1033         | -655.6204      | -370.9472    | 2.3757          | 0.3993        |
| 0.0025        | 2.72  | 8000 | 0.0042          | 1.8289         | -15.3097         | 0.9983             | 17.1386         | -655.7274      | -370.7011    | 2.3634          | 0.3853        |
| 0.0023        | 2.75  | 8100 | 0.0041          | 1.7995         | -15.2380         | 0.9983             | 17.0375         | -655.0099      | -370.9947    | 2.3619          | 0.3779        |
| 0.0025        | 2.79  | 8200 | 0.0040          | 1.8013         | -15.2440         | 0.9983             | 17.0453         | -655.0703      | -370.9769    | 2.3668          | 0.3827        |
| 0.002         | 2.82  | 8300 | 0.0040          | 1.8040         | -15.2101         | 0.9983             | 17.0141         | -654.7317      | -370.9499    | 2.3660          | 0.3834        |
| 0.0023        | 2.86  | 8400 | 0.0040          | 1.7441         | -15.3132         | 0.9983             | 17.0572         | -655.7621      | -371.5493    | 2.3498          | 0.3680        |
| 0.002         | 2.89  | 8500 | 0.0040          | 1.7551         | -15.3278         | 0.9983             | 17.0828         | -655.9080      | -371.4393    | 2.3509          | 0.3714        |
| 0.004         | 2.92  | 8600 | 0.0040          | 1.7500         | -15.3290         | 0.9983             | 17.0790         | -655.9205      | -371.4897    | 2.3518          | 0.3701        |
| 0.0041        | 2.96  | 8700 | 0.0040          | 1.7294         | -15.3645         | 0.9983             | 17.0940         | -656.2756      | -371.6956    | 2.3478          | 0.3660        |
| 0.0029        | 2.99  | 8800 | 0.0040          | 1.7305         | -15.3609         | 0.9983             | 17.0914         | -656.2390      | -371.6845    | 2.3464          | 0.3647        |


### Framework versions

- Transformers 4.35.0
- Pytorch 2.1.1+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1