thorirhrafn commited on
Commit
2e3f65e
1 Parent(s): 2a06130

End of training

Browse files
Files changed (1) hide show
  1. README.md +29 -29
README.md CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.0172
22
- - Rewards/chosen: 0.0550
23
- - Rewards/rejected: -5.0967
24
  - Rewards/accuracies: 1.0
25
- - Rewards/margins: 5.1517
26
- - Logps/rejected: -272.7118
27
- - Logps/chosen: -126.5166
28
- - Logits/rejected: -2.8059
29
- - Logits/chosen: -3.0134
30
 
31
  ## Model description
32
 
@@ -45,7 +45,7 @@ More information needed
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
- - learning_rate: 5e-06
49
  - train_batch_size: 1
50
  - eval_batch_size: 1
51
  - seed: 42
@@ -59,26 +59,26 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 0.5638 | 0.1 | 25 | 0.4588 | 0.0798 | -0.4775 | 0.9933 | 0.5572 | -226.5192 | -126.2689 | -3.1511 | -3.3013 |
63
- | 0.2438 | 0.2 | 50 | 0.2407 | 0.1391 | -1.2630 | 0.9967 | 1.4021 | -234.3750 | -125.6758 | -3.0948 | -3.2573 |
64
- | 0.1419 | 0.3 | 75 | 0.1251 | 0.1512 | -2.1203 | 0.9967 | 2.2715 | -242.9475 | -125.5544 | -3.0169 | -3.1907 |
65
- | 0.0637 | 0.4 | 100 | 0.0685 | 0.1313 | -3.0008 | 0.9967 | 3.1321 | -251.7525 | -125.7539 | -2.9258 | -3.1120 |
66
- | 0.0467 | 0.5 | 125 | 0.0435 | 0.0748 | -3.7561 | 0.9967 | 3.8309 | -259.3056 | -126.3189 | -2.8674 | -3.0627 |
67
- | 0.029 | 0.6 | 150 | 0.0326 | 0.0369 | -4.2568 | 0.9967 | 4.2937 | -264.3123 | -126.6974 | -2.8396 | -3.0402 |
68
- | 0.0248 | 0.7 | 175 | 0.0272 | 0.0298 | -4.5229 | 0.9967 | 4.5528 | -266.9736 | -126.7682 | -2.8248 | -3.0283 |
69
- | 0.0226 | 0.79 | 200 | 0.0233 | 0.0416 | -4.7048 | 0.9967 | 4.7463 | -268.7922 | -126.6510 | -2.8199 | -3.0251 |
70
- | 0.0149 | 0.89 | 225 | 0.0218 | 0.0346 | -4.8496 | 0.9967 | 4.8843 | -270.2410 | -126.7205 | -2.8109 | -3.0175 |
71
- | 0.0139 | 0.99 | 250 | 0.0204 | 0.0329 | -4.9460 | 0.9967 | 4.9789 | -271.2041 | -126.7377 | -2.8070 | -3.0145 |
72
- | 0.0106 | 1.09 | 275 | 0.0191 | 0.0298 | -5.0258 | 1.0 | 5.0556 | -272.0027 | -126.7688 | -2.8048 | -3.0123 |
73
- | 0.0166 | 1.19 | 300 | 0.0187 | 0.0372 | -5.0554 | 1.0 | 5.0926 | -272.2990 | -126.6948 | -2.8040 | -3.0115 |
74
- | 0.0123 | 1.29 | 325 | 0.0182 | 0.0438 | -5.0713 | 1.0 | 5.1151 | -272.4578 | -126.6287 | -2.8040 | -3.0117 |
75
- | 0.017 | 1.39 | 350 | 0.0178 | 0.0474 | -5.0755 | 1.0 | 5.1228 | -272.4991 | -126.5928 | -2.8055 | -3.0130 |
76
- | 0.0115 | 1.49 | 375 | 0.0177 | 0.0530 | -5.0809 | 1.0 | 5.1339 | -272.5540 | -126.5368 | -2.8064 | -3.0139 |
77
- | 0.0107 | 1.59 | 400 | 0.0175 | 0.0549 | -5.0879 | 1.0 | 5.1428 | -272.6239 | -126.5175 | -2.8059 | -3.0134 |
78
- | 0.0144 | 1.69 | 425 | 0.0174 | 0.0546 | -5.0923 | 1.0 | 5.1468 | -272.6672 | -126.5208 | -2.8063 | -3.0137 |
79
- | 0.0112 | 1.79 | 450 | 0.0175 | 0.0549 | -5.0935 | 1.0 | 5.1484 | -272.6794 | -126.5173 | -2.8062 | -3.0136 |
80
- | 0.0101 | 1.89 | 475 | 0.0175 | 0.0549 | -5.0958 | 1.0 | 5.1508 | -272.7028 | -126.5172 | -2.8061 | -3.0135 |
81
- | 0.011 | 1.99 | 500 | 0.0172 | 0.0550 | -5.0967 | 1.0 | 5.1517 | -272.7118 | -126.5166 | -2.8059 | -3.0134 |
82
 
83
 
84
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.0068
22
+ - Rewards/chosen: -0.0291
23
+ - Rewards/rejected: -6.8852
24
  - Rewards/accuracies: 1.0
25
+ - Rewards/margins: 6.8561
26
+ - Logps/rejected: -290.5968
27
+ - Logps/chosen: -127.3574
28
+ - Logits/rejected: -2.7556
29
+ - Logits/chosen: -2.9748
30
 
31
  ## Model description
32
 
 
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
+ - learning_rate: 1e-05
49
  - train_batch_size: 1
50
  - eval_batch_size: 1
51
  - seed: 42
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.3903 | 0.1 | 25 | 0.2328 | 0.1244 | -1.3373 | 0.9933 | 1.4618 | -235.1181 | -125.8223 | -3.0887 | -3.2517 |
63
+ | 0.0561 | 0.2 | 50 | 0.0585 | 0.0159 | -3.4934 | 0.9933 | 3.5094 | -256.6789 | -126.9073 | -2.9094 | -3.1004 |
64
+ | 0.0267 | 0.3 | 75 | 0.0268 | -0.0626 | -4.9264 | 0.9967 | 4.8637 | -271.0085 | -127.6931 | -2.8143 | -3.0209 |
65
+ | 0.0141 | 0.4 | 100 | 0.0175 | -0.0535 | -5.4979 | 0.9967 | 5.4444 | -276.7235 | -127.6012 | -2.7755 | -2.9884 |
66
+ | 0.0105 | 0.5 | 125 | 0.0133 | -0.0686 | -5.9461 | 0.9967 | 5.8775 | -281.2056 | -127.7524 | -2.7592 | -2.9752 |
67
+ | 0.0093 | 0.6 | 150 | 0.0113 | -0.0582 | -6.1989 | 0.9967 | 6.1407 | -283.7333 | -127.6482 | -2.7644 | -2.9810 |
68
+ | 0.007 | 0.7 | 175 | 0.0097 | -0.0175 | -6.2570 | 1.0 | 6.2396 | -284.3148 | -127.2412 | -2.7683 | -2.9851 |
69
+ | 0.0085 | 0.79 | 200 | 0.0083 | 0.0050 | -6.4220 | 1.0 | 6.4270 | -285.9642 | -127.0162 | -2.7708 | -2.9884 |
70
+ | 0.0049 | 0.89 | 225 | 0.0079 | -0.0124 | -6.5942 | 1.0 | 6.5818 | -287.6865 | -127.1910 | -2.7644 | -2.9830 |
71
+ | 0.004 | 0.99 | 250 | 0.0076 | -0.0282 | -6.7093 | 1.0 | 6.6811 | -288.8376 | -127.3483 | -2.7587 | -2.9779 |
72
+ | 0.0028 | 1.09 | 275 | 0.0072 | -0.0372 | -6.7997 | 1.0 | 6.7625 | -289.7418 | -127.4389 | -2.7571 | -2.9763 |
73
+ | 0.005 | 1.19 | 300 | 0.0070 | -0.0326 | -6.8348 | 1.0 | 6.8022 | -290.0928 | -127.3927 | -2.7560 | -2.9754 |
74
+ | 0.0038 | 1.29 | 325 | 0.0069 | -0.0346 | -6.8482 | 1.0 | 6.8137 | -290.2270 | -127.4126 | -2.7557 | -2.9749 |
75
+ | 0.004 | 1.39 | 350 | 0.0069 | -0.0326 | -6.8612 | 1.0 | 6.8285 | -290.3561 | -127.3931 | -2.7556 | -2.9747 |
76
+ | 0.0032 | 1.49 | 375 | 0.0069 | -0.0328 | -6.8697 | 1.0 | 6.8370 | -290.4420 | -127.3942 | -2.7557 | -2.9750 |
77
+ | 0.0028 | 1.59 | 400 | 0.0069 | -0.0322 | -6.8743 | 1.0 | 6.8422 | -290.4877 | -127.3882 | -2.7558 | -2.9751 |
78
+ | 0.004 | 1.69 | 425 | 0.0067 | -0.0293 | -6.8746 | 1.0 | 6.8453 | -290.4905 | -127.3596 | -2.7557 | -2.9750 |
79
+ | 0.003 | 1.79 | 450 | 0.0067 | -0.0296 | -6.8840 | 1.0 | 6.8544 | -290.5845 | -127.3624 | -2.7553 | -2.9746 |
80
+ | 0.0028 | 1.89 | 475 | 0.0068 | -0.0285 | -6.8839 | 1.0 | 6.8554 | -290.5839 | -127.3521 | -2.7555 | -2.9748 |
81
+ | 0.0028 | 1.99 | 500 | 0.0068 | -0.0291 | -6.8852 | 1.0 | 6.8561 | -290.5968 | -127.3574 | -2.7556 | -2.9748 |
82
 
83
 
84
  ### Framework versions