thorirhrafn
commited on
Commit
•
2e3f65e
1
Parent(s):
2a06130
End of training
Browse files
README.md
CHANGED
@@ -18,15 +18,15 @@ should probably proofread and complete it, then remove this comment. -->
|
|
18 |
|
19 |
This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
-
- Loss: 0.
|
22 |
-
- Rewards/chosen: 0.
|
23 |
-
- Rewards/rejected: -
|
24 |
- Rewards/accuracies: 1.0
|
25 |
-
- Rewards/margins:
|
26 |
-
- Logps/rejected: -
|
27 |
-
- Logps/chosen: -
|
28 |
-
- Logits/rejected: -2.
|
29 |
-
- Logits/chosen: -
|
30 |
|
31 |
## Model description
|
32 |
|
@@ -45,7 +45,7 @@ More information needed
|
|
45 |
### Training hyperparameters
|
46 |
|
47 |
The following hyperparameters were used during training:
|
48 |
-
- learning_rate:
|
49 |
- train_batch_size: 1
|
50 |
- eval_batch_size: 1
|
51 |
- seed: 42
|
@@ -59,26 +59,26 @@ The following hyperparameters were used during training:
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
|
83 |
|
84 |
### Framework versions
|
|
|
18 |
|
19 |
This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on an unknown dataset.
|
20 |
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.0068
|
22 |
+
- Rewards/chosen: -0.0291
|
23 |
+
- Rewards/rejected: -6.8852
|
24 |
- Rewards/accuracies: 1.0
|
25 |
+
- Rewards/margins: 6.8561
|
26 |
+
- Logps/rejected: -290.5968
|
27 |
+
- Logps/chosen: -127.3574
|
28 |
+
- Logits/rejected: -2.7556
|
29 |
+
- Logits/chosen: -2.9748
|
30 |
|
31 |
## Model description
|
32 |
|
|
|
45 |
### Training hyperparameters
|
46 |
|
47 |
The following hyperparameters were used during training:
|
48 |
+
- learning_rate: 1e-05
|
49 |
- train_batch_size: 1
|
50 |
- eval_batch_size: 1
|
51 |
- seed: 42
|
|
|
59 |
|
60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
61 |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
62 |
+
| 0.3903 | 0.1 | 25 | 0.2328 | 0.1244 | -1.3373 | 0.9933 | 1.4618 | -235.1181 | -125.8223 | -3.0887 | -3.2517 |
|
63 |
+
| 0.0561 | 0.2 | 50 | 0.0585 | 0.0159 | -3.4934 | 0.9933 | 3.5094 | -256.6789 | -126.9073 | -2.9094 | -3.1004 |
|
64 |
+
| 0.0267 | 0.3 | 75 | 0.0268 | -0.0626 | -4.9264 | 0.9967 | 4.8637 | -271.0085 | -127.6931 | -2.8143 | -3.0209 |
|
65 |
+
| 0.0141 | 0.4 | 100 | 0.0175 | -0.0535 | -5.4979 | 0.9967 | 5.4444 | -276.7235 | -127.6012 | -2.7755 | -2.9884 |
|
66 |
+
| 0.0105 | 0.5 | 125 | 0.0133 | -0.0686 | -5.9461 | 0.9967 | 5.8775 | -281.2056 | -127.7524 | -2.7592 | -2.9752 |
|
67 |
+
| 0.0093 | 0.6 | 150 | 0.0113 | -0.0582 | -6.1989 | 0.9967 | 6.1407 | -283.7333 | -127.6482 | -2.7644 | -2.9810 |
|
68 |
+
| 0.007 | 0.7 | 175 | 0.0097 | -0.0175 | -6.2570 | 1.0 | 6.2396 | -284.3148 | -127.2412 | -2.7683 | -2.9851 |
|
69 |
+
| 0.0085 | 0.79 | 200 | 0.0083 | 0.0050 | -6.4220 | 1.0 | 6.4270 | -285.9642 | -127.0162 | -2.7708 | -2.9884 |
|
70 |
+
| 0.0049 | 0.89 | 225 | 0.0079 | -0.0124 | -6.5942 | 1.0 | 6.5818 | -287.6865 | -127.1910 | -2.7644 | -2.9830 |
|
71 |
+
| 0.004 | 0.99 | 250 | 0.0076 | -0.0282 | -6.7093 | 1.0 | 6.6811 | -288.8376 | -127.3483 | -2.7587 | -2.9779 |
|
72 |
+
| 0.0028 | 1.09 | 275 | 0.0072 | -0.0372 | -6.7997 | 1.0 | 6.7625 | -289.7418 | -127.4389 | -2.7571 | -2.9763 |
|
73 |
+
| 0.005 | 1.19 | 300 | 0.0070 | -0.0326 | -6.8348 | 1.0 | 6.8022 | -290.0928 | -127.3927 | -2.7560 | -2.9754 |
|
74 |
+
| 0.0038 | 1.29 | 325 | 0.0069 | -0.0346 | -6.8482 | 1.0 | 6.8137 | -290.2270 | -127.4126 | -2.7557 | -2.9749 |
|
75 |
+
| 0.004 | 1.39 | 350 | 0.0069 | -0.0326 | -6.8612 | 1.0 | 6.8285 | -290.3561 | -127.3931 | -2.7556 | -2.9747 |
|
76 |
+
| 0.0032 | 1.49 | 375 | 0.0069 | -0.0328 | -6.8697 | 1.0 | 6.8370 | -290.4420 | -127.3942 | -2.7557 | -2.9750 |
|
77 |
+
| 0.0028 | 1.59 | 400 | 0.0069 | -0.0322 | -6.8743 | 1.0 | 6.8422 | -290.4877 | -127.3882 | -2.7558 | -2.9751 |
|
78 |
+
| 0.004 | 1.69 | 425 | 0.0067 | -0.0293 | -6.8746 | 1.0 | 6.8453 | -290.4905 | -127.3596 | -2.7557 | -2.9750 |
|
79 |
+
| 0.003 | 1.79 | 450 | 0.0067 | -0.0296 | -6.8840 | 1.0 | 6.8544 | -290.5845 | -127.3624 | -2.7553 | -2.9746 |
|
80 |
+
| 0.0028 | 1.89 | 475 | 0.0068 | -0.0285 | -6.8839 | 1.0 | 6.8554 | -290.5839 | -127.3521 | -2.7555 | -2.9748 |
|
81 |
+
| 0.0028 | 1.99 | 500 | 0.0068 | -0.0291 | -6.8852 | 1.0 | 6.8561 | -290.5968 | -127.3574 | -2.7556 | -2.9748 |
|
82 |
|
83 |
|
84 |
### Framework versions
|