learn3r commited on
Commit
527d67f
1 Parent(s): a8152ba

Model save

Browse files
README.md CHANGED
@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Loss: 3.3318
17
 
18
  ## Model description
19
 
@@ -40,37 +40,42 @@ The following hyperparameters were used during training:
40
  - total_train_batch_size: 256
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: constant
43
- - num_epochs: 20.0
44
 
45
  ### Training results
46
 
47
  | Training Loss | Epoch | Step | Validation Loss |
48
  |:-------------:|:-----:|:----:|:---------------:|
49
- | 0.1033 | 0.97 | 14 | 3.1040 |
50
- | 0.0836 | 1.95 | 28 | 3.0540 |
51
- | 0.0717 | 2.99 | 43 | 2.9414 |
52
- | 0.0614 | 3.97 | 57 | 3.0238 |
53
- | 0.1275 | 4.94 | 71 | 2.8326 |
54
- | 0.0511 | 5.98 | 86 | 3.0479 |
55
- | 0.0666 | 6.96 | 100 | 3.1255 |
56
- | 0.0398 | 8.0 | 115 | 3.2240 |
57
- | 0.0396 | 8.97 | 129 | 3.1667 |
58
- | 0.0466 | 9.95 | 143 | 3.2775 |
59
- | 0.043 | 10.99 | 158 | 3.3289 |
60
- | 0.0538 | 11.97 | 172 | 2.8202 |
61
- | 0.028 | 12.94 | 186 | 3.4366 |
62
- | 0.1056 | 13.98 | 201 | 3.3447 |
63
- | 0.0303 | 14.96 | 215 | 3.0069 |
64
- | 0.0234 | 16.0 | 230 | 3.3524 |
65
- | 0.0263 | 16.97 | 244 | 3.2473 |
66
- | 0.0225 | 17.95 | 258 | 3.3365 |
67
- | 0.0225 | 18.99 | 273 | 3.4389 |
68
- | 0.0211 | 19.48 | 280 | 3.3318 |
 
 
 
 
 
69
 
70
 
71
  ### Framework versions
72
 
73
- - Transformers 4.36.2
74
- - Pytorch 2.1.2+cu121
75
- - Datasets 2.16.1
76
- - Tokenizers 0.15.0
 
13
 
14
  This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 3.6048
17
 
18
  ## Model description
19
 
 
40
  - total_train_batch_size: 256
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: constant
43
+ - num_epochs: 25.0
44
 
45
  ### Training results
46
 
47
  | Training Loss | Epoch | Step | Validation Loss |
48
  |:-------------:|:-----:|:----:|:---------------:|
49
+ | 0.1855 | 0.97 | 14 | 2.5320 |
50
+ | 0.1635 | 1.95 | 28 | 2.4299 |
51
+ | 0.1272 | 2.99 | 43 | 2.9443 |
52
+ | 0.1113 | 3.97 | 57 | 2.8813 |
53
+ | 0.0819 | 4.94 | 71 | 3.0005 |
54
+ | 0.0782 | 5.98 | 86 | 3.0224 |
55
+ | 0.0588 | 6.96 | 100 | 3.1903 |
56
+ | 0.0729 | 8.0 | 115 | 2.5871 |
57
+ | 0.0473 | 8.97 | 129 | 3.2830 |
58
+ | 0.113 | 9.95 | 143 | 3.3443 |
59
+ | 0.0364 | 10.99 | 158 | 3.3243 |
60
+ | 0.0321 | 11.97 | 172 | 3.3962 |
61
+ | 0.0302 | 12.94 | 186 | 3.4508 |
62
+ | 0.0717 | 13.98 | 201 | 3.4166 |
63
+ | 0.0746 | 14.96 | 215 | 2.8975 |
64
+ | 0.0548 | 16.0 | 230 | 3.0853 |
65
+ | 0.0507 | 16.97 | 244 | 3.0706 |
66
+ | 0.0442 | 17.95 | 258 | 3.2759 |
67
+ | 0.0396 | 18.99 | 273 | 3.1962 |
68
+ | 0.0351 | 19.97 | 287 | 3.3108 |
69
+ | 0.0306 | 20.94 | 301 | 3.2607 |
70
+ | 0.0267 | 21.98 | 316 | 3.4015 |
71
+ | 0.1454 | 22.96 | 330 | 2.6912 |
72
+ | 0.0252 | 24.0 | 345 | 3.4576 |
73
+ | 0.0187 | 24.35 | 350 | 3.6048 |
74
 
75
 
76
  ### Framework versions
77
 
78
+ - Transformers 4.38.1
79
+ - Pytorch 2.2.1+cu121
80
+ - Datasets 2.17.1
81
+ - Tokenizers 0.15.2
generation_config.json CHANGED
@@ -1,7 +1,6 @@
1
  {
2
- "_from_model_config": true,
3
  "decoder_start_token_id": 0,
4
  "eos_token_id": 1,
5
  "pad_token_id": 0,
6
- "transformers_version": "4.36.2"
7
  }
 
1
  {
 
2
  "decoder_start_token_id": 0,
3
  "eos_token_id": 1,
4
  "pad_token_id": 0,
5
+ "transformers_version": "4.38.1"
6
  }
model-00001-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:253bc6cc1a92ccf1ec601524f6a3740c377bcbbb66809644c4f22ea7f77395e4
3
  size 4995019280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6bb450ec4acf29b32eba2932af64f13fc455797b46e2f58598a7f2ed041d11cf
3
  size 4995019280
model-00002-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d6259da6f8312150f4d51abfd0b32476ed26eb77c95a8e30debb9222715392d9
3
  size 4974952864
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06a9423b392ff84a80006cefff5b99bc835cf2ae09b97f6e3170bb49b3cc99bf
3
  size 4974952864
model-00003-of-00003.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:161d5f5b13f26bee15144c0e8a9a283270652daae8ad32d21e4ce82a4f70c937
3
  size 1429331352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff58c21b0aa42dd33b4dc63df5fcb9e2c50c9352a087534f2eeaecca6048aa54
3
  size 1429331352