theekshana commited on
Commit
a441dd7
1 Parent(s): 14ba587

Training complete

Browse files
README.md ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: albert-base-v2
4
+ tags:
5
+ - classification
6
+ - sentiment
7
+ - sinhala
8
+ - news data
9
+ - generated_from_trainer
10
+ model-index:
11
+ - name: sinhala_albert
12
+ results: []
13
+ ---
14
+
15
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
+ should probably proofread and complete it, then remove this comment. -->
17
+
18
+ # sinhala_albert
19
+
20
+ This model is a fine-tuned version of [albert-base-v2](https://huggingface.co/albert-base-v2) on an unknown dataset.
21
+ It achieves the following results on the evaluation set:
22
+ - Loss: 4.5337
23
+
24
+ ## Model description
25
+
26
+ More information needed
27
+
28
+ ## Intended uses & limitations
29
+
30
+ More information needed
31
+
32
+ ## Training and evaluation data
33
+
34
+ More information needed
35
+
36
+ ## Training procedure
37
+
38
+ ### Training hyperparameters
39
+
40
+ The following hyperparameters were used during training:
41
+ - learning_rate: 5e-05
42
+ - train_batch_size: 128
43
+ - eval_batch_size: 128
44
+ - seed: 42
45
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
+ - lr_scheduler_type: linear
47
+ - lr_scheduler_warmup_steps: 500
48
+ - num_epochs: 100
49
+
50
+ ### Training results
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss |
53
+ |:-------------:|:-----:|:----:|:---------------:|
54
+ | 1.0056 | 1.0 | 83 | 1.0130 |
55
+ | 0.9992 | 2.0 | 166 | 1.0021 |
56
+ | 0.9774 | 3.0 | 249 | 0.9789 |
57
+ | 0.9323 | 4.0 | 332 | 0.9695 |
58
+ | 0.863 | 5.0 | 415 | 0.9616 |
59
+ | 0.7944 | 6.0 | 498 | 0.9871 |
60
+ | 0.6328 | 7.0 | 581 | 1.0075 |
61
+ | 0.4705 | 8.0 | 664 | 1.4911 |
62
+ | 0.2834 | 9.0 | 747 | 1.5777 |
63
+ | 0.2278 | 10.0 | 830 | 1.6406 |
64
+ | 0.1751 | 11.0 | 913 | 1.7568 |
65
+ | 0.1657 | 12.0 | 996 | 1.7089 |
66
+ | 0.0974 | 13.0 | 1079 | 1.8463 |
67
+ | 0.1562 | 14.0 | 1162 | 1.9219 |
68
+ | 0.118 | 15.0 | 1245 | 1.9384 |
69
+ | 0.1044 | 16.0 | 1328 | 1.9971 |
70
+ | 0.0952 | 17.0 | 1411 | 2.1732 |
71
+ | 0.0877 | 18.0 | 1494 | 2.0566 |
72
+ | 0.0598 | 19.0 | 1577 | 2.4616 |
73
+ | 0.0762 | 20.0 | 1660 | 2.2672 |
74
+ | 0.1003 | 21.0 | 1743 | 2.3471 |
75
+ | 0.0627 | 22.0 | 1826 | 2.2526 |
76
+ | 0.0584 | 23.0 | 1909 | 2.7092 |
77
+ | 0.0679 | 24.0 | 1992 | 2.1629 |
78
+ | 0.0538 | 25.0 | 2075 | 2.5745 |
79
+ | 0.0723 | 26.0 | 2158 | 2.5667 |
80
+ | 0.0564 | 27.0 | 2241 | 2.4331 |
81
+ | 0.0662 | 28.0 | 2324 | 2.7811 |
82
+ | 0.0226 | 29.0 | 2407 | 2.8163 |
83
+ | 0.0224 | 30.0 | 2490 | 2.7452 |
84
+ | 0.0344 | 31.0 | 2573 | 2.6642 |
85
+ | 0.0519 | 32.0 | 2656 | 2.3490 |
86
+ | 0.0478 | 33.0 | 2739 | 2.7382 |
87
+ | 0.0436 | 34.0 | 2822 | 2.7556 |
88
+ | 0.0474 | 35.0 | 2905 | 2.5449 |
89
+ | 0.0355 | 36.0 | 2988 | 2.8280 |
90
+ | 0.0343 | 37.0 | 3071 | 2.9405 |
91
+ | 0.0283 | 38.0 | 3154 | 2.9870 |
92
+ | 0.0446 | 39.0 | 3237 | 3.0252 |
93
+ | 0.0288 | 40.0 | 3320 | 3.0866 |
94
+ | 0.0134 | 41.0 | 3403 | 3.1549 |
95
+ | 0.0328 | 42.0 | 3486 | 3.0168 |
96
+ | 0.0159 | 43.0 | 3569 | 2.8753 |
97
+ | 0.0155 | 44.0 | 3652 | 3.3455 |
98
+ | 0.0087 | 45.0 | 3735 | 3.4373 |
99
+ | 0.0296 | 46.0 | 3818 | 3.1949 |
100
+ | 0.0085 | 47.0 | 3901 | 3.1817 |
101
+ | 0.0048 | 48.0 | 3984 | 3.2233 |
102
+ | 0.0122 | 49.0 | 4067 | 3.5465 |
103
+ | 0.0024 | 50.0 | 4150 | 3.5276 |
104
+ | 0.0014 | 51.0 | 4233 | 3.5111 |
105
+ | 0.0121 | 52.0 | 4316 | 3.4483 |
106
+ | 0.0159 | 53.0 | 4399 | 3.8072 |
107
+ | 0.0027 | 54.0 | 4482 | 3.7448 |
108
+ | 0.0059 | 55.0 | 4565 | 3.9230 |
109
+ | 0.0083 | 56.0 | 4648 | 3.9245 |
110
+ | 0.0128 | 57.0 | 4731 | 3.8699 |
111
+ | 0.0116 | 58.0 | 4814 | 3.9957 |
112
+ | 0.0013 | 59.0 | 4897 | 3.8153 |
113
+ | 0.0013 | 60.0 | 4980 | 3.9093 |
114
+ | 0.0035 | 61.0 | 5063 | 4.0339 |
115
+ | 0.0028 | 62.0 | 5146 | 3.9929 |
116
+ | 0.0036 | 63.0 | 5229 | 4.1217 |
117
+ | 0.004 | 64.0 | 5312 | 4.0936 |
118
+ | 0.0014 | 65.0 | 5395 | 4.1109 |
119
+ | 0.0047 | 66.0 | 5478 | 4.1978 |
120
+ | 0.0005 | 67.0 | 5561 | 4.2320 |
121
+ | 0.0009 | 68.0 | 5644 | 4.2441 |
122
+ | 0.0027 | 69.0 | 5727 | 4.2670 |
123
+ | 0.0008 | 70.0 | 5810 | 4.2923 |
124
+ | 0.0013 | 71.0 | 5893 | 4.3101 |
125
+ | 0.0006 | 72.0 | 5976 | 4.3561 |
126
+ | 0.0024 | 73.0 | 6059 | 4.3419 |
127
+ | 0.0014 | 74.0 | 6142 | 4.3432 |
128
+ | 0.0011 | 75.0 | 6225 | 4.3302 |
129
+ | 0.0 | 76.0 | 6308 | 4.3439 |
130
+ | 0.0016 | 77.0 | 6391 | 4.3667 |
131
+ | 0.0026 | 78.0 | 6474 | 4.3803 |
132
+ | 0.0031 | 79.0 | 6557 | 4.3800 |
133
+ | 0.002 | 80.0 | 6640 | 4.3941 |
134
+ | 0.0008 | 81.0 | 6723 | 4.4071 |
135
+ | 0.0019 | 82.0 | 6806 | 4.4259 |
136
+ | 0.0013 | 83.0 | 6889 | 4.4436 |
137
+ | 0.0015 | 84.0 | 6972 | 4.4603 |
138
+ | 0.0009 | 85.0 | 7055 | 4.4706 |
139
+ | 0.0019 | 86.0 | 7138 | 4.4701 |
140
+ | 0.001 | 87.0 | 7221 | 4.4777 |
141
+ | 0.0007 | 88.0 | 7304 | 4.4905 |
142
+ | 0.0021 | 89.0 | 7387 | 4.4910 |
143
+ | 0.0012 | 90.0 | 7470 | 4.4959 |
144
+ | 0.0 | 91.0 | 7553 | 4.4990 |
145
+ | 0.0024 | 92.0 | 7636 | 4.5091 |
146
+ | 0.0031 | 93.0 | 7719 | 4.5115 |
147
+ | 0.0011 | 94.0 | 7802 | 4.5215 |
148
+ | 0.0 | 95.0 | 7885 | 4.5152 |
149
+ | 0.002 | 96.0 | 7968 | 4.5200 |
150
+ | 0.0013 | 97.0 | 8051 | 4.5293 |
151
+ | 0.0013 | 98.0 | 8134 | 4.5285 |
152
+ | 0.0023 | 99.0 | 8217 | 4.5339 |
153
+ | 0.002 | 100.0 | 8300 | 4.5337 |
154
+
155
+
156
+ ### Framework versions
157
+
158
+ - Transformers 4.41.0.dev0
159
+ - Pytorch 2.2.1+cu118
160
+ - Datasets 2.14.5
161
+ - Tokenizers 0.19.1
config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "albert-base-v2",
3
+ "architectures": [
4
+ "AlbertForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0,
7
+ "bos_token_id": 2,
8
+ "classifier_dropout_prob": 0.1,
9
+ "down_scale_factor": 1,
10
+ "embedding_size": 128,
11
+ "eos_token_id": 3,
12
+ "gap_size": 0,
13
+ "hidden_act": "gelu_new",
14
+ "hidden_dropout_prob": 0,
15
+ "hidden_size": 768,
16
+ "id2label": {
17
+ "0": "NEGATIVE",
18
+ "1": "NEUTRAL",
19
+ "2": "POSITIVE"
20
+ },
21
+ "initializer_range": 0.02,
22
+ "inner_group_num": 1,
23
+ "intermediate_size": 3072,
24
+ "label2id": {
25
+ "NEGATIVE": 0,
26
+ "NEUTRAL": 1,
27
+ "POSITIVE": 2
28
+ },
29
+ "layer_norm_eps": 1e-12,
30
+ "max_position_embeddings": 64,
31
+ "model_type": "albert",
32
+ "net_structure_type": 0,
33
+ "num_attention_heads": 12,
34
+ "num_hidden_groups": 1,
35
+ "num_hidden_layers": 12,
36
+ "num_memory_blocks": 0,
37
+ "pad_token_id": 0,
38
+ "position_embedding_type": "absolute",
39
+ "problem_type": "single_label_classification",
40
+ "torch_dtype": "float32",
41
+ "transformers_version": "4.41.0.dev0",
42
+ "type_vocab_size": 2,
43
+ "vocab_size": 30522
44
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b3a89aeca89da2c4f334d504b108fda2bf88acb25e08ea5eaa8d95b2e910215
3
+ size 47014252
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[UNK]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[PAD]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "mask_token": "[MASK]",
47
+ "model_max_length": 64,
48
+ "pad_token": "[PAD]",
49
+ "sep_token": "[SEP]",
50
+ "tokenizer_class": "PreTrainedTokenizerFast",
51
+ "unk_token": "[UNK]"
52
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af3e71946d6e7c2d83054508a1622cab64e2180e10893fd50c6f6c84a24fd33e
3
+ size 5048