albarpambagio
commited on
Commit
•
93d5f34
1
Parent(s):
1c85ede
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,12 @@ widget:
|
|
9 |
model-index:
|
10 |
- name: distilbert-base-indonesian-finetuned-PRDECT-ID
|
11 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -16,59 +22,65 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
# distilbert-base-indonesian-finetuned-PRDECT-ID
|
18 |
|
19 |
-
This model is a fine-tuned version of [cahya/distilbert-base-indonesian](https://huggingface.co/cahya/distilbert-base-indonesian) on
|
20 |
-
|
|
|
21 |
|
22 |
-
## Model description
|
23 |
-
|
24 |
-
More information needed
|
25 |
-
|
26 |
-
## Intended uses & limitations
|
27 |
-
|
28 |
-
More information needed
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
-
|
|
|
|
|
|
|
33 |
|
34 |
-
## Training procedure
|
35 |
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
-
-
|
40 |
-
-
|
41 |
-
-
|
42 |
-
-
|
43 |
-
-
|
44 |
-
-
|
45 |
-
-
|
46 |
-
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
|
54 |
-
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
|
73 |
|
74 |
### Framework versions
|
@@ -76,4 +88,4 @@ The following hyperparameters were used during training:
|
|
76 |
- Transformers 4.41.2
|
77 |
- Pytorch 2.1.2
|
78 |
- Datasets 2.19.2
|
79 |
-
- Tokenizers 0.19.1
|
|
|
9 |
model-index:
|
10 |
- name: distilbert-base-indonesian-finetuned-PRDECT-ID
|
11 |
results: []
|
12 |
+
datasets:
|
13 |
+
- SEACrowd/prdect_id
|
14 |
+
language:
|
15 |
+
- id
|
16 |
+
metrics:
|
17 |
+
- perplexity
|
18 |
---
|
19 |
|
20 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
22 |
|
23 |
# distilbert-base-indonesian-finetuned-PRDECT-ID
|
24 |
|
25 |
+
This model is a fine-tuned version of [cahya/distilbert-base-indonesian](https://huggingface.co/cahya/distilbert-base-indonesian) on
|
26 |
+
[The PRDECT-ID Dataset] (https://www.kaggle.com/datasets/jocelyndumlao/prdect-id-indonesian-emotion-classification), it is a compilation of Indonesian product reviews that come with emotion and sentiment labels.
|
27 |
+
These reviews were gathered from one of Indonesia's largest e-commerce platforms, Tokopedia.
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Training and evaluation data
|
31 |
|
32 |
+
I split my dataframe `df` into training, validation, and testing sets (`train_df`, `val_df`, `test_df`)
|
33 |
+
using the `train_test_split` function from `sklearn.model_selection`.
|
34 |
+
I set the test size to 20% for the initial split and further divided the remaining data equally between validation and testing sets.
|
35 |
+
This process ensures that each split (`val_df` and `test_df`) maintains the same class distribution as the original dataset (`stratify=df['label']`).
|
36 |
|
|
|
37 |
|
38 |
### Training hyperparameters
|
39 |
|
40 |
The following hyperparameters were used during training:
|
41 |
+
- num_train_epochs: 5
|
42 |
+
- per_device_train_batch_size: 16
|
43 |
+
- per_device_eval_batch_size: 16
|
44 |
+
- warmup_steps: 500
|
45 |
+
- weight_decay: 0.01
|
46 |
+
- logging_dir: ./logs
|
47 |
+
- logging_steps: 10
|
48 |
+
- eval_strategy: epoch
|
49 |
+
- save_strategy: epoch
|
50 |
+
|
51 |
+
### Training and Evaluation Results
|
52 |
+
|
53 |
+
The following table summarizes the training and validation loss over the epochs:
|
54 |
+
|
55 |
+
| Epoch | Training Loss | Validation Loss |
|
56 |
+
|-------|----------------|-----------------|
|
57 |
+
| 1 | 0.000100 | 0.000062 |
|
58 |
+
| 2 | 0.000000 | 0.000038 |
|
59 |
+
| 3 | 0.000000 | 0.000025 |
|
60 |
+
| 4 | 0.000000 | 0.000017 |
|
61 |
+
| 5 | 0.000000 | 0.000014 |
|
62 |
+
|
63 |
+
Train output:
|
64 |
+
- global_step: 235
|
65 |
+
- training_loss: 3.9409913424219185e-05
|
66 |
+
- train_runtime: 44.6774
|
67 |
+
- train_samples_per_second: 83.04
|
68 |
+
- train_steps_per_second: 5.26
|
69 |
+
- total_flos: 122954683514880.0
|
70 |
+
- train_loss: 3.9409913424219185e-05
|
71 |
+
- epoch: 5.0
|
72 |
+
|
73 |
+
Evaluation:
|
74 |
+
- eval_loss: 1.3968576240586117e-05
|
75 |
+
- eval_runtime: 0.3321
|
76 |
+
- eval_samples_per_second: 270.973
|
77 |
+
- eval_steps_per_second: 18.065
|
78 |
+
- epoch: 5.0
|
79 |
+
|
80 |
+
Perplexity: 1.0000139686738017
|
81 |
+
|
82 |
+
These results indicate excellent model performance and generalization capabilities.
|
83 |
+
|
84 |
|
85 |
|
86 |
### Framework versions
|
|
|
88 |
- Transformers 4.41.2
|
89 |
- Pytorch 2.1.2
|
90 |
- Datasets 2.19.2
|
91 |
+
- Tokenizers 0.19.1
|