Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,8 @@ The model is the "small" version of GPT-2 (12-layer, 768-hidden, 12-heads) with
|
|
12 |
## Training details:
|
13 |
It is trained from scratch a generative Transformer model as GPT-2 on a large corpus of Greek text so that the model can generate long stretches of contiguous coherent text. Attention dropouts with a rate of 0.1 are used for regularization on all layers and L2 weight decay of 0,01. In addition, a batch size of 4 and accumulated gradients over 8 iterations are used, resulting in an effective batch size of 32. The model uses the Adam optimization scheme with a learning rate of 1e-4 and is trained for 20 epochs. The learning rate increases linearly from zero over the first 9000 updates and decreases linearly by using a linear schedule. The implementation is based on the open-source PyTorch-transformer library (HuggingFace 2019).
|
14 |
|
15 |
-
## Cited in:
|
|
|
16 |
- Alexandridis, G.; Varlamis, I.; Korovesis, K.; Caridakis, G.; Tsantilas, P. (2021). A Survey on Sentiment Analysis and Opinion Mining in Greek Social Media. Information, 12(8), 331. https://doi.org/10.3390/info12080331
|
17 |
- Aivatoglou, Georgios. (2022). Aspect-Based Sentiment Analysis in Greek Data. MSc Thesis, Aristotle University of Thessaloniki, Faculty of Sciences, School of Informatics, Intelligence Systems Lab. Supervising Professor: Dr. Ioannis Vlahavas. March 2022.
|
18 |
|
|
|
12 |
## Training details:
|
13 |
It is trained from scratch a generative Transformer model as GPT-2 on a large corpus of Greek text so that the model can generate long stretches of contiguous coherent text. Attention dropouts with a rate of 0.1 are used for regularization on all layers and L2 weight decay of 0,01. In addition, a batch size of 4 and accumulated gradients over 8 iterations are used, resulting in an effective batch size of 32. The model uses the Adam optimization scheme with a learning rate of 1e-4 and is trained for 20 epochs. The learning rate increases linearly from zero over the first 9000 updates and decreases linearly by using a linear schedule. The implementation is based on the open-source PyTorch-transformer library (HuggingFace 2019).
|
14 |
|
15 |
+
## Cited in:
|
16 |
+
- GKOLFOPOULOS, G.; VARLAMIS, I. Developing a news classifier for greek using bert. In: 2022 7th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM). IEEE, 2022. p. 1-6.
|
17 |
- Alexandridis, G.; Varlamis, I.; Korovesis, K.; Caridakis, G.; Tsantilas, P. (2021). A Survey on Sentiment Analysis and Opinion Mining in Greek Social Media. Information, 12(8), 331. https://doi.org/10.3390/info12080331
|
18 |
- Aivatoglou, Georgios. (2022). Aspect-Based Sentiment Analysis in Greek Data. MSc Thesis, Aristotle University of Thessaloniki, Faculty of Sciences, School of Informatics, Intelligence Systems Lab. Supervising Professor: Dr. Ioannis Vlahavas. March 2022.
|
19 |
|