File size: 2,021 Bytes
c21d87b
 
 
 
 
7edacbe
d77aeaf
2b8064c
5d97fb1
07d66e0
 
 
a1eff3f
 
919f81a
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
language: el
---

## gpt2-greek
## Dataset: 
The model is trained on a collection of almost 5GB Greek texts, with the main source to be from Greek Wikipedia. The content is extracted using the Wikiextractor tool (Attardi, 2012). The dataset is constructed as 5 sentences per sample (about 3.7 millions of samples) and the end of document is marked with the string <|endoftext|> providing the model with paragraph information, as done for the original GPT-2 training set by Radford . The input sentences are pre-processed and tokenized using 22,000 merges of byte-pair encoding. 

## Model:
The model is the "small" version of GPT-2 (12-layer, 768-hidden, 12-heads) with the only difference that the maximum sequence length is set at 512 tokens instead of 1024.

## Training details: 
It is trained from scratch a generative Transformer model as GPT-2 on a large corpus of Greek text so that the model can generate long stretches of contiguous coherent text. Attention dropouts with a rate of 0.1 are used for regularization on all layers and L2 weight decay of 0,01. In addition, a batch size of 4 and accumulated gradients over 8 iterations are used, resulting in an effective batch size of 32. The model uses the Adam optimization scheme with a learning rate of 1e-4 and is trained for 20 epochs. The learning rate increases linearly from zero over the first 9000 updates and decreases linearly by using a linear schedule. The implementation is based on the open-source PyTorch-transformer library (HuggingFace 2019).

## Cited in: 
    Alexandridis, G.; Varlamis, I.; Korovesis, K.; Caridakis, G.; Tsantilas, P. (2021). A Survey on Sentiment Analysis and Opinion Mining in Greek Social Media. Information, 12(8), 331. https://doi.org/10.3390/info12080331
    Aivatoglou, Georgios. (2022). Aspect-Based Sentiment Analysis in Greek Data. MSc Thesis, Aristotle University of Thessaloniki, Faculty of Sciences, School of Informatics, Intelligence Systems Lab. Supervising Professor: Dr. Ioannis Vlahavas. March 2022.