--- library_name: transformers base_model: cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual tags: - generated_from_trainer metrics: - accuracy - f1 - precision - recall model-index: - name: democracy-sentiment-analysis-turkish-roberta results: [] license: mit language: - tr --- # democracy-sentiment-analysis-turkish-roberta This model is a fine-tuned version of [cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.4469 - Accuracy: 0.8184 - F1: 0.8186 - Precision: 0.8224 - Recall: 0.8184 ## Model description This model is fine-tuned from the base model cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual for sentiment analysis in Turkish, specifically focusing on democracy-related text. The model classifies texts into three sentiment categories: Positive Neutral Negative ## Intended uses & limitations This model is well-suited for analyzing sentiments in Turkish texts that discuss democracy, governance, and related political discourse. ## Training and evaluation data The training dataset consists of 30,000 rows gathered from various sources, including: Kaggle, Hugging Face, Ekşi Sözlük, and synthetic data generated using state-of-the-art LLMs. The dataset is multilingual in origin, with texts in English, Russian, and Turkish. All non-Turkish texts were translated into Turkish. The data represents a broad spectrum of democratic discourse from 30 different sources. ## How to Use To use this model for sentiment analysis, you can leverage the Hugging Face `pipeline` for text classification as shown below: ```python from transformers import pipeline # Load the model from Hugging Face sentiment_model = pipeline(model="yeniguno/democracy-sentiment-analysis-turkish-roberta", task='text-classification') # Example text input response = sentiment_model("En iyisi devletin tüm gücünü tek bir lidere verelim") # Print the result print(response) # [{'label': 'negative', 'score': 0.9617443084716797}] # Example text input response = sentiment_model("Birçok farklı sesin çıkması zaman alıcı ve karmaşık görünebilir, ancak demokrasinin getirdiği özgürlük ve çeşitlilik, toplumun gerçek gücüdür.") # Print the result print(response) # [{'label': 'positive', 'score': 0.958978533744812}] # Example text input response = sentiment_model("Bugün hava yağmurlu.") # Print the result print(response) # [{'label': 'neutral', 'score': 0.9915837049484253}] ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - num_epochs: 2 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:| | 0.7236 | 1.0 | 802 | 0.4797 | 0.8039 | 0.8031 | 0.8037 | 0.8039 | | 0.424 | 2.0 | 1604 | 0.4469 | 0.8184 | 0.8186 | 0.8224 | 0.8184 | ### Framework versions - Transformers 4.44.2 - Pytorch 2.4.0+cu121 - Datasets 2.21.0 - Tokenizers 0.19.1