|
--- |
|
license: mit |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
--- |
|
# Model Card for POLLCHECK/RoBERTa-classifier |
|
## Model Description |
|
This RoBERTa model has been fine-tuned for a binary classification task to determine whether statements are 0 OR "biased/ fake" or 1 OR "unbiased/ real". The model is based on the RoBERTa architecture, a robustly optimized BERT pretraining approach. |
|
|
|
## Intended Use |
|
Primary Use: This model is intended for the classification of textual statements into two categories: biased and unbiased. It is suitable for analyzing news articles, editorials, and opinion pieces. |
|
Users: This model can be used by data scientists, journalists, content moderators, and social media platforms to detect bias in text. |
|
## Model Details |
|
Architecture: The model uses the RoBERTa-base architecture. |
|
Training Data: The model was trained on a curated dataset comprising news articles, editorials, and opinion pieces labeled as biased or unbiased by domain experts. |
|
Performance Metrics |
|
|
|
## Usage |
|
|
|
- [Sample News Bias Dataset (CSV)](https://huggingface.co/POLLCHECK/RoBERTa-classifier/blob/main/News_Bias_Samples.csv) |
|
- [Inference Script for RoBERTa Classifier (Python)](https://huggingface.co/POLLCHECK/RoBERTa-classifier/blob/main/inference-roberta.py) |
|
|
|
```from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
|
|
model_name = "POLLCHECK/RoBERTa-classifier" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
|
|
texts = [ |
|
"Religious Extremists Threaten Our Way of Life.", |
|
"Public Health Officials are working." |
|
] |
|
for text in texts: |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
probabilities = torch.softmax(outputs.logits, dim=-1) |
|
predicted_label = "biased" if probabilities[0][0] > 0.5 else "unbiased" |
|
print(f"Text: {text}\nPredicted label: {predicted_label}") |
|
``` |
|
|
|
## Results |
|
|
|
The following table presents the evaluation metrics for each class along with macro averages: |
|
|
|
| Class | Precision | Recall | F1-Score | |
|
|--------------------|-----------|--------|----------| |
|
| Biased/ fake (0) | 0.93 | 0.96 | 0.94 | |
|
| Unbiased/ real (1) | 0.96 | 0.92 | 0.94 | |
|
| Macro Avg | 0.94 | 0.94 | 0.94 | |