File size: 912 Bytes
3f74e06 b73cd62 3f74e06 9745c75 3f74e06 f779937 297b087 edb4601 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
---
datasets:
- hatexplain
language:
- en
pipeline_tag: text-classification
metrics:
- accuracy
- f1
- precision
- recall
---
# BERT for hate speech classification
The model is based on BERT and used for classifying a text as **toxic** and **non-toxic**. It achieved an **F1** score of **0.81** and an **Accuracy** of **0.77**.
The model was fine-tuned on the HateXplain dataset found here: https://huggingface.co/datasets/hatexplain
## How to use
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('tum-nlp/bert-hateXplain')
model = AutoModelForSequenceClassification.from_pretrained('tum-nlp/bert-hateXplain')
# Create the pipeline for classification
hate_classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
# Predict
hate_classifier("I like you. I love you")
``` |