File size: 912 Bytes
3f74e06
 
 
 
 
 
b73cd62
 
 
 
 
3f74e06
 
9745c75
3f74e06
f779937
 
 
 
 
 
 
 
 
 
 
 
 
 
 
297b087
edb4601
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
datasets:
- hatexplain
language:
- en
pipeline_tag: text-classification
metrics:
- accuracy
- f1
- precision
- recall
---
# BERT for hate speech classification
The model is based on BERT and used for classifying a text as **toxic** and **non-toxic**. It achieved an **F1** score of **0.81** and an **Accuracy** of **0.77**.

The model was fine-tuned on the HateXplain dataset found here: https://huggingface.co/datasets/hatexplain

## How to use

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('tum-nlp/bert-hateXplain')
model = AutoModelForSequenceClassification.from_pretrained('tum-nlp/bert-hateXplain')

# Create the pipeline for classification
hate_classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Predict
hate_classifier("I like you. I love you")

```