Indo Spam Chatbot
Model Overview
Indo Spam Chatbot is a fine-tuned spam detection model based on the Gemma 2 2B architecture. This model is specifically designed for identifying spam messages in WhatsApp chatbot interactions. It has been fine-tuned using a dataset of 40,000 spam messages collected over a year. The dataset includes two labels:
- Spam
- Non-spam
The model supports detecting spam across multiple categories, such as:
- Offensive and abusive words
- Profane language
- Gibberish words and numbers
- Spam links
- And more
How To Use
Using this model becomes easy when you have transformers installed:
pip install -U transformers
Then you can use the model like this:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Spam sentence
sentences = ["adsfwcasdfad",
"kak bisa depo di link ini: http://dewa.site/dewa/dewi",
"p",
"1234"]
# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('kasyfilalbar/indo-spam-chatbot')
model = AutoModelForSequenceClassification.from_pretrained('kasyfilalbar/indo-spam-chatbot', device_map = "auto")
# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
with torch.no_grad():
encoded_input = encoded_input.to('cuda')
model_output = model(**encd_sent)
model_output = model_output.logits
label = torch.argmax(model_output, dim=1)
print(label.item())
REPOSITORY
for more info about the code, you could visit https://github.com/Kasyfil97/indo-spam-chatbot
- Downloads last month
- 18
Model tree for kasyfilalbar/indo-spam-chatbot
Base model
google/gemma-2-2b