Multi-lingual BERT Bengali Name Entity Recognition
mBERT-Bengali-NER
is a transformer-based Bengali NER model build with bert-base-multilingual-uncased model and Wikiann Datasets.
How to Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("sagorsarker/mbert-bengali-ner")
model = AutoModelForTokenClassification.from_pretrained("sagorsarker/mbert-bengali-ner")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True)
example = "আমি জাহিদ এবং আমি ঢাকায় বাস করি।"
ner_results = nlp(example)
print(ner_results)
Label and ID Mapping
Label ID | Label |
---|---|
0 | O |
1 | B-PER |
2 | I-PER |
3 | B-ORG |
4 | I-ORG |
5 | B-LOC |
6 | I-LOC |
Training Details
- mBERT-Bengali-NER trained with Wikiann datasets
- mBERT-Bengali-NER trained with transformers-token-classification script
- mBERT-Bengali-NER total trained 5 epochs.
- Trained in Kaggle GPU
Evaluation Results
Model | F1 | Precision | Recall | Accuracy | Loss |
---|---|---|---|---|---|
mBert-Bengali-NER | 0.97105 | 0.96769 | 0.97443 | 0.97682 | 0.12511 |
- Downloads last month
- 1,400
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.