xlm-roberta-base-ner-silvanus

This model is a fine-tuned version of xlm-roberta-base on the Indonesian NER dataset. It achieves the following results on the evaluation set:

Loss: 0.0567
Precision: 0.9189
Recall: 0.9273
F1: 0.9231
Accuracy: 0.9859

Model description

The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data.

Developed by: See associated paper
Model type: Multi-lingual model
Language(s) (NLP) or Countries (images): XLM-RoBERTa is a multilingual model trained on 100 different languages; see GitHub Repo for full list; model is fine-tuned on a dataset in English
License: More information needed
Related Models: RoBERTa, XLM
- Parent Model: XLM-RoBERTa
Resources for more information: GitHub Repo

Intended uses & limitations

This model can be used to extract multilingual information such as location, date and time on social media (Twitter, etc.). This model is limited by an Indonesian language training data set to be tested in 4 languages (English, Spanish, Italian and Slovak) using zero-shot transfer learning techniques to extract multilingual information.

Training and evaluation data

This model was fine-tuned on Indonesian NER datasets.

Abbreviation	Description
O	Outside of a named entity
B-LOC	Beginning of a location right after another location
I-LOC	Location
B-DAT	Beginning of a date right after another date
I-DAT	Date
B-TIM	Beginning of a time right after another time
I-TIM	Time

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.1394	1.0	827	0.0559	0.8808	0.9257	0.9027	0.9842
0.0468	2.0	1654	0.0575	0.9107	0.9190	0.9148	0.9849
0.0279	3.0	2481	0.0567	0.9189	0.9273	0.9231	0.9859

Framework versions

Transformers 4.35.0
Pytorch 2.1.0+cu118
Datasets 2.14.6
Tokenizers 0.14.1

programmersilvanus
/

ner-xlmr