metadata
language:
- pt
This model was distilled from BERTimbau
Usage
from transformers import AutoTokenizer # Or BertTokenizer
from transformers import AutoModelForPreTraining # Or BertForPreTraining for loading pretraining heads
from transformers import AutoModel # or BertModel, for BERT without pretraining heads
model = AutoModelForPreTraining.from_pretrained('adalbertojunior/distilbert-portuguese-cased')
tokenizer = AutoTokenizer.from_pretrained('adalbertojunior/distilbert-portuguese-cased', do_lower_case=False)
You should fine tune it on your own data.
It can achieve accuracy up to 99% relative to the original BERTimbau in some tasks.
@misc {adalberto_ferreira_barbosa_junior_2024,
author = { {Adalberto Ferreira Barbosa Junior} },
title = { distilbert-portuguese-cased (Revision df1fa7a) },
year = 2024,
url = { https://huggingface.co/adalbertojunior/distilbert-portuguese-cased },
doi = { 10.57967/hf/3041 },
publisher = { Hugging Face }
}