YAML Metadata
Error:
"datasets[0]" with value "Oscar Corpus, News, Stories" is not valid. If possible, use a dataset id from https://hf.co/datasets.
Marathi DistilBERT
Model description
This model is an adaptation of DistilBERT (Victor Sanh et al., 2019) for Marathi language. This version of Marathi-DistilBERT is trained from scratch on approximately 11.2 million sentences.
DISCLAIMER
This model has not been thoroughly tested and may contain biased opinions or inappropriate language. User discretion is advised
Training data
The training data has been extracted from a variety of sources, mainly including:
- Oscar Corpus
- Marathi Newspapers
- Marathi storybooks and articles
The data is cleaned by removing all languages other than Marathi, while preserving common punctuations
Training procedure
The model is trained from scratch using an Adam optimizer with a learning rate of 1e-4 and default β1 and β2 values of 0.9 and 0.999 respectively with a total batch size of 256 on a v3-8 TPU and mask probability of 15%.
Example
from transformers import pipeline
fill_mask = pipeline(
"fill-mask",
model="DarshanDeshpande/marathi-distilbert",
tokenizer="DarshanDeshpande/marathi-distilbert",
)
fill_mask("हा खरोखर चांगला [MASK] आहे.")
BibTeX entry and citation info
@misc{sanh2020distilbert,
title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},
year={2020},
eprint={1910.01108},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Authors
1. Darshan Deshpande: GitHub, LinkedIn
2. Harshavardhan Abichandani: GitHub, LinkedIn
- Downloads last month
- 21
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.