teamapocalypseml
/

regben2ipa-byt5small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

smji commited on Feb 29

Commit

20d4698

•

1 Parent(s): 65e91c7

Update README.md

Files changed (1) hide show

README.md +49 -0

README.md CHANGED Viewed

@@ -1,3 +1,52 @@
 ---
 license: mit
 ---

 ---
 license: mit
+language:
+- bn
+metrics:
+- wer
+- cer
+tags:
+- seq2seq
+- ipa
+- bengali
+- byt5
 ---
+# Regional bengali text to IPA transcription - umt5-base
+This is a fine-tuned version of the [umt5-base](https://huggingface.co/google/umt5-base) for the task of generating IPA transcriptions from regional bengali text.
+This was done on the dataset of the competition [“ভাষামূল: মুখের ভাষার খোঁজে“](https://www.kaggle.com/competitions/regipa/overview) by Bengali.AI.
+Best scores achieved in the leaderboards:
+- **Public score**: 0.01995
+- **Private score**: 0.02072
+## Loading & using the model
+```python
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("smji/ben2ipa-byt5small")
+model = AutoModelForSeq2SeqLM.from_pretrained("smji/ben2ipa-byt5small")
+"""
+  The format of the input text must be: <district> <bengali_text>
+"""
+text = "<Chittagong> bengali_text_here"
+text_ids = tokenizer(text, return_tensors='pt').input_ids
+model(text_ids)
+```
+## Using the pipeline
+```python
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+pipe = pipeline("text2text-generation", model="smji/ben2ipa-byt5small", device=device)
+```
+## Credits
+Done by [S M Jishanul Islam](https://github.com/S-M-J-I), [Sadia Ahmmed](https://github.com/sadia-ahmmed), [Sahid Hossain Mustakim](https://github.com/sratul35)