--- language: - as - bn - brx - doi - en - gom - gu - hi - kn - ks - kas - mai - ml - mr - mni - mnb - ne - or - pa - sa - sat - sd - snd - ta - te - ur language_details: >- asm_Beng, ben_Beng, brx_Deva, doi_Deva, eng_Latn, gom_Deva, guj_Gujr, hin_Deva, kan_Knda, kas_Arab, kas_Deva, mai_Deva, mal_Mlym, mar_Deva, mni_Beng, mni_Mtei, npi_Deva, ory_Orya, pan_Guru, san_Deva, sat_Olck, snd_Arab, snd_Deva, tam_Taml, tel_Telu, urd_Arab tags: - indictrans2 - translation - ai4bharat - multilingual license: mit datasets: - flores-200 - IN22-Gen - IN22-Conv metrics: - bleu - chrf - chrf++ - comet inference: false --- # IndicTrans2 This is the model card of IndicTrans2 Indic-En Distilled 200M variant. Please refer to [section 7.6: Distilled Models](https://openreview.net/forum?id=vfT4YuzAYA) in the TMLR submission for further details on model training, data and metrics. ### Usage Instructions Please refer to the [github repository](https://github.com/AI4Bharat/IndicTrans2/tree/main/huggingface_inference) for a detail description on how to use HF compatible IndicTrans2 models for inference. **Note: IndicTrans2 is not compatible with AutoTokenizer, therefore we provide [IndicTransTokenizer](https://github.com/VarunGumma/IndicTransTokenizer)** ### Citation If you consider using our work then please cite using: ``` @article{gala2023indictrans, title={IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages}, author={Jay Gala and Pranjal A Chitale and A K Raghavan and Varun Gumma and Sumanth Doddapaneni and Aswanth Kumar M and Janki Atul Nawale and Anupama Sujatha and Ratish Puduppully and Vivek Raghavan and Pratyush Kumar and Mitesh M Khapra and Raj Dabre and Anoop Kunchukuttan}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2023}, url={https://openreview.net/forum?id=vfT4YuzAYA}, note={} } ```