Multilingual capabilities?

by joujiboi - opened Sep 17

Sep 17

I have tried out the model for Japanese and in my small testing it understands the language fine. That said, it would be useful to know how much multilingual capabilities were prioritised in the creation of the dataset and tokenisation. For example, how much of the dataset is not English? What about just Japanese?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment