Languages?
Can someone share the supported languages for this model?
The model was trained specifically for English language retrieval.
@lukemerrick Are there any plans to bring out a multilingual version? That would be very valuable! If you have recommendations for multilingual alternatives, those are welcome too.
What language(a) are you trying to support, @BramVanroy ? I don't think my team maintains any kind of public roadmap I can share, but we always appreciate user input!
@lukemerrick I'd be really happy to get some more powerful models for Dutch out there. I've been training a lot of generative models, but for retrieving/embedding we could really use new, high quality models such as Snowflake has provided for English!
By the way, the Netherlands Forensic Institute has released some datasets translated to Dutch in case anyone is looking for training data: https://huggingface.co/NetherlandsForensicInstitute