@visheratin on Hugging Face: "Keep stacking cool stuff and getting better results! After I changed the…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

visheratin

posted an update Mar 10

Post

Keep stacking cool stuff and getting better results! After I changed the standard vision encoder to SigLIP, NLLB-CLIP got a 10% average performance improvement. And now, I added matryoshka layers (https://arxiv.org/abs/2205.13147) to enable smaller embeddings and got another 6% performance boost! Plus, thanks to MRL, 4.5x smaller embeddings retain 90%+ quality.

The large model is finally SoTA for both image and text multilingual retrieval!

The models are available on the hub:
- visheratin/nllb-siglip-mrl-base
- visheratin/nllb-siglip-mrl-large

Tom-Neverwinter

Mar 12

hmm, what happens if you throw moondream2 on?https://huggingface.co/vikhyatk/moondream2

visheratin

Mar 12

It uses the same vision encoder, so I expect that nothing changes.

In this post

visheratin Alexander Visheratin
Tom-Neverwinter s