view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 8 days ago • 94
Marqo-Ecommerce-Embeddings Collection State-of-the-art embedding models fine-tuned for the ecommerce domain. +67% increase in evaluation metrics vs ViT-B-16-SigLIP. • 10 items • Updated 7 days ago • 16
view article Article PyTorchModelHubMixin: Bridging the Gap for Custom AI Models on Hugging Face By not-lain • 10 days ago • 11
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 14 days ago • 48
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 161
view article Article Recipe: Preparing Multilingual Speech Datasets for TTS Training By PHBJT • 17 days ago • 14
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated 21 days ago • 16
VidToMe: Video Token Merging for Zero-Shot Video Editing Paper • 2312.10656 • Published Dec 17, 2023 • 10
view article Article Hugging Face welcomes the Aya Expanse family of multilingual models By ariG23498 • 28 days ago • 10
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python about 1 month ago • 41
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • Oct 20 • 30
view article Article Fancy Stateful Metaflow Service + UI on Google Colab ? By Aurelien-Morgan • Oct 14 • 4
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17 • 55