Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages Paper • 2410.16153 • Published 20 days ago • 42
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published 24 days ago • 74
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 17 days ago • 453
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 307
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Paper • 2410.06885 • Published Oct 9 • 40
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction Paper • 2410.04932 • Published Oct 7 • 9
Distilling an End-to-End Voice Assistant Without Instruction Training Data Paper • 2410.02678 • Published Oct 3 • 22
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Paper • 2409.17912 • Published Sep 26 • 20
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 143
Models That Speak Darija Collection A collection of models I tested and verified that could speak darija. some models are Proprietary models like: im-a-good-gpt2-chatbot • 3 items • Updated Sep 30 • 1
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper • 2409.18042 • Published Sep 26 • 36
MonoFormer: One Transformer for Both Diffusion and Autoregression Paper • 2409.16280 • Published Sep 24 • 17
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Paper • 2409.13592 • Published Sep 20 • 48
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 602