Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 215
Parler-TTS: fully open-source high-quality TTS Collection If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. • 7 items • Updated Aug 8 • 46
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59
Chameleon Collection Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR. • 2 items • Updated Jul 9 • 27
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 11 items • Updated 10 days ago • 41
SEED-Story: Multimodal Long Story Generation with Large Language Model Paper • 2407.08683 • Published Jul 11 • 22
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6 • 88
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11 • 41
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2 • 22
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 73
Learning to Decode Collaboratively with Multiple Language Models Paper • 2403.03870 • Published Mar 6 • 18