Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
lamhieu 
posted an update Jun 14
Post
2880
Wow, this is amazing! 🤯
Samba is a powerful hybrid model with an unlimited context length, combining Mamba, MLP, Sliding Window Attention, and MLP stacking. Samba largest version, Samba-3.8B, trained on 3.2 trillion tokens, excels in benchmarks like MMLU, GSM8K, and HumanEval, and shines in long-context tasks with minimal tuning.
---
Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"
Github: https://github.com/microsoft/Samba
In this post