Recent Mamba Papers
[NB: Notes are from TuringPost]
Paper • 2403.09977 • Published • 9Note The paper makes Mamba more suitable for deployment on resource-constrained devices by introducing an efficient 2D scanning method and a dual-pathway module for balanced global-local feature extraction. Results show significant reduction in FLOPs while maintaining strong accuracy. [NB: Notes are from TuringPost]
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Paper • 2403.14520 • Published • 33Note The paper extends Mamba to be a multi-modal large language model capable of jointly reasoning over vision and language. Experiments demonstrate competitive performance on vision-language tasks with faster inference speeds compared to Transformer-based models.
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series
Paper • 2403.15360 • Published • 11Note The paper presents a simplified Mamba-based architecture that addresses stability issues when scaling Mamba to larger sizes. The key innovation is EinFFT, a novel channel mixing technique that ensures stable optimization. SiMBA shows strong results on vision tasks and multivariate time series forecasting, closing the gap with state-of-the-art Transformers.