Aria: An Open Multimodal Native Mixture-of-Experts Model Paper • 2410.05993 • Published 9 days ago • 99
view article Article Introducing RWKV — An RNN with the advantages of a transformer May 15, 2023 • 12
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers Paper • 2211.14730 • Published Nov 27, 2022 • 1