Sequence Parallelism: Long Sequence Training from System Perspective Paper • 2105.13120 • Published May 26, 2021 • 5