Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer Paper • 2305.16380 • Published May 25, 2023 • 4