Seq2seq + Attention
Pytorch implementation of Neural Machine Translation by Jointly Learning to Align and Translate. Trained on the Multi30k-de-en dataset with sentencepiece as the tokenizer.
Here's the attention heatmap of a random sample from the test set: