Lossless Acceleration for Seq2seq Generation with Aggressive Decoding Paper • 2205.10350 • Published May 20, 2022 • 2
Blockwise Parallel Decoding for Deep Autoregressive Models Paper • 1811.03115 • Published Nov 7, 2018 • 2
Fast Transformer Decoding: One Write-Head is All You Need Paper • 1911.02150 • Published Nov 6, 2019 • 6
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Paper • 2403.05135 • Published Mar 8 • 42
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs Paper • 2106.09997 • Published Jun 18, 2021 • 2
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints Paper • 2305.13245 • Published May 22, 2023 • 5
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems Paper • 2402.12875 • Published Feb 20 • 13
A Thorough Examination of Decoding Methods in the Era of LLMs Paper • 2402.06925 • Published Feb 10 • 1