Atom: Low-bit Quantization for Efficient and Accurate LLM Serving Paper • 2310.19102 • Published Oct 29, 2023 • 10
Reducing Activation Recomputation in Large Transformer Models Paper • 2205.05198 • Published May 10, 2022