Model Compression and Efficient Inference for Large Language Models: A Survey Paper • 2402.09748 • Published Feb 15 • 1