The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines Paper • 2408.01050 • Published Aug 2 • 8 • 4
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU Paper • 2312.12456 • Published Dec 16, 2023 • 41 • 4
Investigating Answerability of LLMs for Long-Form Question Answering Paper • 2309.08210 • Published Sep 15, 2023 • 12 • 1