LLM Optimization - a gaspardthrl Collection

gaspardthrl 's Collections

LLM Optimization

Retrieval-Augmented Generation

GenAI-based Time Series

LLM Optimization

updated 12 days ago

A Survey on Efficient Inference for Large Language Models

Paper • 2404.14294 • Published Apr 22 • 2
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Paper • 2310.19102 • Published Oct 29, 2023 • 10
Reducing Activation Recomputation in Large Transformer Models

Paper • 2205.05198 • Published May 10, 2022