Hugging Quants

AI & ML interests

Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗

Organization Card

Community About org cards

Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.

Resources:

Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.

Collections 3

models 19

hugging-quants/gemma-2-9b-it-AWQ-INT4

Text Generation • Updated Oct 17 • 2.3k • 4

hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4

Text Generation • Updated Oct 7 • 101

hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF

Text Generation • Updated Sep 25 • 3.15k • 13

hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF

Text Generation • Updated Sep 25 • 304k • 15

hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF

Text Generation • Updated Sep 25 • 4.98k • 18

hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF

Text Generation • Updated Sep 25 • 37.6k • 40

hugging-quants/Meta-Llama-3.1-405B-BNB-NF4

Text Generation • Updated Sep 16 • 39 • 2

hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4

Text Generation • Updated Sep 16 • 1.08k • 5

hugging-quants/Meta-Llama-3.1-405B-BNB-NF4-BF16

Text Generation • Updated Sep 16 • 1.12k • 2

hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4

Text Generation • Updated Sep 13 • 123k • 37

datasets

None public yet