Model Card for Model ID
This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers.
Model Details
Model Description
- Developed by: The Kaitchup
- Model type: Causal (Llama 2)
- Language(s) (NLP): English
- License: Apache 2.0, Llama 2 license agreement
Model Sources
The method and code used to quantize the model are explained here: Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL
Uses
This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters. Note that the 2-bit quantization significantly decreases the performance of Llama 2.
Other versions
Model Card Contact
- Downloads last month
- 20
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.