7B AWQ
Collection
These models are selected for their compatibility with small 12GB memory GPUs.
•
204 items
•
Updated
•
2
This repo contains AWQ model files for IkariDev and Undi's Noromaid 7B v0.4 DPO.
These files were quantised using hardware kindly provided by SolidRusT Networks.
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
It is supported by:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Base model
NeverSleep/Noromaid-7B-0.4-DPO