Qwen2.5-0.5B-Instruct

Introduction

This model is based on the Qwen2.5-0.5B-Instruct model and is quantized in 4bits in the EXL2 format using the AutoQuant system : https://colab.research.google.com/drive/1b6nqC7UZVt8bx4MksX7s656GXPM-eWw4

You can learn more about the EXL2 format here : https://github.com/turboderp/exllamav2 Feel free to use it as you want

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Volko76/Qwen2.5-0.5B-Instruct-EXL2-4bits

Base model

Qwen/Qwen2.5-0.5B

Finetuned

(31)

this model

Collection including Volko76/Qwen2.5-0.5B-Instruct-EXL2-4bits

EXL2 Quantizations

Collection

A collection of models quantized for EXL2, one of the fastest quantisation method. https://github.com/turboderp/exllamav2 • 8 items • Updated 1 day ago