Edit model card

⭐ GLiClass: Generalist and Lightweight Model for Sequence Classification

This is an efficient zero-shot classifier inspired by GLiNER work. It demonstrates the same performance as a cross-encoder while being more compute-efficient because classification is done at a single forward path.

It can be used for topic classification, sentiment analysis and as a reranker in RAG pipelines.

The model was trained on synthetic data and can be used in commercial applications.

This version of the model utilize the LLM2Vec approach for converting modern decoders to bi-directional encoder. It brings the following benefits:

  • Enhanced performance and generalization capabilities;
  • Support for Flash Attention;
  • Extended context window.

How to use:

First of all, you need to install GLiClass library:

pip install gliclass

To use this particular Qwen-based model you need different transformers package version than llm2vec requires, so install it manually:

pip install transformers==4.44.1

Than you need to initialize a model and a pipeline:

from gliclass import GLiClassModel, ZeroShotClassificationPipeline
from transformers import AutoTokenizer

model = GLiClassModel.from_pretrained("knowledgator/gliclass-qwen-0.5B-v1.0")
tokenizer = AutoTokenizer.from_pretrained("knowledgator/gliclass-qwen-0.5B-v1.0")

pipeline = ZeroShotClassificationPipeline(model, tokenizer, classification_type='multi-label', device='cuda:0')

text = "One day I will see the world!"
labels = ["travel", "dreams", "sport", "science", "politics"]
results = pipeline(text, labels, threshold=0.5)[0] #because we have one text

for result in results:
 print(result["label"], "=>", result["score"])

Benchmarks:

While the model is some how comparable to DeBERTa version in zero-shot setting, it demonstrates state-of-the-art performance in few-shot setting. Few-shot performance

Join Our Discord

Connect with our community on Discord for news, support, and discussion about our models. Join Discord.

Downloads last month
29
Safetensors
Model size
497M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Dataset used to train knowledgator/gliclass-qwen-0.5B-v1.0

Collections including knowledgator/gliclass-qwen-0.5B-v1.0