A practical use case from your great job for the spanish language
Model
In this project, I developed an efficient and fast user intent classification system by leveraging an ensemble of logistic regression, SVM, and k-NN classifiers. The model uses text embeddings from the jinaai/jina-embeddings-v2-base-es model to achieve high accuracy while being significantly more resource-efficient compared to large language models (LLMs).
Motivation
Detecting user intent is crucial for retrieve-augmented generation (RAG) pipelines in conversational AI. These pipelines often require multiple calls to LLMs and sophisticated prompt engineering, which can be both time-consuming and costly. Our approach seeks to drastically reduce the time and number of calls to LLMs, providing a fast and cost-effective solution without compromising accuracy. This model focuses on classifying requests and questions in Spanish, supporting intents like censorship, others, lead, contact, directions, meet, negation, affirmation, and casual chat.
Results
Intent | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
Afirmación | 1.00 | 1.00 | 1.00 | 14 |
Censura | 0.99 | 1.00 | 0.99 | 539 |
Charla | 1.00 | 0.67 | 0.80 | 15 |
Contacto | 0.97 | 1.00 | 0.99 | 38 |
Direcciones | 1.00 | 1.00 | 1.00 | 71 |
Lead | 0.99 | 0.99 | 0.99 | 140 |
Meet | 0.97 | 1.00 | 0.98 | 29 |
Negación | 1.00 | 0.94 | 0.97 | 18 |
Otros | 0.98 | 0.97 | 0.98 | 171 |
Micro Avg | 0.99 | 0.99 | 0.99 | 1035 |
Macro Avg | 0.99 | 0.95 | 0.97 | 1035 |
Weighted Avg | 0.99 | 0.99 | 0.99 | 1035 |
Great job Jina AI your embeddings Rockz!