linqresearch commited on
Commit
7eb7884
1 Parent(s): 5acb592

MOD: add reference links to README.md

Browse files

## Changes
- add reference links & English tag

Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -3,13 +3,15 @@ tags:
3
  - transformers
4
  license: cc-by-nc-4.0
5
  pipeline_tag: feature-extraction
 
 
6
  ---
7
 
8
  <h1 align="center">Linq-AI-Research/Linq-Embed-Mistral</h1>
9
 
10
  **Linq-Embed-Mistral**
11
 
12
- Linq-Embed-Mistral has been developed by building upon the foundations of the E5-mistral-7b-instruct and Mistral-7B-v0.1 models. We focus on improving text retrieval using advanced data refinement methods, including sophisticated data crafting, data filtering, and negative mining techniques. These methods are applied to both existing benchmark datasets and highly tailored synthetic datasets generated via LLMs. To enhance the quality of the synthetic data, we employ extensive prompt engineering and guidance from teacher models, ensuring these methods are specifically tailored to each task. Our efforts primarily aim to create high-quality triplet datasets (query, positive example, negative example), significantly improving text retrieval performance.
13
 
14
  Linq-Embed-Mistral performs exceptionally well in the MTEB benchmarks, achieving an average score of 68.1 across 56 datasets. This performance ranks it 1st among publicly accessible models on the MTEB leaderboard and 3rd overall among all evaluated models. The model excels in retrieval tasks, ranking 1st among all models listed on the MTEB leaderboard with a performance score of 60.0.
15
 
 
3
  - transformers
4
  license: cc-by-nc-4.0
5
  pipeline_tag: feature-extraction
6
+ language:
7
+ - en
8
  ---
9
 
10
  <h1 align="center">Linq-AI-Research/Linq-Embed-Mistral</h1>
11
 
12
  **Linq-Embed-Mistral**
13
 
14
+ Linq-Embed-Mistral has been developed by building upon the foundations of the [E5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) and [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) models. We focus on improving text retrieval using advanced data refinement methods, including sophisticated data crafting, data filtering, and negative mining techniques. These methods are applied to both existing benchmark datasets and highly tailored synthetic datasets generated via LLMs. To enhance the quality of the synthetic data, we employ extensive prompt engineering and guidance from teacher models, ensuring these methods are specifically tailored to each task. Our efforts primarily aim to create high-quality triplet datasets (query, positive example, negative example), significantly improving text retrieval performance.
15
 
16
  Linq-Embed-Mistral performs exceptionally well in the MTEB benchmarks, achieving an average score of 68.1 across 56 datasets. This performance ranks it 1st among publicly accessible models on the MTEB leaderboard and 3rd overall among all evaluated models. The model excels in retrieval tasks, ranking 1st among all models listed on the MTEB leaderboard with a performance score of 60.0.
17