Xenova HF staff commited on
Commit
e199941
1 Parent(s): 396120a

Upload folder using huggingface_hub

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,3308 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - feature-extraction
5
+ - sentence-similarity
6
+ - mteb
7
+ language:
8
+ - de
9
+ - en
10
+ inference: false
11
+ license: apache-2.0
12
+ model-index:
13
+ - name: jina-embeddings-v2-base-de
14
+ results:
15
+ - task:
16
+ type: Classification
17
+ dataset:
18
+ type: mteb/amazon_counterfactual
19
+ name: MTEB AmazonCounterfactualClassification (en)
20
+ config: en
21
+ split: test
22
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
23
+ metrics:
24
+ - type: accuracy
25
+ value: 73.76119402985076
26
+ - type: ap
27
+ value: 35.99577188521176
28
+ - type: f1
29
+ value: 67.50397431543269
30
+ - task:
31
+ type: Classification
32
+ dataset:
33
+ type: mteb/amazon_counterfactual
34
+ name: MTEB AmazonCounterfactualClassification (de)
35
+ config: de
36
+ split: test
37
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
38
+ metrics:
39
+ - type: accuracy
40
+ value: 68.9186295503212
41
+ - type: ap
42
+ value: 79.73307115840507
43
+ - type: f1
44
+ value: 66.66245744831339
45
+ - task:
46
+ type: Classification
47
+ dataset:
48
+ type: mteb/amazon_polarity
49
+ name: MTEB AmazonPolarityClassification
50
+ config: default
51
+ split: test
52
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
53
+ metrics:
54
+ - type: accuracy
55
+ value: 77.52215
56
+ - type: ap
57
+ value: 71.85051037177416
58
+ - type: f1
59
+ value: 77.4171096157774
60
+ - task:
61
+ type: Classification
62
+ dataset:
63
+ type: mteb/amazon_reviews_multi
64
+ name: MTEB AmazonReviewsClassification (en)
65
+ config: en
66
+ split: test
67
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
68
+ metrics:
69
+ - type: accuracy
70
+ value: 38.498
71
+ - type: f1
72
+ value: 38.058193386555956
73
+ - task:
74
+ type: Classification
75
+ dataset:
76
+ type: mteb/amazon_reviews_multi
77
+ name: MTEB AmazonReviewsClassification (de)
78
+ config: de
79
+ split: test
80
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
81
+ metrics:
82
+ - type: accuracy
83
+ value: 37.717999999999996
84
+ - type: f1
85
+ value: 37.22674371574757
86
+ - task:
87
+ type: Retrieval
88
+ dataset:
89
+ type: arguana
90
+ name: MTEB ArguAna
91
+ config: default
92
+ split: test
93
+ revision: None
94
+ metrics:
95
+ - type: map_at_1
96
+ value: 25.319999999999997
97
+ - type: map_at_10
98
+ value: 40.351
99
+ - type: map_at_100
100
+ value: 41.435
101
+ - type: map_at_1000
102
+ value: 41.443000000000005
103
+ - type: map_at_3
104
+ value: 35.266
105
+ - type: map_at_5
106
+ value: 37.99
107
+ - type: mrr_at_1
108
+ value: 25.746999999999996
109
+ - type: mrr_at_10
110
+ value: 40.515
111
+ - type: mrr_at_100
112
+ value: 41.606
113
+ - type: mrr_at_1000
114
+ value: 41.614000000000004
115
+ - type: mrr_at_3
116
+ value: 35.42
117
+ - type: mrr_at_5
118
+ value: 38.112
119
+ - type: ndcg_at_1
120
+ value: 25.319999999999997
121
+ - type: ndcg_at_10
122
+ value: 49.332
123
+ - type: ndcg_at_100
124
+ value: 53.909
125
+ - type: ndcg_at_1000
126
+ value: 54.089
127
+ - type: ndcg_at_3
128
+ value: 38.705
129
+ - type: ndcg_at_5
130
+ value: 43.606
131
+ - type: precision_at_1
132
+ value: 25.319999999999997
133
+ - type: precision_at_10
134
+ value: 7.831
135
+ - type: precision_at_100
136
+ value: 0.9820000000000001
137
+ - type: precision_at_1000
138
+ value: 0.1
139
+ - type: precision_at_3
140
+ value: 16.24
141
+ - type: precision_at_5
142
+ value: 12.119
143
+ - type: recall_at_1
144
+ value: 25.319999999999997
145
+ - type: recall_at_10
146
+ value: 78.307
147
+ - type: recall_at_100
148
+ value: 98.222
149
+ - type: recall_at_1000
150
+ value: 99.57300000000001
151
+ - type: recall_at_3
152
+ value: 48.72
153
+ - type: recall_at_5
154
+ value: 60.597
155
+ - task:
156
+ type: Clustering
157
+ dataset:
158
+ type: mteb/arxiv-clustering-p2p
159
+ name: MTEB ArxivClusteringP2P
160
+ config: default
161
+ split: test
162
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
163
+ metrics:
164
+ - type: v_measure
165
+ value: 41.43100588255654
166
+ - task:
167
+ type: Clustering
168
+ dataset:
169
+ type: mteb/arxiv-clustering-s2s
170
+ name: MTEB ArxivClusteringS2S
171
+ config: default
172
+ split: test
173
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
174
+ metrics:
175
+ - type: v_measure
176
+ value: 32.08988904593667
177
+ - task:
178
+ type: Reranking
179
+ dataset:
180
+ type: mteb/askubuntudupquestions-reranking
181
+ name: MTEB AskUbuntuDupQuestions
182
+ config: default
183
+ split: test
184
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
185
+ metrics:
186
+ - type: map
187
+ value: 60.55514765595906
188
+ - type: mrr
189
+ value: 73.51393835465858
190
+ - task:
191
+ type: STS
192
+ dataset:
193
+ type: mteb/biosses-sts
194
+ name: MTEB BIOSSES
195
+ config: default
196
+ split: test
197
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
198
+ metrics:
199
+ - type: cos_sim_pearson
200
+ value: 79.6723823121172
201
+ - type: cos_sim_spearman
202
+ value: 76.90596922214986
203
+ - type: euclidean_pearson
204
+ value: 77.87910737957918
205
+ - type: euclidean_spearman
206
+ value: 76.66319260598262
207
+ - type: manhattan_pearson
208
+ value: 77.37039493457965
209
+ - type: manhattan_spearman
210
+ value: 76.09872191280964
211
+ - task:
212
+ type: BitextMining
213
+ dataset:
214
+ type: mteb/bucc-bitext-mining
215
+ name: MTEB BUCC (de-en)
216
+ config: de-en
217
+ split: test
218
+ revision: d51519689f32196a32af33b075a01d0e7c51e252
219
+ metrics:
220
+ - type: accuracy
221
+ value: 98.97703549060543
222
+ - type: f1
223
+ value: 98.86569241475296
224
+ - type: precision
225
+ value: 98.81002087682673
226
+ - type: recall
227
+ value: 98.97703549060543
228
+ - task:
229
+ type: Classification
230
+ dataset:
231
+ type: mteb/banking77
232
+ name: MTEB Banking77Classification
233
+ config: default
234
+ split: test
235
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
236
+ metrics:
237
+ - type: accuracy
238
+ value: 83.93506493506493
239
+ - type: f1
240
+ value: 83.91014949949302
241
+ - task:
242
+ type: Clustering
243
+ dataset:
244
+ type: mteb/biorxiv-clustering-p2p
245
+ name: MTEB BiorxivClusteringP2P
246
+ config: default
247
+ split: test
248
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
249
+ metrics:
250
+ - type: v_measure
251
+ value: 34.970675877585144
252
+ - task:
253
+ type: Clustering
254
+ dataset:
255
+ type: mteb/biorxiv-clustering-s2s
256
+ name: MTEB BiorxivClusteringS2S
257
+ config: default
258
+ split: test
259
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
260
+ metrics:
261
+ - type: v_measure
262
+ value: 28.779230269190954
263
+ - task:
264
+ type: Clustering
265
+ dataset:
266
+ type: slvnwhrl/blurbs-clustering-p2p
267
+ name: MTEB BlurbsClusteringP2P
268
+ config: default
269
+ split: test
270
+ revision: a2dd5b02a77de3466a3eaa98ae586b5610314496
271
+ metrics:
272
+ - type: v_measure
273
+ value: 35.490175601567216
274
+ - task:
275
+ type: Clustering
276
+ dataset:
277
+ type: slvnwhrl/blurbs-clustering-s2s
278
+ name: MTEB BlurbsClusteringS2S
279
+ config: default
280
+ split: test
281
+ revision: 9bfff9a7f8f6dc6ffc9da71c48dd48b68696471d
282
+ metrics:
283
+ - type: v_measure
284
+ value: 16.16638280560168
285
+ - task:
286
+ type: Retrieval
287
+ dataset:
288
+ type: BeIR/cqadupstack
289
+ name: MTEB CQADupstackAndroidRetrieval
290
+ config: default
291
+ split: test
292
+ revision: None
293
+ metrics:
294
+ - type: map_at_1
295
+ value: 30.830999999999996
296
+ - type: map_at_10
297
+ value: 41.355
298
+ - type: map_at_100
299
+ value: 42.791000000000004
300
+ - type: map_at_1000
301
+ value: 42.918
302
+ - type: map_at_3
303
+ value: 38.237
304
+ - type: map_at_5
305
+ value: 40.066
306
+ - type: mrr_at_1
307
+ value: 38.484
308
+ - type: mrr_at_10
309
+ value: 47.593
310
+ - type: mrr_at_100
311
+ value: 48.388
312
+ - type: mrr_at_1000
313
+ value: 48.439
314
+ - type: mrr_at_3
315
+ value: 45.279
316
+ - type: mrr_at_5
317
+ value: 46.724
318
+ - type: ndcg_at_1
319
+ value: 38.484
320
+ - type: ndcg_at_10
321
+ value: 47.27
322
+ - type: ndcg_at_100
323
+ value: 52.568000000000005
324
+ - type: ndcg_at_1000
325
+ value: 54.729000000000006
326
+ - type: ndcg_at_3
327
+ value: 43.061
328
+ - type: ndcg_at_5
329
+ value: 45.083
330
+ - type: precision_at_1
331
+ value: 38.484
332
+ - type: precision_at_10
333
+ value: 8.927
334
+ - type: precision_at_100
335
+ value: 1.425
336
+ - type: precision_at_1000
337
+ value: 0.19
338
+ - type: precision_at_3
339
+ value: 20.791999999999998
340
+ - type: precision_at_5
341
+ value: 14.85
342
+ - type: recall_at_1
343
+ value: 30.830999999999996
344
+ - type: recall_at_10
345
+ value: 57.87799999999999
346
+ - type: recall_at_100
347
+ value: 80.124
348
+ - type: recall_at_1000
349
+ value: 94.208
350
+ - type: recall_at_3
351
+ value: 45.083
352
+ - type: recall_at_5
353
+ value: 51.154999999999994
354
+ - task:
355
+ type: Retrieval
356
+ dataset:
357
+ type: BeIR/cqadupstack
358
+ name: MTEB CQADupstackEnglishRetrieval
359
+ config: default
360
+ split: test
361
+ revision: None
362
+ metrics:
363
+ - type: map_at_1
364
+ value: 25.782
365
+ - type: map_at_10
366
+ value: 34.492
367
+ - type: map_at_100
368
+ value: 35.521
369
+ - type: map_at_1000
370
+ value: 35.638
371
+ - type: map_at_3
372
+ value: 31.735999999999997
373
+ - type: map_at_5
374
+ value: 33.339
375
+ - type: mrr_at_1
376
+ value: 32.357
377
+ - type: mrr_at_10
378
+ value: 39.965
379
+ - type: mrr_at_100
380
+ value: 40.644000000000005
381
+ - type: mrr_at_1000
382
+ value: 40.695
383
+ - type: mrr_at_3
384
+ value: 37.739
385
+ - type: mrr_at_5
386
+ value: 39.061
387
+ - type: ndcg_at_1
388
+ value: 32.357
389
+ - type: ndcg_at_10
390
+ value: 39.644
391
+ - type: ndcg_at_100
392
+ value: 43.851
393
+ - type: ndcg_at_1000
394
+ value: 46.211999999999996
395
+ - type: ndcg_at_3
396
+ value: 35.675000000000004
397
+ - type: ndcg_at_5
398
+ value: 37.564
399
+ - type: precision_at_1
400
+ value: 32.357
401
+ - type: precision_at_10
402
+ value: 7.344
403
+ - type: precision_at_100
404
+ value: 1.201
405
+ - type: precision_at_1000
406
+ value: 0.168
407
+ - type: precision_at_3
408
+ value: 17.155
409
+ - type: precision_at_5
410
+ value: 12.166
411
+ - type: recall_at_1
412
+ value: 25.782
413
+ - type: recall_at_10
414
+ value: 49.132999999999996
415
+ - type: recall_at_100
416
+ value: 67.24
417
+ - type: recall_at_1000
418
+ value: 83.045
419
+ - type: recall_at_3
420
+ value: 37.021
421
+ - type: recall_at_5
422
+ value: 42.548
423
+ - task:
424
+ type: Retrieval
425
+ dataset:
426
+ type: BeIR/cqadupstack
427
+ name: MTEB CQADupstackGamingRetrieval
428
+ config: default
429
+ split: test
430
+ revision: None
431
+ metrics:
432
+ - type: map_at_1
433
+ value: 35.778999999999996
434
+ - type: map_at_10
435
+ value: 47.038000000000004
436
+ - type: map_at_100
437
+ value: 48.064
438
+ - type: map_at_1000
439
+ value: 48.128
440
+ - type: map_at_3
441
+ value: 44.186
442
+ - type: map_at_5
443
+ value: 45.788000000000004
444
+ - type: mrr_at_1
445
+ value: 41.254000000000005
446
+ - type: mrr_at_10
447
+ value: 50.556999999999995
448
+ - type: mrr_at_100
449
+ value: 51.296
450
+ - type: mrr_at_1000
451
+ value: 51.331
452
+ - type: mrr_at_3
453
+ value: 48.318
454
+ - type: mrr_at_5
455
+ value: 49.619
456
+ - type: ndcg_at_1
457
+ value: 41.254000000000005
458
+ - type: ndcg_at_10
459
+ value: 52.454
460
+ - type: ndcg_at_100
461
+ value: 56.776
462
+ - type: ndcg_at_1000
463
+ value: 58.181000000000004
464
+ - type: ndcg_at_3
465
+ value: 47.713
466
+ - type: ndcg_at_5
467
+ value: 49.997
468
+ - type: precision_at_1
469
+ value: 41.254000000000005
470
+ - type: precision_at_10
471
+ value: 8.464
472
+ - type: precision_at_100
473
+ value: 1.157
474
+ - type: precision_at_1000
475
+ value: 0.133
476
+ - type: precision_at_3
477
+ value: 21.526
478
+ - type: precision_at_5
479
+ value: 14.696000000000002
480
+ - type: recall_at_1
481
+ value: 35.778999999999996
482
+ - type: recall_at_10
483
+ value: 64.85300000000001
484
+ - type: recall_at_100
485
+ value: 83.98400000000001
486
+ - type: recall_at_1000
487
+ value: 94.18299999999999
488
+ - type: recall_at_3
489
+ value: 51.929
490
+ - type: recall_at_5
491
+ value: 57.666
492
+ - task:
493
+ type: Retrieval
494
+ dataset:
495
+ type: BeIR/cqadupstack
496
+ name: MTEB CQADupstackGisRetrieval
497
+ config: default
498
+ split: test
499
+ revision: None
500
+ metrics:
501
+ - type: map_at_1
502
+ value: 21.719
503
+ - type: map_at_10
504
+ value: 29.326999999999998
505
+ - type: map_at_100
506
+ value: 30.314000000000004
507
+ - type: map_at_1000
508
+ value: 30.397000000000002
509
+ - type: map_at_3
510
+ value: 27.101
511
+ - type: map_at_5
512
+ value: 28.141
513
+ - type: mrr_at_1
514
+ value: 23.503
515
+ - type: mrr_at_10
516
+ value: 31.225
517
+ - type: mrr_at_100
518
+ value: 32.096000000000004
519
+ - type: mrr_at_1000
520
+ value: 32.159
521
+ - type: mrr_at_3
522
+ value: 29.076999999999998
523
+ - type: mrr_at_5
524
+ value: 30.083
525
+ - type: ndcg_at_1
526
+ value: 23.503
527
+ - type: ndcg_at_10
528
+ value: 33.842
529
+ - type: ndcg_at_100
530
+ value: 39.038000000000004
531
+ - type: ndcg_at_1000
532
+ value: 41.214
533
+ - type: ndcg_at_3
534
+ value: 29.347
535
+ - type: ndcg_at_5
536
+ value: 31.121
537
+ - type: precision_at_1
538
+ value: 23.503
539
+ - type: precision_at_10
540
+ value: 5.266
541
+ - type: precision_at_100
542
+ value: 0.831
543
+ - type: precision_at_1000
544
+ value: 0.106
545
+ - type: precision_at_3
546
+ value: 12.504999999999999
547
+ - type: precision_at_5
548
+ value: 8.565000000000001
549
+ - type: recall_at_1
550
+ value: 21.719
551
+ - type: recall_at_10
552
+ value: 46.024
553
+ - type: recall_at_100
554
+ value: 70.78999999999999
555
+ - type: recall_at_1000
556
+ value: 87.022
557
+ - type: recall_at_3
558
+ value: 33.64
559
+ - type: recall_at_5
560
+ value: 37.992
561
+ - task:
562
+ type: Retrieval
563
+ dataset:
564
+ type: BeIR/cqadupstack
565
+ name: MTEB CQADupstackMathematicaRetrieval
566
+ config: default
567
+ split: test
568
+ revision: None
569
+ metrics:
570
+ - type: map_at_1
571
+ value: 15.601
572
+ - type: map_at_10
573
+ value: 22.054000000000002
574
+ - type: map_at_100
575
+ value: 23.177
576
+ - type: map_at_1000
577
+ value: 23.308
578
+ - type: map_at_3
579
+ value: 19.772000000000002
580
+ - type: map_at_5
581
+ value: 21.055
582
+ - type: mrr_at_1
583
+ value: 19.403000000000002
584
+ - type: mrr_at_10
585
+ value: 26.409
586
+ - type: mrr_at_100
587
+ value: 27.356
588
+ - type: mrr_at_1000
589
+ value: 27.441
590
+ - type: mrr_at_3
591
+ value: 24.108999999999998
592
+ - type: mrr_at_5
593
+ value: 25.427
594
+ - type: ndcg_at_1
595
+ value: 19.403000000000002
596
+ - type: ndcg_at_10
597
+ value: 26.474999999999998
598
+ - type: ndcg_at_100
599
+ value: 32.086
600
+ - type: ndcg_at_1000
601
+ value: 35.231
602
+ - type: ndcg_at_3
603
+ value: 22.289
604
+ - type: ndcg_at_5
605
+ value: 24.271
606
+ - type: precision_at_1
607
+ value: 19.403000000000002
608
+ - type: precision_at_10
609
+ value: 4.813
610
+ - type: precision_at_100
611
+ value: 0.8869999999999999
612
+ - type: precision_at_1000
613
+ value: 0.13
614
+ - type: precision_at_3
615
+ value: 10.531
616
+ - type: precision_at_5
617
+ value: 7.710999999999999
618
+ - type: recall_at_1
619
+ value: 15.601
620
+ - type: recall_at_10
621
+ value: 35.916
622
+ - type: recall_at_100
623
+ value: 60.8
624
+ - type: recall_at_1000
625
+ value: 83.245
626
+ - type: recall_at_3
627
+ value: 24.321
628
+ - type: recall_at_5
629
+ value: 29.372999999999998
630
+ - task:
631
+ type: Retrieval
632
+ dataset:
633
+ type: BeIR/cqadupstack
634
+ name: MTEB CQADupstackPhysicsRetrieval
635
+ config: default
636
+ split: test
637
+ revision: None
638
+ metrics:
639
+ - type: map_at_1
640
+ value: 25.522
641
+ - type: map_at_10
642
+ value: 34.854
643
+ - type: map_at_100
644
+ value: 36.269
645
+ - type: map_at_1000
646
+ value: 36.387
647
+ - type: map_at_3
648
+ value: 32.187
649
+ - type: map_at_5
650
+ value: 33.692
651
+ - type: mrr_at_1
652
+ value: 31.375999999999998
653
+ - type: mrr_at_10
654
+ value: 40.471000000000004
655
+ - type: mrr_at_100
656
+ value: 41.481
657
+ - type: mrr_at_1000
658
+ value: 41.533
659
+ - type: mrr_at_3
660
+ value: 38.274
661
+ - type: mrr_at_5
662
+ value: 39.612
663
+ - type: ndcg_at_1
664
+ value: 31.375999999999998
665
+ - type: ndcg_at_10
666
+ value: 40.298
667
+ - type: ndcg_at_100
668
+ value: 46.255
669
+ - type: ndcg_at_1000
670
+ value: 48.522
671
+ - type: ndcg_at_3
672
+ value: 36.049
673
+ - type: ndcg_at_5
674
+ value: 38.095
675
+ - type: precision_at_1
676
+ value: 31.375999999999998
677
+ - type: precision_at_10
678
+ value: 7.305000000000001
679
+ - type: precision_at_100
680
+ value: 1.201
681
+ - type: precision_at_1000
682
+ value: 0.157
683
+ - type: precision_at_3
684
+ value: 17.132
685
+ - type: precision_at_5
686
+ value: 12.107999999999999
687
+ - type: recall_at_1
688
+ value: 25.522
689
+ - type: recall_at_10
690
+ value: 50.988
691
+ - type: recall_at_100
692
+ value: 76.005
693
+ - type: recall_at_1000
694
+ value: 91.11200000000001
695
+ - type: recall_at_3
696
+ value: 38.808
697
+ - type: recall_at_5
698
+ value: 44.279
699
+ - task:
700
+ type: Retrieval
701
+ dataset:
702
+ type: BeIR/cqadupstack
703
+ name: MTEB CQADupstackProgrammersRetrieval
704
+ config: default
705
+ split: test
706
+ revision: None
707
+ metrics:
708
+ - type: map_at_1
709
+ value: 24.615000000000002
710
+ - type: map_at_10
711
+ value: 32.843
712
+ - type: map_at_100
713
+ value: 34.172999999999995
714
+ - type: map_at_1000
715
+ value: 34.286
716
+ - type: map_at_3
717
+ value: 30.125
718
+ - type: map_at_5
719
+ value: 31.495
720
+ - type: mrr_at_1
721
+ value: 30.023
722
+ - type: mrr_at_10
723
+ value: 38.106
724
+ - type: mrr_at_100
725
+ value: 39.01
726
+ - type: mrr_at_1000
727
+ value: 39.071
728
+ - type: mrr_at_3
729
+ value: 35.674
730
+ - type: mrr_at_5
731
+ value: 36.924
732
+ - type: ndcg_at_1
733
+ value: 30.023
734
+ - type: ndcg_at_10
735
+ value: 38.091
736
+ - type: ndcg_at_100
737
+ value: 43.771
738
+ - type: ndcg_at_1000
739
+ value: 46.315
740
+ - type: ndcg_at_3
741
+ value: 33.507
742
+ - type: ndcg_at_5
743
+ value: 35.304
744
+ - type: precision_at_1
745
+ value: 30.023
746
+ - type: precision_at_10
747
+ value: 6.837999999999999
748
+ - type: precision_at_100
749
+ value: 1.124
750
+ - type: precision_at_1000
751
+ value: 0.152
752
+ - type: precision_at_3
753
+ value: 15.562999999999999
754
+ - type: precision_at_5
755
+ value: 10.936
756
+ - type: recall_at_1
757
+ value: 24.615000000000002
758
+ - type: recall_at_10
759
+ value: 48.691
760
+ - type: recall_at_100
761
+ value: 72.884
762
+ - type: recall_at_1000
763
+ value: 90.387
764
+ - type: recall_at_3
765
+ value: 35.659
766
+ - type: recall_at_5
767
+ value: 40.602
768
+ - task:
769
+ type: Retrieval
770
+ dataset:
771
+ type: BeIR/cqadupstack
772
+ name: MTEB CQADupstackRetrieval
773
+ config: default
774
+ split: test
775
+ revision: None
776
+ metrics:
777
+ - type: map_at_1
778
+ value: 23.223666666666666
779
+ - type: map_at_10
780
+ value: 31.338166666666673
781
+ - type: map_at_100
782
+ value: 32.47358333333333
783
+ - type: map_at_1000
784
+ value: 32.5955
785
+ - type: map_at_3
786
+ value: 28.84133333333333
787
+ - type: map_at_5
788
+ value: 30.20808333333333
789
+ - type: mrr_at_1
790
+ value: 27.62483333333333
791
+ - type: mrr_at_10
792
+ value: 35.385916666666674
793
+ - type: mrr_at_100
794
+ value: 36.23325
795
+ - type: mrr_at_1000
796
+ value: 36.29966666666667
797
+ - type: mrr_at_3
798
+ value: 33.16583333333333
799
+ - type: mrr_at_5
800
+ value: 34.41983333333334
801
+ - type: ndcg_at_1
802
+ value: 27.62483333333333
803
+ - type: ndcg_at_10
804
+ value: 36.222
805
+ - type: ndcg_at_100
806
+ value: 41.29491666666666
807
+ - type: ndcg_at_1000
808
+ value: 43.85508333333333
809
+ - type: ndcg_at_3
810
+ value: 31.95116666666667
811
+ - type: ndcg_at_5
812
+ value: 33.88541666666667
813
+ - type: precision_at_1
814
+ value: 27.62483333333333
815
+ - type: precision_at_10
816
+ value: 6.339916666666667
817
+ - type: precision_at_100
818
+ value: 1.0483333333333333
819
+ - type: precision_at_1000
820
+ value: 0.14608333333333334
821
+ - type: precision_at_3
822
+ value: 14.726500000000003
823
+ - type: precision_at_5
824
+ value: 10.395
825
+ - type: recall_at_1
826
+ value: 23.223666666666666
827
+ - type: recall_at_10
828
+ value: 46.778999999999996
829
+ - type: recall_at_100
830
+ value: 69.27141666666667
831
+ - type: recall_at_1000
832
+ value: 87.27383333333334
833
+ - type: recall_at_3
834
+ value: 34.678749999999994
835
+ - type: recall_at_5
836
+ value: 39.79900000000001
837
+ - task:
838
+ type: Retrieval
839
+ dataset:
840
+ type: BeIR/cqadupstack
841
+ name: MTEB CQADupstackStatsRetrieval
842
+ config: default
843
+ split: test
844
+ revision: None
845
+ metrics:
846
+ - type: map_at_1
847
+ value: 21.677
848
+ - type: map_at_10
849
+ value: 27.828000000000003
850
+ - type: map_at_100
851
+ value: 28.538999999999998
852
+ - type: map_at_1000
853
+ value: 28.64
854
+ - type: map_at_3
855
+ value: 26.105
856
+ - type: map_at_5
857
+ value: 27.009
858
+ - type: mrr_at_1
859
+ value: 24.387
860
+ - type: mrr_at_10
861
+ value: 30.209999999999997
862
+ - type: mrr_at_100
863
+ value: 30.953000000000003
864
+ - type: mrr_at_1000
865
+ value: 31.029
866
+ - type: mrr_at_3
867
+ value: 28.707
868
+ - type: mrr_at_5
869
+ value: 29.610999999999997
870
+ - type: ndcg_at_1
871
+ value: 24.387
872
+ - type: ndcg_at_10
873
+ value: 31.378
874
+ - type: ndcg_at_100
875
+ value: 35.249
876
+ - type: ndcg_at_1000
877
+ value: 37.923
878
+ - type: ndcg_at_3
879
+ value: 28.213
880
+ - type: ndcg_at_5
881
+ value: 29.658
882
+ - type: precision_at_1
883
+ value: 24.387
884
+ - type: precision_at_10
885
+ value: 4.8309999999999995
886
+ - type: precision_at_100
887
+ value: 0.73
888
+ - type: precision_at_1000
889
+ value: 0.104
890
+ - type: precision_at_3
891
+ value: 12.168
892
+ - type: precision_at_5
893
+ value: 8.251999999999999
894
+ - type: recall_at_1
895
+ value: 21.677
896
+ - type: recall_at_10
897
+ value: 40.069
898
+ - type: recall_at_100
899
+ value: 58.077
900
+ - type: recall_at_1000
901
+ value: 77.97
902
+ - type: recall_at_3
903
+ value: 31.03
904
+ - type: recall_at_5
905
+ value: 34.838
906
+ - task:
907
+ type: Retrieval
908
+ dataset:
909
+ type: BeIR/cqadupstack
910
+ name: MTEB CQADupstackTexRetrieval
911
+ config: default
912
+ split: test
913
+ revision: None
914
+ metrics:
915
+ - type: map_at_1
916
+ value: 14.484
917
+ - type: map_at_10
918
+ value: 20.355
919
+ - type: map_at_100
920
+ value: 21.382
921
+ - type: map_at_1000
922
+ value: 21.511
923
+ - type: map_at_3
924
+ value: 18.448
925
+ - type: map_at_5
926
+ value: 19.451999999999998
927
+ - type: mrr_at_1
928
+ value: 17.584
929
+ - type: mrr_at_10
930
+ value: 23.825
931
+ - type: mrr_at_100
932
+ value: 24.704
933
+ - type: mrr_at_1000
934
+ value: 24.793000000000003
935
+ - type: mrr_at_3
936
+ value: 21.92
937
+ - type: mrr_at_5
938
+ value: 22.97
939
+ - type: ndcg_at_1
940
+ value: 17.584
941
+ - type: ndcg_at_10
942
+ value: 24.315
943
+ - type: ndcg_at_100
944
+ value: 29.354999999999997
945
+ - type: ndcg_at_1000
946
+ value: 32.641999999999996
947
+ - type: ndcg_at_3
948
+ value: 20.802
949
+ - type: ndcg_at_5
950
+ value: 22.335
951
+ - type: precision_at_1
952
+ value: 17.584
953
+ - type: precision_at_10
954
+ value: 4.443
955
+ - type: precision_at_100
956
+ value: 0.8160000000000001
957
+ - type: precision_at_1000
958
+ value: 0.128
959
+ - type: precision_at_3
960
+ value: 9.807
961
+ - type: precision_at_5
962
+ value: 7.0889999999999995
963
+ - type: recall_at_1
964
+ value: 14.484
965
+ - type: recall_at_10
966
+ value: 32.804
967
+ - type: recall_at_100
968
+ value: 55.679
969
+ - type: recall_at_1000
970
+ value: 79.63
971
+ - type: recall_at_3
972
+ value: 22.976
973
+ - type: recall_at_5
974
+ value: 26.939
975
+ - task:
976
+ type: Retrieval
977
+ dataset:
978
+ type: BeIR/cqadupstack
979
+ name: MTEB CQADupstackUnixRetrieval
980
+ config: default
981
+ split: test
982
+ revision: None
983
+ metrics:
984
+ - type: map_at_1
985
+ value: 22.983999999999998
986
+ - type: map_at_10
987
+ value: 30.812
988
+ - type: map_at_100
989
+ value: 31.938
990
+ - type: map_at_1000
991
+ value: 32.056000000000004
992
+ - type: map_at_3
993
+ value: 28.449999999999996
994
+ - type: map_at_5
995
+ value: 29.542
996
+ - type: mrr_at_1
997
+ value: 27.145999999999997
998
+ - type: mrr_at_10
999
+ value: 34.782999999999994
1000
+ - type: mrr_at_100
1001
+ value: 35.699
1002
+ - type: mrr_at_1000
1003
+ value: 35.768
1004
+ - type: mrr_at_3
1005
+ value: 32.572
1006
+ - type: mrr_at_5
1007
+ value: 33.607
1008
+ - type: ndcg_at_1
1009
+ value: 27.145999999999997
1010
+ - type: ndcg_at_10
1011
+ value: 35.722
1012
+ - type: ndcg_at_100
1013
+ value: 40.964
1014
+ - type: ndcg_at_1000
1015
+ value: 43.598
1016
+ - type: ndcg_at_3
1017
+ value: 31.379
1018
+ - type: ndcg_at_5
1019
+ value: 32.924
1020
+ - type: precision_at_1
1021
+ value: 27.145999999999997
1022
+ - type: precision_at_10
1023
+ value: 6.063000000000001
1024
+ - type: precision_at_100
1025
+ value: 0.9730000000000001
1026
+ - type: precision_at_1000
1027
+ value: 0.13
1028
+ - type: precision_at_3
1029
+ value: 14.366000000000001
1030
+ - type: precision_at_5
1031
+ value: 9.776
1032
+ - type: recall_at_1
1033
+ value: 22.983999999999998
1034
+ - type: recall_at_10
1035
+ value: 46.876
1036
+ - type: recall_at_100
1037
+ value: 69.646
1038
+ - type: recall_at_1000
1039
+ value: 88.305
1040
+ - type: recall_at_3
1041
+ value: 34.471000000000004
1042
+ - type: recall_at_5
1043
+ value: 38.76
1044
+ - task:
1045
+ type: Retrieval
1046
+ dataset:
1047
+ type: BeIR/cqadupstack
1048
+ name: MTEB CQADupstackWebmastersRetrieval
1049
+ config: default
1050
+ split: test
1051
+ revision: None
1052
+ metrics:
1053
+ - type: map_at_1
1054
+ value: 23.017000000000003
1055
+ - type: map_at_10
1056
+ value: 31.049
1057
+ - type: map_at_100
1058
+ value: 32.582
1059
+ - type: map_at_1000
1060
+ value: 32.817
1061
+ - type: map_at_3
1062
+ value: 28.303
1063
+ - type: map_at_5
1064
+ value: 29.854000000000003
1065
+ - type: mrr_at_1
1066
+ value: 27.866000000000003
1067
+ - type: mrr_at_10
1068
+ value: 35.56
1069
+ - type: mrr_at_100
1070
+ value: 36.453
1071
+ - type: mrr_at_1000
1072
+ value: 36.519
1073
+ - type: mrr_at_3
1074
+ value: 32.938
1075
+ - type: mrr_at_5
1076
+ value: 34.391
1077
+ - type: ndcg_at_1
1078
+ value: 27.866000000000003
1079
+ - type: ndcg_at_10
1080
+ value: 36.506
1081
+ - type: ndcg_at_100
1082
+ value: 42.344
1083
+ - type: ndcg_at_1000
1084
+ value: 45.213
1085
+ - type: ndcg_at_3
1086
+ value: 31.805
1087
+ - type: ndcg_at_5
1088
+ value: 33.933
1089
+ - type: precision_at_1
1090
+ value: 27.866000000000003
1091
+ - type: precision_at_10
1092
+ value: 7.016
1093
+ - type: precision_at_100
1094
+ value: 1.468
1095
+ - type: precision_at_1000
1096
+ value: 0.23900000000000002
1097
+ - type: precision_at_3
1098
+ value: 14.822
1099
+ - type: precision_at_5
1100
+ value: 10.791
1101
+ - type: recall_at_1
1102
+ value: 23.017000000000003
1103
+ - type: recall_at_10
1104
+ value: 47.053
1105
+ - type: recall_at_100
1106
+ value: 73.177
1107
+ - type: recall_at_1000
1108
+ value: 91.47800000000001
1109
+ - type: recall_at_3
1110
+ value: 33.675
1111
+ - type: recall_at_5
1112
+ value: 39.36
1113
+ - task:
1114
+ type: Retrieval
1115
+ dataset:
1116
+ type: BeIR/cqadupstack
1117
+ name: MTEB CQADupstackWordpressRetrieval
1118
+ config: default
1119
+ split: test
1120
+ revision: None
1121
+ metrics:
1122
+ - type: map_at_1
1123
+ value: 16.673
1124
+ - type: map_at_10
1125
+ value: 24.051000000000002
1126
+ - type: map_at_100
1127
+ value: 24.933
1128
+ - type: map_at_1000
1129
+ value: 25.06
1130
+ - type: map_at_3
1131
+ value: 21.446
1132
+ - type: map_at_5
1133
+ value: 23.064
1134
+ - type: mrr_at_1
1135
+ value: 18.115000000000002
1136
+ - type: mrr_at_10
1137
+ value: 25.927
1138
+ - type: mrr_at_100
1139
+ value: 26.718999999999998
1140
+ - type: mrr_at_1000
1141
+ value: 26.817999999999998
1142
+ - type: mrr_at_3
1143
+ value: 23.383000000000003
1144
+ - type: mrr_at_5
1145
+ value: 25.008999999999997
1146
+ - type: ndcg_at_1
1147
+ value: 18.115000000000002
1148
+ - type: ndcg_at_10
1149
+ value: 28.669
1150
+ - type: ndcg_at_100
1151
+ value: 33.282000000000004
1152
+ - type: ndcg_at_1000
1153
+ value: 36.481
1154
+ - type: ndcg_at_3
1155
+ value: 23.574
1156
+ - type: ndcg_at_5
1157
+ value: 26.340000000000003
1158
+ - type: precision_at_1
1159
+ value: 18.115000000000002
1160
+ - type: precision_at_10
1161
+ value: 4.769
1162
+ - type: precision_at_100
1163
+ value: 0.767
1164
+ - type: precision_at_1000
1165
+ value: 0.116
1166
+ - type: precision_at_3
1167
+ value: 10.351
1168
+ - type: precision_at_5
1169
+ value: 7.8
1170
+ - type: recall_at_1
1171
+ value: 16.673
1172
+ - type: recall_at_10
1173
+ value: 41.063
1174
+ - type: recall_at_100
1175
+ value: 62.851
1176
+ - type: recall_at_1000
1177
+ value: 86.701
1178
+ - type: recall_at_3
1179
+ value: 27.532
1180
+ - type: recall_at_5
1181
+ value: 34.076
1182
+ - task:
1183
+ type: Retrieval
1184
+ dataset:
1185
+ type: climate-fever
1186
+ name: MTEB ClimateFEVER
1187
+ config: default
1188
+ split: test
1189
+ revision: None
1190
+ metrics:
1191
+ - type: map_at_1
1192
+ value: 8.752
1193
+ - type: map_at_10
1194
+ value: 15.120000000000001
1195
+ - type: map_at_100
1196
+ value: 16.678
1197
+ - type: map_at_1000
1198
+ value: 16.854
1199
+ - type: map_at_3
1200
+ value: 12.603
1201
+ - type: map_at_5
1202
+ value: 13.918
1203
+ - type: mrr_at_1
1204
+ value: 19.283
1205
+ - type: mrr_at_10
1206
+ value: 29.145
1207
+ - type: mrr_at_100
1208
+ value: 30.281000000000002
1209
+ - type: mrr_at_1000
1210
+ value: 30.339
1211
+ - type: mrr_at_3
1212
+ value: 26.069
1213
+ - type: mrr_at_5
1214
+ value: 27.864
1215
+ - type: ndcg_at_1
1216
+ value: 19.283
1217
+ - type: ndcg_at_10
1218
+ value: 21.804000000000002
1219
+ - type: ndcg_at_100
1220
+ value: 28.576
1221
+ - type: ndcg_at_1000
1222
+ value: 32.063
1223
+ - type: ndcg_at_3
1224
+ value: 17.511
1225
+ - type: ndcg_at_5
1226
+ value: 19.112000000000002
1227
+ - type: precision_at_1
1228
+ value: 19.283
1229
+ - type: precision_at_10
1230
+ value: 6.873
1231
+ - type: precision_at_100
1232
+ value: 1.405
1233
+ - type: precision_at_1000
1234
+ value: 0.20500000000000002
1235
+ - type: precision_at_3
1236
+ value: 13.16
1237
+ - type: precision_at_5
1238
+ value: 10.189
1239
+ - type: recall_at_1
1240
+ value: 8.752
1241
+ - type: recall_at_10
1242
+ value: 27.004
1243
+ - type: recall_at_100
1244
+ value: 50.648
1245
+ - type: recall_at_1000
1246
+ value: 70.458
1247
+ - type: recall_at_3
1248
+ value: 16.461000000000002
1249
+ - type: recall_at_5
1250
+ value: 20.973
1251
+ - task:
1252
+ type: Retrieval
1253
+ dataset:
1254
+ type: dbpedia-entity
1255
+ name: MTEB DBPedia
1256
+ config: default
1257
+ split: test
1258
+ revision: None
1259
+ metrics:
1260
+ - type: map_at_1
1261
+ value: 6.81
1262
+ - type: map_at_10
1263
+ value: 14.056
1264
+ - type: map_at_100
1265
+ value: 18.961
1266
+ - type: map_at_1000
1267
+ value: 20.169
1268
+ - type: map_at_3
1269
+ value: 10.496
1270
+ - type: map_at_5
1271
+ value: 11.952
1272
+ - type: mrr_at_1
1273
+ value: 53.5
1274
+ - type: mrr_at_10
1275
+ value: 63.479
1276
+ - type: mrr_at_100
1277
+ value: 63.971999999999994
1278
+ - type: mrr_at_1000
1279
+ value: 63.993
1280
+ - type: mrr_at_3
1281
+ value: 61.541999999999994
1282
+ - type: mrr_at_5
1283
+ value: 62.778999999999996
1284
+ - type: ndcg_at_1
1285
+ value: 42.25
1286
+ - type: ndcg_at_10
1287
+ value: 31.471
1288
+ - type: ndcg_at_100
1289
+ value: 35.115
1290
+ - type: ndcg_at_1000
1291
+ value: 42.408
1292
+ - type: ndcg_at_3
1293
+ value: 35.458
1294
+ - type: ndcg_at_5
1295
+ value: 32.973
1296
+ - type: precision_at_1
1297
+ value: 53.5
1298
+ - type: precision_at_10
1299
+ value: 24.85
1300
+ - type: precision_at_100
1301
+ value: 7.79
1302
+ - type: precision_at_1000
1303
+ value: 1.599
1304
+ - type: precision_at_3
1305
+ value: 38.667
1306
+ - type: precision_at_5
1307
+ value: 31.55
1308
+ - type: recall_at_1
1309
+ value: 6.81
1310
+ - type: recall_at_10
1311
+ value: 19.344
1312
+ - type: recall_at_100
1313
+ value: 40.837
1314
+ - type: recall_at_1000
1315
+ value: 64.661
1316
+ - type: recall_at_3
1317
+ value: 11.942
1318
+ - type: recall_at_5
1319
+ value: 14.646
1320
+ - task:
1321
+ type: Classification
1322
+ dataset:
1323
+ type: mteb/emotion
1324
+ name: MTEB EmotionClassification
1325
+ config: default
1326
+ split: test
1327
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1328
+ metrics:
1329
+ - type: accuracy
1330
+ value: 44.64499999999999
1331
+ - type: f1
1332
+ value: 39.39106911352714
1333
+ - task:
1334
+ type: Retrieval
1335
+ dataset:
1336
+ type: fever
1337
+ name: MTEB FEVER
1338
+ config: default
1339
+ split: test
1340
+ revision: None
1341
+ metrics:
1342
+ - type: map_at_1
1343
+ value: 48.196
1344
+ - type: map_at_10
1345
+ value: 61.404
1346
+ - type: map_at_100
1347
+ value: 61.846000000000004
1348
+ - type: map_at_1000
1349
+ value: 61.866
1350
+ - type: map_at_3
1351
+ value: 58.975
1352
+ - type: map_at_5
1353
+ value: 60.525
1354
+ - type: mrr_at_1
1355
+ value: 52.025
1356
+ - type: mrr_at_10
1357
+ value: 65.43299999999999
1358
+ - type: mrr_at_100
1359
+ value: 65.80799999999999
1360
+ - type: mrr_at_1000
1361
+ value: 65.818
1362
+ - type: mrr_at_3
1363
+ value: 63.146
1364
+ - type: mrr_at_5
1365
+ value: 64.64
1366
+ - type: ndcg_at_1
1367
+ value: 52.025
1368
+ - type: ndcg_at_10
1369
+ value: 67.889
1370
+ - type: ndcg_at_100
1371
+ value: 69.864
1372
+ - type: ndcg_at_1000
1373
+ value: 70.337
1374
+ - type: ndcg_at_3
1375
+ value: 63.315
1376
+ - type: ndcg_at_5
1377
+ value: 65.91799999999999
1378
+ - type: precision_at_1
1379
+ value: 52.025
1380
+ - type: precision_at_10
1381
+ value: 9.182
1382
+ - type: precision_at_100
1383
+ value: 1.027
1384
+ - type: precision_at_1000
1385
+ value: 0.108
1386
+ - type: precision_at_3
1387
+ value: 25.968000000000004
1388
+ - type: precision_at_5
1389
+ value: 17.006
1390
+ - type: recall_at_1
1391
+ value: 48.196
1392
+ - type: recall_at_10
1393
+ value: 83.885
1394
+ - type: recall_at_100
1395
+ value: 92.671
1396
+ - type: recall_at_1000
1397
+ value: 96.018
1398
+ - type: recall_at_3
1399
+ value: 71.59
1400
+ - type: recall_at_5
1401
+ value: 77.946
1402
+ - task:
1403
+ type: Retrieval
1404
+ dataset:
1405
+ type: fiqa
1406
+ name: MTEB FiQA2018
1407
+ config: default
1408
+ split: test
1409
+ revision: None
1410
+ metrics:
1411
+ - type: map_at_1
1412
+ value: 15.193000000000001
1413
+ - type: map_at_10
1414
+ value: 25.168000000000003
1415
+ - type: map_at_100
1416
+ value: 27.017000000000003
1417
+ - type: map_at_1000
1418
+ value: 27.205000000000002
1419
+ - type: map_at_3
1420
+ value: 21.746
1421
+ - type: map_at_5
1422
+ value: 23.579
1423
+ - type: mrr_at_1
1424
+ value: 31.635999999999996
1425
+ - type: mrr_at_10
1426
+ value: 40.077
1427
+ - type: mrr_at_100
1428
+ value: 41.112
1429
+ - type: mrr_at_1000
1430
+ value: 41.160999999999994
1431
+ - type: mrr_at_3
1432
+ value: 37.937
1433
+ - type: mrr_at_5
1434
+ value: 39.18
1435
+ - type: ndcg_at_1
1436
+ value: 31.635999999999996
1437
+ - type: ndcg_at_10
1438
+ value: 32.298
1439
+ - type: ndcg_at_100
1440
+ value: 39.546
1441
+ - type: ndcg_at_1000
1442
+ value: 42.88
1443
+ - type: ndcg_at_3
1444
+ value: 29.221999999999998
1445
+ - type: ndcg_at_5
1446
+ value: 30.069000000000003
1447
+ - type: precision_at_1
1448
+ value: 31.635999999999996
1449
+ - type: precision_at_10
1450
+ value: 9.367
1451
+ - type: precision_at_100
1452
+ value: 1.645
1453
+ - type: precision_at_1000
1454
+ value: 0.22399999999999998
1455
+ - type: precision_at_3
1456
+ value: 20.01
1457
+ - type: precision_at_5
1458
+ value: 14.753
1459
+ - type: recall_at_1
1460
+ value: 15.193000000000001
1461
+ - type: recall_at_10
1462
+ value: 38.214999999999996
1463
+ - type: recall_at_100
1464
+ value: 65.95
1465
+ - type: recall_at_1000
1466
+ value: 85.85300000000001
1467
+ - type: recall_at_3
1468
+ value: 26.357000000000003
1469
+ - type: recall_at_5
1470
+ value: 31.319999999999997
1471
+ - task:
1472
+ type: Retrieval
1473
+ dataset:
1474
+ type: jinaai/ger_da_lir
1475
+ name: MTEB GerDaLIR
1476
+ config: default
1477
+ split: test
1478
+ revision: None
1479
+ metrics:
1480
+ - type: map_at_1
1481
+ value: 10.363
1482
+ - type: map_at_10
1483
+ value: 16.222
1484
+ - type: map_at_100
1485
+ value: 17.28
1486
+ - type: map_at_1000
1487
+ value: 17.380000000000003
1488
+ - type: map_at_3
1489
+ value: 14.054
1490
+ - type: map_at_5
1491
+ value: 15.203
1492
+ - type: mrr_at_1
1493
+ value: 11.644
1494
+ - type: mrr_at_10
1495
+ value: 17.625
1496
+ - type: mrr_at_100
1497
+ value: 18.608
1498
+ - type: mrr_at_1000
1499
+ value: 18.695999999999998
1500
+ - type: mrr_at_3
1501
+ value: 15.481
1502
+ - type: mrr_at_5
1503
+ value: 16.659
1504
+ - type: ndcg_at_1
1505
+ value: 11.628
1506
+ - type: ndcg_at_10
1507
+ value: 20.028000000000002
1508
+ - type: ndcg_at_100
1509
+ value: 25.505
1510
+ - type: ndcg_at_1000
1511
+ value: 28.288000000000004
1512
+ - type: ndcg_at_3
1513
+ value: 15.603
1514
+ - type: ndcg_at_5
1515
+ value: 17.642
1516
+ - type: precision_at_1
1517
+ value: 11.628
1518
+ - type: precision_at_10
1519
+ value: 3.5589999999999997
1520
+ - type: precision_at_100
1521
+ value: 0.664
1522
+ - type: precision_at_1000
1523
+ value: 0.092
1524
+ - type: precision_at_3
1525
+ value: 7.109999999999999
1526
+ - type: precision_at_5
1527
+ value: 5.401
1528
+ - type: recall_at_1
1529
+ value: 10.363
1530
+ - type: recall_at_10
1531
+ value: 30.586000000000002
1532
+ - type: recall_at_100
1533
+ value: 56.43
1534
+ - type: recall_at_1000
1535
+ value: 78.142
1536
+ - type: recall_at_3
1537
+ value: 18.651
1538
+ - type: recall_at_5
1539
+ value: 23.493
1540
+ - task:
1541
+ type: Retrieval
1542
+ dataset:
1543
+ type: deepset/germandpr
1544
+ name: MTEB GermanDPR
1545
+ config: default
1546
+ split: test
1547
+ revision: 5129d02422a66be600ac89cd3e8531b4f97d347d
1548
+ metrics:
1549
+ - type: map_at_1
1550
+ value: 60.78
1551
+ - type: map_at_10
1552
+ value: 73.91499999999999
1553
+ - type: map_at_100
1554
+ value: 74.089
1555
+ - type: map_at_1000
1556
+ value: 74.09400000000001
1557
+ - type: map_at_3
1558
+ value: 71.87
1559
+ - type: map_at_5
1560
+ value: 73.37700000000001
1561
+ - type: mrr_at_1
1562
+ value: 60.78
1563
+ - type: mrr_at_10
1564
+ value: 73.91499999999999
1565
+ - type: mrr_at_100
1566
+ value: 74.089
1567
+ - type: mrr_at_1000
1568
+ value: 74.09400000000001
1569
+ - type: mrr_at_3
1570
+ value: 71.87
1571
+ - type: mrr_at_5
1572
+ value: 73.37700000000001
1573
+ - type: ndcg_at_1
1574
+ value: 60.78
1575
+ - type: ndcg_at_10
1576
+ value: 79.35600000000001
1577
+ - type: ndcg_at_100
1578
+ value: 80.077
1579
+ - type: ndcg_at_1000
1580
+ value: 80.203
1581
+ - type: ndcg_at_3
1582
+ value: 75.393
1583
+ - type: ndcg_at_5
1584
+ value: 78.077
1585
+ - type: precision_at_1
1586
+ value: 60.78
1587
+ - type: precision_at_10
1588
+ value: 9.59
1589
+ - type: precision_at_100
1590
+ value: 0.9900000000000001
1591
+ - type: precision_at_1000
1592
+ value: 0.1
1593
+ - type: precision_at_3
1594
+ value: 28.52
1595
+ - type: precision_at_5
1596
+ value: 18.4
1597
+ - type: recall_at_1
1598
+ value: 60.78
1599
+ - type: recall_at_10
1600
+ value: 95.902
1601
+ - type: recall_at_100
1602
+ value: 99.024
1603
+ - type: recall_at_1000
1604
+ value: 100.0
1605
+ - type: recall_at_3
1606
+ value: 85.56099999999999
1607
+ - type: recall_at_5
1608
+ value: 92.0
1609
+ - task:
1610
+ type: STS
1611
+ dataset:
1612
+ type: jinaai/german-STSbenchmark
1613
+ name: MTEB GermanSTSBenchmark
1614
+ config: default
1615
+ split: test
1616
+ revision: 49d9b423b996fea62b483f9ee6dfb5ec233515ca
1617
+ metrics:
1618
+ - type: cos_sim_pearson
1619
+ value: 88.49524420894356
1620
+ - type: cos_sim_spearman
1621
+ value: 88.32407839427714
1622
+ - type: euclidean_pearson
1623
+ value: 87.25098779877104
1624
+ - type: euclidean_spearman
1625
+ value: 88.22738098593608
1626
+ - type: manhattan_pearson
1627
+ value: 87.23872691839607
1628
+ - type: manhattan_spearman
1629
+ value: 88.2002968380165
1630
+ - task:
1631
+ type: Retrieval
1632
+ dataset:
1633
+ type: hotpotqa
1634
+ name: MTEB HotpotQA
1635
+ config: default
1636
+ split: test
1637
+ revision: None
1638
+ metrics:
1639
+ - type: map_at_1
1640
+ value: 31.81
1641
+ - type: map_at_10
1642
+ value: 46.238
1643
+ - type: map_at_100
1644
+ value: 47.141
1645
+ - type: map_at_1000
1646
+ value: 47.213
1647
+ - type: map_at_3
1648
+ value: 43.248999999999995
1649
+ - type: map_at_5
1650
+ value: 45.078
1651
+ - type: mrr_at_1
1652
+ value: 63.619
1653
+ - type: mrr_at_10
1654
+ value: 71.279
1655
+ - type: mrr_at_100
1656
+ value: 71.648
1657
+ - type: mrr_at_1000
1658
+ value: 71.665
1659
+ - type: mrr_at_3
1660
+ value: 69.76599999999999
1661
+ - type: mrr_at_5
1662
+ value: 70.743
1663
+ - type: ndcg_at_1
1664
+ value: 63.619
1665
+ - type: ndcg_at_10
1666
+ value: 55.38999999999999
1667
+ - type: ndcg_at_100
1668
+ value: 58.80800000000001
1669
+ - type: ndcg_at_1000
1670
+ value: 60.331999999999994
1671
+ - type: ndcg_at_3
1672
+ value: 50.727
1673
+ - type: ndcg_at_5
1674
+ value: 53.284
1675
+ - type: precision_at_1
1676
+ value: 63.619
1677
+ - type: precision_at_10
1678
+ value: 11.668000000000001
1679
+ - type: precision_at_100
1680
+ value: 1.434
1681
+ - type: precision_at_1000
1682
+ value: 0.164
1683
+ - type: precision_at_3
1684
+ value: 32.001000000000005
1685
+ - type: precision_at_5
1686
+ value: 21.223
1687
+ - type: recall_at_1
1688
+ value: 31.81
1689
+ - type: recall_at_10
1690
+ value: 58.339
1691
+ - type: recall_at_100
1692
+ value: 71.708
1693
+ - type: recall_at_1000
1694
+ value: 81.85
1695
+ - type: recall_at_3
1696
+ value: 48.001
1697
+ - type: recall_at_5
1698
+ value: 53.059
1699
+ - task:
1700
+ type: Classification
1701
+ dataset:
1702
+ type: mteb/imdb
1703
+ name: MTEB ImdbClassification
1704
+ config: default
1705
+ split: test
1706
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1707
+ metrics:
1708
+ - type: accuracy
1709
+ value: 68.60640000000001
1710
+ - type: ap
1711
+ value: 62.84296904042086
1712
+ - type: f1
1713
+ value: 68.50643633327537
1714
+ - task:
1715
+ type: Reranking
1716
+ dataset:
1717
+ type: jinaai/miracl
1718
+ name: MTEB MIRACL
1719
+ config: default
1720
+ split: test
1721
+ revision: 8741c3b61cd36ed9ca1b3d4203543a41793239e2
1722
+ metrics:
1723
+ - type: map
1724
+ value: 64.29704335389768
1725
+ - type: mrr
1726
+ value: 72.11962197159565
1727
+ - task:
1728
+ type: Classification
1729
+ dataset:
1730
+ type: mteb/mtop_domain
1731
+ name: MTEB MTOPDomainClassification (en)
1732
+ config: en
1733
+ split: test
1734
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1735
+ metrics:
1736
+ - type: accuracy
1737
+ value: 89.3844049247606
1738
+ - type: f1
1739
+ value: 89.2124328528015
1740
+ - task:
1741
+ type: Classification
1742
+ dataset:
1743
+ type: mteb/mtop_domain
1744
+ name: MTEB MTOPDomainClassification (de)
1745
+ config: de
1746
+ split: test
1747
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1748
+ metrics:
1749
+ - type: accuracy
1750
+ value: 88.36855452240067
1751
+ - type: f1
1752
+ value: 87.35458822097442
1753
+ - task:
1754
+ type: Classification
1755
+ dataset:
1756
+ type: mteb/mtop_intent
1757
+ name: MTEB MTOPIntentClassification (en)
1758
+ config: en
1759
+ split: test
1760
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1761
+ metrics:
1762
+ - type: accuracy
1763
+ value: 66.48654810761514
1764
+ - type: f1
1765
+ value: 50.07229882504409
1766
+ - task:
1767
+ type: Classification
1768
+ dataset:
1769
+ type: mteb/mtop_intent
1770
+ name: MTEB MTOPIntentClassification (de)
1771
+ config: de
1772
+ split: test
1773
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1774
+ metrics:
1775
+ - type: accuracy
1776
+ value: 63.832065370526905
1777
+ - type: f1
1778
+ value: 46.283579383385806
1779
+ - task:
1780
+ type: Classification
1781
+ dataset:
1782
+ type: mteb/amazon_massive_intent
1783
+ name: MTEB MassiveIntentClassification (de)
1784
+ config: de
1785
+ split: test
1786
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1787
+ metrics:
1788
+ - type: accuracy
1789
+ value: 63.89038332212509
1790
+ - type: f1
1791
+ value: 61.86279849685129
1792
+ - task:
1793
+ type: Classification
1794
+ dataset:
1795
+ type: mteb/amazon_massive_intent
1796
+ name: MTEB MassiveIntentClassification (en)
1797
+ config: en
1798
+ split: test
1799
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1800
+ metrics:
1801
+ - type: accuracy
1802
+ value: 69.11230665770006
1803
+ - type: f1
1804
+ value: 67.44780095350535
1805
+ - task:
1806
+ type: Classification
1807
+ dataset:
1808
+ type: mteb/amazon_massive_scenario
1809
+ name: MTEB MassiveScenarioClassification (de)
1810
+ config: de
1811
+ split: test
1812
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1813
+ metrics:
1814
+ - type: accuracy
1815
+ value: 71.25084061869536
1816
+ - type: f1
1817
+ value: 71.43965023016408
1818
+ - task:
1819
+ type: Classification
1820
+ dataset:
1821
+ type: mteb/amazon_massive_scenario
1822
+ name: MTEB MassiveScenarioClassification (en)
1823
+ config: en
1824
+ split: test
1825
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1826
+ metrics:
1827
+ - type: accuracy
1828
+ value: 73.73907195696032
1829
+ - type: f1
1830
+ value: 73.69920814839061
1831
+ - task:
1832
+ type: Clustering
1833
+ dataset:
1834
+ type: mteb/medrxiv-clustering-p2p
1835
+ name: MTEB MedrxivClusteringP2P
1836
+ config: default
1837
+ split: test
1838
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1839
+ metrics:
1840
+ - type: v_measure
1841
+ value: 31.32577306498249
1842
+ - task:
1843
+ type: Clustering
1844
+ dataset:
1845
+ type: mteb/medrxiv-clustering-s2s
1846
+ name: MTEB MedrxivClusteringS2S
1847
+ config: default
1848
+ split: test
1849
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1850
+ metrics:
1851
+ - type: v_measure
1852
+ value: 28.759349326367783
1853
+ - task:
1854
+ type: Reranking
1855
+ dataset:
1856
+ type: mteb/mind_small
1857
+ name: MTEB MindSmallReranking
1858
+ config: default
1859
+ split: test
1860
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1861
+ metrics:
1862
+ - type: map
1863
+ value: 30.401342674703425
1864
+ - type: mrr
1865
+ value: 31.384379585660987
1866
+ - task:
1867
+ type: Retrieval
1868
+ dataset:
1869
+ type: nfcorpus
1870
+ name: MTEB NFCorpus
1871
+ config: default
1872
+ split: test
1873
+ revision: None
1874
+ metrics:
1875
+ - type: map_at_1
1876
+ value: 4.855
1877
+ - type: map_at_10
1878
+ value: 10.01
1879
+ - type: map_at_100
1880
+ value: 12.461
1881
+ - type: map_at_1000
1882
+ value: 13.776
1883
+ - type: map_at_3
1884
+ value: 7.252
1885
+ - type: map_at_5
1886
+ value: 8.679
1887
+ - type: mrr_at_1
1888
+ value: 41.176
1889
+ - type: mrr_at_10
1890
+ value: 49.323
1891
+ - type: mrr_at_100
1892
+ value: 49.954
1893
+ - type: mrr_at_1000
1894
+ value: 49.997
1895
+ - type: mrr_at_3
1896
+ value: 46.904
1897
+ - type: mrr_at_5
1898
+ value: 48.375
1899
+ - type: ndcg_at_1
1900
+ value: 39.318999999999996
1901
+ - type: ndcg_at_10
1902
+ value: 28.607
1903
+ - type: ndcg_at_100
1904
+ value: 26.554
1905
+ - type: ndcg_at_1000
1906
+ value: 35.731
1907
+ - type: ndcg_at_3
1908
+ value: 32.897999999999996
1909
+ - type: ndcg_at_5
1910
+ value: 31.53
1911
+ - type: precision_at_1
1912
+ value: 41.176
1913
+ - type: precision_at_10
1914
+ value: 20.867
1915
+ - type: precision_at_100
1916
+ value: 6.796
1917
+ - type: precision_at_1000
1918
+ value: 1.983
1919
+ - type: precision_at_3
1920
+ value: 30.547
1921
+ - type: precision_at_5
1922
+ value: 27.245
1923
+ - type: recall_at_1
1924
+ value: 4.855
1925
+ - type: recall_at_10
1926
+ value: 14.08
1927
+ - type: recall_at_100
1928
+ value: 28.188000000000002
1929
+ - type: recall_at_1000
1930
+ value: 60.07900000000001
1931
+ - type: recall_at_3
1932
+ value: 7.947
1933
+ - type: recall_at_5
1934
+ value: 10.786
1935
+ - task:
1936
+ type: Retrieval
1937
+ dataset:
1938
+ type: nq
1939
+ name: MTEB NQ
1940
+ config: default
1941
+ split: test
1942
+ revision: None
1943
+ metrics:
1944
+ - type: map_at_1
1945
+ value: 26.906999999999996
1946
+ - type: map_at_10
1947
+ value: 41.147
1948
+ - type: map_at_100
1949
+ value: 42.269
1950
+ - type: map_at_1000
1951
+ value: 42.308
1952
+ - type: map_at_3
1953
+ value: 36.638999999999996
1954
+ - type: map_at_5
1955
+ value: 39.285
1956
+ - type: mrr_at_1
1957
+ value: 30.359
1958
+ - type: mrr_at_10
1959
+ value: 43.607
1960
+ - type: mrr_at_100
1961
+ value: 44.454
1962
+ - type: mrr_at_1000
1963
+ value: 44.481
1964
+ - type: mrr_at_3
1965
+ value: 39.644
1966
+ - type: mrr_at_5
1967
+ value: 42.061
1968
+ - type: ndcg_at_1
1969
+ value: 30.330000000000002
1970
+ - type: ndcg_at_10
1971
+ value: 48.899
1972
+ - type: ndcg_at_100
1973
+ value: 53.612
1974
+ - type: ndcg_at_1000
1975
+ value: 54.51200000000001
1976
+ - type: ndcg_at_3
1977
+ value: 40.262
1978
+ - type: ndcg_at_5
1979
+ value: 44.787
1980
+ - type: precision_at_1
1981
+ value: 30.330000000000002
1982
+ - type: precision_at_10
1983
+ value: 8.323
1984
+ - type: precision_at_100
1985
+ value: 1.0959999999999999
1986
+ - type: precision_at_1000
1987
+ value: 0.11800000000000001
1988
+ - type: precision_at_3
1989
+ value: 18.395
1990
+ - type: precision_at_5
1991
+ value: 13.627
1992
+ - type: recall_at_1
1993
+ value: 26.906999999999996
1994
+ - type: recall_at_10
1995
+ value: 70.215
1996
+ - type: recall_at_100
1997
+ value: 90.61200000000001
1998
+ - type: recall_at_1000
1999
+ value: 97.294
2000
+ - type: recall_at_3
2001
+ value: 47.784
2002
+ - type: recall_at_5
2003
+ value: 58.251
2004
+ - task:
2005
+ type: PairClassification
2006
+ dataset:
2007
+ type: paws-x
2008
+ name: MTEB PawsX
2009
+ config: default
2010
+ split: test
2011
+ revision: 8a04d940a42cd40658986fdd8e3da561533a3646
2012
+ metrics:
2013
+ - type: cos_sim_accuracy
2014
+ value: 60.5
2015
+ - type: cos_sim_ap
2016
+ value: 57.606096528877494
2017
+ - type: cos_sim_f1
2018
+ value: 62.24240307369892
2019
+ - type: cos_sim_precision
2020
+ value: 45.27439024390244
2021
+ - type: cos_sim_recall
2022
+ value: 99.55307262569832
2023
+ - type: dot_accuracy
2024
+ value: 57.699999999999996
2025
+ - type: dot_ap
2026
+ value: 51.289351057160616
2027
+ - type: dot_f1
2028
+ value: 62.25953130465197
2029
+ - type: dot_precision
2030
+ value: 45.31568228105906
2031
+ - type: dot_recall
2032
+ value: 99.4413407821229
2033
+ - type: euclidean_accuracy
2034
+ value: 60.45
2035
+ - type: euclidean_ap
2036
+ value: 57.616461421424034
2037
+ - type: euclidean_f1
2038
+ value: 62.313697657913416
2039
+ - type: euclidean_precision
2040
+ value: 45.657826313052524
2041
+ - type: euclidean_recall
2042
+ value: 98.10055865921787
2043
+ - type: manhattan_accuracy
2044
+ value: 60.3
2045
+ - type: manhattan_ap
2046
+ value: 57.580565271667325
2047
+ - type: manhattan_f1
2048
+ value: 62.24240307369892
2049
+ - type: manhattan_precision
2050
+ value: 45.27439024390244
2051
+ - type: manhattan_recall
2052
+ value: 99.55307262569832
2053
+ - type: max_accuracy
2054
+ value: 60.5
2055
+ - type: max_ap
2056
+ value: 57.616461421424034
2057
+ - type: max_f1
2058
+ value: 62.313697657913416
2059
+ - task:
2060
+ type: Retrieval
2061
+ dataset:
2062
+ type: quora
2063
+ name: MTEB QuoraRetrieval
2064
+ config: default
2065
+ split: test
2066
+ revision: None
2067
+ metrics:
2068
+ - type: map_at_1
2069
+ value: 70.21300000000001
2070
+ - type: map_at_10
2071
+ value: 84.136
2072
+ - type: map_at_100
2073
+ value: 84.796
2074
+ - type: map_at_1000
2075
+ value: 84.812
2076
+ - type: map_at_3
2077
+ value: 81.182
2078
+ - type: map_at_5
2079
+ value: 83.027
2080
+ - type: mrr_at_1
2081
+ value: 80.91000000000001
2082
+ - type: mrr_at_10
2083
+ value: 87.155
2084
+ - type: mrr_at_100
2085
+ value: 87.27000000000001
2086
+ - type: mrr_at_1000
2087
+ value: 87.271
2088
+ - type: mrr_at_3
2089
+ value: 86.158
2090
+ - type: mrr_at_5
2091
+ value: 86.828
2092
+ - type: ndcg_at_1
2093
+ value: 80.88
2094
+ - type: ndcg_at_10
2095
+ value: 87.926
2096
+ - type: ndcg_at_100
2097
+ value: 89.223
2098
+ - type: ndcg_at_1000
2099
+ value: 89.321
2100
+ - type: ndcg_at_3
2101
+ value: 85.036
2102
+ - type: ndcg_at_5
2103
+ value: 86.614
2104
+ - type: precision_at_1
2105
+ value: 80.88
2106
+ - type: precision_at_10
2107
+ value: 13.350000000000001
2108
+ - type: precision_at_100
2109
+ value: 1.5310000000000001
2110
+ - type: precision_at_1000
2111
+ value: 0.157
2112
+ - type: precision_at_3
2113
+ value: 37.173
2114
+ - type: precision_at_5
2115
+ value: 24.476
2116
+ - type: recall_at_1
2117
+ value: 70.21300000000001
2118
+ - type: recall_at_10
2119
+ value: 95.12
2120
+ - type: recall_at_100
2121
+ value: 99.535
2122
+ - type: recall_at_1000
2123
+ value: 99.977
2124
+ - type: recall_at_3
2125
+ value: 86.833
2126
+ - type: recall_at_5
2127
+ value: 91.26100000000001
2128
+ - task:
2129
+ type: Clustering
2130
+ dataset:
2131
+ type: mteb/reddit-clustering
2132
+ name: MTEB RedditClustering
2133
+ config: default
2134
+ split: test
2135
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
2136
+ metrics:
2137
+ - type: v_measure
2138
+ value: 47.754688783184875
2139
+ - task:
2140
+ type: Clustering
2141
+ dataset:
2142
+ type: mteb/reddit-clustering-p2p
2143
+ name: MTEB RedditClusteringP2P
2144
+ config: default
2145
+ split: test
2146
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
2147
+ metrics:
2148
+ - type: v_measure
2149
+ value: 54.875736374329364
2150
+ - task:
2151
+ type: Retrieval
2152
+ dataset:
2153
+ type: scidocs
2154
+ name: MTEB SCIDOCS
2155
+ config: default
2156
+ split: test
2157
+ revision: None
2158
+ metrics:
2159
+ - type: map_at_1
2160
+ value: 3.773
2161
+ - type: map_at_10
2162
+ value: 9.447
2163
+ - type: map_at_100
2164
+ value: 11.1
2165
+ - type: map_at_1000
2166
+ value: 11.37
2167
+ - type: map_at_3
2168
+ value: 6.787
2169
+ - type: map_at_5
2170
+ value: 8.077
2171
+ - type: mrr_at_1
2172
+ value: 18.5
2173
+ - type: mrr_at_10
2174
+ value: 28.227000000000004
2175
+ - type: mrr_at_100
2176
+ value: 29.445
2177
+ - type: mrr_at_1000
2178
+ value: 29.515
2179
+ - type: mrr_at_3
2180
+ value: 25.2
2181
+ - type: mrr_at_5
2182
+ value: 27.055
2183
+ - type: ndcg_at_1
2184
+ value: 18.5
2185
+ - type: ndcg_at_10
2186
+ value: 16.29
2187
+ - type: ndcg_at_100
2188
+ value: 23.250999999999998
2189
+ - type: ndcg_at_1000
2190
+ value: 28.445999999999998
2191
+ - type: ndcg_at_3
2192
+ value: 15.376000000000001
2193
+ - type: ndcg_at_5
2194
+ value: 13.528
2195
+ - type: precision_at_1
2196
+ value: 18.5
2197
+ - type: precision_at_10
2198
+ value: 8.51
2199
+ - type: precision_at_100
2200
+ value: 1.855
2201
+ - type: precision_at_1000
2202
+ value: 0.311
2203
+ - type: precision_at_3
2204
+ value: 14.533
2205
+ - type: precision_at_5
2206
+ value: 12.0
2207
+ - type: recall_at_1
2208
+ value: 3.773
2209
+ - type: recall_at_10
2210
+ value: 17.282
2211
+ - type: recall_at_100
2212
+ value: 37.645
2213
+ - type: recall_at_1000
2214
+ value: 63.138000000000005
2215
+ - type: recall_at_3
2216
+ value: 8.853
2217
+ - type: recall_at_5
2218
+ value: 12.168
2219
+ - task:
2220
+ type: STS
2221
+ dataset:
2222
+ type: mteb/sickr-sts
2223
+ name: MTEB SICK-R
2224
+ config: default
2225
+ split: test
2226
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
2227
+ metrics:
2228
+ - type: cos_sim_pearson
2229
+ value: 85.32789517976525
2230
+ - type: cos_sim_spearman
2231
+ value: 80.32750384145629
2232
+ - type: euclidean_pearson
2233
+ value: 81.5025131452508
2234
+ - type: euclidean_spearman
2235
+ value: 80.24797115147175
2236
+ - type: manhattan_pearson
2237
+ value: 81.51634463412002
2238
+ - type: manhattan_spearman
2239
+ value: 80.24614721495055
2240
+ - task:
2241
+ type: STS
2242
+ dataset:
2243
+ type: mteb/sts12-sts
2244
+ name: MTEB STS12
2245
+ config: default
2246
+ split: test
2247
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
2248
+ metrics:
2249
+ - type: cos_sim_pearson
2250
+ value: 88.47050448992432
2251
+ - type: cos_sim_spearman
2252
+ value: 80.58919997743621
2253
+ - type: euclidean_pearson
2254
+ value: 85.83258918113664
2255
+ - type: euclidean_spearman
2256
+ value: 80.97441389240902
2257
+ - type: manhattan_pearson
2258
+ value: 85.7798262013878
2259
+ - type: manhattan_spearman
2260
+ value: 80.97208703064196
2261
+ - task:
2262
+ type: STS
2263
+ dataset:
2264
+ type: mteb/sts13-sts
2265
+ name: MTEB STS13
2266
+ config: default
2267
+ split: test
2268
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
2269
+ metrics:
2270
+ - type: cos_sim_pearson
2271
+ value: 85.95341439711532
2272
+ - type: cos_sim_spearman
2273
+ value: 86.59127484634989
2274
+ - type: euclidean_pearson
2275
+ value: 85.57850603454227
2276
+ - type: euclidean_spearman
2277
+ value: 86.47130477363419
2278
+ - type: manhattan_pearson
2279
+ value: 85.59387925447652
2280
+ - type: manhattan_spearman
2281
+ value: 86.50665427391583
2282
+ - task:
2283
+ type: STS
2284
+ dataset:
2285
+ type: mteb/sts14-sts
2286
+ name: MTEB STS14
2287
+ config: default
2288
+ split: test
2289
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2290
+ metrics:
2291
+ - type: cos_sim_pearson
2292
+ value: 85.39810909161844
2293
+ - type: cos_sim_spearman
2294
+ value: 82.98595295546008
2295
+ - type: euclidean_pearson
2296
+ value: 84.04681129969951
2297
+ - type: euclidean_spearman
2298
+ value: 82.98197460689866
2299
+ - type: manhattan_pearson
2300
+ value: 83.9918798171185
2301
+ - type: manhattan_spearman
2302
+ value: 82.91148131768082
2303
+ - task:
2304
+ type: STS
2305
+ dataset:
2306
+ type: mteb/sts15-sts
2307
+ name: MTEB STS15
2308
+ config: default
2309
+ split: test
2310
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2311
+ metrics:
2312
+ - type: cos_sim_pearson
2313
+ value: 88.02072712147692
2314
+ - type: cos_sim_spearman
2315
+ value: 88.78821332623012
2316
+ - type: euclidean_pearson
2317
+ value: 88.12132045572747
2318
+ - type: euclidean_spearman
2319
+ value: 88.74273451067364
2320
+ - type: manhattan_pearson
2321
+ value: 88.05431550059166
2322
+ - type: manhattan_spearman
2323
+ value: 88.67610233020723
2324
+ - task:
2325
+ type: STS
2326
+ dataset:
2327
+ type: mteb/sts16-sts
2328
+ name: MTEB STS16
2329
+ config: default
2330
+ split: test
2331
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2332
+ metrics:
2333
+ - type: cos_sim_pearson
2334
+ value: 82.96134704624787
2335
+ - type: cos_sim_spearman
2336
+ value: 84.44062976314666
2337
+ - type: euclidean_pearson
2338
+ value: 84.03642536310323
2339
+ - type: euclidean_spearman
2340
+ value: 84.4535014579785
2341
+ - type: manhattan_pearson
2342
+ value: 83.92874228901483
2343
+ - type: manhattan_spearman
2344
+ value: 84.33634314951631
2345
+ - task:
2346
+ type: STS
2347
+ dataset:
2348
+ type: mteb/sts17-crosslingual-sts
2349
+ name: MTEB STS17 (en-de)
2350
+ config: en-de
2351
+ split: test
2352
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2353
+ metrics:
2354
+ - type: cos_sim_pearson
2355
+ value: 87.3154168064887
2356
+ - type: cos_sim_spearman
2357
+ value: 86.72393652571682
2358
+ - type: euclidean_pearson
2359
+ value: 86.04193246174164
2360
+ - type: euclidean_spearman
2361
+ value: 86.30482896608093
2362
+ - type: manhattan_pearson
2363
+ value: 85.95524084651859
2364
+ - type: manhattan_spearman
2365
+ value: 86.06031431994282
2366
+ - task:
2367
+ type: STS
2368
+ dataset:
2369
+ type: mteb/sts17-crosslingual-sts
2370
+ name: MTEB STS17 (en-en)
2371
+ config: en-en
2372
+ split: test
2373
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2374
+ metrics:
2375
+ - type: cos_sim_pearson
2376
+ value: 89.91079682750804
2377
+ - type: cos_sim_spearman
2378
+ value: 89.30961836617064
2379
+ - type: euclidean_pearson
2380
+ value: 88.86249564158628
2381
+ - type: euclidean_spearman
2382
+ value: 89.04772899592396
2383
+ - type: manhattan_pearson
2384
+ value: 88.85579791315043
2385
+ - type: manhattan_spearman
2386
+ value: 88.94190462541333
2387
+ - task:
2388
+ type: STS
2389
+ dataset:
2390
+ type: mteb/sts22-crosslingual-sts
2391
+ name: MTEB STS22 (en)
2392
+ config: en
2393
+ split: test
2394
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2395
+ metrics:
2396
+ - type: cos_sim_pearson
2397
+ value: 67.00558145551088
2398
+ - type: cos_sim_spearman
2399
+ value: 67.96601170393878
2400
+ - type: euclidean_pearson
2401
+ value: 67.87627043214336
2402
+ - type: euclidean_spearman
2403
+ value: 66.76402572303859
2404
+ - type: manhattan_pearson
2405
+ value: 67.88306560555452
2406
+ - type: manhattan_spearman
2407
+ value: 66.6273862035506
2408
+ - task:
2409
+ type: STS
2410
+ dataset:
2411
+ type: mteb/sts22-crosslingual-sts
2412
+ name: MTEB STS22 (de)
2413
+ config: de
2414
+ split: test
2415
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2416
+ metrics:
2417
+ - type: cos_sim_pearson
2418
+ value: 50.83759332748726
2419
+ - type: cos_sim_spearman
2420
+ value: 59.066344562858006
2421
+ - type: euclidean_pearson
2422
+ value: 50.08955848154131
2423
+ - type: euclidean_spearman
2424
+ value: 58.36517305855221
2425
+ - type: manhattan_pearson
2426
+ value: 50.05257267223111
2427
+ - type: manhattan_spearman
2428
+ value: 58.37570252804986
2429
+ - task:
2430
+ type: STS
2431
+ dataset:
2432
+ type: mteb/sts22-crosslingual-sts
2433
+ name: MTEB STS22 (de-en)
2434
+ config: de-en
2435
+ split: test
2436
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2437
+ metrics:
2438
+ - type: cos_sim_pearson
2439
+ value: 59.22749007956492
2440
+ - type: cos_sim_spearman
2441
+ value: 55.97282077657827
2442
+ - type: euclidean_pearson
2443
+ value: 62.10661533695752
2444
+ - type: euclidean_spearman
2445
+ value: 53.62780854854067
2446
+ - type: manhattan_pearson
2447
+ value: 62.37138085709719
2448
+ - type: manhattan_spearman
2449
+ value: 54.17556356828155
2450
+ - task:
2451
+ type: STS
2452
+ dataset:
2453
+ type: mteb/stsbenchmark-sts
2454
+ name: MTEB STSBenchmark
2455
+ config: default
2456
+ split: test
2457
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2458
+ metrics:
2459
+ - type: cos_sim_pearson
2460
+ value: 87.91145397065878
2461
+ - type: cos_sim_spearman
2462
+ value: 88.13960018389005
2463
+ - type: euclidean_pearson
2464
+ value: 87.67618876224006
2465
+ - type: euclidean_spearman
2466
+ value: 87.99119480810556
2467
+ - type: manhattan_pearson
2468
+ value: 87.67920297334753
2469
+ - type: manhattan_spearman
2470
+ value: 87.99113250064492
2471
+ - task:
2472
+ type: Reranking
2473
+ dataset:
2474
+ type: mteb/scidocs-reranking
2475
+ name: MTEB SciDocsRR
2476
+ config: default
2477
+ split: test
2478
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2479
+ metrics:
2480
+ - type: map
2481
+ value: 78.09133563707582
2482
+ - type: mrr
2483
+ value: 93.2415288052543
2484
+ - task:
2485
+ type: Retrieval
2486
+ dataset:
2487
+ type: scifact
2488
+ name: MTEB SciFact
2489
+ config: default
2490
+ split: test
2491
+ revision: None
2492
+ metrics:
2493
+ - type: map_at_1
2494
+ value: 47.760999999999996
2495
+ - type: map_at_10
2496
+ value: 56.424
2497
+ - type: map_at_100
2498
+ value: 57.24399999999999
2499
+ - type: map_at_1000
2500
+ value: 57.278
2501
+ - type: map_at_3
2502
+ value: 53.68000000000001
2503
+ - type: map_at_5
2504
+ value: 55.442
2505
+ - type: mrr_at_1
2506
+ value: 50.666999999999994
2507
+ - type: mrr_at_10
2508
+ value: 58.012
2509
+ - type: mrr_at_100
2510
+ value: 58.736
2511
+ - type: mrr_at_1000
2512
+ value: 58.769000000000005
2513
+ - type: mrr_at_3
2514
+ value: 56.056
2515
+ - type: mrr_at_5
2516
+ value: 57.321999999999996
2517
+ - type: ndcg_at_1
2518
+ value: 50.666999999999994
2519
+ - type: ndcg_at_10
2520
+ value: 60.67700000000001
2521
+ - type: ndcg_at_100
2522
+ value: 64.513
2523
+ - type: ndcg_at_1000
2524
+ value: 65.62400000000001
2525
+ - type: ndcg_at_3
2526
+ value: 56.186
2527
+ - type: ndcg_at_5
2528
+ value: 58.692
2529
+ - type: precision_at_1
2530
+ value: 50.666999999999994
2531
+ - type: precision_at_10
2532
+ value: 8.200000000000001
2533
+ - type: precision_at_100
2534
+ value: 1.023
2535
+ - type: precision_at_1000
2536
+ value: 0.11199999999999999
2537
+ - type: precision_at_3
2538
+ value: 21.889
2539
+ - type: precision_at_5
2540
+ value: 14.866999999999999
2541
+ - type: recall_at_1
2542
+ value: 47.760999999999996
2543
+ - type: recall_at_10
2544
+ value: 72.006
2545
+ - type: recall_at_100
2546
+ value: 89.767
2547
+ - type: recall_at_1000
2548
+ value: 98.833
2549
+ - type: recall_at_3
2550
+ value: 60.211000000000006
2551
+ - type: recall_at_5
2552
+ value: 66.3
2553
+ - task:
2554
+ type: PairClassification
2555
+ dataset:
2556
+ type: mteb/sprintduplicatequestions-pairclassification
2557
+ name: MTEB SprintDuplicateQuestions
2558
+ config: default
2559
+ split: test
2560
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2561
+ metrics:
2562
+ - type: cos_sim_accuracy
2563
+ value: 99.79009900990098
2564
+ - type: cos_sim_ap
2565
+ value: 94.86690691995835
2566
+ - type: cos_sim_f1
2567
+ value: 89.37875751503007
2568
+ - type: cos_sim_precision
2569
+ value: 89.5582329317269
2570
+ - type: cos_sim_recall
2571
+ value: 89.2
2572
+ - type: dot_accuracy
2573
+ value: 99.76336633663367
2574
+ - type: dot_ap
2575
+ value: 94.26453740761586
2576
+ - type: dot_f1
2577
+ value: 88.00783162016641
2578
+ - type: dot_precision
2579
+ value: 86.19367209971237
2580
+ - type: dot_recall
2581
+ value: 89.9
2582
+ - type: euclidean_accuracy
2583
+ value: 99.7940594059406
2584
+ - type: euclidean_ap
2585
+ value: 94.85459757524379
2586
+ - type: euclidean_f1
2587
+ value: 89.62779156327544
2588
+ - type: euclidean_precision
2589
+ value: 88.96551724137932
2590
+ - type: euclidean_recall
2591
+ value: 90.3
2592
+ - type: manhattan_accuracy
2593
+ value: 99.79009900990098
2594
+ - type: manhattan_ap
2595
+ value: 94.76971336654465
2596
+ - type: manhattan_f1
2597
+ value: 89.35323383084577
2598
+ - type: manhattan_precision
2599
+ value: 88.91089108910892
2600
+ - type: manhattan_recall
2601
+ value: 89.8
2602
+ - type: max_accuracy
2603
+ value: 99.7940594059406
2604
+ - type: max_ap
2605
+ value: 94.86690691995835
2606
+ - type: max_f1
2607
+ value: 89.62779156327544
2608
+ - task:
2609
+ type: Clustering
2610
+ dataset:
2611
+ type: mteb/stackexchange-clustering
2612
+ name: MTEB StackExchangeClustering
2613
+ config: default
2614
+ split: test
2615
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2616
+ metrics:
2617
+ - type: v_measure
2618
+ value: 55.38197670064987
2619
+ - task:
2620
+ type: Clustering
2621
+ dataset:
2622
+ type: mteb/stackexchange-clustering-p2p
2623
+ name: MTEB StackExchangeClusteringP2P
2624
+ config: default
2625
+ split: test
2626
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2627
+ metrics:
2628
+ - type: v_measure
2629
+ value: 33.08330158937971
2630
+ - task:
2631
+ type: Reranking
2632
+ dataset:
2633
+ type: mteb/stackoverflowdupquestions-reranking
2634
+ name: MTEB StackOverflowDupQuestions
2635
+ config: default
2636
+ split: test
2637
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2638
+ metrics:
2639
+ - type: map
2640
+ value: 49.50367079063226
2641
+ - type: mrr
2642
+ value: 50.30444943128768
2643
+ - task:
2644
+ type: Summarization
2645
+ dataset:
2646
+ type: mteb/summeval
2647
+ name: MTEB SummEval
2648
+ config: default
2649
+ split: test
2650
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2651
+ metrics:
2652
+ - type: cos_sim_pearson
2653
+ value: 30.37739520909561
2654
+ - type: cos_sim_spearman
2655
+ value: 31.548500943973913
2656
+ - type: dot_pearson
2657
+ value: 29.983610104303
2658
+ - type: dot_spearman
2659
+ value: 29.90185869098618
2660
+ - task:
2661
+ type: Retrieval
2662
+ dataset:
2663
+ type: trec-covid
2664
+ name: MTEB TRECCOVID
2665
+ config: default
2666
+ split: test
2667
+ revision: None
2668
+ metrics:
2669
+ - type: map_at_1
2670
+ value: 0.198
2671
+ - type: map_at_10
2672
+ value: 1.5810000000000002
2673
+ - type: map_at_100
2674
+ value: 9.064
2675
+ - type: map_at_1000
2676
+ value: 22.161
2677
+ - type: map_at_3
2678
+ value: 0.536
2679
+ - type: map_at_5
2680
+ value: 0.8370000000000001
2681
+ - type: mrr_at_1
2682
+ value: 80.0
2683
+ - type: mrr_at_10
2684
+ value: 86.75
2685
+ - type: mrr_at_100
2686
+ value: 86.799
2687
+ - type: mrr_at_1000
2688
+ value: 86.799
2689
+ - type: mrr_at_3
2690
+ value: 85.0
2691
+ - type: mrr_at_5
2692
+ value: 86.5
2693
+ - type: ndcg_at_1
2694
+ value: 73.0
2695
+ - type: ndcg_at_10
2696
+ value: 65.122
2697
+ - type: ndcg_at_100
2698
+ value: 51.853
2699
+ - type: ndcg_at_1000
2700
+ value: 47.275
2701
+ - type: ndcg_at_3
2702
+ value: 66.274
2703
+ - type: ndcg_at_5
2704
+ value: 64.826
2705
+ - type: precision_at_1
2706
+ value: 80.0
2707
+ - type: precision_at_10
2708
+ value: 70.19999999999999
2709
+ - type: precision_at_100
2710
+ value: 53.480000000000004
2711
+ - type: precision_at_1000
2712
+ value: 20.946
2713
+ - type: precision_at_3
2714
+ value: 71.333
2715
+ - type: precision_at_5
2716
+ value: 70.0
2717
+ - type: recall_at_1
2718
+ value: 0.198
2719
+ - type: recall_at_10
2720
+ value: 1.884
2721
+ - type: recall_at_100
2722
+ value: 12.57
2723
+ - type: recall_at_1000
2724
+ value: 44.208999999999996
2725
+ - type: recall_at_3
2726
+ value: 0.5890000000000001
2727
+ - type: recall_at_5
2728
+ value: 0.95
2729
+ - task:
2730
+ type: Clustering
2731
+ dataset:
2732
+ type: slvnwhrl/tenkgnad-clustering-p2p
2733
+ name: MTEB TenKGnadClusteringP2P
2734
+ config: default
2735
+ split: test
2736
+ revision: 5c59e41555244b7e45c9a6be2d720ab4bafae558
2737
+ metrics:
2738
+ - type: v_measure
2739
+ value: 42.84199261133083
2740
+ - task:
2741
+ type: Clustering
2742
+ dataset:
2743
+ type: slvnwhrl/tenkgnad-clustering-s2s
2744
+ name: MTEB TenKGnadClusteringS2S
2745
+ config: default
2746
+ split: test
2747
+ revision: 6cddbe003f12b9b140aec477b583ac4191f01786
2748
+ metrics:
2749
+ - type: v_measure
2750
+ value: 23.689557114798838
2751
+ - task:
2752
+ type: Retrieval
2753
+ dataset:
2754
+ type: webis-touche2020
2755
+ name: MTEB Touche2020
2756
+ config: default
2757
+ split: test
2758
+ revision: None
2759
+ metrics:
2760
+ - type: map_at_1
2761
+ value: 1.941
2762
+ - type: map_at_10
2763
+ value: 8.222
2764
+ - type: map_at_100
2765
+ value: 14.277999999999999
2766
+ - type: map_at_1000
2767
+ value: 15.790000000000001
2768
+ - type: map_at_3
2769
+ value: 4.4670000000000005
2770
+ - type: map_at_5
2771
+ value: 5.762
2772
+ - type: mrr_at_1
2773
+ value: 24.490000000000002
2774
+ - type: mrr_at_10
2775
+ value: 38.784
2776
+ - type: mrr_at_100
2777
+ value: 39.724
2778
+ - type: mrr_at_1000
2779
+ value: 39.724
2780
+ - type: mrr_at_3
2781
+ value: 33.333
2782
+ - type: mrr_at_5
2783
+ value: 37.415
2784
+ - type: ndcg_at_1
2785
+ value: 22.448999999999998
2786
+ - type: ndcg_at_10
2787
+ value: 21.026
2788
+ - type: ndcg_at_100
2789
+ value: 33.721000000000004
2790
+ - type: ndcg_at_1000
2791
+ value: 45.045
2792
+ - type: ndcg_at_3
2793
+ value: 20.053
2794
+ - type: ndcg_at_5
2795
+ value: 20.09
2796
+ - type: precision_at_1
2797
+ value: 24.490000000000002
2798
+ - type: precision_at_10
2799
+ value: 19.796
2800
+ - type: precision_at_100
2801
+ value: 7.469
2802
+ - type: precision_at_1000
2803
+ value: 1.48
2804
+ - type: precision_at_3
2805
+ value: 21.769
2806
+ - type: precision_at_5
2807
+ value: 21.224
2808
+ - type: recall_at_1
2809
+ value: 1.941
2810
+ - type: recall_at_10
2811
+ value: 14.915999999999999
2812
+ - type: recall_at_100
2813
+ value: 46.155
2814
+ - type: recall_at_1000
2815
+ value: 80.664
2816
+ - type: recall_at_3
2817
+ value: 5.629
2818
+ - type: recall_at_5
2819
+ value: 8.437
2820
+ - task:
2821
+ type: Classification
2822
+ dataset:
2823
+ type: mteb/toxic_conversations_50k
2824
+ name: MTEB ToxicConversationsClassification
2825
+ config: default
2826
+ split: test
2827
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2828
+ metrics:
2829
+ - type: accuracy
2830
+ value: 69.64800000000001
2831
+ - type: ap
2832
+ value: 12.914826731261094
2833
+ - type: f1
2834
+ value: 53.05213503422915
2835
+ - task:
2836
+ type: Classification
2837
+ dataset:
2838
+ type: mteb/tweet_sentiment_extraction
2839
+ name: MTEB TweetSentimentExtractionClassification
2840
+ config: default
2841
+ split: test
2842
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2843
+ metrics:
2844
+ - type: accuracy
2845
+ value: 60.427277872099594
2846
+ - type: f1
2847
+ value: 60.78292007556828
2848
+ - task:
2849
+ type: Clustering
2850
+ dataset:
2851
+ type: mteb/twentynewsgroups-clustering
2852
+ name: MTEB TwentyNewsgroupsClustering
2853
+ config: default
2854
+ split: test
2855
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2856
+ metrics:
2857
+ - type: v_measure
2858
+ value: 40.48134168406559
2859
+ - task:
2860
+ type: PairClassification
2861
+ dataset:
2862
+ type: mteb/twittersemeval2015-pairclassification
2863
+ name: MTEB TwitterSemEval2015
2864
+ config: default
2865
+ split: test
2866
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2867
+ metrics:
2868
+ - type: cos_sim_accuracy
2869
+ value: 84.79465935506944
2870
+ - type: cos_sim_ap
2871
+ value: 70.24589055290592
2872
+ - type: cos_sim_f1
2873
+ value: 65.0994575045208
2874
+ - type: cos_sim_precision
2875
+ value: 63.76518218623482
2876
+ - type: cos_sim_recall
2877
+ value: 66.49076517150397
2878
+ - type: dot_accuracy
2879
+ value: 84.63968528342374
2880
+ - type: dot_ap
2881
+ value: 69.84683095084355
2882
+ - type: dot_f1
2883
+ value: 64.50606169727523
2884
+ - type: dot_precision
2885
+ value: 59.1719885487778
2886
+ - type: dot_recall
2887
+ value: 70.89709762532982
2888
+ - type: euclidean_accuracy
2889
+ value: 84.76485664898374
2890
+ - type: euclidean_ap
2891
+ value: 70.20556438685551
2892
+ - type: euclidean_f1
2893
+ value: 65.06796614516543
2894
+ - type: euclidean_precision
2895
+ value: 63.29840319361277
2896
+ - type: euclidean_recall
2897
+ value: 66.93931398416886
2898
+ - type: manhattan_accuracy
2899
+ value: 84.72313286046374
2900
+ - type: manhattan_ap
2901
+ value: 70.17151475534308
2902
+ - type: manhattan_f1
2903
+ value: 65.31379180759113
2904
+ - type: manhattan_precision
2905
+ value: 62.17505366086334
2906
+ - type: manhattan_recall
2907
+ value: 68.7862796833773
2908
+ - type: max_accuracy
2909
+ value: 84.79465935506944
2910
+ - type: max_ap
2911
+ value: 70.24589055290592
2912
+ - type: max_f1
2913
+ value: 65.31379180759113
2914
+ - task:
2915
+ type: PairClassification
2916
+ dataset:
2917
+ type: mteb/twitterurlcorpus-pairclassification
2918
+ name: MTEB TwitterURLCorpus
2919
+ config: default
2920
+ split: test
2921
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2922
+ metrics:
2923
+ - type: cos_sim_accuracy
2924
+ value: 88.95874568246207
2925
+ - type: cos_sim_ap
2926
+ value: 85.82517548264127
2927
+ - type: cos_sim_f1
2928
+ value: 78.22288041466125
2929
+ - type: cos_sim_precision
2930
+ value: 75.33875338753387
2931
+ - type: cos_sim_recall
2932
+ value: 81.33661841700031
2933
+ - type: dot_accuracy
2934
+ value: 88.836496293709
2935
+ - type: dot_ap
2936
+ value: 85.53430720252186
2937
+ - type: dot_f1
2938
+ value: 78.10616085869725
2939
+ - type: dot_precision
2940
+ value: 74.73269555430501
2941
+ - type: dot_recall
2942
+ value: 81.79858330766862
2943
+ - type: euclidean_accuracy
2944
+ value: 88.92769821865176
2945
+ - type: euclidean_ap
2946
+ value: 85.65904346964223
2947
+ - type: euclidean_f1
2948
+ value: 77.98774074208407
2949
+ - type: euclidean_precision
2950
+ value: 73.72282795035315
2951
+ - type: euclidean_recall
2952
+ value: 82.77640899291654
2953
+ - type: manhattan_accuracy
2954
+ value: 88.86366282454303
2955
+ - type: manhattan_ap
2956
+ value: 85.61599642231819
2957
+ - type: manhattan_f1
2958
+ value: 78.01480509061737
2959
+ - type: manhattan_precision
2960
+ value: 74.10460685833044
2961
+ - type: manhattan_recall
2962
+ value: 82.36064059131506
2963
+ - type: max_accuracy
2964
+ value: 88.95874568246207
2965
+ - type: max_ap
2966
+ value: 85.82517548264127
2967
+ - type: max_f1
2968
+ value: 78.22288041466125
2969
+ - task:
2970
+ type: Retrieval
2971
+ dataset:
2972
+ type: None
2973
+ name: MTEB WikiCLIR
2974
+ config: default
2975
+ split: test
2976
+ revision: None
2977
+ metrics:
2978
+ - type: map_at_1
2979
+ value: 3.9539999999999997
2980
+ - type: map_at_10
2981
+ value: 7.407
2982
+ - type: map_at_100
2983
+ value: 8.677999999999999
2984
+ - type: map_at_1000
2985
+ value: 9.077
2986
+ - type: map_at_3
2987
+ value: 5.987
2988
+ - type: map_at_5
2989
+ value: 6.6979999999999995
2990
+ - type: mrr_at_1
2991
+ value: 35.65
2992
+ - type: mrr_at_10
2993
+ value: 45.097
2994
+ - type: mrr_at_100
2995
+ value: 45.83
2996
+ - type: mrr_at_1000
2997
+ value: 45.871
2998
+ - type: mrr_at_3
2999
+ value: 42.63
3000
+ - type: mrr_at_5
3001
+ value: 44.104
3002
+ - type: ndcg_at_1
3003
+ value: 29.215000000000003
3004
+ - type: ndcg_at_10
3005
+ value: 22.694
3006
+ - type: ndcg_at_100
3007
+ value: 22.242
3008
+ - type: ndcg_at_1000
3009
+ value: 27.069
3010
+ - type: ndcg_at_3
3011
+ value: 27.641
3012
+ - type: ndcg_at_5
3013
+ value: 25.503999999999998
3014
+ - type: precision_at_1
3015
+ value: 35.65
3016
+ - type: precision_at_10
3017
+ value: 12.795000000000002
3018
+ - type: precision_at_100
3019
+ value: 3.354
3020
+ - type: precision_at_1000
3021
+ value: 0.743
3022
+ - type: precision_at_3
3023
+ value: 23.403
3024
+ - type: precision_at_5
3025
+ value: 18.474
3026
+ - type: recall_at_1
3027
+ value: 3.9539999999999997
3028
+ - type: recall_at_10
3029
+ value: 11.301
3030
+ - type: recall_at_100
3031
+ value: 22.919999999999998
3032
+ - type: recall_at_1000
3033
+ value: 40.146
3034
+ - type: recall_at_3
3035
+ value: 7.146
3036
+ - type: recall_at_5
3037
+ value: 8.844000000000001
3038
+ - task:
3039
+ type: Retrieval
3040
+ dataset:
3041
+ type: jinaai/xmarket_de
3042
+ name: MTEB XMarket
3043
+ config: default
3044
+ split: test
3045
+ revision: 2336818db4c06570fcdf263e1bcb9993b786f67a
3046
+ metrics:
3047
+ - type: map_at_1
3048
+ value: 4.872
3049
+ - type: map_at_10
3050
+ value: 10.658
3051
+ - type: map_at_100
3052
+ value: 13.422999999999998
3053
+ - type: map_at_1000
3054
+ value: 14.245
3055
+ - type: map_at_3
3056
+ value: 7.857
3057
+ - type: map_at_5
3058
+ value: 9.142999999999999
3059
+ - type: mrr_at_1
3060
+ value: 16.744999999999997
3061
+ - type: mrr_at_10
3062
+ value: 24.416
3063
+ - type: mrr_at_100
3064
+ value: 25.432
3065
+ - type: mrr_at_1000
3066
+ value: 25.502999999999997
3067
+ - type: mrr_at_3
3068
+ value: 22.096
3069
+ - type: mrr_at_5
3070
+ value: 23.421
3071
+ - type: ndcg_at_1
3072
+ value: 16.695999999999998
3073
+ - type: ndcg_at_10
3074
+ value: 18.66
3075
+ - type: ndcg_at_100
3076
+ value: 24.314
3077
+ - type: ndcg_at_1000
3078
+ value: 29.846
3079
+ - type: ndcg_at_3
3080
+ value: 17.041999999999998
3081
+ - type: ndcg_at_5
3082
+ value: 17.585
3083
+ - type: precision_at_1
3084
+ value: 16.695999999999998
3085
+ - type: precision_at_10
3086
+ value: 10.374
3087
+ - type: precision_at_100
3088
+ value: 3.988
3089
+ - type: precision_at_1000
3090
+ value: 1.1860000000000002
3091
+ - type: precision_at_3
3092
+ value: 14.21
3093
+ - type: precision_at_5
3094
+ value: 12.623000000000001
3095
+ - type: recall_at_1
3096
+ value: 4.872
3097
+ - type: recall_at_10
3098
+ value: 18.624
3099
+ - type: recall_at_100
3100
+ value: 40.988
3101
+ - type: recall_at_1000
3102
+ value: 65.33
3103
+ - type: recall_at_3
3104
+ value: 10.162
3105
+ - type: recall_at_5
3106
+ value: 13.517999999999999
3107
+ ---
3108
+ <!-- TODO: add evaluation results here -->
3109
+ <br><br>
3110
+
3111
+ <p align="center">
3112
+ <img src="https://aeiljuispo.cloudimg.io/v7/https://cdn-uploads.huggingface.co/production/uploads/603763514de52ff951d89793/AFoybzd5lpBQXEBrQHuTt.png?w=200&h=200&f=face" alt="Jina AI logo: Jina AI is your Portal to Multimodal AI" width="150px">
3113
+ </p>
3114
+
3115
+
3116
+ <p align="center">
3117
+ <b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
3118
+ </p>
3119
+
3120
+ ## Quick Start
3121
+
3122
+ The easiest way to starting using `jina-embeddings-v2-base-de` is to use Jina AI's [Embedding API](https://jina.ai/embeddings/).
3123
+
3124
+ ## Intended Usage & Model Info
3125
+
3126
+ `jina-embeddings-v2-base-de` is a German/English bilingual text **embedding model** supporting **8192 sequence length**.
3127
+ It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
3128
+ We have designed it for high performance in mono-lingual & cross-lingual applications and trained it specifically to support mixed German-English input without bias.
3129
+ Additionally, we provide the following embedding models:
3130
+
3131
+ `jina-embeddings-v2-base-de` ist ein zweisprachiges **Text Embedding Modell** für Deutsch und Englisch,
3132
+ welches Texteingaben mit einer Länge von bis zu **8192 Token unterstützt**.
3133
+ Es basiert auf der adaptierten Bert-Modell-Architektur JinaBERT,
3134
+ welche mithilfe einer symmetrische Variante von [ALiBi](https://arxiv.org/abs/2108.12409) längere Eingabetexte erlaubt.
3135
+ Wir haben, das Model für hohe Performance in einsprachigen und cross-lingual Anwendungen entwickelt und speziell darauf trainiert,
3136
+ gemischte deutsch-englische Eingaben ohne einen Bias zu kodieren.
3137
+ Des Weiteren stellen wir folgende Embedding-Modelle bereit:
3138
+
3139
+ - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
3140
+ - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.
3141
+ - [`jina-embeddings-v2-base-zh`](https://huggingface.co/jinaai/jina-embeddings-v2-base-zh): 161 million parameters Chinese-English Bilingual embeddings.
3142
+ - [`jina-embeddings-v2-base-de`](https://huggingface.co/jinaai/jina-embeddings-v2-base-de): 161 million parameters German-English Bilingual embeddings **(you are here)**.
3143
+ - [`jina-embeddings-v2-base-es`](): Spanish-English Bilingual embeddings (soon).
3144
+
3145
+ ## Data & Parameters
3146
+
3147
+ We will publish a report with technical details about the training of the bilingual models soon.
3148
+ The training of the English model is described in this [technical report](https://arxiv.org/abs/2310.19923).
3149
+
3150
+ ## Usage
3151
+
3152
+ **<details><summary>Please apply mean pooling when integrating the model.</summary>**
3153
+ <p>
3154
+
3155
+ ### Why mean pooling?
3156
+
3157
+ `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
3158
+ It has been proved to be the most effective way to produce high-quality sentence embeddings.
3159
+ We offer an `encode` function to deal with this.
3160
+
3161
+ However, if you would like to do it without using the default `encode` function:
3162
+
3163
+ ```python
3164
+ import torch
3165
+ import torch.nn.functional as F
3166
+ from transformers import AutoTokenizer, AutoModel
3167
+
3168
+ def mean_pooling(model_output, attention_mask):
3169
+ token_embeddings = model_output[0]
3170
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
3171
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
3172
+
3173
+ sentences = ['How is the weather today?', 'What is the current weather like today?']
3174
+
3175
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-base-de')
3176
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True)
3177
+
3178
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
3179
+
3180
+ with torch.no_grad():
3181
+ model_output = model(**encoded_input)
3182
+
3183
+ embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
3184
+ embeddings = F.normalize(embeddings, p=2, dim=1)
3185
+ ```
3186
+
3187
+ </p>
3188
+ </details>
3189
+
3190
+ You can use Jina Embedding models directly from transformers package.
3191
+
3192
+ First, you need to make sure that you are logged into huggingface. You can either use the huggingface-cli tool (after installing the `transformers` package) and pass your [hugginface access token](https://huggingface.co/docs/hub/security-tokens):
3193
+ ```bash
3194
+ huggingface-cli login
3195
+ ```
3196
+ Alternatively, you can provide the access token as an environment variable in the shell:
3197
+ ```bash
3198
+ export HF_TOKEN="<your token here>"
3199
+ ```
3200
+ or in Python:
3201
+ ```python
3202
+ import os
3203
+
3204
+ os.environ['HF_TOKEN'] = "<your token here>"
3205
+ ```
3206
+
3207
+ Then, you can use load and use the model via the `AutoModel` class:
3208
+ ```python
3209
+ !pip install transformers
3210
+ from transformers import AutoModel
3211
+ from numpy.linalg import norm
3212
+
3213
+ cos_sim = lambda a,b: (a @ b.T) / (norm(a)*norm(b))
3214
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True) # trust_remote_code is needed to use the encode method
3215
+ embeddings = model.encode(['How is the weather today?', 'Wie ist das Wetter heute?'])
3216
+ print(cos_sim(embeddings[0], embeddings[1]))
3217
+ ```
3218
+
3219
+ If you only want to handle shorter sequence, such as 2k, pass the `max_length` parameter to the `encode` function:
3220
+
3221
+ ```python
3222
+ embeddings = model.encode(
3223
+ ['Very long ... document'],
3224
+ max_length=2048
3225
+ )
3226
+ ```
3227
+
3228
+ Using the its latest release (v2.3.0) sentence-transformers also supports Jina embeddings (Please make sure that you are logged into huggingface as well):
3229
+
3230
+ ```python
3231
+ !pip install -U sentence-transformers
3232
+ from sentence_transformers import SentenceTransformer
3233
+ from sentence_transformers.util import cos_sim
3234
+
3235
+ model = SentenceTransformer(
3236
+ "jinaai/jina-embeddings-v2-base-de", # switch to en/zh for English or Chinese
3237
+ trust_remote_code=True
3238
+ )
3239
+
3240
+ # control your input sequence length up to 8192
3241
+ model.max_seq_length = 1024
3242
+
3243
+ embeddings = model.encode([
3244
+ 'How is the weather today?',
3245
+ 'Wie ist das Wetter heute?'
3246
+ ])
3247
+ print(cos_sim(embeddings[0], embeddings[1]))
3248
+ ```
3249
+
3250
+ ## Alternatives to Using Transformers Package
3251
+
3252
+ 1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/).
3253
+ 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).
3254
+
3255
+ ## Benchmark Results
3256
+
3257
+ We evaluated our Bilingual model on all German and English evaluation tasks availble on the [MTEB benchmark](https://huggingface.co/blog/mteb). In addition, we evaluated the models agains a couple of other German, English, and multilingual models on additional German evaluation tasks:
3258
+
3259
+ <img src="de_evaluation_results.png" width="780px">
3260
+
3261
+ ## Use Jina Embeddings for RAG
3262
+
3263
+ According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
3264
+
3265
+ > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
3266
+
3267
+ <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
3268
+
3269
+ ## Trouble Shooting
3270
+
3271
+ **Loading of Model Code failed**
3272
+
3273
+ If you forgot to pass the `trust_remote_code=True` flag when calling `AutoModel.from_pretrained` or initializing the model via the `SentenceTransformer` class, you will receive an error that the model weights could not be initialized.
3274
+ This is caused by tranformers falling back to creating a default BERT model, instead of a jina-embedding model:
3275
+
3276
+ ```bash
3277
+ Some weights of the model checkpoint at jinaai/jina-embeddings-v2-base-en were not used when initializing BertModel: ['encoder.layer.2.mlp.layernorm.weight', 'encoder.layer.3.mlp.layernorm.weight', 'encoder.layer.10.mlp.wo.bias', 'encoder.layer.5.mlp.wo.bias', 'encoder.layer.2.mlp.layernorm.bias', 'encoder.layer.1.mlp.gated_layers.weight', 'encoder.layer.5.mlp.gated_layers.weight', 'encoder.layer.8.mlp.layernorm.bias', ...
3278
+ ```
3279
+
3280
+
3281
+ **User is not logged into Huggingface**
3282
+
3283
+ The model is only availabe under [gated access](https://huggingface.co/docs/hub/models-gated).
3284
+ This means you need to be logged into huggingface load load it.
3285
+ If you receive the following error, you need to provide an access token, either by using the huggingface-cli or providing the token via an environment variable as described above:
3286
+ ```bash
3287
+ OSError: jinaai/jina-embeddings-v2-base-en is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
3288
+ If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.
3289
+ ```
3290
+
3291
+ ## Contact
3292
+
3293
+ Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
3294
+
3295
+ ## Citation
3296
+
3297
+ If you find Jina Embeddings useful in your research, please cite the following paper:
3298
+
3299
+ ```
3300
+ @misc{günther2023jina,
3301
+ title={Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents},
3302
+ author={Michael Günther and Jackmin Ong and Isabelle Mohr and Alaeddine Abdessalem and Tanguy Abel and Mohammad Kalim Akram and Susana Guzman and Georgios Mastrapas and Saba Sturua and Bo Wang and Maximilian Werk and Nan Wang and Han Xiao},
3303
+ year={2023},
3304
+ eprint={2310.19923},
3305
+ archivePrefix={arXiv},
3306
+ primaryClass={cs.CL}
3307
+ }
3308
+ ```
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "jinaai/jina-bert-implementation",
3
+ "model_max_length": 8192,
4
+ "architectures": [
5
+ "JinaBertForMaskedLM"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.0,
8
+ "auto_map": {
9
+ "AutoConfig": "jinaai/jina-bert-implementation--configuration_bert.JinaBertConfig",
10
+ "AutoModelForMaskedLM": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForMaskedLM",
11
+ "AutoModel": "jinaai/jina-bert-implementation--modeling_bert.JinaBertModel",
12
+ "AutoModelForSequenceClassification": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForSequenceClassification"
13
+ },
14
+ "classifier_dropout": null,
15
+ "emb_pooler": "mean",
16
+ "feed_forward_type": "geglu",
17
+ "gradient_checkpointing": false,
18
+ "hidden_act": "gelu",
19
+ "hidden_dropout_prob": 0.1,
20
+ "hidden_size": 768,
21
+ "initializer_range": 0.02,
22
+ "intermediate_size": 3072,
23
+ "layer_norm_eps": 1e-12,
24
+ "max_position_embeddings": 8192,
25
+ "model_type": "bert",
26
+ "num_attention_heads": 12,
27
+ "num_hidden_layers": 12,
28
+ "pad_token_id": 0,
29
+ "position_embedding_type": "alibi",
30
+ "torch_dtype": "float16",
31
+ "transformers_version": "4.31.0",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 61056
35
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.2",
4
+ "transformers": "4.31.0",
5
+ "pytorch": "2.0.1"
6
+ }
7
+ }
de_evaluation_results.png ADDED
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:51654b7441fbfcfef6598b01cbd1ea925ca0f0cad81202fcd36fee325783f1b0
3
+ size 641212851
onnx/model_quantized.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b19a38d186594d9784e6a2c8954035a6928b5da1df402e7942d25021cf488b6b
3
+ size 161565240
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b7ed0e5f7aa0fbaf70db652dd783b4b8d9da415f6e6e405c91e069d364ad1eed
3
+ size 321664570
sentence_bert_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false,
4
+ "model_args": {"trust_remote_code": true}
5
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff