elliesleightholm
commited on
Commit
•
005825e
1
Parent(s):
9f0a42f
Update README.md
Browse files
README.md
CHANGED
@@ -1,50 +1,55 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
tags:
|
4 |
-
- clip
|
5 |
-
- ecommerce
|
6 |
-
- multimodal retrieval
|
7 |
-
- transformers
|
8 |
-
- openCLIP
|
9 |
-
|
|
|
|
|
|
|
10 |
|
11 |
-
# Marqo
|
12 |
-
In this work, we introduce two state-of-the-art embedding models for
|
13 |
-
Marqo-Ecommerce-B and Marqo-Ecommerce-L.
|
14 |
-
Trained on over 60M unique products, they are over 30% better compared to Amazon Titan Embedding services for e-commerce retrieval tasks.
|
15 |
|
16 |
-
|
17 |
|
|
|
|
|
|
|
18 |
|
19 |
**Released Content**:
|
20 |
1) Marqo-Ecommerce-B and Marqo-Ecommerce-L embedding models
|
21 |
2) GoogleShopping-1m and AmazonProducts-3m for evaluation
|
22 |
3) Evaluation Code
|
23 |
|
24 |
-
|
25 |
## Models
|
26 |
-
|
27 |
-
The models can be loaded in several ways. See below for available options.
|
28 |
| **Embedding Model** | **#Params (m)** | **Dimension** | **HuggingFace** | **Download .pt** |
|
29 |
|---------------------| --- |---------------|------------------------------------|-------------------------------------------------------------------------------------------------------------|
|
30 |
-
| Marqo-Ecommerce-B | 203 | 768 | Marqo/marqo-ecommerce-embeddings-B | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-b.pt) |
|
31 |
-
| Marqo-Ecommerce-L | 652 | 1024 | Marqo/marqo-ecommerce-embeddings-L | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-l.pt) |
|
|
|
|
|
|
|
32 |
|
33 |
-
### HuggingFace with transformers
|
34 |
```python
|
35 |
from transformers import AutoModel, AutoProcessor
|
36 |
import torch
|
37 |
from PIL import Image
|
38 |
import requests
|
39 |
-
|
40 |
-
model_name
|
|
|
41 |
|
42 |
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
|
43 |
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
|
44 |
|
45 |
-
img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-
|
46 |
image = [img]
|
47 |
-
text = ["
|
48 |
processed = processor(text=text, images=image, padding='max_length', return_tensors="pt")
|
49 |
processor.image_processor.do_rescale = False
|
50 |
with torch.no_grad():
|
@@ -54,10 +59,12 @@ with torch.no_grad():
|
|
54 |
text_probs = (100 * image_features @ text_features.T).softmax(dim=-1)
|
55 |
|
56 |
print(text_probs)
|
57 |
-
# [
|
58 |
```
|
59 |
|
60 |
-
### HuggingFace with OpenCLIP
|
|
|
|
|
61 |
```
|
62 |
pip install open_clip_torch
|
63 |
```
|
@@ -69,14 +76,16 @@ import torch
|
|
69 |
|
70 |
# Specify model from Hugging Face Hub
|
71 |
model_name = 'hf-hub:Marqo/marqo-ecommerce-embeddings-B'
|
|
|
|
|
72 |
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms(model_name)
|
73 |
tokenizer = open_clip.get_tokenizer(model_name)
|
74 |
|
75 |
# Preprocess the image and tokenize text inputs
|
76 |
# Load an example image from a URL
|
77 |
-
img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-
|
78 |
image = preprocess_val(img).unsqueeze(0)
|
79 |
-
text = tokenizer(["
|
80 |
|
81 |
# Perform inference
|
82 |
with torch.no_grad(), torch.cuda.amp.autocast():
|
@@ -88,18 +97,22 @@ with torch.no_grad(), torch.cuda.amp.autocast():
|
|
88 |
|
89 |
# Display the label probabilities
|
90 |
print("Label probs:", text_probs)
|
91 |
-
# [
|
92 |
```
|
93 |
|
94 |
-
### Evaluation
|
|
|
|
|
95 |
```
|
96 |
git clone https://github.com/marqo-ai/GCL
|
97 |
```
|
98 |
Install the packages required by GCL.
|
|
|
|
|
99 |
```
|
100 |
cd ./GCL
|
101 |
MODEL=hf-hub:Marqo/marqo-ecommerce-B
|
102 |
-
outdir=/MarqoModels/GE/marqo-ecommerce-B/gs-
|
103 |
hfdataset=Marqo/google-shopping-general-eval
|
104 |
python evals/eval_hf_datasets_v1.py \
|
105 |
--model_name $MODEL \
|
@@ -118,39 +131,153 @@ python evals/eval_hf_datasets_v1.py \
|
|
118 |
--context-length "[[64], [0]]"
|
119 |
```
|
120 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
121 |
|
122 |
## Detailed Performance
|
123 |
-
|
|
|
|
|
124 |
|
125 |
-
|
126 |
-
|
127 |
-
|
128 |
-
| Marqo-Ecommerce-B | 0.623 | 0.084 | 0.832 | 0.624 |
|
129 |
-
| Amazon-Titan-MultiModal | 0.475 | 0.065 | 0.648 | 0.475 |
|
130 |
-
| ViT-B-16-SigLip | 0.476 | 0.067 | 0.660 | 0.477 |
|
131 |
-
| ViT-L-16-SigLip | 0.540 | 0.073 | 0.722 | 0.540 |
|
132 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
133 |
|
134 |
**GoogleShopping-Category2Image Retrieval.**
|
135 |
|
136 |
-
| **Embedding Model**
|
137 |
-
|
138 |
-
| Marqo-Ecommerce-L
|
139 |
-
| Marqo-Ecommerce-B
|
140 |
-
|
|
141 |
-
| ViT-
|
142 |
-
| ViT-
|
|
|
|
|
143 |
|
144 |
**AmazonProducts-Text2Image Retrieval.**
|
145 |
|
146 |
-
| **Embedding Model**
|
147 |
-
|
148 |
-
| Marqo-Ecommerce-L
|
149 |
-
| Marqo-Ecommerce-B
|
150 |
-
|
|
151 |
-
| ViT-
|
152 |
-
| ViT-
|
|
|
|
|
153 |
|
|
|
|
|
154 |
|
|
|
155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
156 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- clip
|
5 |
+
- ecommerce
|
6 |
+
- multimodal retrieval
|
7 |
+
- transformers
|
8 |
+
- openCLIP
|
9 |
+
datasets:
|
10 |
+
- Marqo/amazon-products-eval
|
11 |
+
- Marqo/google-shopping-general-eval
|
12 |
+
---
|
13 |
|
14 |
+
# Marqo Ecommerce Embedding Models
|
15 |
+
In this work, we introduce two state-of-the-art embedding models for ecommerce products: Marqo-Ecommerce-B and Marqo-Ecommerce-L.
|
|
|
|
|
16 |
|
17 |
+
The benchmarking results highlight a remarkable performance by marqo-ecommerce models, which both consistently outperformed all other models across various metrics. Specifically, for the Google Shopping Text-to-Image task, marqo-ecommerce-L achieved an improvement of 43% in MRR, 41% in nDCG@10 and 33% in Recall@10 when compared to ViT-B-16-SigLIP which is our baseline model for these benchmarks. For the Google Shopping Category-to-Image task, we saw an improvement of 67% in mAP, 41% in nDCG@10 and 42% in Precision@10.
|
18 |
|
19 |
+
<img src="https://raw.githubusercontent.com/marqo-ai/marqo-ecommerce-embeddings/refs/heads/main/performance.png?token=GHSAT0AAAAAACZY3OVLSH7USBXOC3SYCQ3OZZL6LVQ" alt="multi split visual" width="700"/>
|
20 |
+
|
21 |
+
More benchmarking results can be found below.
|
22 |
|
23 |
**Released Content**:
|
24 |
1) Marqo-Ecommerce-B and Marqo-Ecommerce-L embedding models
|
25 |
2) GoogleShopping-1m and AmazonProducts-3m for evaluation
|
26 |
3) Evaluation Code
|
27 |
|
|
|
28 |
## Models
|
29 |
+
|
|
|
30 |
| **Embedding Model** | **#Params (m)** | **Dimension** | **HuggingFace** | **Download .pt** |
|
31 |
|---------------------| --- |---------------|------------------------------------|-------------------------------------------------------------------------------------------------------------|
|
32 |
+
| Marqo-Ecommerce-B | 203 | 768 | [Marqo/marqo-ecommerce-embeddings-B](https://huggingface.co/Marqo/marqo-ecommerce-embeddings-B) | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-b.pt) |
|
33 |
+
| Marqo-Ecommerce-L | 652 | 1024 | [Marqo/marqo-ecommerce-embeddings-L](https://huggingface.co/Marqo/marqo-ecommerce-embeddings-L) | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-l.pt) |
|
34 |
+
|
35 |
+
### Load from HuggingFace with transformers
|
36 |
+
To load the models in Transformers, see below. The models are hosted on [Hugging Face](https://huggingface.co/collections/Marqo/marqo-ecommerce-embeddings-66f611b9bb9d035a8d164fbb) and loaded using [Transformers](https://github.com/huggingface/transformers).
|
37 |
|
|
|
38 |
```python
|
39 |
from transformers import AutoModel, AutoProcessor
|
40 |
import torch
|
41 |
from PIL import Image
|
42 |
import requests
|
43 |
+
|
44 |
+
model_name= 'Marqo/marqo-ecommerce-embeddings-B'
|
45 |
+
# model_name = 'Marqo/marqo-ecommerce-embeddings-L'
|
46 |
|
47 |
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
|
48 |
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
|
49 |
|
50 |
+
img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-ecommerce-embeddings/refs/heads/main/images/dining-chairs.png', stream=True).raw).convert("RGB")
|
51 |
image = [img]
|
52 |
+
text = ["dining chairs", "a laptop", "toothbrushes"]
|
53 |
processed = processor(text=text, images=image, padding='max_length', return_tensors="pt")
|
54 |
processor.image_processor.do_rescale = False
|
55 |
with torch.no_grad():
|
|
|
59 |
text_probs = (100 * image_features @ text_features.T).softmax(dim=-1)
|
60 |
|
61 |
print(text_probs)
|
62 |
+
# [1.0000e+00, 8.3131e-12, 5.2173e-12]
|
63 |
```
|
64 |
|
65 |
+
### Load from HuggingFace with OpenCLIP
|
66 |
+
To load the models in OpenCLIP, see below. The models are hosted on [Hugging Face](https://huggingface.co/collections/Marqo/marqo-ecommerce-embeddings-66f611b9bb9d035a8d164fbb) and loaded using [OpenCLIP](https://github.com/mlfoundations/open_clip). You can also find this code inside `run_models.py`.
|
67 |
+
|
68 |
```
|
69 |
pip install open_clip_torch
|
70 |
```
|
|
|
76 |
|
77 |
# Specify model from Hugging Face Hub
|
78 |
model_name = 'hf-hub:Marqo/marqo-ecommerce-embeddings-B'
|
79 |
+
# model_name = 'hf-hub:Marqo/marqo-ecommerce-embeddings-L'
|
80 |
+
|
81 |
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms(model_name)
|
82 |
tokenizer = open_clip.get_tokenizer(model_name)
|
83 |
|
84 |
# Preprocess the image and tokenize text inputs
|
85 |
# Load an example image from a URL
|
86 |
+
img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-ecommerce-embeddings/refs/heads/main/images/dining-chairs.png', stream=True).raw)
|
87 |
image = preprocess_val(img).unsqueeze(0)
|
88 |
+
text = tokenizer(["dining chairs", "a laptop", "toothbrushes"])
|
89 |
|
90 |
# Perform inference
|
91 |
with torch.no_grad(), torch.cuda.amp.autocast():
|
|
|
97 |
|
98 |
# Display the label probabilities
|
99 |
print("Label probs:", text_probs)
|
100 |
+
# [1.0000e+00, 8.3131e-12, 5.2173e-12]
|
101 |
```
|
102 |
|
103 |
+
### Evaluation
|
104 |
+
[Generalised Contrastiove Learning](https://github.com/marqo-ai/GCL) (GCL) is used for the evaluation. The following code can also be found in `scripts`.
|
105 |
+
|
106 |
```
|
107 |
git clone https://github.com/marqo-ai/GCL
|
108 |
```
|
109 |
Install the packages required by GCL.
|
110 |
+
|
111 |
+
**1. GoogleShopping-Text2Image Retrieval.**
|
112 |
```
|
113 |
cd ./GCL
|
114 |
MODEL=hf-hub:Marqo/marqo-ecommerce-B
|
115 |
+
outdir=/MarqoModels/GE/marqo-ecommerce-B/gs-title2image
|
116 |
hfdataset=Marqo/google-shopping-general-eval
|
117 |
python evals/eval_hf_datasets_v1.py \
|
118 |
--model_name $MODEL \
|
|
|
131 |
--context-length "[[64], [0]]"
|
132 |
```
|
133 |
|
134 |
+
**2. GoogleShopping-Category2Image Retrieval.**
|
135 |
+
```
|
136 |
+
cd ./GCL
|
137 |
+
MODEL=hf-hub:Marqo/marqo-ecommerce-B
|
138 |
+
outdir=/MarqoModels/GE/marqo-ecommerce-B/gs-cat2image
|
139 |
+
hfdataset=Marqo/google-shopping-general-eval
|
140 |
+
python evals/eval_hf_datasets_v1.py \
|
141 |
+
--model_name $MODEL \
|
142 |
+
--hf-dataset $hfdataset \
|
143 |
+
--output-dir $outdir \
|
144 |
+
--batch-size 1024 \
|
145 |
+
--num_workers 8 \
|
146 |
+
--left-key "['query']" \
|
147 |
+
--right-key "['image']" \
|
148 |
+
--img-or-txt "[['txt'], ['img']]" \
|
149 |
+
--left-weight "[1]" \
|
150 |
+
--right-weight "[1]" \
|
151 |
+
--run-queries-cpu \
|
152 |
+
--top-q 4000 \
|
153 |
+
--doc-id-key item_ID \
|
154 |
+
--context-length "[[64], [0]]"
|
155 |
+
```
|
156 |
+
|
157 |
+
**3. AmazonProducts-Category2Image Retrieval.**
|
158 |
+
```
|
159 |
+
cd ./GCL
|
160 |
+
MODEL=hf-hub:Marqo/marqo-ecommerce-B
|
161 |
+
outdir=/MarqoModels/GE/marqo-ecommerce-B/ap-title2image
|
162 |
+
hfdataset=Marqo/amazon-products-eval
|
163 |
+
python evals/eval_hf_datasets_v1.py \
|
164 |
+
--model_name $MODEL \
|
165 |
+
--hf-dataset $hfdataset \
|
166 |
+
--output-dir $outdir \
|
167 |
+
--batch-size 1024 \
|
168 |
+
--num_workers 8 \
|
169 |
+
--left-key "['title']" \
|
170 |
+
--right-key "['image']" \
|
171 |
+
--img-or-txt "[['txt'], ['img']]" \
|
172 |
+
--left-weight "[1]" \
|
173 |
+
--right-weight "[1]" \
|
174 |
+
--run-queries-cpu \
|
175 |
+
--top-q 4000 \
|
176 |
+
--doc-id-key item_ID \
|
177 |
+
--context-length "[[64], [0]]"
|
178 |
+
```
|
179 |
|
180 |
## Detailed Performance
|
181 |
+
The benchmarks are separated into 'Marqo-Ecommerce-Hard' and '100k-Marqo-Ecommerce-Easy'. The "easy" dataset is about 10-30 times smaller, and designed to accommodate rate-limited models, specifically Cohere-Embeddings-v3 and GCP-Vertex. The "hard" dataset represents the true challenge, since it contains four million ecommerce product listings, which pushes these models to their limits in a real-world, ecommerce scenario.
|
182 |
+
|
183 |
+
Within both these scenarios, the models were benchmarked against three different tasks:
|
184 |
|
185 |
+
* Google Shopping Text-to-Image
|
186 |
+
* Google Shopping Category-to-Image
|
187 |
+
* Amazon Products Text-to-Image
|
|
|
|
|
|
|
|
|
188 |
|
189 |
+
### Marqo-Ecommerce-Hard
|
190 |
+
Marqo-Ecommerce-Hard looks into the comprehensive evaluation conducted using the full 4 million dataset, highlighting the robust performance of our models in a real-world context.
|
191 |
+
|
192 |
+
**GoogleShopping-Text2Image Retrieval.**
|
193 |
+
|
194 |
+
| **Embedding Model** | **mAP** | **R@10** | **MRR** | **nDCG@10** |
|
195 |
+
|-------------------------|------|-------|------|---------|
|
196 |
+
| **Marqo-Ecommerce-L** | **0.682**| **0.878** | **0.683**| **0.726** |
|
197 |
+
| Marqo-Ecommerce-B | 0.623| 0.832 | 0.624| 0.668 |
|
198 |
+
| ViT-SO400M-14-SigLip | 0.573| 0.763 | 0.574| 0.613 |
|
199 |
+
| ViT-L-16-SigLip | 0.540| 0.722 | 0.540| 0.577 |
|
200 |
+
| ViT-B-16-SigLip | 0.476| 0.660 | 0.477| 0.513 |
|
201 |
+
| Amazon-Titan-MultiModal | 0.475| 0.648 | 0.475| 0.509 |
|
202 |
+
| Jina-V1-CLIP | 0.285| 0.402 | 0.285| 0.306 |
|
203 |
|
204 |
**GoogleShopping-Category2Image Retrieval.**
|
205 |
|
206 |
+
| **Embedding Model** | **mAP** | **P@10** | **MRR** | **nDCG@10** |
|
207 |
+
|-----------------------------|---------|----------|---------|-------------|
|
208 |
+
| **Marqo-Ecommerce-L** | **0.463** | **0.652** | **0.822** | **0.666** |
|
209 |
+
| Marqo-Ecommerce-B | 0.423 | 0.629 | 0.810 | 0.644 |
|
210 |
+
| ViT-SO400M-14-SigLip | 0.352 | 0.516 | 0.707 | 0.529 |
|
211 |
+
| ViT-L-16-SigLip | 0.324 | 0.497 | 0.687 | 0.509 |
|
212 |
+
| ViT-B-16-SigLip | 0.277 | 0.458 | 0.660 | 0.473 |
|
213 |
+
| Amazon-Titan-MultiModal | 0.246 | 0.429 | 0.642 | 0.446 |
|
214 |
+
| Jina-V1-CLIP | 0.123 | 0.275 | 0.504 | 0.294 |
|
215 |
|
216 |
**AmazonProducts-Text2Image Retrieval.**
|
217 |
|
218 |
+
| **Embedding Model** | **mAP** | **R@10** | **MRR** | **nDCG@10** |
|
219 |
+
|-----------------------------|---------|----------|---------|-------------|
|
220 |
+
| **Marqo-Ecommerce-L** | **0.658** | **0.854** | **0.663** | **0.703** |
|
221 |
+
| Marqo-Ecommerce-B | 0.592 | 0.795 | 0.597 | 0.637 |
|
222 |
+
| ViT-SO400M-14-SigLip | 0.560 | 0.742 | 0.564 | 0.599 |
|
223 |
+
| ViT-L-16-SigLip | 0.544 | 0.715 | 0.548 | 0.580 |
|
224 |
+
| ViT-B-16-SigLip | 0.480 | 0.650 | 0.484 | 0.515 |
|
225 |
+
| Amazon-Titan-MultiModal | 0.456 | 0.627 | 0.457 | 0.491 |
|
226 |
+
| Jina-V1-CLIP | 0.265 | 0.378 | 0.266 | 0.285 |
|
227 |
|
228 |
+
### 100k-Marqo-Ecommerce-Easy
|
229 |
+
This dataset is about 10-30 times smaller than the Marqo-Ecommerce-Hard, and designed to accommodate rate-limited models, specifically Cohere-Embeddings-v3 and GCP-Vertex.
|
230 |
|
231 |
+
**GoogleShopping-Text2Image Retrieval.**
|
232 |
|
233 |
+
| **Embedding Model** | **mAP** | **R@10** | **MRR** | **nDCG@10** |
|
234 |
+
|-----------------------------|---------|----------|---------|-------------|
|
235 |
+
| **Marqo-Ecommerce-L** | **0.879** | **0.971** | **0.879** | **0.901** |
|
236 |
+
| Marqo-Ecommerce-B | 0.842 | 0.961 | 0.842 | 0.871 |
|
237 |
+
| ViT-SO400M-14-SigLip | 0.792 | 0.935 | 0.792 | 0.825 |
|
238 |
+
| GCP-Vertex | 0.740 | 0.910 | 0.740 | 0.779 |
|
239 |
+
| ViT-L-16-SigLip | 0.754 | 0.907 | 0.754 | 0.789 |
|
240 |
+
| ViT-B-16-SigLip | 0.701 | 0.870 | 0.701 | 0.739 |
|
241 |
+
| Amazon-Titan-MultiModal | 0.694 | 0.868 | 0.693 | 0.733 |
|
242 |
+
| Jina-V1-CLIP | 0.480 | 0.638 | 0.480 | 0.511 |
|
243 |
+
| Cohere-embedding-v3 | 0.358 | 0.515 | 0.358 | 0.389 |
|
244 |
|
245 |
+
**GoogleShopping-Category2Image Retrieval.**
|
246 |
+
|
247 |
+
| **Embedding Model** | **mAP** | **P@10** | **MRR** | **nDCG@10** |
|
248 |
+
|-----------------------------|---------|----------|---------|-------------|
|
249 |
+
| **Marqo-Ecommerce-L** | **0.515** | **0.358** | **0.764** | **0.590** |
|
250 |
+
| Marqo-Ecommerce-B | 0.479 | 0.336 | 0.744 | 0.558 |
|
251 |
+
| ViT-SO400M-14-SigLip | 0.423 | 0.302 | 0.644 | 0.487 |
|
252 |
+
| GCP-Vertex | 0.417 | 0.298 | 0.636 | 0.481 |
|
253 |
+
| ViT-L-16-SigLip | 0.392 | 0.281 | 0.627 | 0.458 |
|
254 |
+
| ViT-B-16-SigLip | 0.347 | 0.252 | 0.594 | 0.414 |
|
255 |
+
| Amazon-Titan-MultiModal | 0.308 | 0.231 | 0.558 | 0.377 |
|
256 |
+
| Jina-V1-CLIP | 0.175 | 0.122 | 0.369 | 0.229 |
|
257 |
+
| Cohere-embedding-v3 | 0.136 | 0.110 | 0.315 | 0.178 |
|
258 |
+
|
259 |
+
**AmazonProducts-Text2Image Retrieval.**
|
260 |
+
|
261 |
+
| **Embedding Model** | **mAP** | **R@10** | **MRR** | **nDCG@10** |
|
262 |
+
|-----------------------------|---------|----------|---------|-------------|
|
263 |
+
| **Marqo-Ecommerce-L** | **0.92** | **0.978** | **0.928** | **0.940** |
|
264 |
+
| Marqo-Ecommerce-B | 0.897 | 0.967 | 0.897 | 0.914 |
|
265 |
+
| ViT-SO400M-14-SigLip | 0.860 | 0.954 | 0.860 | 0.882 |
|
266 |
+
| ViT-L-16-SigLip | 0.842 | 0.940 | 0.842 | 0.865 |
|
267 |
+
| GCP-Vertex | 0.808 | 0.933 | 0.808 | 0.837 |
|
268 |
+
| ViT-B-16-SigLip | 0.797 | 0.917 | 0.797 | 0.825 |
|
269 |
+
| Amazon-Titan-MultiModal | 0.762 | 0.889 | 0.763 | 0.791 |
|
270 |
+
| Jina-V1-CLIP | 0.530 | 0.699 | 0.530 | 0.565 |
|
271 |
+
| Cohere-embedding-v3 | 0.433 | 0.597 | 0.433 | 0.465 |
|
272 |
+
|
273 |
+
## Citation
|
274 |
+
```
|
275 |
+
@software{zhu2024marqoecommembed_2024,
|
276 |
+
author = {Tianyu Zhu and and Jesse Clark},
|
277 |
+
month = oct,
|
278 |
+
title = {{Marqo Ecommerce Embeddings - Foundation Model for Product Embeddings}},
|
279 |
+
url = {https://github.com/marqo-ai/marqo-ecommerce-embeddings/},
|
280 |
+
version = {1.0.0},
|
281 |
+
year = {2024}
|
282 |
+
}
|
283 |
+
```
|