alanztymarqo commited on
Commit
1f36e4b
1 Parent(s): 05d9b36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +146 -3
README.md CHANGED
@@ -1,3 +1,146 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Marqo E-commerce Embedding Models
6
+ In this work, we introduce two state-of-the-art embedding models for e-commerce:
7
+ Marqo-Ecommerce-B and Marqo-Ecommerce-L.
8
+ They are over 30% better compared to Amazon Titan Embedding services for e-commerce retrieval tasks.
9
+
10
+ **Released Content**:
11
+ 1) Marqo-Ecommerce-B and Marqo-Ecommerce-L embedding models
12
+ 2) GoogleShopping-1m and AmazonProducts-3m for evaluation
13
+ 3) Evaluation Code
14
+
15
+ <img src="performance.png" alt="multi split visual" width="500"/>
16
+
17
+ ## Models
18
+
19
+ | **Embedding Model** | **#Params (m)** | **Dimension** | **HuggingFace** | **Download .pt** |
20
+ |---------------------| --- |---------------|------------------------------------|-------------------------------------------------------------------------------------------------------------|
21
+ | Marqo-Ecommerce-B | 203 | 768 | Marqo/marqo-ecommerce-embeddings-B | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-b.pt) |
22
+ | Marqo-Ecommerce-L | 652 | 1024 | Marqo/marqo-ecommerce-embeddings-L | [link](https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-general-ecomm/marqo-ecomm-embeddings-l.pt) |
23
+
24
+ ### HuggingFace with OpenCLIP
25
+ ```
26
+ pip install open_clip_torch
27
+ ```
28
+ ```python
29
+ from PIL import Image
30
+ import open_clip
31
+ import requests
32
+ import torch
33
+
34
+ # Specify model from Hugging Face Hub
35
+ model_name = 'hf-hub:Marqo/marqo-ecommerce-embeddings-B'
36
+ model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms(model_name)
37
+ tokenizer = open_clip.get_tokenizer(model_name)
38
+
39
+ # Preprocess the image and tokenize text inputs
40
+ # Load an example image from a URL
41
+ img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-FashionCLIP/main/docs/fashion-hippo.png', stream=True).raw)
42
+ image = preprocess_val(img).unsqueeze(0)
43
+ text = tokenizer(["a hat", "a t-shirt", "shoes"])
44
+
45
+ # Perform inference
46
+ with torch.no_grad(), torch.cuda.amp.autocast():
47
+ image_features = model.encode_image(image, normalize=True)
48
+ text_features = model.encode_text(text, normalize=True)
49
+
50
+ # Calculate similarity probabilities
51
+ text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
52
+
53
+ # Display the label probabilities
54
+ print("Label probs:", text_probs)
55
+ # [9.9955e-01, 4.4712e-04, 4.4010e-06]]
56
+ ```
57
+ ### HuggingFace with transformers
58
+ ```python
59
+ from transformers import AutoModel, AutoProcessor
60
+ import torch
61
+ from PIL import Image
62
+ import requests
63
+ # model_name= 'Marqo/marqo-ecommerce-embeddings-L'
64
+ model_name = 'Marqo/marqo-ecommerce-embeddings-B'
65
+
66
+ model_1 = AutoModel.from_pretrained(model_name, trust_remote_code=True)
67
+ processor_1 = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
68
+
69
+ img = Image.open(requests.get('https://raw.githubusercontent.com/marqo-ai/marqo-FashionCLIP/main/docs/fashion-hippo.png', stream=True).raw).convert("RGB")
70
+ image_1 = [img]
71
+ text_1 = ["a hat", "a t-shirt", "shoes"]
72
+ processed_1 = processor_1(text=text_1, images=image_1, padding='max_length', return_tensors="pt")
73
+ processor_1.image_processor.do_rescale = False
74
+ with torch.no_grad():
75
+ image_features_1 = model_1.get_image_features(processed_1['pixel_values'], normalize=True)
76
+ text_features_1 = model_1.get_text_features(processed_1['input_ids'], normalize=True)
77
+
78
+ text_probs_1 = (100 * image_features_1 @ text_features_1.T).softmax(dim=-1)
79
+
80
+ print(text_probs_1)
81
+ # [9.9955e-01, 4.4712e-04, 4.4010e-06]]
82
+ ```
83
+
84
+ ### Evaluation with GCL
85
+ ```
86
+ git clone https://github.com/marqo-ai/GCL
87
+ ```
88
+ Install the packages required by GCL.
89
+ ```
90
+ cd ./GCL
91
+ MODEL=hf-hub:Marqo/marqo-ecommerce-B
92
+ outdir=/MarqoModels/GE/marqo-ecommerce-B/gs-title2image2
93
+ hfdataset=Marqo/google-shopping-general-eval
94
+ python evals/eval_hf_datasets_v1.py \
95
+ --model_name $MODEL \
96
+ --hf-dataset $hfdataset \
97
+ --output-dir $outdir \
98
+ --batch-size 1024 \
99
+ --num_workers 8 \
100
+ --left-key "['title']" \
101
+ --right-key "['image']" \
102
+ --img-or-txt "[['txt'], ['img']]" \
103
+ --left-weight "[1]" \
104
+ --right-weight "[1]" \
105
+ --run-queries-cpu \
106
+ --top-q 4000 \
107
+ --doc-id-key item_ID \
108
+ --context-length "[[64], [0]]"
109
+ ```
110
+
111
+
112
+ ## Detailed Performance
113
+ **GoogleShopping-Text2Image Retrieval.**
114
+
115
+ | **Embedding Model** | **mAP** | **P@10** | **R@10** | **MRR** |
116
+ | --- | --- | --- | --- | --- |
117
+ | Marqo-Ecommerce-L | **0.682** | **0.089** | **0.878** | **0.683** |
118
+ | Marqo-Ecommerce-B | 0.623 | 0.084 | 0.832 | 0.624 |
119
+ | Amazon-Titan-MultiModal | 0.475 | 0.065 | 0.648 | 0.475 |
120
+ | ViT-B-16-SigLip | 0.476 | 0.067 | 0.660 | 0.477 |
121
+ | ViT-L-16-SigLip | 0.540 | 0.073 | 0.722 | 0.540 |
122
+
123
+
124
+ **GoogleShopping-Category2Image Retrieval.**
125
+
126
+ | **Embedding Model** | **mAP** | **P@10** | **MRR** | **nDCG@10** |
127
+ | --- | --- | --- | --- | --- |
128
+ | Marqo-Ecommerce-L | **0.463** | **0.652** | **0.822** | **0.666** |
129
+ | Marqo-Ecommerce-B | 0.423 | 0.629 | 0.810 | 0.644 |
130
+ | Amazon-Titan-MultiModal | 0.246 | 0.429 | 0.642 | 0.446 |
131
+ | ViT-B-16-SigLip | 0.277 | 0.458 | 0.660 | 0.473 |
132
+ | ViT-L-16-SigLip | 0.324 | 0.497 | 0.687 | 0.509 |
133
+
134
+ **AmazonProducts-Text2Image Retrieval.**
135
+
136
+ | **Embedding Model** | **mAP** | **P@10** | **R@10** | **MRR** |
137
+ | --- | --- | --- | --- | --- |
138
+ | Marqo-Ecommerce-L | **0.658** | **0.096** | **0.854** | **0.663** |
139
+ | Marqo-Ecommerce-B | 0.592 | 0.089 | 0.795 | 0.597 |
140
+ | Amazon-Titan-MultiModal | 0.456 | 0.064 | 0.627 | 0.457 |
141
+ | ViT-B-16-SigLip | 0.480 | 0.070 | 0.650 | 0.484 |
142
+ | ViT-L-16-SigLip | 0.544 | 0.077 | 0.715 | 0.548 |
143
+
144
+
145
+
146
+