Update README.md
Browse files
README.md
CHANGED
@@ -66,6 +66,34 @@ print(cos_sim(text_embeddings[0], text_embeddings[1])) # text embedding similari
|
|
66 |
print(cos_sim(text_embeddings[0], image_embeddings[0])) # text-image cross-modal similarity
|
67 |
```
|
68 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
69 |
## Performance
|
70 |
|
71 |
### Text-Image Retrieval
|
|
|
66 |
print(cos_sim(text_embeddings[0], image_embeddings[0])) # text-image cross-modal similarity
|
67 |
```
|
68 |
|
69 |
+
**notice: our emperical study shows that text-text cosine similarity is normally larger than text-image cosine similarity!**
|
70 |
+
|
71 |
+
If you want to merge two scores, we recommended 2 ways:
|
72 |
+
|
73 |
+
1. weighted average of text-text sim and text-image sim:
|
74 |
+
|
75 |
+
```python
|
76 |
+
# pseudo code
|
77 |
+
alpha = 0.6 # text search
|
78 |
+
beta = 0.4 # cross-modal search
|
79 |
+
|
80 |
+
combined_scores = alpha * sim(query, document) + beta * sim(text, image)
|
81 |
+
```
|
82 |
+
|
83 |
+
2. apply z-score normalization before merging scores:
|
84 |
+
|
85 |
+
```python
|
86 |
+
# pseudo code
|
87 |
+
query_document_sim_mean = np.mean(cos_sim_query_documents)
|
88 |
+
query_document_sim_std = np.std(cos_sim_query_documents)
|
89 |
+
text_image_sim_mean = np.mean(cos_sim_text_images)
|
90 |
+
text_image_sim_std = np.std(cos_sim_text_images)
|
91 |
+
|
92 |
+
query_document_sim_normalized = (cos_sim_query_documents - query_document_sim_mean) / query_document_sim_std
|
93 |
+
text_image_sim_normalized = (cos_sim_text_images - text_image_sim_mean) / text_image_sim_std
|
94 |
+
# sum normalized scores
|
95 |
+
```
|
96 |
+
|
97 |
## Performance
|
98 |
|
99 |
### Text-Image Retrieval
|