Dimitrios Kapetanios

dkapt

dkapt

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

NexaAIDev/omnivision-968M

liked a model 6 days ago

nvidia/NV-Embed-v2

upvoted an article 8 days ago

Organizations

None yet

dkapt's activity

upvoted an article 8 days ago

Article

Releasing the largest multilingual open pretraining dataset

•

8 days ago

• 94

upvoted a paper 12 days ago

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published 14 days ago • 27

upvoted an article 13 days ago

Article

ColFlor: Towards BERT-Size Vision-Language Document Retrieval Models

•

Oct 18

• 16

upvoted a paper about 1 month ago

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8 • 107

upvoted an article about 2 months ago

Article

Document Similarity Search with ColPali

•

Sep 21

• 47

upvoted a collection about 2 months ago

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 15 items • Updated Sep 18 • 156

upvoted a paper about 2 months ago

Flamingo: a Visual Language Model for Few-Shot Learning

Paper • 2204.14198 • Published Apr 29, 2022 • 14

upvoted an article about 2 months ago

Article

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Aug 22, 2023

• 27

upvoted a paper about 2 months ago

ColPali: Efficient Document Retrieval with Vision Language Models

Paper • 2407.01449 • Published Jun 27 • 41

upvoted a collection 4 months ago

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 622

upvoted 4 articles 6 months ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

•

May 16

• 17

Article

Vision Language Models Explained

Apr 11

• 214

Article

A Dive into Pretraining Strategies for Vision-Language Models

Feb 3, 2023

• 48

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 210

upvoted a paper 8 months ago

Evaluating Frontier Models for Dangerous Capabilities

Paper • 2403.13793 • Published Mar 20 • 7