63 26 38

Johannes Kolbe

johko

johko

AI & ML interests

None yet

Recent Activity

liked a model 17 days ago

PleIAs/celadon

New activity 26 days ago

johko/HaloQuest

New activity 26 days ago

hf-vision/course-assets

Organizations

johko's activity

upvoted an article about 1 month ago

Article

Recoloring photos with diffusers

•

Oct 9

• 27

upvoted a paper 2 months ago

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Paper • 2407.15680 • Published Jul 22 • 1

upvoted an article 3 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

upvoted a paper 3 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 118

upvoted a paper 4 months ago

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10 • 40

upvoted an article 7 months ago

Article

Design choices for Vision Language Models in 2024

•

Apr 16

• 25

upvoted a collection 8 months ago

🎭 Avatars

Collection

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 69 items • Updated Oct 21 • 76

upvoted a paper 8 months ago

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

Paper • 2403.10516 • Published Mar 15 • 16

upvoted a collection 9 months ago

Matryoshka Embedding Models

Collection

https://huggingface.co/blog/matryoshka • 14 items • Updated Jun 4 • 13

upvoted 2 papers 9 months ago

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20 • 13

Aria Everyday Activities Dataset

Paper • 2402.13349 • Published Feb 20 • 30

upvoted a paper 10 months ago

PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

Paper • 2402.01118 • Published Feb 2 • 29

upvoted a collection 10 months ago

AIM

Collection

AIM: Autoregressive Image Models • 5 items • Updated 23 days ago • 48

upvoted a paper 11 months ago

Instruct-Imagen: Image Generation with Multi-modal Instruction

Paper • 2401.01952 • Published Jan 3 • 30

upvoted 5 papers 12 months ago

upvoted a paper about 1 year ago

GLaMM: Pixel Grounding Large Multimodal Model

Paper • 2311.03356 • Published Nov 6, 2023 • 33