Zhang Yuanhan's picture

Zhang Yuanhan

ZhangYuanhan

·

https://zhangyuanhan-ai.github.io/

AI & ML interests

None yet

Recent Activity

New activity 10 days ago

lmms-lab/LLaVA-Video-178K

upvoted a paper 12 days ago

updated a model 28 days ago

lmms-lab/LLaVA-Video-72B-Qwen2

Organizations

ZhangYuanhan's activity

upvoted a paper 12 days ago

HourVideo: 1-Hour Video-Language Understanding

Paper • 2411.04998 • Published 14 days ago • 1

upvoted 3 papers about 2 months ago

Contrastive Localized Language-Image Pre-Training

Paper • 2410.02746 • Published Oct 3 • 31

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3 • 34

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3 • 37

upvoted a collection 2 months ago

LLaVA-Video

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 6 items • Updated Oct 5 • 53

upvoted 2 papers 4 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

Paper • 2407.12772 • Published Jul 17 • 33

upvoted a paper 5 months ago

Long Context Transfer from Language to Vision

Paper • 2406.16852 • Published Jun 24 • 32

upvoted a paper 9 months ago

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

Paper • 2402.07865 • Published Feb 12 • 12

upvoted a paper about 1 year ago

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30

upvoted 2 papers over 1 year ago

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Paper • 2308.01390 • Published Aug 2, 2023 • 32

DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations

Paper • 2307.07047 • Published Jul 13, 2023 • 15