Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Paper • 2404.07973 • Published Apr 11 • 30
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Paper • 2406.08407 • Published Jun 12 • 24