Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models
Paper
•
2404.07973
•
Published
•
30
Where the model is tasked with identifying the object in a region mentioned in a query. we utilize the validation split of the LVIS dataset