PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs Paper • 2402.08657 • Published Feb 13 • 1
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8 • 16