OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper • 2411.07199 • Published 5 days ago • 42
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Paper • 2411.04997 • Published 9 days ago • 30
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published 9 days ago • 100
Teach Multimodal LLMs to Comprehend Electrocardiographic Images Paper • 2410.19008 • Published 26 days ago • 22
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18 • 73
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Paper • 2410.14669 • Published 29 days ago • 35
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks Paper • 2410.12381 • Published Oct 16 • 41
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10 • 26
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14 • 37