M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation Paper • 2410.21157 • Published 12 days ago • 6
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes Paper • 2411.00771 • Published 8 days ago • 9
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published 10 days ago • 43
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions Paper • 2411.02394 • Published 5 days ago • 14
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent Paper • 2411.02265 • Published 5 days ago • 22
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Paper • 2410.24024 • Published 9 days ago • 45
Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities Paper • 2410.11190 • Published 26 days ago • 20
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation Paper • 2410.13232 • Published 24 days ago • 40
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 20 days ago • 55
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published 23 days ago • 53
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Paper • 2410.16256 • Published 19 days ago • 58
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Paper • 2410.16268 • Published 19 days ago • 65
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors Paper • 2410.16271 • Published 19 days ago • 80
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction Paper • 2410.17247 • Published 18 days ago • 43
SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes Paper • 2410.17249 • Published 18 days ago • 39
WorldSimBench: Towards Video Generation Models as World Simulators Paper • 2410.18072 • Published 17 days ago • 16
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding Paper • 2410.17434 • Published 18 days ago • 24