Anything in Any Scene: Photorealistic Video Object Insertion Paper • 2401.17509 • Published Jan 30 • 16
Memory Consolidation Enables Long-Context Video Understanding Paper • 2402.05861 • Published Feb 8 • 8
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 188
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29 • 32
AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Paper • 2403.14468 • Published Mar 21 • 22
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text Paper • 2403.14773 • Published Mar 21 • 10
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22 • 22