SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Paper • 2411.05007 • Published 2 days ago • 13
VILA-U-7B Collection VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation • 2 items • Updated 19 days ago • 4
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details Paper • 2411.03047 • Published 5 days ago • 6
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks Paper • 2410.20650 • Published 13 days ago • 14
Correlation of Object Detection Performance with Visual Saliency and Depth Estimation Paper • 2411.02844 • Published 5 days ago • 3
IGOR: Image-GOal Representations are the Atomic Control Units for Foundation Models in Embodied AI Paper • 2411.00785 • Published 24 days ago • 8
AutoVFX: Physically Realistic Video Editing from Natural Language Instructions Paper • 2411.02394 • Published 5 days ago • 14
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent Paper • 2411.02265 • Published 5 days ago • 22
Adaptive Caching for Faster Video Generation with Diffusion Transformers Paper • 2411.02397 • Published 5 days ago • 17
How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published 5 days ago • 27
Training-free Regional Prompting for Diffusion Transformers Paper • 2411.02395 • Published 5 days ago • 22
Survey of User Interface Design and Interaction Techniques in Generative AI Applications Paper • 2410.22370 • Published 12 days ago • 11
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes Paper • 2411.00771 • Published 8 days ago • 9
GRS-QA -- Graph Reasoning-Structured Question Answering Dataset Paper • 2411.00369 • Published 9 days ago • 6