Papers
arxiv:2401.12945

Lumiere: A Space-Time Diffusion Model for Video Generation

Published on Jan 23
Β· Submitted by akhaliq on Jan 24
#1 Paper of the day

Abstract

We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion -- a pivotal challenge in video synthesis. To this end, we introduce a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synthesize distant keyframes followed by temporal super-resolution -- an approach that inherently makes global temporal consistency difficult to achieve. By deploying both spatial and (importantly) temporal down- and up-sampling and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate a full-frame-rate, low-resolution video by processing it in multiple space-time scales. We demonstrate state-of-the-art text-to-video generation results, and show that our design easily facilitates a wide range of content creation tasks and video editing applications, including image-to-video, video inpainting, and stylized generation.

Community

No model just show once again? Google you are being way way behind in AI race.

decide

The video is amazing, but where is the action?! You can take top 1 in the AI race, but why don't you want it yourself? What should push you to do this?!

This comment has been hidden

I would like to perform tests on the model, very rich hard tests.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Cool!

This comment has been hidden
This comment has been hidden

Lumiere's Breakthrough: Space-Time Diffusion for Stunning Video Generation

Links πŸ”—:

πŸ‘‰ Subscribe: https://www.youtube.com/@Arxflix
πŸ‘‰ Twitter: https://x.com/arxflix
πŸ‘‰ LMNT (Partner): https://lmnt.com/

By Arxflix
9t4iCUHx_400x400-1.jpg

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2401.12945 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.12945 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 49