Phenaki: Variable Length Video Generation From Open Domain Textual Description
Paper
•
2210.02399
•
Published
•
3
The embeddings of images and video patches from raw frames x are processed by a spatial and then a causal transformer (AR in time) to gen video tokens