VFIMamba: Video Frame Interpolation with State Space Models

This is the official checkpoint library for VFIMamba: Video Frame Interpolation with State Space Models. Please refer to this repository for our code.

Model Description

VFIMamba is the first approach to adapt the SSM model to the video frame interpolation task.

We devise the Mixed-SSM Block (MSB) for efficient inter-frame modeling using S6.
We explore various rearrangement methods to convert two frames into a sequence, discovering that interleaved rearrangement is more suitable for VFI tasks.
We propose a curriculum learning strategy to further leverage the potential of the S6 model.

Experimental results demonstrate that VFIMamba achieves the state-of-the-art performance across various datasets, in particular highlighting the potential of the SSM model for VFI tasks with high resolution.

Usage

We provide two models, an efficient version (VFIMamba-S) and a stronger one (VFIMamba). You can choose what you need by specifying the parameter model.

Manually Load

Please refer to the instruction here for manually loading the checkpoints and a more customized experience.

python demo_2x.py  --model **model[VFIMamba_S/VFIMamba]**      # for 2x interpolation
python demo_Nx.py --n 8 --model **model[VFIMamba_S/VFIMamba]** # for 8x interpolation

Hugging Face Demo

For Hugging Face demo, please refer to the code here.

python hf_demo_2x.py --model **model[VFIMamba_S/VFIMamba]**      # for 2x interpolation

Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:

@misc{zhang2024vfimambavideoframeinterpolation,
      title={VFIMamba: Video Frame Interpolation with State Space Models}, 
      author={Guozhen Zhang and Chunxu Liu and Yutao Cui and Xiaotong Zhao and Kai Ma and Limin Wang},
      year={2024},
      eprint={2407.02315},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.02315}, 
}