|
--- |
|
tags: |
|
- text-to-image |
|
- stable-diffusion |
|
- audio-to-video |
|
license: apache-2.0 |
|
language: |
|
- en |
|
library_name: diffusers |
|
--- |
|
|
|
# V-Express Model Card |
|
|
|
<div align="center"> |
|
|
|
[**Project Page**](https://tenvence.github.io/p/v-express/) **|** [**Paper (comming soon)**](https://tenvence.github.io/p/v-express/) **|** [**Code**](https://github.com/tencent-ailab/V-Express) |
|
|
|
</div> |
|
|
|
--- |
|
|
|
## Introduction |
|
|
|
## Models |
|
|
|
### Audio Encoder |
|
|
|
- [model_ckpts/wav2vec2-base-960h](https://huggingface.co/tk93/V-Express/tree/main/model_ckpts/wav2vec2-base-960h). (It is also available from the original model card [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h)) |
|
|
|
### Face Analysis |
|
|
|
- [model_ckpts/insightface_models/models/buffalo_l](https://huggingface.co/tk93/V-Express/tree/main/model_ckpts/insightface_models/models/buffalo_l). (It is also available from the original repository [insightface/buffalo_l](https://github.com/deepinsight/insightface/releases/download/v0.7/buffalo_l.zip)) |
|
|
|
### V-Express |
|
|
|
- [model_ckpts/sd-vae-ft-mse](https://huggingface.co/tk93/V-Express/tree/main/model_ckpts/sd-vae-ft-mse). VAE encoder. (original model card [stabilityai/sd-vae-ft-mse](https://huggingface.co/stabilityai/sd-vae-ft-mse)) |
|
- [model_ckpts/stable-diffusion-v1-5](https://huggingface.co/tk93/V-Express/tree/main/model_ckpts/stable-diffusion-v1-5). Only the model configuration file for unet is needed here. (original model card [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)) |
|
- [model_ckpts/v-express](https://huggingface.co/tk93/V-Express/tree/main/model_ckpts/v-express). The video generation model conditional on audio and V-kps we call V-Express. |
|
|