float16 model only generate black images/videos

#15
by jiagaoxiang - opened

When running the code example below, it will only give me black images/video. It seems to be related the same issue in stabilityai/stable-diffusion-2-1 model. When I use stabilityai/stable-diffusion-2-1 in float16, it will give me all black images; but if I use float32 or other diffusion model, it will be OK. For stable video diffusion, I cannot use float32 as it will cause memory errors. Can anyone help why the float16 type will lead to black images?

#################code below####################################################

import torch

from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

pipe = StableVideoDiffusionPipeline.from_pretrained(
"stabilityai/stable-video-diffusion-img2vid", torch_dtype=torch.float16, variant="fp16"
)
pipe.to("cuda")
pipe.unet = torch.compile(pipe.unet, fullgraph=True)

Load the conditioning image

image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))

generator = torch.manual_seed(42)
frames = pipe(image, decode_chunk_size=8, generator=generator).frames[0]

export_to_video(frames, "generated.mp4", fps=7)

Sign up or log in to comment