Diffusers documentation

RePaint

You are viewing v0.18.2 version. A newer version v0.31.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

RePaint

Overview

RePaint: Inpainting using Denoising Diffusion Probabilistic Models (PNDM) by Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc Van Gool.

The abstract of the paper is the following:

Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image information. Since this technique does not modify or condition the original DDPM network itself, the model produces high-quality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks. RePaint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions.

The original codebase can be found here.

Available Pipelines:

Pipeline Tasks Colab
pipeline_repaint.py Image Inpainting -

Usage example

from io import BytesIO

import torch

import PIL
import requests
from diffusers import RePaintPipeline, RePaintScheduler


def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")


img_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/celeba_hq_256.png"
mask_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/mask_256.png"

# Load the original image and the mask as PIL images
original_image = download_image(img_url).resize((256, 256))
mask_image = download_image(mask_url).resize((256, 256))

# Load the RePaint scheduler and pipeline based on a pretrained DDPM model
scheduler = RePaintScheduler.from_pretrained("google/ddpm-ema-celebahq-256")
pipe = RePaintPipeline.from_pretrained("google/ddpm-ema-celebahq-256", scheduler=scheduler)
pipe = pipe.to("cuda")

generator = torch.Generator(device="cuda").manual_seed(0)
output = pipe(
    image=original_image,
    mask_image=mask_image,
    num_inference_steps=250,
    eta=0.0,
    jump_length=10,
    jump_n_sample=10,
    generator=generator,
)
inpainted_image = output.images[0]

RePaintPipeline

class diffusers.RePaintPipeline

< >

( unet scheduler )

__call__

< >

( image: typing.Union[torch.Tensor, PIL.Image.Image] mask_image: typing.Union[torch.Tensor, PIL.Image.Image] num_inference_steps: int = 250 eta: float = 0.0 jump_length: int = 10 jump_n_sample: int = 10 generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None output_type: typing.Optional[str] = 'pil' return_dict: bool = True ) ImagePipelineOutput or tuple

Parameters

  • image (torch.FloatTensor or PIL.Image.Image) — The original image to inpaint on.
  • mask_image (torch.FloatTensor or PIL.Image.Image) — The mask_image where 0.0 values define which part of the original image to inpaint (change).
  • num_inference_steps (int, optional, defaults to 1000) — The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
  • eta (float) — The weight of noise for added noise in a diffusion step. Its value is between 0.0 and 1.0 - 0.0 is DDIM and 1.0 is DDPM scheduler respectively.
  • jump_length (int, optional, defaults to 10) — The number of steps taken forward in time before going backward in time for a single jump (“j” in RePaint paper). Take a look at Figure 9 and 10 in https://arxiv.org/pdf/2201.09865.pdf.
  • jump_n_sample (int, optional, defaults to 10) — The number of times we will make forward time jump for a given chosen time sample. Take a look at Figure 9 and 10 in https://arxiv.org/pdf/2201.09865.pdf.
  • generator (torch.Generator, optional) — One or a list of torch generator(s) to make generation deterministic.
  • output_type (str, optional, defaults to "pil") — The output format of the generate image. Choose between PIL: PIL.Image.Image or np.array.
  • return_dict (bool, optional, defaults to True) — Whether or not to return a ImagePipelineOutput instead of a plain tuple.

Returns

ImagePipelineOutput or tuple

~pipelines.utils.ImagePipelineOutput if return_dict is True, otherwise a `tuple. When returning a tuple, the first element is a list with the generated images.