File size: 2,469 Bytes

994e46a
 
 
 
 
 
 
 
 
 
 
40f663f
6a63c98
 
b00ade0
73fa0da
6a63c98
b00ade0
73fa0da
6a63c98
 
 
6fca9e2
994e46a
 
 
6fca9e2
715e74d
994e46a
 
 
e81c441
994e46a
b70e3a2
fcf7a9c
994e46a
fcf7a9c
 
 
994e46a
 
 
 
 
 
 
fcf7a9c
6a63c98
994e46a
b70e3a2
994e46a
fcf7a9c
994e46a
 
 
fcf7a9c
994e46a
 
 
 
6a63c98
b70e3a2
994e46a
fcf7a9c
 
994e46a
 
fcf7a9c
994e46a
 
 
 
 
d2a87d9
fcf7a9c
d2a87d9
994e46a
fcf7a9c
994e46a
e81c441
994e46a
fcf7a9c
994e46a
592e9f9

---
language:
- en
pipeline_tag: unconditional-image-generation
tags:
- Diffusion Models
- Stable Diffusion
- Perturbed-Attention Guidance
- PAG
---

# Perturbed-Attention Guidance for SDXL

<div style="display:flex">
  <video width=50% autoplay loop controls>
    <source src="https://huggingface.co/multimodalart/sdxl_perturbed_attention_guidance/resolve/main/pag_sdxl.mp4" type="video/mp4">
  </video>
  <video width=50% autoplay loop controls>
    <source src="https://huggingface.co/multimodalart/sdxl_perturbed_attention_guidance/resolve/main/pag_uncond.mp4" type="video/mp4">
  </video>
</div>

The original Perturbed-Attention Guidance for unconditional models and SD1.5 by [Hyoungwon Cho](https://huggingface.co/hyoungwoncho) is availiable at [hyoungwoncho/sd_perturbed_attention_guidance](https://huggingface.co/hyoungwoncho/sd_perturbed_attention_guidance)

[Project](https://ku-cvlab.github.io/Perturbed-Attention-Guidance/) / [arXiv](https://arxiv.org/abs/2403.17377) / [GitHub](https://github.com/KU-CVLAB/Perturbed-Attention-Guidance)

This repository is just a simple SDXL implementation of the Perturbed-Attention Guidance (PAG) on Stable Diffusion XL (SDXL) for the 🧨 diffusers library.


## Quickstart

Loading Custom Pipeline:

```py
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    custom_pipeline="multimodalart/sdxl_perturbed_attention_guidance",
    torch_dtype=torch.float16
)

device="cuda"
pipe = pipe.to(device)
```

Unconditional sampling with PAG:
![image/jpeg](uncond_generation_pag.jpg)

```py
output = pipe(
        "",
        num_inference_steps=50,
        guidance_scale=0.0,
        pag_scale=5.0,
        pag_applied_layers=['mid']
    ).images
```

Sampling with PAG and CFG:
![image/jpeg](cfgpag.jpg)
```py
output = pipe(
        "the spirit of a tamagotchi wandering in the city of Vienna",
        num_inference_steps=25,
        guidance_scale=4.0,
        pag_scale=3.0,
        pag_applied_layers=['mid']
    ).images
```

## Parameters

`guidance_scale` : guidance scale of CFG (ex: `7.5`)

`pag_scale` : guidance scale of PAG (ex: `4.0`)

`pag_applied_layers`: layer to apply perturbation (ex: ['mid'])

`pag_applied_layers_index` : index of the layers to apply perturbation (ex: ['m0', 'm1'])

## Stable Diffusion XL Demo

[Try it here](https://huggingface.co/spaces/multimodalart/perturbed-attention-guidance-sdxl)