Update README.md
Browse files
README.md
CHANGED
@@ -8,23 +8,25 @@ license: creativeml-openrail-m
|
|
8 |
|
9 |
![pipeline](pipeline.png)
|
10 |
|
11 |
-
SDXL consists of a mixture-of-experts pipeline for latent diffusion:
|
12 |
In a first step, the base model is used to generate (noisy) latents,
|
13 |
-
which are then further processed with a refinement model (available here:
|
14 |
Note that the base model can be used as a standalone module.
|
15 |
|
16 |
-
Alternatively, we can use a two-
|
17 |
First, the base model is used to generate latents of the desired output size.
|
18 |
In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
|
19 |
-
to the latents generated in the first step, using the same prompt.
|
|
|
|
|
20 |
|
21 |
### Model Description
|
22 |
|
23 |
- **Developed by:** Stability AI
|
24 |
- **Model type:** Diffusion-based text-to-image generative model
|
25 |
-
- **License:** [
|
26 |
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
|
27 |
-
- **Resources for more information:** [GitHub Repository](https://github.com/Stability-AI/generative-models) [SDXL
|
28 |
|
29 |
### Model Sources
|
30 |
|
|
|
8 |
|
9 |
![pipeline](pipeline.png)
|
10 |
|
11 |
+
[SDXL](https://arxiv.org/abs/2307.01952) consists of a mixture-of-experts pipeline for latent diffusion:
|
12 |
In a first step, the base model is used to generate (noisy) latents,
|
13 |
+
which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps.
|
14 |
Note that the base model can be used as a standalone module.
|
15 |
|
16 |
+
Alternatively, we can use a two-stage pipeline as follows:
|
17 |
First, the base model is used to generate latents of the desired output size.
|
18 |
In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as "img2img")
|
19 |
+
to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations.
|
20 |
+
|
21 |
+
Source code is available at https://github.com/Stability-AI/generative-models .
|
22 |
|
23 |
### Model Description
|
24 |
|
25 |
- **Developed by:** Stability AI
|
26 |
- **Model type:** Diffusion-based text-to-image generative model
|
27 |
+
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
|
28 |
- **Model Description:** This is a model that can be used to generate and modify images based on text prompts. It is a [Latent Diffusion Model](https://arxiv.org/abs/2112.10752) that uses two fixed, pretrained text encoders ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip) and [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main)).
|
29 |
+
- **Resources for more information:** Check out our [GitHub Repository](https://github.com/Stability-AI/generative-models) and the [SDXL report on arXiv](https://arxiv.org/abs/2307.01952).
|
30 |
|
31 |
### Model Sources
|
32 |
|