AnalogMutations commited on
Commit
0dc85a3
1 Parent(s): f090012

Add readme and model_index.json

Browse files
Files changed (2) hide show
  1. README.md +112 -3
  2. model_index.json +33 -0
README.md CHANGED
@@ -1,3 +1,112 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - stable-diffusion
5
+ - stable-diffusion-diffusers
6
+ - image-to-image
7
+ - art
8
+ widget:
9
+ - src: >-
10
+ https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png
11
+ prompt: Cartoonize the following image
12
+ datasets:
13
+ - instruction-tuning-sd/cartoonization
14
+ ---
15
+
16
+ # Instruction-tuned Stable Diffusion for Cartoonization (Fine-tuned)
17
+
18
+ This pipeline is an 'instruction-tuned' version of [Stable Diffusion (v1.5)](https://huggingface.co/runwayml/stable-diffusion-v1-5). It was
19
+ fine-tuned from the existing [InstructPix2Pix checkpoints](https://huggingface.co/timbrooks/instruct-pix2pix).
20
+
21
+ ## Pipeline description
22
+
23
+ Motivation behind this pipeline partly comes from [FLAN](https://huggingface.co/papers/2109.01652) and partly
24
+ comes from [InstructPix2Pix](https://huggingface.co/papers/2211.09800). The main idea is to first create an
25
+ instruction prompted dataset (as described in [our blog](https://hf.co/blog/instruction-tuning-sd)) and then conduct InstructPix2Pix style
26
+ training. The end objective is to make Stable Diffusion better at following specific instructions
27
+ that entail image transformation related operations.
28
+
29
+ <p align="center">
30
+ <img src="https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/instruction-tuning-sd.png" width=600/>
31
+ </p>
32
+
33
+ Follow [this post](https://hf.co/blog/instruction-tuning-sd) to know more.
34
+
35
+ ## Training procedure and results
36
+
37
+ Training was conducted on [instruction-tuning-sd/cartoonization](https://huggingface.co/datasets/instruction-tuning-sd/cartoonization) dataset. Refer to
38
+ [this repository](https://github.com/huggingface/instruction-tuned-sd) to know more. The training logs can be found [here](https://wandb.ai/sayakpaul/instruction-tuning-sd?workspace=user-sayakpaul).
39
+
40
+ Here are some results dervied from the pipeline:
41
+
42
+ <p align="center">
43
+ <img src="https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/cartoonization_results.jpeg" width=600/>
44
+ </p>
45
+
46
+ ## Intended uses & limitations
47
+
48
+ You can use the pipeline for performing cartoonization with an input image and an input prompt.
49
+
50
+ ### How to use
51
+
52
+ Here is how to use this model:
53
+
54
+ ```python
55
+ import torch
56
+ from diffusers import StableDiffusionInstructPix2PixPipeline
57
+ from diffusers.utils import load_image
58
+
59
+ model_id = "instruction-tuning-sd/cartoonizer"
60
+ pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(
61
+ model_id, torch_dtype=torch.float16, use_auth_token=True
62
+ ).to("cuda")
63
+
64
+ image_path = "https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
65
+ image = load_image(image_path)
66
+
67
+ image = pipeline("Cartoonize the following image", image=image).images[0]
68
+ image.save("image.png")
69
+ ```
70
+
71
+ For notes on limitations, misuse, malicious use, out-of-scope use, please refer to the model card
72
+ [here](https://huggingface.co/runwayml/stable-diffusion-v1-5).
73
+
74
+ ## Citation
75
+
76
+ **FLAN**
77
+
78
+ ```bibtex
79
+ @inproceedings{
80
+ wei2022finetuned,
81
+ title={Finetuned Language Models are Zero-Shot Learners},
82
+ author={Jason Wei and Maarten Bosma and Vincent Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M. Dai and Quoc V Le},
83
+ booktitle={International Conference on Learning Representations},
84
+ year={2022},
85
+ url={https://openreview.net/forum?id=gEZrGCozdqR}
86
+ }
87
+ ```
88
+
89
+ **InstructPix2Pix**
90
+
91
+ ```bibtex
92
+ @InProceedings{
93
+ brooks2022instructpix2pix,
94
+ author = {Brooks, Tim and Holynski, Aleksander and Efros, Alexei A.},
95
+ title = {InstructPix2Pix: Learning to Follow Image Editing Instructions},
96
+ booktitle = {CVPR},
97
+ year = {2023},
98
+ }
99
+ ```
100
+
101
+ **Instruction-tuning for Stable Diffusion blog**
102
+
103
+ ```bibtex
104
+ @article{
105
+ Paul2023instruction-tuning-sd,
106
+ author = {Paul, Sayak},
107
+ title = {Instruction-tuning Stable Diffusion with InstructPix2Pix},
108
+ journal = {Hugging Face Blog},
109
+ year = {2023},
110
+ note = {https://huggingface.co/blog/instruction-tuning-sd},
111
+ }
112
+ ```
model_index.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "StableDiffusionInstructPix2PixPipeline",
3
+ "_diffusers_version": "0.15.0.dev0",
4
+ "feature_extractor": [
5
+ "transformers",
6
+ "CLIPImageProcessor"
7
+ ],
8
+ "requires_safety_checker": false,
9
+ "safety_checker": [
10
+ "stable_diffusion",
11
+ "StableDiffusionSafetyChecker"
12
+ ],
13
+ "scheduler": [
14
+ "diffusers",
15
+ "EulerAncestralDiscreteScheduler"
16
+ ],
17
+ "text_encoder": [
18
+ "transformers",
19
+ "CLIPTextModel"
20
+ ],
21
+ "tokenizer": [
22
+ "transformers",
23
+ "CLIPTokenizer"
24
+ ],
25
+ "unet": [
26
+ "diffusers",
27
+ "UNet2DConditionModel"
28
+ ],
29
+ "vae": [
30
+ "diffusers",
31
+ "AutoencoderKL"
32
+ ]
33
+ }