Upload 26 files

Browse files

Files changed (27) hide show

.gitattributes +1 -0
README.md +373 -0
controlnet_utils.py +40 -0
images/bag.png +0 -0
images/bag_scribble.png +0 -0
images/bag_scribble_out.png +0 -0
images/bird.png +3 -0
images/bird_canny.png +0 -0
images/bird_canny_out.png +0 -0
images/chef_pose_out.png +0 -0
images/house.png +0 -0
images/house_seg.png +0 -0
images/house_seg_out.png +0 -0
images/man.png +0 -0
images/man_hed.png +0 -0
images/man_hed_out.png +0 -0
images/openpose.png +0 -0
images/pose.png +0 -0
images/room.png +0 -0
images/room_mlsd.png +0 -0
images/room_mlsd_out.png +0 -0
images/stormtrooper.png +0 -0
images/stormtrooper_depth.png +0 -0
images/stormtrooper_depth_out.png +0 -0
images/toy.png +0 -0
images/toy_normal.png +0 -0
images/toy_normal_out.png +0 -0

.gitattributes CHANGED Viewed

@@ -32,3 +32,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+images/bird.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,376 @@
 ---
 license: openrail
 ---

 ---
 license: openrail
 ---
+# Controlnet
+Controlnet is an auxiliary model which augments pre-trained diffusion models with an additional conditioning.
+Controlnet comes with multiple auxiliary models, each which allows a different type of conditioning
+Controlnet's auxiliary models are trained with stable diffusion 1.5. Experimentally, the auxiliary models can be used with other diffusion models such as dreamboothed stable diffusion.
+The auxiliary conditioning is passed directly to the diffusers pipeline. If you want to process an image to create the auxiliary conditioning, external dependencies are required.
+Some of the additional conditionings can be extracted from images via additional models. We extracted these
+additional models from the original controlnet repo into a separate package that can be found on [github](https://github.com/patrickvonplaten/human_pose.git).
+## Canny edge detection
+Install opencv
+```sh
+$ pip install opencv-contrib-python
+```
+```python
+import cv2
+from PIL import Image
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+import torch
+import numpy as np
+image = Image.open('images/bird.png')
+image = np.array(image)
+low_threshold = 100
+high_threshold = 200
+image = cv2.Canny(image, low_threshold, high_threshold)
+image = image[:, :, None]
+image = np.concatenate([image, image, image], axis=2)
+image = Image.fromarray(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-canny",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("bird", image).images[0]
+image.save('images/bird_canny_out.png')
+```
+![bird](./images/bird.png)
+![bird_canny](./images/bird_canny.png)
+![bird_canny_out](./images/bird_canny_out.png)
+## M-LSD Straight line detection
+Install the additional controlnet models package.
+```sh
+$ pip install git+https://github.com/patrickvonplaten/human_pose.git
+```
+```py
+from PIL import Image
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+import torch
+from human_pose import MLSDdetector
+mlsd = MLSDdetector.from_pretrained('lllyasviel/ControlNet')
+image = Image.open('images/room.png')
+image = mlsd(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-mlsd",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("room", image).images[0]
+image.save('images/room_mlsd_out.png')
+```
+![room](./images/room.png)
+![room_mlsd](./images/room_mlsd.png)
+![room_mlsd_out](./images/room_mlsd_out.png)
+## Pose estimation
+Install the additional controlnet models package.
+```sh
+$ pip install git+https://github.com/patrickvonplaten/human_pose.git
+```
+```py
+from PIL import Image
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+import torch
+from human_pose import OpenposeDetector
+openpose = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')
+image = Image.open('images/pose.png')
+image = openpose(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-openpose",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("chef in the kitchen", image).images[0]
+image.save('images/chef_pose_out.png')
+```
+![pose](./images/pose.png)
+![openpose](./images/openpose.png)
+![chef_pose_out](./images/chef_pose_out.png)
+## Semantic Segmentation
+Semantic segmentation relies on transformers. Transformers is a
+dependency of diffusers for running controlnet, so you should
+have it installed already.
+```py
+from transformers import AutoImageProcessor, UperNetForSemanticSegmentation
+from PIL import Image
+import numpy as np
+from controlnet_utils import ade_palette
+import torch
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+image_processor = AutoImageProcessor.from_pretrained("openmmlab/upernet-convnext-small")
+image_segmentor = UperNetForSemanticSegmentation.from_pretrained("openmmlab/upernet-convnext-small")
+image = Image.open("./images/house.png").convert('RGB')
+pixel_values = image_processor(image, return_tensors="pt").pixel_values
+with torch.no_grad():
+  outputs = image_segmentor(pixel_values)
+seg = image_processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
+color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8) # height, width, 3
+palette = np.array(ade_palette())
+for label, color in enumerate(palette):
+    color_seg[seg == label, :] = color
+color_seg = color_seg.astype(np.uint8)
+image = Image.fromarray(color_seg)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-seg",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("house", image).images[0]
+image.save('./images/house_seg_out.png')
+```
+![house](images/house.png)
+![house_seg](images/house_seg.png)
+![house_seg_out](images/house_seg_out.png)
+## Depth control
+Depth control relies on transformers. Transformers is a dependency of diffusers for running controlnet, so
+you should have it installed already.
+```py
+from transformers import pipeline
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+from PIL import Image
+import numpy as np
+depth_estimator = pipeline('depth-estimation')
+image = Image.open('./images/stormtrooper.png')
+image = depth_estimator(image)['depth']
+image = np.array(image)
+image = image[:, :, None]
+image = np.concatenate([image, image, image], axis=2)
+image = Image.fromarray(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-depth",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("Stormtrooper's lecture", image).images[0]
+image.save('./images/stormtrooper_depth_out.png')
+```
+![stormtrooper](./images/stormtrooper.png)
+![stormtrooler_depth](./images/stormtrooper_depth.png)
+![stormtrooler_depth_out](./images/stormtrooper_depth_out.png)
+## Normal map
+```py
+from PIL import Image
+from transformers import pipeline
+import numpy as np
+import cv2
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+image = Image.open("images/toy.png").convert("RGB")
+depth_estimator = pipeline("depth-estimation", model ="Intel/dpt-hybrid-midas" )
+image = depth_estimator(image)['predicted_depth'][0]
+image = image.numpy()
+image_depth = image.copy()
+image_depth -= np.min(image_depth)
+image_depth /= np.max(image_depth)
+bg_threhold = 0.4
+x = cv2.Sobel(image, cv2.CV_32F, 1, 0, ksize=3)
+x[image_depth < bg_threhold] = 0
+y = cv2.Sobel(image, cv2.CV_32F, 0, 1, ksize=3)
+y[image_depth < bg_threhold] = 0
+z = np.ones_like(x) * np.pi * 2.0
+image = np.stack([x, y, z], axis=2)
+image /= np.sum(image ** 2.0, axis=2, keepdims=True) ** 0.5
+image = (image * 127.5 + 127.5).clip(0, 255).astype(np.uint8)
+image = Image.fromarray(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-normal",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("cute toy", image).images[0]
+image.save('images/toy_normal_out.png')
+```
+![toy](./images/toy.png)
+![toy_normal](./images/toy_normal.png)
+![toy_normal_out](./images/toy_normal_out.png)
+## Scribble
+Install the additional controlnet models package.
+```sh
+$ pip install git+https://github.com/patrickvonplaten/human_pose.git
+```
+```py
+from PIL import Image
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+import torch
+from human_pose import HEDdetector
+hed = HEDdetector.from_pretrained('lllyasviel/ControlNet')
+image = Image.open('images/bag.png')
+image = hed(image, scribble=True)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-scribble",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("bag", image).images[0]
+image.save('images/bag_scribble_out.png')
+```
+![bag](./images/bag.png)
+![bag_scribble](./images/bag_scribble.png)
+![bag_scribble_out](./images/bag_scribble_out.png)
+## HED Boundary
+Install the additional controlnet models package.
+```sh
+$ pip install git+https://github.com/patrickvonplaten/human_pose.git
+```
+```py
+from PIL import Image
+from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
+import torch
+from human_pose import HEDdetector
+hed = HEDdetector.from_pretrained('lllyasviel/ControlNet')
+image = Image.open('images/man.png')
+image = hed(image)
+controlnet = ControlNetModel.from_pretrained(
+    "fusing/stable-diffusion-v1-5-controlnet-hed",
+)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None
+)
+pipe.to('cuda')
+image = pipe("oil painting of handsome old man, masterpiece", image).images[0]
+image.save('images/man_hed_out.png')
+```
+![man](./images/man.png)
+![man_hed](./images/man_hed.png)
+![man_hed_out](./images/man_hed_out.png)

controlnet_utils.py ADDED Viewed

	@@ -0,0 +1,40 @@

+def ade_palette():
+    """ADE20K palette that maps each class to RGB values."""
+    return [[120, 120, 120], [180, 120, 120], [6, 230, 230], [80, 50, 50],
+            [4, 200, 3], [120, 120, 80], [140, 140, 140], [204, 5, 255],
+            [230, 230, 230], [4, 250, 7], [224, 5, 255], [235, 255, 7],
+            [150, 5, 61], [120, 120, 70], [8, 255, 51], [255, 6, 82],
+            [143, 255, 140], [204, 255, 4], [255, 51, 7], [204, 70, 3],
+            [0, 102, 200], [61, 230, 250], [255, 6, 51], [11, 102, 255],
+            [255, 7, 71], [255, 9, 224], [9, 7, 230], [220, 220, 220],
+            [255, 9, 92], [112, 9, 255], [8, 255, 214], [7, 255, 224],
+            [255, 184, 6], [10, 255, 71], [255, 41, 10], [7, 255, 255],
+            [224, 255, 8], [102, 8, 255], [255, 61, 6], [255, 194, 7],
+            [255, 122, 8], [0, 255, 20], [255, 8, 41], [255, 5, 153],
+            [6, 51, 255], [235, 12, 255], [160, 150, 20], [0, 163, 255],
+            [140, 140, 140], [250, 10, 15], [20, 255, 0], [31, 255, 0],
+            [255, 31, 0], [255, 224, 0], [153, 255, 0], [0, 0, 255],
+            [255, 71, 0], [0, 235, 255], [0, 173, 255], [31, 0, 255],
+            [11, 200, 200], [255, 82, 0], [0, 255, 245], [0, 61, 255],
+            [0, 255, 112], [0, 255, 133], [255, 0, 0], [255, 163, 0],
+            [255, 102, 0], [194, 255, 0], [0, 143, 255], [51, 255, 0],
+            [0, 82, 255], [0, 255, 41], [0, 255, 173], [10, 0, 255],
+            [173, 255, 0], [0, 255, 153], [255, 92, 0], [255, 0, 255],
+            [255, 0, 245], [255, 0, 102], [255, 173, 0], [255, 0, 20],
+            [255, 184, 184], [0, 31, 255], [0, 255, 61], [0, 71, 255],
+            [255, 0, 204], [0, 255, 194], [0, 255, 82], [0, 10, 255],
+            [0, 112, 255], [51, 0, 255], [0, 194, 255], [0, 122, 255],
+            [0, 255, 163], [255, 153, 0], [0, 255, 10], [255, 112, 0],
+            [143, 255, 0], [82, 0, 255], [163, 255, 0], [255, 235, 0],
+            [8, 184, 170], [133, 0, 255], [0, 255, 92], [184, 0, 255],
+            [255, 0, 31], [0, 184, 255], [0, 214, 255], [255, 0, 112],
+            [92, 255, 0], [0, 224, 255], [112, 224, 255], [70, 184, 160],
+            [163, 0, 255], [153, 0, 255], [71, 255, 0], [255, 0, 163],
+            [255, 204, 0], [255, 0, 143], [0, 255, 235], [133, 255, 0],
+            [255, 0, 235], [245, 0, 255], [255, 0, 122], [255, 245, 0],
+            [10, 190, 212], [214, 255, 0], [0, 204, 255], [20, 0, 255],
+            [255, 255, 0], [0, 153, 255], [0, 41, 255], [0, 255, 204],
+            [41, 0, 255], [41, 255, 0], [173, 0, 255], [0, 245, 255],
+            [71, 0, 255], [122, 0, 255], [0, 255, 184], [0, 92, 255],
+            [184, 255, 0], [0, 133, 255], [255, 214, 0], [25, 194, 194],
+            [102, 255, 0], [92, 0, 255]]