xinsir
/

controlnet-openpose-sdxl-1.0

+---
+license: apache-2.0
+---
+# ***State of the art ControlNet-openpose-sdxl-1.0 model, not limited to anime, just for show***
+![images](./masonry0.webp)
+### Examples
+![images0](./000001_scribble_concat.webp)
+![images1](./000003_scribble_concat.webp)
+![images2](./000005_scribble_concat.webp)
+![images3](./000008_scribble_concat.webpp)
+![images4](./000015_scribble_concat.webp)
+![images5](./000031_scribble_concat.webp)
+![images6](./000042_scribble_concat.webp)
+![images7](./000047_scribble_concat.webp)
+![images8](./000048_scribble_concat.webp)
+![images9](./000083_scribble_concat.webp)
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
+from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
+from controlnet_aux import OpenposeDetector
+from PIL import Image
+import torch
+import numpy as np
+import cv2
+controlnet_conditioning_scale = 1.0
+prompt = "your prompt, the longer the better, you can describe it as detail as possible"
+negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
+eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
+controlnet = ControlNetModel.from_pretrained(
+    "xinsir/controlnet-openpose-sdxl-1.0",
+    torch_dtype=torch.float16
+)
+# when test with other base model, you need to change the vae also.
+vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
+pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
+    "stabilityai/stable-diffusion-xl-base-1.0",
+    controlnet=controlnet,
+    vae=vae,
+    safety_checker=None,
+    torch_dtype=torch.float16,
+    scheduler=eulera_scheduler,
+)
+processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')
+controlnet_img = cv2.imread("your image path")
+controlnet_img = processor(controlnet_img, hand_and_face=False, output_type='cv2')
+# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
+height, width, _  = controlnet_img.shape
+ratio = np.sqrt(1024. * 1024. / (width * height))
+new_width, new_height = int(width * ratio), int(height * ratio)
+controlnet_img = cv2.resize(controlnet_img, (new_width, new_height))
+controlnet_img = Image.fromarray(controlnet_img)
+images = pipe(
+    prompt,
+    negative_prompt=negative_prompt,
+    image=controlnet_img,
+    controlnet_conditioning_scale=controlnet_conditioning_scale,
+    width=new_width,
+    height=new_height,
+    num_inference_steps=30,
+    ).images
+images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
+```
+## Evaluation Data
+HumanArt [https://github.com/IDEA-Research/HumanArt], select 2000 images with ground truth pose annotations to generate images and calculate mAP.
+## Quantitative Result
+| metric | xinsir/controlnet-openpose-sdxl-1.0 |  lllyasviel/control_v11p_sd15_openpose | thibaud/controlnet-openpose-sdxl-1.0 |
+|-------|-------|-------|-------|
+| mAP | **0.357** | 0.326 | 0.209 |
+We are the SOTA openpose model compared with other opensource models.