h94 commited on
Commit
d8f6d6f
1 Parent(s): 08042ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -1
README.md CHANGED
@@ -26,10 +26,76 @@ An experimental version of IP-Adapter-FaceID: we use face ID embedding from a fa
26
 
27
  ![results](./ip-adapter-faceid.jpg)
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Limitations and Bias
31
  - The model does not achieve perfect photorealism and ID consistency.
32
- - The generalization of the model is limited due to limitations of the training model, generative model and face recognition model.
33
 
34
 
35
 
 
26
 
27
  ![results](./ip-adapter-faceid.jpg)
28
 
29
+ ## Usage
30
+
31
+ Firstly, you should use [insightface](https://github.com/deepinsight/insightface) to extract face ID embedding:
32
+
33
+ ```python
34
+
35
+ import cv2
36
+ from insightface.app import FaceAnalysis
37
+
38
+
39
+ app = FaceAnalysis(name="buffalo_l", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
40
+ app.prepare(ctx_id=0, det_size=(640, 640))
41
+
42
+ image = cv2.imread("person.jpg")
43
+ faces = app.get(image)
44
+
45
+ faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)
46
+ ```
47
+
48
+ Then, you can generate images conditioned on the face embeddings:
49
+
50
+ ```python
51
+
52
+ import torch
53
+ from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL
54
+ from PIL import Image
55
+
56
+ from ip_adapter.ip_adapter_faceid import IPAdapterFaceID
57
+
58
+ base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
59
+ vae_model_path = "stabilityai/sd-vae-ft-mse"
60
+ ip_ckpt = "ip-adapter-faceid_sd15.bin"
61
+ device = "cuda"
62
+
63
+ noise_scheduler = DDIMScheduler(
64
+ num_train_timesteps=1000,
65
+ beta_start=0.00085,
66
+ beta_end=0.012,
67
+ beta_schedule="scaled_linear",
68
+ clip_sample=False,
69
+ set_alpha_to_one=False,
70
+ steps_offset=1,
71
+ )
72
+ vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
73
+ pipe = StableDiffusionPipeline.from_pretrained(
74
+ base_model_path,
75
+ torch_dtype=torch.float16,
76
+ scheduler=noise_scheduler,
77
+ vae=vae,
78
+ feature_extractor=None,
79
+ safety_checker=None
80
+ )
81
+
82
+ # load ip-adapter
83
+ ip_model = IPAdapterFaceID(pipe, ip_ckpt, device)
84
+
85
+ # generate image
86
+ prompt = "photo of a woman in red dress in a garden"
87
+ negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality, blurry"
88
+
89
+ images = ip_model.generate(
90
+ prompt=prompt, negative_prompt=negative_prompt, faceid_embeds=faceid_embeds, num_samples=4, width=512, height=768, num_inference_steps=30, seed=2023
91
+ )
92
+
93
+ ```
94
+
95
 
96
  ## Limitations and Bias
97
  - The model does not achieve perfect photorealism and ID consistency.
98
+ - The generalization of the model is limited due to limitations of the training data, base model and face recognition model.
99
 
100
 
101