Update README.md
Browse files
README.md
CHANGED
@@ -26,10 +26,76 @@ An experimental version of IP-Adapter-FaceID: we use face ID embedding from a fa
|
|
26 |
|
27 |
![results](./ip-adapter-faceid.jpg)
|
28 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Limitations and Bias
|
31 |
- The model does not achieve perfect photorealism and ID consistency.
|
32 |
-
- The generalization of the model is limited due to limitations of the training
|
33 |
|
34 |
|
35 |
|
|
|
26 |
|
27 |
![results](./ip-adapter-faceid.jpg)
|
28 |
|
29 |
+
## Usage
|
30 |
+
|
31 |
+
Firstly, you should use [insightface](https://github.com/deepinsight/insightface) to extract face ID embedding:
|
32 |
+
|
33 |
+
```python
|
34 |
+
|
35 |
+
import cv2
|
36 |
+
from insightface.app import FaceAnalysis
|
37 |
+
|
38 |
+
|
39 |
+
app = FaceAnalysis(name="buffalo_l", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
|
40 |
+
app.prepare(ctx_id=0, det_size=(640, 640))
|
41 |
+
|
42 |
+
image = cv2.imread("person.jpg")
|
43 |
+
faces = app.get(image)
|
44 |
+
|
45 |
+
faceid_embeds = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)
|
46 |
+
```
|
47 |
+
|
48 |
+
Then, you can generate images conditioned on the face embeddings:
|
49 |
+
|
50 |
+
```python
|
51 |
+
|
52 |
+
import torch
|
53 |
+
from diffusers import StableDiffusionPipeline, DDIMScheduler, AutoencoderKL
|
54 |
+
from PIL import Image
|
55 |
+
|
56 |
+
from ip_adapter.ip_adapter_faceid import IPAdapterFaceID
|
57 |
+
|
58 |
+
base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
|
59 |
+
vae_model_path = "stabilityai/sd-vae-ft-mse"
|
60 |
+
ip_ckpt = "ip-adapter-faceid_sd15.bin"
|
61 |
+
device = "cuda"
|
62 |
+
|
63 |
+
noise_scheduler = DDIMScheduler(
|
64 |
+
num_train_timesteps=1000,
|
65 |
+
beta_start=0.00085,
|
66 |
+
beta_end=0.012,
|
67 |
+
beta_schedule="scaled_linear",
|
68 |
+
clip_sample=False,
|
69 |
+
set_alpha_to_one=False,
|
70 |
+
steps_offset=1,
|
71 |
+
)
|
72 |
+
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
|
73 |
+
pipe = StableDiffusionPipeline.from_pretrained(
|
74 |
+
base_model_path,
|
75 |
+
torch_dtype=torch.float16,
|
76 |
+
scheduler=noise_scheduler,
|
77 |
+
vae=vae,
|
78 |
+
feature_extractor=None,
|
79 |
+
safety_checker=None
|
80 |
+
)
|
81 |
+
|
82 |
+
# load ip-adapter
|
83 |
+
ip_model = IPAdapterFaceID(pipe, ip_ckpt, device)
|
84 |
+
|
85 |
+
# generate image
|
86 |
+
prompt = "photo of a woman in red dress in a garden"
|
87 |
+
negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality, blurry"
|
88 |
+
|
89 |
+
images = ip_model.generate(
|
90 |
+
prompt=prompt, negative_prompt=negative_prompt, faceid_embeds=faceid_embeds, num_samples=4, width=512, height=768, num_inference_steps=30, seed=2023
|
91 |
+
)
|
92 |
+
|
93 |
+
```
|
94 |
+
|
95 |
|
96 |
## Limitations and Bias
|
97 |
- The model does not achieve perfect photorealism and ID consistency.
|
98 |
+
- The generalization of the model is limited due to limitations of the training data, base model and face recognition model.
|
99 |
|
100 |
|
101 |
|