Diffusers
English
sayakpaul HF staff commited on
Commit
4233ccc
1 Parent(s): fa49079

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: diffusers
5
+ license: other
6
+ license_name: flux-1-dev-non-commercial-license
7
+ license_link: LICENSE.md
8
+ ---
9
+
10
+ LoRA is the de-facto technique for quickly adapting a pre-trained large model on custom use cases. Typically, LoRA matrices are low-rank in nature. Now, the word “low” can vary depending on the context, but usually, for a large diffusion model like [Flux](https://huggingface.co/black-forest-labs/FLUX.1-dev), a rank of 128 can be considered high. This is because users may often need to keep multiple LoRAs unfused in memory to be able to quickly switch between them. So, the higher the rank, the higher the memory on top of the volume of the base model.
11
+
12
+ So, what if we could take an existing LoRA checkpoint with a high rank and reduce its rank even further to:
13
+
14
+ - Reduce the memory requirements
15
+ - Enable use cases like `torch.compile()` (which require all the LoRAs to be of the same rank to avoid re-compilation)
16
+
17
+ ## Random projections
18
+
19
+ Basic idea:
20
+
21
+ 1. Generate a random projection matrix: `R = torch.randn(new_rank, original_rank, dtype=torch.float32) / torch.sqrt(torch.tensor(new_rank, dtype=torch.float32))`.
22
+ 2. Then compute the new LoRA up and down matrices:
23
+
24
+ ```python
25
+ # We keep R in torch.float32 for numerical stability.
26
+ lora_A_new = (R @ lora_A.to(R.dtype)).to(lora_A.dtype)
27
+ lora_B_new = (lora_B.to(R.dtype) @ R.T).to(lora_B.dtype)
28
+ ```
29
+
30
+ If `lora_A` and `lora_B` had shapes of (42, 3072) and (3072, 42) respectively, `lora_A_new` and `lora_B_new` will have (4, 3072) and (3072, 4), respectively.
31
+
32
+
33
+ ### Results
34
+
35
+ Tried on this LoRA: [https://huggingface.co/glif/how2draw](https://huggingface.co/glif/how2draw). Unless explicitly specified, a rank of 4 was used for all experiments. Here’s a side-by-side comparison of the original and the reduced LoRAs (on the same seed).
36
+
37
+ ```python
38
+ from diffusers import DiffusionPipeline
39
+ import torch
40
+
41
+ pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16).to("cuda")
42
+ # Change accordingly.
43
+ lora_id = "How2Draw-V2_000002800_svd.safetensors"
44
+ pipe.load_lora_weights(lora_id)
45
+
46
+ prompts = [
47
+ "Yorkshire Terrier with smile, How2Draw",
48
+ "a dolphin, How2Draw",
49
+ "an owl, How3Draw",
50
+ "A silhouette of a girl performing a ballet pose, with elegant lines to suggest grace and movement. The background can include simple outlines of ballet shoes and a music note. The image should convey elegance and poise in a minimalistic style, How2Draw"
51
+ ]
52
+ images = pipe(
53
+ prompts, num_inference_steps=50, max_sequence_length=512, guidance_scale=3.5, generator=torch.manual_seed(0)
54
+ ).images
55
+ ```
56
+
57
+ ![Yorkshire Terrier with smile, How2Draw](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image.png)
58
+
59
+ Yorkshire Terrier with smile, How2Draw
60
+
61
+ ![a dolphin, How2Draw](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%201.png)
62
+
63
+ a dolphin, How2Draw
64
+
65
+ ![an owl, How3Draw](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%202.png)
66
+
67
+ an owl, How3Draw
68
+
69
+ ![A silhouette of a girl performing a ballet pose, with elegant lines to suggest grace and movement. The background can include simple outlines of ballet shoes and a music note. The image should convey elegance and poise in a minimalistic style, How2Draw](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%203.png)
70
+
71
+ A silhouette of a girl performing a ballet pose, with elegant lines to suggest grace and movement. The background can include simple outlines of ballet shoes and a music note. The image should convey elegance and poise in a minimalistic style, How2Draw
72
+
73
+ Code: [https://gist.github.com/sayakpaul/9bae12402eddd53a79ee1f64b659b07b#file-low_rank_lora-py](https://gist.github.com/sayakpaul/9bae12402eddd53a79ee1f64b659b07b#file-low_rank_lora-py)
74
+
75
+ ### Notes
76
+
77
+ * One should experiment with the `new_rank` parameter to obtain the desired trade-off between performance and memory. With a `new_rank` of 4, we reduce the size of the LoRA from 451MB to 42MB.
78
+ * There is a `use_sparse` option in the script above for using sparse random projection matrices.
79
+
80
+ ## SVD
81
+
82
+ ### Results
83
+
84
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%204.png)
85
+
86
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%205.png)
87
+
88
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%206.png)
89
+
90
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%207.png)
91
+
92
+ ### Randomized SVD
93
+
94
+ Full SVD can be time-consuming. Truncated SVD is useful very large sparse matrices. We can use randomized SVD for none-to-negligible loss in quality but significantly faster speed.
95
+
96
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%208.png)
97
+
98
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%209.png)
99
+
100
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%2010.png)
101
+
102
+ ![image.png](Make%20a%20high-rank%20LoRA%20low-rank%2010c1384ebcac80ca895dcc006a297900/image%2011.png)
103
+
104
+ Code: [https://gist.github.com/sayakpaul/9bae12402eddd53a79ee1f64b659b07b#file-svd_low_rank_lora-py](https://gist.github.com/sayakpaul/9bae12402eddd53a79ee1f64b659b07b#file-svd_low_rank_lora-py)
105
+
106
+ ### Tune the knobs in SVD
107
+
108
+ - `new_rank` as always
109
+ - `niter` when using randomized SVD