Update README.md
Browse files
README.md
CHANGED
@@ -8,23 +8,26 @@ license: creativeml-openrail-m
|
|
8 |
|
9 |
This is a low-quality bocchi-the-rock (ぼっち・ざ・ろっく!) character model.
|
10 |
Similar to my [yama-no-susume model](https://huggingface.co/alea31415/yama-no-susume), this model is capable of generating **multi-character scenes** beyond images of a single character.
|
11 |
-
Of course, the result is still hit-or-miss, but I
|
12 |
and otherwise, you can always rely on inpainting.
|
13 |
Here are two examples:
|
14 |
|
15 |
With inpainting
|
16 |
-
|
17 |
|
18 |
Without inpainting
|
19 |
-
|
20 |
|
21 |
|
22 |
### Characters
|
23 |
|
24 |
The model knows 12 characters from bocchi the rock.
|
25 |
-
The ressemblance with a character can be improved by a better description of their appearance.
|
|
|
|
|
|
|
|
|
26 |
|
27 |
-
*Coming soon*
|
28 |
|
29 |
### Dataset description
|
30 |
|
@@ -51,7 +54,7 @@ The model is trained on runpod using 3090 and cost me around 15 dollors.
|
|
51 |
|
52 |
#### Hyperparameter specification
|
53 |
|
54 |
-
|
55 |
|
56 |
Note that as a consequence of the weighting scheme which translates into a number of different multiply for each image,
|
57 |
the count of repeat and epoch has a quite different meaning here.
|
@@ -61,16 +64,33 @@ and therefore I did not even finish an entire epoch with the 48000 steps at batc
|
|
61 |
### Failures
|
62 |
|
63 |
- For the first 24000 steps I use the trigger words `Bfan1` and `Bfan2` for the two fans of Bocchi.
|
64 |
-
However, these two words are too similar and the model fails to different characters for these.
|
|
|
|
|
|
|
|
|
65 |
|
66 |
|
67 |
### More Example Generations
|
68 |
|
69 |
With inpainting
|
70 |
-
|
|
|
|
|
|
|
71 |
|
72 |
Without inpainting
|
73 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
|
75 |
Some failure cases
|
76 |
-
|
|
|
|
|
|
|
|
8 |
|
9 |
This is a low-quality bocchi-the-rock (ぼっち・ざ・ろっく!) character model.
|
10 |
Similar to my [yama-no-susume model](https://huggingface.co/alea31415/yama-no-susume), this model is capable of generating **multi-character scenes** beyond images of a single character.
|
11 |
+
Of course, the result is still hit-or-miss, but I with some chance you can get the entire Kessoku Band right in one shot,
|
12 |
and otherwise, you can always rely on inpainting.
|
13 |
Here are two examples:
|
14 |
|
15 |
With inpainting
|
16 |
+
![4265343062-1047638199](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343062-1047638199.png)
|
17 |
|
18 |
Without inpainting
|
19 |
+
![4265343086-2648280139](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343086-2648280139.png)
|
20 |
|
21 |
|
22 |
### Characters
|
23 |
|
24 |
The model knows 12 characters from bocchi the rock.
|
25 |
+
The ressemblance with a character can be improved by a better description of their appearance (for example by adding long wavy hair to ShimizuEliza).
|
26 |
+
|
27 |
+
![xy_grid-0028-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0028-24.jpg)
|
28 |
+
![xy_grid-0029-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0029-24.jpg)
|
29 |
+
![xy_grid-0030-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0030-24.jpg)
|
30 |
|
|
|
31 |
|
32 |
### Dataset description
|
33 |
|
|
|
54 |
|
55 |
#### Hyperparameter specification
|
56 |
|
57 |
+
The model is trained for 48000 steps, at batch size 4, lr 1e-6, resolution 512, and conditional dropping rate of 10%.
|
58 |
|
59 |
Note that as a consequence of the weighting scheme which translates into a number of different multiply for each image,
|
60 |
the count of repeat and epoch has a quite different meaning here.
|
|
|
64 |
### Failures
|
65 |
|
66 |
- For the first 24000 steps I use the trigger words `Bfan1` and `Bfan2` for the two fans of Bocchi.
|
67 |
+
However, these two words are too similar and the model fails to different characters for these.
|
68 |
+
Therefore I changed Bfan2 to Bofa2 at step 24000. This seemed to solve the problem.
|
69 |
+
- Character blending is always an issue.
|
70 |
+
- When prompting the four characters of Kessoku Band we often get side shots.
|
71 |
+
I think this is because of some overfitting to a particular image.
|
72 |
|
73 |
|
74 |
### More Example Generations
|
75 |
|
76 |
With inpainting
|
77 |
+
![4265343068-2420755431](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343068-2420755431.png)
|
78 |
+
![4265343066-3979275255](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343066-3979275255.png)
|
79 |
+
![4265343022-3534836762](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343022-3534836762.png)
|
80 |
+
|
81 |
|
82 |
Without inpainting
|
83 |
+
![4265343092-803155289](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343092-803155289.png)
|
84 |
+
![4265343053-918713189](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343053-918713189.png)
|
85 |
+
![4265343054-2839948768](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343054-2839948768.png)
|
86 |
+
![4265343096-399054050](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343096-399054050.png)
|
87 |
+
![4265343100-3858388158](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343100-3858388158.png)
|
88 |
+
![4265343016-2842516738](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343016-2842516738.png)
|
89 |
+
![4265343084-3548261345](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343084-3548261345.png)
|
90 |
+
![4265343083-1372779456](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343083-1372779456.png)
|
91 |
|
92 |
Some failure cases
|
93 |
+
![4265343089-2940163958](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343089-2940163958.png)
|
94 |
+
![4265343091-129639375](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343091-129639375.png)
|
95 |
+
![4265343048-2869643584](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343048-2869643584.png)
|
96 |
+
![4265343039-1470057774](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343039-1470057774.png)
|