alea31415 commited on
Commit
964a2f0
1 Parent(s): 655efe9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -10
README.md CHANGED
@@ -8,23 +8,26 @@ license: creativeml-openrail-m
8
 
9
  This is a low-quality bocchi-the-rock (ぼっち・ざ・ろっく!) character model.
10
  Similar to my [yama-no-susume model](https://huggingface.co/alea31415/yama-no-susume), this model is capable of generating **multi-character scenes** beyond images of a single character.
11
- Of course, the result is still hit-or-miss, but I think the success rate of getting the **entire Kessoku Band** right in one shot is already quite high,
12
  and otherwise, you can always rely on inpainting.
13
  Here are two examples:
14
 
15
  With inpainting
16
- *Coming soon*
17
 
18
  Without inpainting
19
- *Coming soon*
20
 
21
 
22
  ### Characters
23
 
24
  The model knows 12 characters from bocchi the rock.
25
- The ressemblance with a character can be improved by a better description of their appearance.
 
 
 
 
26
 
27
- *Coming soon*
28
 
29
  ### Dataset description
30
 
@@ -51,7 +54,7 @@ The model is trained on runpod using 3090 and cost me around 15 dollors.
51
 
52
  #### Hyperparameter specification
53
 
54
- - The model is trained for 48000 steps, at batch size 4, lr 1e-6, resolution 512, and conditional dropping rate of 10%.
55
 
56
  Note that as a consequence of the weighting scheme which translates into a number of different multiply for each image,
57
  the count of repeat and epoch has a quite different meaning here.
@@ -61,16 +64,33 @@ and therefore I did not even finish an entire epoch with the 48000 steps at batc
61
  ### Failures
62
 
63
  - For the first 24000 steps I use the trigger words `Bfan1` and `Bfan2` for the two fans of Bocchi.
64
- However, these two words are too similar and the model fails to different characters for these. Therefore I changed Bfan2 to Bofa2 at step 24000.
 
 
 
 
65
 
66
 
67
  ### More Example Generations
68
 
69
  With inpainting
70
- *Coming soon*
 
 
 
71
 
72
  Without inpainting
73
- *Coming soon*
 
 
 
 
 
 
 
74
 
75
  Some failure cases
76
- *Coming soon*
 
 
 
 
8
 
9
  This is a low-quality bocchi-the-rock (ぼっち・ざ・ろっく!) character model.
10
  Similar to my [yama-no-susume model](https://huggingface.co/alea31415/yama-no-susume), this model is capable of generating **multi-character scenes** beyond images of a single character.
11
+ Of course, the result is still hit-or-miss, but I with some chance you can get the entire Kessoku Band right in one shot,
12
  and otherwise, you can always rely on inpainting.
13
  Here are two examples:
14
 
15
  With inpainting
16
+ ![4265343062-1047638199](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343062-1047638199.png)
17
 
18
  Without inpainting
19
+ ![4265343086-2648280139](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343086-2648280139.png)
20
 
21
 
22
  ### Characters
23
 
24
  The model knows 12 characters from bocchi the rock.
25
+ The ressemblance with a character can be improved by a better description of their appearance (for example by adding long wavy hair to ShimizuEliza).
26
+
27
+ ![xy_grid-0028-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0028-24.jpg)
28
+ ![xy_grid-0029-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0029-24.jpg)
29
+ ![xy_grid-0030-24](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/grids/xy_grid-0030-24.jpg)
30
 
 
31
 
32
  ### Dataset description
33
 
 
54
 
55
  #### Hyperparameter specification
56
 
57
+ The model is trained for 48000 steps, at batch size 4, lr 1e-6, resolution 512, and conditional dropping rate of 10%.
58
 
59
  Note that as a consequence of the weighting scheme which translates into a number of different multiply for each image,
60
  the count of repeat and epoch has a quite different meaning here.
 
64
  ### Failures
65
 
66
  - For the first 24000 steps I use the trigger words `Bfan1` and `Bfan2` for the two fans of Bocchi.
67
+ However, these two words are too similar and the model fails to different characters for these.
68
+ Therefore I changed Bfan2 to Bofa2 at step 24000. This seemed to solve the problem.
69
+ - Character blending is always an issue.
70
+ - When prompting the four characters of Kessoku Band we often get side shots.
71
+ I think this is because of some overfitting to a particular image.
72
 
73
 
74
  ### More Example Generations
75
 
76
  With inpainting
77
+ ![4265343068-2420755431](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343068-2420755431.png)
78
+ ![4265343066-3979275255](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343066-3979275255.png)
79
+ ![4265343022-3534836762](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/with_inpaint/4265343022-3534836762.png)
80
+
81
 
82
  Without inpainting
83
+ ![4265343092-803155289](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343092-803155289.png)
84
+ ![4265343053-918713189](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343053-918713189.png)
85
+ ![4265343054-2839948768](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343054-2839948768.png)
86
+ ![4265343096-399054050](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343096-399054050.png)
87
+ ![4265343100-3858388158](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343100-3858388158.png)
88
+ ![4265343016-2842516738](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343016-2842516738.png)
89
+ ![4265343084-3548261345](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343084-3548261345.png)
90
+ ![4265343083-1372779456](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/without_inpaint/4265343083-1372779456.png)
91
 
92
  Some failure cases
93
+ ![4265343089-2940163958](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343089-2940163958.png)
94
+ ![4265343091-129639375](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343091-129639375.png)
95
+ ![4265343048-2869643584](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343048-2869643584.png)
96
+ ![4265343039-1470057774](https://huggingface.co/alea31415/bocchi-the-rock-character/resolve/main/examples/failure/4265343039-1470057774.png)