Linaqruf commited on
Commit
9f0dafc
1 Parent(s): 1a09d75

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: creativeml-openrail-m
3
+ thumbnail: "https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/example_image/thumbnail.png"
4
+ language:
5
+ - en
6
+ pipeline_tag: text-to-image
7
+ tags:
8
+ - stable-diffusion
9
+ - stable-diffusion-diffusers
10
+ - diffusers
11
+ - waifu-diffusion
12
+ inference: true
13
+ widget:
14
+ - text: >-
15
+ masterpiece, best quality, 1girl, brown hair, green eyes, colorful, autumn,
16
+ cumulonimbus clouds, lighting, blue sky, falling leaves, garden
17
+ example_title: example 1girl
18
+ - text: >-
19
+ masterpiece, best quality, 1boy, medium hair, blonde hair, blue eyes,
20
+ bishounen, colorful, autumn, cumulonimbus clouds, lighting, blue sky,
21
+ falling leaves, garden
22
+ example_title: example 1boy
23
+ ---
24
+
25
+ # Hitokomoru Diffusion V2
26
+
27
+ ![Anime Girl](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/example_image/thumbnail.png)
28
+
29
+ A latent diffusion model that has been trained on Japanese Artist artwork, [ヒトこもる/Hitokomoru](https://www.pixiv.net/en/users/30837811). The current model is fine-tuned from [waifu-diffusion-1-4](https://huggingface.co/hakurei/waifu-diffusion-v1-4) (`wd-1-4-anime_e2.ckpt`) with a learning rate of `2.0e-6`, 15000 training steps and 4 batch sizes on the `257 artworks` collected from Danbooru. This model supposed to be a continuation of [hitokomoru-diffusion](https://huggingface.co/Linaqruf/hitokomoru-diffusion/) fine-tuned from Anything V3.0. Dataset has been preprocessed using [Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be converted to latents and trained at non-square resolutions. Like other anime-style Stable Diffusion models, it also supports Danbooru tags to generate images.
30
+
31
+ e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
32
+
33
+ - Use it with the [`Automatic1111's Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) see: [how-to-use](#howtouse)
34
+ - Use it with 🧨 [`diffusers`](##🧨Diffusers)
35
+
36
+
37
+ # Model Details
38
+
39
+ - **Developed by:** Linaqruf
40
+ - **Model type:** Diffusion-based text-to-image generation model
41
+ - **Model type:** This is a model that can be used to generate and modify images based on text prompts.
42
+ - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
43
+ - **Finetuned from model:** [waifu-diffusion-v1-4-epoch-2](https://huggingface.co/hakurei/waifu-diffusion-v1-4/blob/main/wd-1-4-anime_e2.ckpt)
44
+
45
+ ## How to Use
46
+ - Download the `hitokomoru-v2.ckpt` [here](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/hitokomoru-v2.ckpt), or download the safetensors version [here](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/hitokomoru-v2.safetensors).
47
+ - This model is fine-tuned from [waifu-diffusion-v1-4-epoch-2](https://huggingface.co/hakurei/waifu-diffusion-v1-4/blob/main/wd-1-4-anime_e2.ckpt), which is also fine-tuned from [stable-diffusion-2-1-base](https://huggingface.co/stabilityai/stable-diffusion-2-1-base). So in order to run this model in [`Automatic1111's Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui), you need to put inference config .YAML file next to the model, you can find it [here](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/hitokomoru-v2.yaml)
48
+
49
+ ## 🧨 Diffusers
50
+
51
+ This model can be used just like any other Stable Diffusion model. For more information, please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion). You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX]().
52
+
53
+ You should install dependencies below in order to running the pipeline
54
+
55
+ ```bash
56
+ pip install diffusers transformers accelerate scipy safetensors
57
+ ```
58
+ Running the pipeline (if you don't swap the scheduler it will run with the default DDIM, in this example we are swapping it to DPMSolverMultistepScheduler):
59
+
60
+ ```python
61
+ import torch
62
+ from torch import autocast
63
+ from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
64
+
65
+ model_id = "Linaqruf/hitokomoru-diffusion-v2"
66
+
67
+ # Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
68
+ pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
69
+ pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
70
+ pipe = pipe.to("cuda")
71
+
72
+ prompt = "masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress"
73
+ negative_prompt = "worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry"
74
+
75
+ with autocast("cuda"):
76
+ image = pipe(prompt,
77
+ negative_prompt=negative_prompt,
78
+ width=512,
79
+ height=728,
80
+ guidance_scale=12,
81
+ num_inference_steps=50).images[0]
82
+
83
+ image.save("anime_girl.png")
84
+ ```
85
+
86
+
87
+
88
+ ## Example
89
+
90
+ Here is some cherrypicked samples:
91
+
92
+ ![Anime Girl](https://huggingface.co/Linaqruf/hitokomoru-diffusion-v2/resolve/main/example_image/cherry-picked-sample.png)
93
+
94
+ ### Prompt and settings for Example Images
95
+
96
+ ```
97
+ masterpiece, best quality, high quality, 1girl, solo, sitting, confident expression, long blonde hair, blue eyes, formal dress, jewelry, make-up, luxury, close-up, face, upper body.
98
+
99
+ Negative prompt: worst quality, low quality, medium quality, deleted, lowres, comic, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, blurry
100
+
101
+ Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 994051800, Size: 512x768, Model hash: ea61e913a0, Model: hitokomoru-v2, Batch size: 2, Batch pos: 0, Denoising strength: 0.6, Clip skip: 2, ENSD: 31337, Hires upscale: 1.5, Hires steps: 20, Hires upscaler: Latent (nearest-exact)
102
+ ``````
103
+ ## License
104
+
105
+ This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
106
+ The CreativeML OpenRAIL License specifies:
107
+
108
+ 1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
109
+ 2. The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
110
+ 3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
111
+ [Please read the full license here](https://huggingface.co/spaces/CompVis/stable-diffusion-license)
112
+
113
+ ## Credit
114
+ - [ヒトこもる/Hitokomoru](https://www.pixiv.net/en/users/30837811) for Datasets