lavaman131
/

cartoonify

StableDiffusionPipeline

diffusers-training

stable-diffusion

stable-diffusion-diffusers

Inference Endpoints

Model card Files Files and versions Community

cartoonify / README.md

lavaman131's picture

Update README.md

6720d99 verified 6 months ago

|

2.55 kB

	---
	license: creativeml-openrail-m
	library_name: diffusers
	tags:
	- text-to-image
	- dreambooth
	- diffusers-training
	- stable-diffusion
	- stable-diffusion-diffusers
	base_model: runwayml/stable-diffusion-v1-5
	inference: true
	instance_prompt: disney style
	---

	<!-- This model card has been generated automatically according to the information the training script had access to. You
	should probably proofread and complete it, then remove this comment. -->


	# Cartoonify

	This is a dreambooth model derived from `runwayml/stable-diffusion-v1-5` with additional fine-tuning of the text encoder. The weights were trained from a popular animation studio using [DreamBooth](https://dreambooth.github.io/). Use the tokens _disney style_ in your prompts for the effect.

	You can find some example images below:

	<p float="left">
	<img width=256 height=256 src="./images/king.png">
	<img width=256 height=256 src="./images/legend_of_zelda.png">
	<img width=256 height=256 src="./images/pony.png">
	<img width=256 height=256 src="./images/princess.png">
	<img width=256 height=256 src="./images/red_ferrari.png">
	</p>

	## Intended uses & limitations

	#### How to use

	```python
	import torch
	from diffusers import StableDiffusionPipeline

	# basic usage
	repo_id = "lavaman131/cartoonify"
	device = torch.device("cuda")
	torch_dtype = torch.float16 if device.type in ["mps", "cuda"] else torch.float32
	pipeline = StableDiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch_dtype).to(device)
	image = pipeline("PROMPT GOES HERE").images[0]
	image.save("output.png")
	```

	#### Full source code

	The full source-code used for training can be found [here](https://github.com/lavaman131/cartoonify).

	#### Limitations and bias

	As with any diffusion model, playing around with the prompt and classifier-free guidance parameter is required until you get the results you want. Zoomed-out subjects seem to loose clairity in the face. For additional safety in image generation, we use the Stable Diffusion safety checker.

	## Training details

	The model was fine-tuned for 3500 steps on around 200 images of modern Disney characters, backgrounds, and animals. The ratios for each were 70%, 20%, and 10% respectively on an RTX A5000 GPU (24GB VRAM).

	The training code used can be found [here](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py). The regularization images used for training can be found [here](https://github.com/aitrepreneur/SD-Regularization-Images-Style-Dreambooth/tree/main/style_ddim).