A Stable Diffusion v2.1 ControlNet finetuned on part of MS Coco with a CLIP-conditioned eps MSE loss for better expressiveness in some instances.
Uses OpenAI's CLIP ViT-b/16.
Requires specific inference code to work at all -- vanilla CFG inference oversteps & oversaturates it, and using all ControlNet layers does so as well.
Finetuning this way may worsen FID while obviously improving CLIP score; somebody ought to measure that.
License for weights & code (aside from all applicable inheritance from SD 2.1 & other code bases' licenses) is simple:
- If you want to use this as a user, just use good judgement and don't hold me liable;
- If you want to use this as a researcher, cite it;
- If you want to use this for commercial ventures outside of personal use, reach out lol
Also see licenses included here for all work this builds on & follow (esp. for Stable Diffusion) their terms, of course!