Tile Model for SDXL?

#1
by 97Buckeye - opened

Do you intend to release a Tile model for XL? I really miss using ControlNet for upscaling in SDXL.

Yes, agree with @97Buckeye , in SD 1.5 tile did wonders, for me this is the most anticipated ControlNet model.

Every day I wake up I check if there is tile model for SDXL.
Is this one more difficult then the rest? We already have 9 canny and 11 depth models, but still no tile model.

Every day I wake up I check if there is tile model for SDXL.
Is this one more difficult then the rest? We already have 9 canny and 11 depth models, but still no tile model.

Same here. And I don't even see anyone talking about a Tile model. I hate having to switch down to a 1.5 model just so I can upscale my XL images.

@Illyasviel Might you have any information regarding a Tile model for XL?

Same here. And I don't even see anyone talking about a Tile model. I hate having to switch down to a 1.5 model just so I can upscale my XL images.

@97Buckeye What's your flow for this? As I understand latent spaces are not compatible between 1.5 and XL. Is it like: generate XL -> encode into 1.5 latent -> upscale with 1.5 model + tile controlnet? Are the results close to the original XL image with denoise 0.3 - 0.4?

@97Buckeye What's your flow for this? As I understand latent spaces are not compatible between 1.5 and XL. Is it like: generate XL -> encode into 1.5 latent -> upscale with 1.5 model + tile controlnet? Are the results close to the original XL image with denoise 0.3 - 0.4?

I do pretty much that, but I do get some small "artifacts" on my images - 1.5 model doesn't understand things XL created. I made DA post with my results:
https://www.deviantart.com/yashamon/journal/Testing-out-AI-upscaling-methods-983456810
Sometimes I even get whole scenes in a single tile that somewhat match original. For example check out roof of this one:
https://www.deviantart.com/yashamon/art/AI-4K-8K-Druid-s-house-983850912

I would like to throw in my 2 cents, i'm also searching daily for news about tile support for SDXL. Getting tired of switching between checkpoints. Thanks so much for the work so far with this project, its simply amazing.

Yeah, I don't think I'm gonna bother testing it any further. Gave it another try, I thought XL Inpainting will be better with it, but I'm really not impressed, not as powerful as 1.5. Not really worth switching for twice the vram usage and lower speed, worse results and inpainting resolution limit. Maybe it has some use cases in Comfy if someone uses SDXL, so probably a plus for that.

https://huggingface.co/TTPlanet/TTPLanet_SDXL_Controlnet_Tile_Realistic_V1 try this one and read the instruction to use. Same as 1.5 but for SDXL, realistic version, no guarantee on 2D

Thanks a lot TTplanet! I can confirm that this actually works!

I use ComfyUI with JuggernautXL_v9 and 4x-UltraSharp.pth to upscale a close up landscape photo from 2k to 8k and after 6 hours of experimentation I got the best result using Tiled KSampler with tiling strategy random strict, 20 steps, cfg 7, sampler euler, scheduler sgm_uniform, denoise 0.6, with ControlNet strength set to 0.5.

I found that using IPAdapter (plus) was producing slightly better details but at the cost of the image losing global contrast and no longer looking like the original when seen from further away.

Also I got considerably better result with regular checkpoint at 20 steps than with lightning checkpoint at 6 steps.

Thanks a lot TTplanet! I can confirm that this actually works!

I use ComfyUI with JuggernautXL_v9 and 4x-UltraSharp.pth to upscale a close up landscape photo from 2k to 8k and after 6 hours of experimentation I got the best result using Tiled KSampler with tiling strategy random strict, 20 steps, cfg 7, sampler euler, scheduler sgm_uniform, denoise 0.6, with ControlNet strength set to 0.5.

I found that using IPAdapter (plus) was producing slightly better details but at the cost of the image losing global contrast and no longer looking like the original when seen from further away.

Also I got considerably better result with regular checkpoint at 20 steps than with lightning checkpoint at 6 steps.

I have uploaded my workflow in Civitai and put the link here also. if you have a nice image, you can directly use it, if you have low quality image you need to handle it with i2i at low denoise to fix it. and sent it to my workflow, you will see the same effect as I showed in example. I will prefer to use IPA for the pre-process on the image before the ultimate sd upscale apply. Try my process, you will like it....

Thank you @TTPlanet ! I've tested this (the fp16 version), and it seems to work great (even with 2D). It looks better than Tile 1.5 for working with larger resolution images, as produced by SDXL. And since it can use an SDXL base model to work from, including using the same model that generated the original image, that also helps produce much finer details when upscaling to higher resolutions. It seems to stay much truer to the original image when upscaling as with Ultimate SD Upscale, just adding necessary details, without as many extra hallucinations. It also doesn't seem to get splotchy like 1.5 did after upscaling multiple times. Great work! I look forward to adding this to my workflow. FYI, not sure why but ControlNet warns that it is "Unable to determine version for ControlNet model" as it is running.

Hesajon could you leave a workflow ? For me it looks horrorble πŸ₯²

After some further testing, it might not be as great as it first looked. Still testing...

@TTPlanet I've tried your workflow, and all possible permutations regarding differences between our workflows. After extensive texting, my conclusion is that Ultimate SD Upscale is detrimental. With it, I either can't get rid of visible seams, or the image is too constrained by low denoise and so lacks detail. Your combination of sampler, scheduler and CNet strength values proved interesting however. When I combined those with Tiled KSampler which allows for much higher denoise of 0.6 without visible seams, this resulted in more detailed and better looking result compared to what I had before, at a very slight cost to faithfulness / consistency / coherence. I think both sets of settings are useful, and are probably more such combination that could be discovered.
Here is my workflow featuring both sets of settings: https://comfyworkflows.com/workflows/91690876-a404-4a89-b8e5-1f84aaf64c58

edit: I just tested SUPIR and CCSR against this, and this Controlnet Tile XL model wins by a huge margin, and is 7 times faster. It's fascinating how everyone's talking about those two in the world of upscaling, while the true gem is hidden right here. To be fair, CNet Tile approach does not stick nearly as close to the original as they do, but most of us aren't trying to do doing forensics with these, so allowing the model some creativity is not really an issue, or at least for me it isn't. Model's going to hallucinate anyways as it can never actually know what details were there in reality when zoomed in, might as well let it hallucinate optimally.

Sign up or log in to comment