good image quality but bad prompt adherence

by balikita - opened Jul 6

Discussion

balikita

Jul 6

prompt: photo of the pope on top of an orca whale. he is speeding and the photo around him is blurry. year 1970

result:

toilaluan

Jul 7

it's your bad prompt.
A vintage photograph from 1970 showing the Pope riding on top of a massive orca whale. The scene is dynamic, with the Pope speeding across the water, creating a sense of motion. The background and surroundings are blurred due to the high speed, enhancing the focus on the Pope and the orca. The Pope is dressed in his traditional white papal garments, and the photo has a slightly faded, nostalgic quality typical of 1970s photography.

balikita

Jul 7

nop. it should understand my prompt if prompt adherence is there

toilaluan

Jul 7

@balikita why it should while your prompt is not clear and even wrong?

balikita

Jul 7

@balikita why it should while your prompt is not clear and even wrong?

because that prompt working good in midjourney so that's not wrong

3blackbar

Jul 7

•

edited Jul 7

dood, adapt your prompt to the model - not the other way around, its always like this, 1.5 and xl need different prompting too, this one as well, so move on and change your prompt, its not going to work with your "pope on top" whatever it means.just cause it works in MJ doesnt mean it will work with anything else

Said2k

Jul 7

An LLM is needed to upsample your prompt so the model can understand it.

balikita

Jul 7

dood, adapt your prompt to the model - not the other way around, its always like this, 1.5 and xl need different prompting too, this one as well, so move on and change your prompt, its not going to work with your "pope on top" whatever it means.just cause it works in MJ doesnt mean it will work with anything else

That shouldn't be the case though. When it is true it means the model, itself, is inherently failing to evolve and improve. The central point around improved prompt coherency is it should eventually reach the level of resolving in the way a human would naturally perceive it. Having to use weird ass negatives like in SD3's fail case shouldn't be the norm.

deepfree

Jul 7

Modified to "riding on top", and it works, in some cases:
photo of the pope riding on top of an orca whale. he is speeding and the photo around him is blurry. year 1970

Not totally agree with balikita, but it's true that Kolors' prompt adherence is not quite good, after lots of tests.

sinsro

Jul 13

I have the same experience, great image quality, but prompt adherence is very poor compared to other models. I have been playing around with many inference parameters, and the issue is consistent. I even tried to translate the prompt to Chinese, since the model seems to be optimized for mandarin, but it did not help. Randomly the model can get it, and provide the correct result, so it does seem to understand the nuances in the prompt but the understanding is not consistent. I also tried to use LLM improved prompts as suggested above, but experienced that the model seem to ignore most of the details in the enhanced prompt. For example, given the same seed, two completely different prompts could produce the exact same image, given the presence of some "magic" keywords. So, basically, you will need some luck to get the results you are looking for. Would be nice if the results were more predictable and consistent, but currently it is lacking compared to other models in this respect.

deepfree

Jul 14

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment