Related ToolsChatgpt

ChatGPT Image Refusals Can Be Bypassed by Claiming It Got It Wrong

ChatGPT by OpenAI
Image: OpenAI

There's a quirk in ChatGPT's image generation that users have been quietly exploiting: tell the model "you got it wrong" after it refuses to generate an image, and it will sometimes comply anyway.

The behavior appears to stem from how ChatGPT handles correction prompts. The model is trained heavily on being helpful and on updating its responses when users point out errors. When told it made a mistake, it seems to re-engage with the task through a "correction" framing rather than re-evaluating whether the original refusal was actually warranted. The result is that the refusal logic gets bypassed - not through any technical trick, but through simple social pressure.

This is not a guaranteed workaround. It appears more likely to succeed on borderline refusals - the kind where the model flags something that doesn't violate any actual policy - than on hard blocks against clearly prohibited content. But the fact that it works at all points to a real gap between ChatGPT's content filtering (which runs at one point in the conversation) and its error-correction behavior (which apparently doesn't re-check the same filters before complying).

OpenAI hasn't commented on the behavior. It's the kind of edge case that typically gets patched quietly rather than addressed in a formal update. For users with legitimate image requests that trigger false positives - a persistent frustration with AI image generators generally - it's a low-effort workaround worth trying before giving up on a prompt entirely.