The image generation quality from OpenAI's GPT Image V2 model is inconsistent enough that regular users are noticing - and losing patience. Common failure modes include garbled text embedded in images, distorted hands and faces, and cases where the model ignores key parts of a prompt entirely.
GPT Image V2 - available inside ChatGPT and via the API as gpt-image-1 - launched earlier this year as a significant upgrade over DALL-E 3, with OpenAI emphasizing better text rendering and more faithful instruction-following. In practice, the text improvement is real but inconsistent. Simple labels or short signs often render cleanly; longer strings, stylized typography, or multi-line layouts still break in ways that make the output unusable.
The prompt-following failures are harder to shrug off. When a model misreads or drops key instructions, you're not just running one extra generation - you're running three or four, burning time and API credits to get a single usable result. For anyone embedding image generation into a content or design workflow, a high failure rate on complex prompts makes the tool genuinely unreliable.
OpenAI has not published a public fix log or acknowledged specific quality regressions. The model handles straightforward product shots and simple illustration prompts reasonably well. Anything requiring precise composition, accurate text at scale, or multiple interacting elements is where it still struggles.
The most practical workaround right now: write simpler, more literal prompts and batch your generations. That's not a satisfying answer, but it's the realistic one for production use.