A set of ChatGPT-generated images shows Vincent and Jules from Pulp Fiction rendered in progressively happier emotional states - same faces, same black suits, same Tarantino visual framing, just shifting from deadpan to full smiles across six frames. The consistency is what's notable, not the prompt.
Getting an AI image generator to maintain the same character appearance across multiple outputs used to require specialized tools: ControlNet plugins (software that locks specific visual features across image generations), reference image uploads, or dedicated character consistency workflows in tools like ComfyUI. Midjourney only recently added a --cref flag to handle this. ChatGPT does it through conversation context alone.
The Pulp Fiction prompt works partly because the characters are well-defined in the model's training data - two actors with distinctive looks, from a film with a recognizable visual style. But the output quality across the series suggests the underlying GPT-4o image system has gotten meaningfully better at holding visual coherence across a single thread, not just recognizing the reference.
For content creators doing straightforward character-consistency work with recognizable cultural references, this removes a step. For original characters, unusual camera angles, or specific artistic styles outside the model's training, dedicated image tools still have a real edge. But for style experiments and cultural reference work, ChatGPT is a reasonable first stop before reaching for heavier software.