Models Breaking

OpenAI Publishes Official Prompt Guidance for GPT-5.4

March 7, 2026 3 min read

Image: OpenAI

What Happened

OpenAI released an official prompt guidance document for GPT-5.4, their latest model. Published on the OpenAI developer platform, the guide details specific prompting patterns, configuration recommendations, and workflow strategies tuned to how GPT-5.4 actually behaves.

The key recommendations break down into several categories:

Output Contracts - Explicitly define what you want back: "return exactly the sections requested, in the requested order." This reduces verbosity and keeps the model focused on delivering structured results.

Tool Persistence Rules - Tell the model "do not stop early when another tool call is likely to materially improve correctness or completeness." This directly addresses a common failure where models skip steps in multi-tool workflows.

Completeness Contracts - For multi-step tasks, instruct the model to maintain "an internal checklist of required deliverables" and track coverage systematically.

Verification Loops - Add a final step: "check correctness: does the output satisfy every requirement?" A lightweight instruction that catches missed requirements.

Reasoning Effort Controls - OpenAI introduces a tiered system: use none for execution tasks, low for latency-sensitive work, and medium/high only for reasoning-heavy workloads. Their explicit advice: "treat reasoning effort as a last-mile knob, not the primary way to improve quality."

OpenAI also notes that GPT-5.4 excels at personality and tone adherence with less drift over long outputs, and handles evidence-rich synthesis well in long-context and multi-tool workflows.

Why It Matters

This guidance reveals where GPT-5.4 is strong and where it still needs guardrails. The fact that OpenAI explicitly calls out tool routing as "less reliable" early in sessions tells you something: if you are building agentic workflows, front-load your tool selection logic rather than relying on the model to figure it out.

The reasoning effort parameter is significant. Instead of every request burning maximum compute on chain-of-thought reasoning, you can now dial it down for straightforward tasks. This has direct cost and latency implications for anyone running GPT-5.4 at scale through the API.

The "output contracts" pattern is not new to experienced prompt engineers, but having OpenAI officially codify it validates what practitioners have known: telling models exactly what format you expect back is one of the highest-leverage prompting techniques.

For teams using GPT-5.4 in production, the completeness contract and verification loop patterns are worth implementing immediately. These are cheap additions to system prompts that catch the most common failure mode - the model delivering 80% of what you asked for and quietly dropping the rest.

Our Take

The most revealing line in the entire document is about reasoning effort: "treat reasoning effort as a last-mile knob, not the primary way to improve quality." This is OpenAI admitting that throwing more compute at reasoning is not the answer to bad prompts. Structure your instructions well first. Tune reasoning effort second.

The tool persistence rule is the kind of specific, practical advice that actually changes outcomes. If you have used GPT-5.4 for agentic tasks and watched it bail out after one tool call when it clearly needed three, you know the frustration. Now there is an official pattern to prevent it.

What is missing from the guide is equally telling. No mention of context window sizes or token limits. No benchmarks against Claude or Gemini. No discussion of pricing tiers for different reasoning levels. OpenAI is being strategic about what they share.

If you are building on the OpenAI API, read this guide and update your system prompts. The patterns are specific enough to implement in an afternoon and general enough to improve most workflows. If you are choosing between models, the guidance about early-session tool routing weaknesses is worth factoring into your decision.

What Happened

Why It Matters

Our Take

Related Tools

More from today

Donald Knuth's Open Combinatorics Problem Solved by Claude Opus 4.6 in One Hour

LLMs Don't Write Correct Code - They Write Code That Looks Right

The "Plausible Code" Problem: Why LLM Output Looks Right but Often Isn't

Cookie Preferences