11,505 API calls. Seven days. One header value that never changed: fallback-percentage: 0.5.
A developer who built a transparent HTTP proxy - a monitoring layer that logs every packet between their application and Anthropic's servers - found that Claude's API has been quietly attaching this header to responses across all plan tiers. The value is fixed at 0.5 for every request, regardless of time of day, server load, or usage patterns.
The plain-language reading: whatever your Claude plan advertises as capacity, 50% is your actual working allocation.
The Constraint Hidden in Plain Sight
Response headers are small pieces of metadata attached to server responses, typically used to pass technical instructions between systems. They don't usually make news. But this one is worth attention because it suggests Claude API capacity is structured differently than the plan documentation implies.
The investigation ruled out the obvious alternative explanations. The fallback-percentage: 0.5 value isn't time-sensitive. It doesn't change during peak hours. It's not correlated with server load. It appears to be a fixed per-account configuration.
Beyond this header, the data revealed another layer most API users don't know about: the weekly quota. About 14% of calls in the dataset hit the weekly limit as the binding constraint - not the 5-hour rolling window that Anthropic documents more prominently. If you're hitting rate limits and only watching the 5-hour window, you may be misattributing what's actually throttling your requests.
Same Plan, Different Rules
The most opaque finding: accounts on identical Max 5x plans are getting different treatment. Some accounts return overage-status: allowed, meaning they can exceed their standard allocation when needed. Others get overage: rejected + org_level_disabled. Same tier, same pricing, different permissions - with no documented explanation for why.
Anthropichasn't published documentation on either the fallback-percentage header or what determines overage eligibility. There's no page in the developer docs explaining what 0.5 means in this context, or whether it represents intended rate architecture or a configuration artifact that happened to surface in response metadata.
For anyone running production workloads on the Claude API, the practical question is direct: if your capacity planning assumed access to the full advertised throughput, you might be working with half of it. Whether the fallback-percentage represents a genuine hard cap, a reservation system, or something else entirely is something only Anthropic can clarify. So far, no official response has been provided.