Related ToolsClaudeChatgptClaude CodeCursor

LLM Agents Fall Apart When You Give Them Thousands of Tools

AI news: LLM Agents Fall Apart When You Give Them Thousands of Tools

How many tools can you give an AI agent before it stops working? If you've been building with function-calling in ChatGPT, Claude, or similar models, the answer is probably around 10 to 20. After that, things get ugly fast.

A detailed write-up from Viktor, an AI automation platform, digs into exactly what goes wrong when agents need access to hundreds or thousands of tools - and proposes a surprisingly simple fix.

The Naive Approach Collapses Quickly

The standard way to give an AI agent access to tools is to describe every tool as a JSON schema in the system prompt. The model reads the descriptions, picks the right tool, and calls it with structured parameters. This works fine for a handful of tools.

But try it with 100 integrations and you hit three walls at once. First, context window saturation: dumping all those schemas into the prompt eats thousands of tokens (the units of text a model processes), making every request slower and more expensive. Second, the model gets confused about which tool to pick. Third, search-based discovery fails because the agent doesn't know what it doesn't know. Ask it about the weather and it won't think to look for a web search function if that tool isn't already in its prompt.

One-Line Skill Summaries Instead of Full Schemas

Viktor's solution is what they call "lazy loading with skills." Instead of stuffing full tool definitions into the prompt, they give the agent a single line describing each capability. A user with 50 integrations ends up with roughly 68 skill summaries - still just 68 lines of context instead of thousands of lines of JSON schemas.

When the agent decides it needs a specific skill, it loads the full instructions on demand. Those skill files contain detailed code examples and parameters, but the agent only reads them when relevant.

The second piece is letting agents write Python code that imports available functions, rather than using structured tool calls. This lets the agent compose multiple tools in loops and sequences without needing a separate schema for every combination.

A Real Problem for Anyone Building Agents

This isn't just an academic exercise. Anyone connecting AI agents to business tools through platforms like Zapier, Make, or custom MCP servers hits this wall. The more tools you add, the worse the agent performs - unless you architect around the limitation.

The lazy-loading pattern is worth studying if you're building anything where an agent needs access to more than a few dozen capabilities. The full system prompt stays lean, the agent loads details only when needed, and skill files double as version-controlled documentation that improves over time.

The approach isn't perfect - it still relies on the agent correctly identifying which one-liner matches the user's intent - but it's a practical step forward from the "dump everything in the prompt" default that most agent frameworks still use.