Every major AI lab is racing to ship agents that can do real work: book flights, manage ad campaigns, spin up cloud infrastructure, negotiate with vendors. The missing piece nobody wants to talk about is money. Specifically, should these agents be allowed to spend it without asking you first?
The answer today is mostly no, but the guardrails are thin. OpenAI's ChatGPT can already browse the web and call third-party APIs through plugins. Anthropic's Claude can execute multi-step tool chains. Google's Gemini agents can interact with Workspace apps. None of these systems have built-in spending limits or purchase approval workflows. The constraint isn't technical - it's that nobody has connected them to a corporate credit card yet. Give it six months.
Where This Gets Real
Think about the AI tools people actually use for work right now. A marketing agent running Facebook ads needs to allocate budget. An infrastructure agent scaling servers on AWS needs to provision resources that cost money by the hour. A procurement agent comparing SaaS vendors is useless if it can't pull the trigger on a purchase order.
Some companies are already experimenting. Zapier's multi-step automations can trigger purchases. Salesforce's Agentforce can process transactions within defined workflows. But these operate with hard-coded spending limits and human approval gates - they're automated, not autonomous.
The distinction matters. Automation follows a script you wrote. Autonomy means the agent decides what to spend, when, and how much based on its own reasoning. That second category is where things get uncomfortable.
The Practical Risk Isn't Skynet
Forget the sci-fi scenarios. The real danger is mundane: an agent misreading context and overspending on cloud compute, or an LLM hallucinating a vendor name and wiring money to the wrong account, or a prompt injection attack tricking your purchasing agent into buying something you never authorized.
These aren't hypothetical failure modes. LLMs hallucinate. Prompt injection remains unsolved. And the "reasoning" behind an agent's spending decision is a black box that no finance team can audit.
A reasonable framework probably looks like this: tiered spending limits (agents can spend up to $X without approval), mandatory human-in-the-loop above a threshold, detailed transaction logging that a non-technical person can review, and kill switches that work instantly. Basically, treat your AI agent like a new employee with a company card - start with a low limit and increase it as trust builds.
The companies building agent platforms that ship with these controls baked in will have a real advantage. The ones that leave spending governance as an afterthought are building liability machines.