Tutorial

Stopping a $5,000 Surprise Bill From An AI Agent

Akshay Sarode
Direct answer

Set a monthly USD budget per agent. Track spend per action. Alert at 80%. Refuse at 100%. Allow a one-time +50% override that's audit-logged. The agent stops at the cap; you don't wake up to a four-figure bill.

The week of October 2024 when a viral tweet showed someone burning $5,000 in 8 hours on Claude tokens via a loop bug — that was the week "agent budget" went from theoretical to required.

The shape of the problem

You can't watch every agent every minute. You need infrastructure that watches.

What "Governor" does

The Ujex Governor is a callable middleware. Every billable action checks against a budget before executing. If the budget is exceeded, the action returns 429.

// Middleware (pseudo)
async function requireQuota(agentId, costUsd) {
  const budget = await db.budgets.get(agentId);
  if (budget.usedUsd + costUsd > budget.monthlyUsd) {
    if (!budget.overrideActive) {
      throw new Error("over budget");
    }
  }
  await db.budgets.increment(agentId, costUsd);
}

Three thresholds

ThresholdBehavior
0–80%Spend tracked, no notification
80%FCM push to owner — "agent X at 80%"
100%All callables return 429; no spend
Override (+50%, one-time)User explicitly extends; audit-logged

What gets metered

Every callable that has cost. tools.invoke, postbox.send, memory.embed, recall.searchEpisodes — each declares a per-call USD figure. The Governor middleware debits before allowing the action.

Important caveat: model-token billing happens at the LLM provider, not at Ujex. The Governor counts actions, not actual GCP / OpenAI / Anthropic spend. We're working on real USD via Billing API integration; today the count is approximate but stops the runaway loop case (which is what matters most).

Override flow

An agent hits 100%. The owner gets the 80% notification + a "your agent is now refusing actions" notification. They can:

Code

from ujex_governor import Governor

gov = Governor(api_key=os.environ['UJEX_API_KEY'])

# Set the budget once
gov.set_budget(agent_id='abc', monthly_usd=50.0)

# Read current
budget = gov.get_budget(agent_id='abc')
# {monthlyUsd: 50.0, usedUsd: 38.42, percent: 76.84, alerted80: True}

# Override (rare)
gov.override(agent_id='abc', extra_usd=25.0, reason='shipping deadline')

Pair with audit

Every quota event lands in the audit log: budget set, threshold crossed, action refused, override granted. When you're investigating "what happened on Friday" the budget timeline + the audit log together tell the whole story.

FAQ

How is this different from OpenAI's spend cap?

OpenAI's cap is account-wide. Governor is per-agent — each agent has its own budget within your account.

Does this stop the runaway loop in real time?

Within one action. Each callable check is <10ms; the agent's next action gets refused if the previous one tipped it over the cap.

Can I set caps per day instead of per month?

Today: monthly only. Daily/weekly is on the roadmap.