Tutorial

Human-in-the-Loop AI Agents: A Production Setup Guide

Akshay Sarode Nov 26, 2025

Direct answer

Wrap every irreversible action in an approval primitive. Use FCM push to a phone (sub-second). Long-press notification → biometric → approve. TTL'd; first responder wins; multi-channel fanout. Audit-logged. The whole loop is <3 seconds when the human is around; defaults safely if they're not.

The autonomous agent that does everything without checking with you sounds great until it pushes to main, sends an email to the wrong person, or runs up a $4,000 API bill. The fix is human-in-the-loop on the things that matter — and only those things.

This is the production setup. Not "a tutorial about a feature" — the design choices that hold at scale.

What needs approval (and what doesn't)

Approval gates add friction. Use them on actions that are:

Irreversible (sending email, git push, payment)
Costly above some threshold (API call > $X, large compute job)
Outside the agent's normal capability sandbox
Reaching external humans (sending message to user, customer)

Don't gate:

Reads (memory queries, file reads inside workspace)
Idempotent writes inside the workspace
Tool calls within the agent's existing capability bundle

The asymmetry: false positives (gating something that didn't need it) cost developer time. False negatives (auto-approving something dangerous) cost real money or trust. Skew toward gating.

The flow

Agent calls ask(prompt, detail, ttl) — writes to approvals/{id} with status pending
Cloud Function fires FCM push to all registered approval channels
Phone receives push (200–800ms typical)
Long-press notification → quick action sheet → Approve / Deny
Biometric (Face ID / Touch ID) confirms the approve gesture
App writes status: approved to Firestore
Agent's snapshot listener fires (~50ms); agent resumes

End-to-end: 1.5–6s when the phone is in hand. TTL of 300s typical; agent gets a denial-equivalent if no decision.

Why mobile, not Slack/email

Slack is async by reputation. Email is slower. SMS is fine but lacks the "long-press for action" affordance. Phone push with haptic feedback is sub-second and biometric-authable. The phone is also where the user is — at their desk, on the bus, in a meeting. Slack is where the user might check in 20 minutes.

Why biometric on the approve gesture

The push notification can be tapped without auth (just shows the prompt). The approve action requires biometric. Reasoning:

The push payload is non-sensitive — just a hint
The approve write is a privileged action — Face ID / Touch ID
If the phone is unattended on a desk, a passerby can read the prompt but can't approve

Why TTL'd

An agent that waits 24 hours for approval is broken. Default TTL: 300 seconds. After that, the approval transitions to timeout; the agent gets a denial-equivalent. The agent decides what to do — most often: log + retry later, or escalate to a different action.

Multi-channel fanout

One owner can register multiple approval channels (phone, watch, SMS fallback, web). FCM push fans out to all of them. First-responder wins; once decided, subsequent taps see "already decided." Useful for teams with rotating on-call.

Code (agent side)

from ujex_mobile import Mobile
m = Mobile(api_key=os.environ['UJEX_API_KEY'], agent_id='abc')

approval = m.ask(
    prompt='Push to main?',
    detail='Will deploy commit 4f3a91 (changes auth flow)',
    ttl_sec=300,
)

result = approval.wait()  # blocks until decided or timeout

if result['status'] == 'approved':
    git_push()
elif result['status'] == 'timeout':
    log.warn('No human reached; deferring push')
else:
    log.warn('Push denied; will surface in summary')

Code (registering an approval channel)

# Phone app, one-time setup
from ujex_client import channels
channels.register(
    uid='your-uid',
    kind='fcm',
    token=fcm_device_token,
)

Code (decide endpoint, app side)

# In the iOS / Android app's notification action handler
def on_approve_tapped(approval_id):
    auth = await biometric_auth()  # Face ID
    if not auth.success:
        return
    await ujex.client.approvals.decide(
        id=approval_id,
        decision='approved',
        device_id=current_device_id(),
    )

Audit trail

Every step lands in the hash-chained audit log: ask, push, decide, timeout. Reconstructing "who approved what when" is a query, not a guess.

Idempotency

The agent might re-call ask after a network blip. Deterministic ID (hash of action + args + timestamp) prevents creating duplicate approval requests. The Cloud Function dedupes on conflict.

Backpressure

What happens if 50 agents ask for approval at once? FCM batches; the human gets 50 notifications. Practical answer: rate-limit at the agent level (no single agent should ask more than 1/min on average), and give each notification a clear identifier (agent name + action) so triage is feasible.

Failure modes

Mode	Behavior
No registered channel	`ask` returns `status: no_channel` immediately
FCM push undelivered	App polls Firestore on launch; user sees pending approvals
Phone offline	FCM queues for ~28 days; if exceeded, agent times out per TTL
App crash mid-approve	Decision write is atomic; either approved or pending
Multiple devices simultaneously approve	First write wins; second sees "already decided"

Compare to alternatives

	Slack approval bot	Email approval link	Mobile push
Latency	5–600s	30s–24h	1.5–6s
Biometric on action	✗	✗	✓
Haptic distinguishable	✗	✗	✓
Works without internet	✗	✗	Push queued by FCM
Multi-channel fanout	One channel	One inbox	Many devices

What we ship

This is exactly the Ujex Mobile subsystem. mobile.ask + Cloud Function + FCM + native iOS/Android approval flow + audit log. Same code Celistra uses for permission approvals on supervised agents.

FAQ

Can I implement this without Ujex?

Yes — FCM is free, the Firestore schema is straightforward, and a basic implementation is ~500 LoC. Ujex pre-builds it and wires it to the rest of the agent stack.

Does this require iOS/Android dev experience?

If you self-build, yes. With Ujex you use the published apps (or fork them — Apache-2.0).

What about Telegram or WhatsApp instead of native apps?

Possible — register a Telegram bot as a channel kind. Latency is higher (1–10s); biometric isn't there. We support it as a fallback.

Is the prompt content encrypted in transit?

FCM payload only has the approval ID; the full prompt is fetched from Firestore over a TLS-authenticated connection on app open.