Tutorial

Human-in-the-Loop AI Agents: A Production Setup Guide

Akshay Sarode
Direct answer

Wrap every irreversible action in an approval primitive. Use FCM push to a phone (sub-second). Long-press notification → biometric → approve. TTL'd; first responder wins; multi-channel fanout. Audit-logged. The whole loop is <3 seconds when the human is around; defaults safely if they're not.

The autonomous agent that does everything without checking with you sounds great until it pushes to main, sends an email to the wrong person, or runs up a $4,000 API bill. The fix is human-in-the-loop on the things that matter — and only those things.

This is the production setup. Not "a tutorial about a feature" — the design choices that hold at scale.

What needs approval (and what doesn't)

Approval gates add friction. Use them on actions that are:

Don't gate:

The asymmetry: false positives (gating something that didn't need it) cost developer time. False negatives (auto-approving something dangerous) cost real money or trust. Skew toward gating.

The flow

  1. Agent calls ask(prompt, detail, ttl) — writes to approvals/{id} with status pending
  2. Cloud Function fires FCM push to all registered approval channels
  3. Phone receives push (200–800ms typical)
  4. Long-press notification → quick action sheet → Approve / Deny
  5. Biometric (Face ID / Touch ID) confirms the approve gesture
  6. App writes status: approved to Firestore
  7. Agent's snapshot listener fires (~50ms); agent resumes

End-to-end: 1.5–6s when the phone is in hand. TTL of 300s typical; agent gets a denial-equivalent if no decision.

Why mobile, not Slack/email

Slack is async by reputation. Email is slower. SMS is fine but lacks the "long-press for action" affordance. Phone push with haptic feedback is sub-second and biometric-authable. The phone is also where the user is — at their desk, on the bus, in a meeting. Slack is where the user might check in 20 minutes.

Why biometric on the approve gesture

The push notification can be tapped without auth (just shows the prompt). The approve action requires biometric. Reasoning:

Why TTL'd

An agent that waits 24 hours for approval is broken. Default TTL: 300 seconds. After that, the approval transitions to timeout; the agent gets a denial-equivalent. The agent decides what to do — most often: log + retry later, or escalate to a different action.

Multi-channel fanout

One owner can register multiple approval channels (phone, watch, SMS fallback, web). FCM push fans out to all of them. First-responder wins; once decided, subsequent taps see "already decided." Useful for teams with rotating on-call.

Code (agent side)

from ujex_mobile import Mobile
m = Mobile(api_key=os.environ['UJEX_API_KEY'], agent_id='abc')

approval = m.ask(
    prompt='Push to main?',
    detail='Will deploy commit 4f3a91 (changes auth flow)',
    ttl_sec=300,
)

result = approval.wait()  # blocks until decided or timeout

if result['status'] == 'approved':
    git_push()
elif result['status'] == 'timeout':
    log.warn('No human reached; deferring push')
else:
    log.warn('Push denied; will surface in summary')

Code (registering an approval channel)

# Phone app, one-time setup
from ujex_client import channels
channels.register(
    uid='your-uid',
    kind='fcm',
    token=fcm_device_token,
)

Code (decide endpoint, app side)

# In the iOS / Android app's notification action handler
def on_approve_tapped(approval_id):
    auth = await biometric_auth()  # Face ID
    if not auth.success:
        return
    await ujex.client.approvals.decide(
        id=approval_id,
        decision='approved',
        device_id=current_device_id(),
    )

Audit trail

Every step lands in the hash-chained audit log: ask, push, decide, timeout. Reconstructing "who approved what when" is a query, not a guess.

Idempotency

The agent might re-call ask after a network blip. Deterministic ID (hash of action + args + timestamp) prevents creating duplicate approval requests. The Cloud Function dedupes on conflict.

Backpressure

What happens if 50 agents ask for approval at once? FCM batches; the human gets 50 notifications. Practical answer: rate-limit at the agent level (no single agent should ask more than 1/min on average), and give each notification a clear identifier (agent name + action) so triage is feasible.

Failure modes

ModeBehavior
No registered channelask returns status: no_channel immediately
FCM push undeliveredApp polls Firestore on launch; user sees pending approvals
Phone offlineFCM queues for ~28 days; if exceeded, agent times out per TTL
App crash mid-approveDecision write is atomic; either approved or pending
Multiple devices simultaneously approveFirst write wins; second sees "already decided"

Compare to alternatives

Slack approval botEmail approval linkMobile push
Latency5–600s30s–24h1.5–6s
Biometric on action
Haptic distinguishable
Works without internetPush queued by FCM
Multi-channel fanoutOne channelOne inboxMany devices

What we ship

This is exactly the Ujex Mobile subsystem. mobile.ask + Cloud Function + FCM + native iOS/Android approval flow + audit log. Same code Celistra uses for permission approvals on supervised agents.

FAQ

Can I implement this without Ujex?

Yes — FCM is free, the Firestore schema is straightforward, and a basic implementation is ~500 LoC. Ujex pre-builds it and wires it to the rest of the agent stack.

Does this require iOS/Android dev experience?

If you self-build, yes. With Ujex you use the published apps (or fork them — Apache-2.0).

What about Telegram or WhatsApp instead of native apps?

Possible — register a Telegram bot as a channel kind. Latency is higher (1–10s); biometric isn't there. We support it as a fallback.

Is the prompt content encrypted in transit?

FCM payload only has the approval ID; the full prompt is fetched from Firestore over a TLS-authenticated connection on app open.