XR — the AI agent you can actually trust

Why XR

Five things no other agent ships together.

Every popular agent bolts security on later. In XR it is the architecture.

💰

Cost Governor

A hard per-task spend/token ceiling enforced in code. It pauses and asks before it can breach your budget — no silent $8 burns.

🧠

Non-Regressive Skills

Verified runs are frozen as immutable baselines. Any update that breaks a past win is auto-rolled-back. The agent can't forget what worked.

🖥️

Local-Model Reliability

Grammar-forced tool-calls make even a 3B local model emit valid output, with deterministic auto-repair as a backstop.

🔒

Provable Security

xr test --attacks runs an injection corpus and prints a publishable block-rate. Plus an egress allow-list, approval gates, and a SHA-256 hash-chained audit log you can verify.

🛡️

Self-Healing & BYOK

Updates auto-rollback if a self-test fails. Keys live in your environment / OS keychain — XR ships none, stores none, costs nothing to run.

How it works

Goal in. Result out. Safely.

XR runs the universal agent loop — but every step passes through a deterministic security spine.

Observe

Loads your trusted task + relevant code from the local RAG index. Untrusted data never touches the planner.

Think

Picks a model (BYOK / local), grammar-forces a valid tool-call, checks it against your budget.

Gate

Least-privilege + egress allow-list + approval for risky actions (CLI, phone button, or voice).

Act & Log

Runs the tool, records it in the tamper-evident hash chain, loops until done.

xr — agent run

# one task, fully guarded — capped at 10 cents
$ xr --budget 0.10 "add install steps to the README"
  ▸ think   planning · qwen2.5 (local) · 💰 0.4k tok / $0.25 cap
  ▸ tool    ⚙ read_file(README.md)              ✓
  ▸ tool    ⚙ write_file(README.md)             ⏸ needs approval
            [a]pprove  [r]eview  [d]eny
  ▸ act     ⚙ write_file(README.md)             ✓ applied
  ✓ done in 3 steps · 💰 1.1k tok ≈ $0.0009 · audit #9cf3e2a880

📊 See the live dashboard →

Cheatsheet

Every command.

One brain, many ways to drive it — terminal, dashboard, phone, voice.

xr "task"Run a task (Agent mode)

xr --mode plan|ask "task"Read-only modes (least-privilege)

xr --budget 0.50 "task"Hard USD spend ceiling

xr --max-tokens 50000 "task"Hard token ceiling

xr --dry-run "task"Simulate — write nothing, run nothing

xr --provider groq --model …Use any BYOK provider

xr serve📊 Local dashboard (127.0.0.1)

xr telegram📱 Secure phone remote (✅/❌ buttons)

xr voice🎙️ Local voice stack (Whisper/Kokoro)

xr skills📚 11 pre-built signed skills

xr index / xr memory🧠 Local RAG + project memory

xr mcp🔌 MCP tool ecosystem

xr cron "every mon 9am: audit"⏰ Natural-language scheduler

xr test --attacks🔒 Injection benchmark

xr verify-logVerify tamper-evident audit chain

xr export📄 Signed, shareable audit report

xr doctorFull system health check

xr --helpShow all commands

Capability	XR	OpenClaw	Hermes	Claude Code
Hard spend ceiling (code-enforced)	✓	✗	✗	✗
Local-model reliability (GBNF)	✓	✗	✗	✗
Non-regressive skills	✓	✗	~	✗
Injection benchmark (runnable)	✓	✗	✗	✗
Tamper-evident audit log	✓	✗	✗	✗
BYOK + $0 to run	✓	~	~	✗
Egress allow-list (anti-exfil)	✓	✗	✗	~

Get started · runs on Linux · macOS · Windows

Up and running in 2 minutes.

Needs Bun. A local model via Ollama is free — or bring your own key.

install

# 1 — clone & install (needs Bun)
$ git clone https://github.com/ahmadrrrtx/xr
$ cd xr && bun install

# 2 — run a task on a local model ($0)
$ bun run src/index.ts "summarize and improve my README"
    # tip: alias xr="bun run $(pwd)/src/index.ts"  → then just `xr "task"`

# 3 — bring a cloud key (never stored by XR)
$ GROQ_API_KEY=… xr --provider groq "list files"

# …or one command, whole agent + dashboard
$ docker compose up

Security model — honest

Blast-radius reduction you can measure.

XR does not claim to be "unhackable" — prompt injection is unsolved industry-wide. It makes a successful attack nearly useless, and lets you prove it.

🚪

Egress allow-list

The agent can't reach a domain you didn't approve — kills most exfiltration, including cloud metadata endpoints.

✋

Approval gates

Write / delete / shell / send need explicit approval — fail-closed on timeout. Dangerous shell is blocked before the model is even trusted.

🪪

Tamper-evident audit

SHA-256 hash-chained log (git's trick). xr verify-log detects any change — $0, offline, private. No blockchain needed.

🧪

Provability

xr test --attacks publishes a reproducible block-rate. We measure security instead of marketing it.

FAQ

Questions, answered.

Does it run on Windows, macOS, and Linux?

Yes — all three. XR is built on Bun (cross-platform) with SQLite built in. There's also a single-container Docker option that runs anywhere.

Is it really free? What does it cost to run?

XR itself is MIT and ships no API keys. With a local model (Ollama) it costs $0. With a cloud key, you pay only your provider's usage — and the Cost Governor caps it.

How do I use it — is it one command like OpenClaw/Hermes?

Yes. Today: git clone + bun install, then bun run src/index.ts "task" (add a shell alias xr for one-word use). An npm package @rrrtx/xr for bun add -g is coming soon. Every feature is an xr <command> — serve, telegram, voice, etc.

Do my keys or code ever leave my machine?

No. Keys live in your OS keychain / environment — XR ships and stores none. The dashboard binds to 127.0.0.1 only. With a local model, nothing leaves your machine at all.

Is XR "unhackable"?

No — and we'll never claim that. Prompt injection is unsolved industry-wide. XR minimizes blast radius (egress allow-list, approval gates, least-privilege) and lets you measure it with xr test --attacks.

Which models / providers does it support?

Any OpenAI-compatible provider — Ollama (local), Groq, OpenAI, OpenRouter, Together, DeepSeek, and more. BYOK: bring whichever key you want.

The AI agent you can actually trust.