Four weekends setting up OpenClaw, only twenty minutes with Jatayu

Thoughts on two personal AI agent setups — the dependency graph, the Docker plumbing, the memory architecture, and what the simpler one got architecturally right.

May 01, 2026

A month ago I decided I wanted a personal AI agent. Not a chatbot — something that could hold state across conversations, manage my schedule, search the web, and integrate with the messaging surfaces I already used. I tried two routes: OpenClaw, a self-hosted multi-component system, and Jatayu, a much smaller MCP-native agent. Both ended up working. The interesting part is why the simpler one works, not just that it does.

Scope and security first, because that decision shaped everything

Before I plugged anything in, I locked the blast radius down:

Dedicated Mac mini, separate user account, no admin rights for the agent process.
Backup ISP, not my primary connection.
Node and the runtime started at the lowest privilege that worked. Permissions added only when something concretely broke.
A separate calendar on a separate email — not my primary. Telegram, not iMessage. Search scoped to what the agent actually needs.

The agent doesn’t have my messages but uses a separate channel. That distinction matters: I’m not building defensive walls around sensitive data, I’m minimizing what the agent can access in the first place.

What OpenClaw actually is, under the hood

OpenClaw is not one thing. It’s a Node-based agent runtime plus a constellation of services it expects to be there:

The agent runtime itself (Node.js, npm-managed, runs as a TUI process you can attach to).
An LLM provider — OpenRouter in my case, with a daily spend limit set before I started tinkering, not after.
An embedding model for memory retrieval.
A search backend — SearXNG, running locally in Docker.
A messaging surface — Telegram bot in my case.
A memory layer — four core files that hold the agent’s persistent context.

Each one of those is its own integration cliff and learning curve. OpenRouter is the one that’s truly trivial. Everything else takes real work.

SearXNG: the Docker setup that ate a weekend

SearXNG is a metasearch aggregator. You run it locally so your agent has a search backend that isn’t tied to a paid API. The official deployment is a Docker container, and the canonical setup is roughly:

A searxng/searxng container exposing port 8080.
A redis sidecar for the rate limiter plugin.
A settings.yml mounted as a volume so your config persists across container restarts.
A SEARXNG_SECRET environment variable (this is non-optional; the container won’t behave without it).

This part is mostly cookbook. The part that isn’t cookbook is the 403 Forbidden problem.

The agent kept getting 403s on every search. I went looking in all the wrong places first — the limiter plugin, the bot-detection layer, my user-agent header, my engine choices, my Docker network config. None of those were the problem.

The actual fix is one line in settings.yml:

yaml

search:
  formats:
    - html
    - json

SearXNG ships with html as the only enabled output format. Programmatic clients — including every agent framework I know of — request format=json. If json isn’t in the allowed formats list, SearXNG refuses the request with a 403. Not a rate limit, not bot detection, not a network issue. A format negotiation rejection that surfaces as the most misleading status code possible.

Add json to the list, restart the container, search works.

This is the kind of thing where I lost couple of hours not to the fix but to the diagnosis. The error told me “forbidden.” The cause was “I didn’t ask for what you were willing to give me.” Worth internalizing as a debugging lesson generally: a 403 from a service you control is almost never a permissions problem. Turned out it was a config issue hidden behind an inaccurate status code.

The npm side: a dependency graph with opinions

OpenClaw’s runtime is a Node project. That means an npm install that pulls a deep tree, a package-lock.json that you absolutely do not delete on a whim, and at least one or two native modules that compile against your local toolchain.

A few things worth knowing if you’re going in:

Pin your Node version. Use .nvmrc or Volta. Agent frameworks tend to be sensitive to Node minor versions because they often use newer language features (top-level await, node: imports, native fetch).
The runtime dependencies are not the whole story. Many agent frameworks also pull in optional peer dependencies for specific integrations (Telegram, Discord, Slack), and the failure mode when one is missing is usually a runtime error rather than an install-time error.
Lockfile drift is a real source of “it worked yesterday” pain. If you’re running this on a Mac mini that you SSH into, commit the lockfile and treat the deployed environment as immutable between updates.

None of this is exotic Node knowledge. It’s just the Node knowledge you actually need rather than the Node knowledge tutorials assume you have.

The four core memory files

This is the part of OpenClaw that took me longest to internalize, and it’s the part that makes the difference between an agent that feels like a tool and one that feels like a toy.

The agent’s persistent state lives in four files. (The exact names will depend on which guide you follow — I leaned on the Velvetshark OpenClaw Memory Masterclass, supplemented by the Awesome Generative AI Guide’s OpenClaw mastery course. Together they hold who you are, what your recurring patterns look like, and what the agent has learned about you over time.

Without these files set up properly, every conversation starts from zero. The agent has the LLM’s general knowledge but no continuity. With them, the agent can hold a coherent picture across days and weeks.

The files aren’t passive. The agent reads them at the start of a session and writes back to them as new information arrives. Which means: their structure determines what the agent can learn. A poorly structured memory file is worse than no memory file at all, because the agent will still try to use it and will produce confidently wrong answers grounded in your bad schema.

The runtime is the part you see. The memory schema is what actually sinks you if you get it wrong.

OpenRouter, including the embeddings shortcut

OpenRouter is the cleanest piece of the whole stack. One key, one base URL, one routing config, and you can hit dozens of models behind a unified API.

Two things worth knowing:

Set daily spend limits in the OpenRouter dashboard before you start tinkering. Agent loops can recurse. Recursion plus per-token billing equals a bad morning.
OpenRouter handles embeddings too. The OpenClaw docs default to OpenAI for embeddings, but you don’t have to use OpenAI — OpenRouter supports embedding models through the same API surface (reference). This collapses one provider, one key, and one integration point out of your stack. I should have done this on day one and didn’t.

On guides and the actual ultimate guide

There’s a beginner guide. A power guide and maybe even a super guide and then a masterclass. None of them is the actual ultimate guide. The actual ultimate guide is the grind: hitting a problem the docs don’t cover, reading the source, fixing it, moving on. Every serious OpenClaw setup I’ve seen has the fingerprints of someone who did that — not someone who found the magic README.

If you can read a stack trace and follow a chain of error messages back to a cause, you can get there. It takes a lot of patience!

Side Note: Claire Vo’s complete guide on Lenny’s Newsletter is top notch!

Then I tried Jatayu

It took roughly about twenty minutes. The setup, mechanically: clone the repo, fill in a PERSONAL.yaml with your contacts, services, and preferences, run bash start.sh, and the agent comes up inside a tmux session you can attach to. Then /imessage:access to approve the iMessage integration, and you’re live.

The architectural contrast with OpenClaw is the interesting part:

No SearXNG. Search comes through MCP servers when the agent needs it, not through a self-hosted aggregator.
No separate embedding model. Memory is file-based and indexed at the application layer.
No Telegram bot. The messaging surface is iMessage, accessed via AppleScript through an MCP server. Native OS primitive instead of a third-party platform.
No daemon-like service architecture. It’s a single process inside a tmux session.

There’s also an economic difference worth noting: Jatayu runs on your existing Claude subscription. No separate OpenRouter billing, no embedding API costs, no vendor sprawl. One subscription you’re already paying for does the work that OpenClaw spreads across multiple providers.

Jatayu is deliberately small in surface area. It doesn’t try to be configurable in the way OpenClaw is — it picks a few strong primitives and commits to them. macOS-native rather than cross-platform, MCP-native rather than self-contained. Those are deliberate architectural choices. They constrain where Jatayu runs, and in return they eliminate entire categories of setup and maintenance complexity.

The architectural decision pays off twice. Once at setup, where you don’t have to plumb six services together. And again every day after, because there’s almost nothing that can drift out of sync.

Using it as a household co-ordianator is even more helpful. Reminders, schedules, the small coordination overhead that fills a week — all of it lives in one place now.

I hit one MCP server issue during setup. Re-reading the docs resolved it in about five minutes. Full credit to Arjun for building and sharing this prototype.

The takeaway

If you’ve wondered why “AI agents” feel impressive in demos but frustrating in real life, the reason is usually architecture, not intelligence. Simpler architecture is easier to trust, maintain, and actually use.

OpenClaw is what you build when you want to understand the whole stack, run everything yourself, and have something you can rebuild from first principles if any one provider goes away. That control costs weekends. It also costs ownership in a subtler way: the system is large enough that even after several weekends inside it, I’m a user of OpenClaw. I understand what it does. I don’t understand it well enough to change how it does it.

Jatayu is small enough that spending real time with it means you actually learn how it works. Not just how to use it — how it’s built. That’s a different relationship with a tool. Simplicity isn’t just about easier setup. It’s what makes real ownership possible.

If I were starting today, I’d try Jatayu first. Not because OpenClaw was hours spent over weekends — it’s because the agent I actually want is one I can understand fully, and that turns out to be a property of the codebase, not the agent.

The stories, views and opinions expressed in this article are solely my own and do not necessarily reflect those of any affiliated organizations or individuals.

Arun’s Substack

Discussion about this post

Ready for more?