From 19 Agents to One: Inside the v3.0 Rebuild

Q: "Why consolidate 19 agents into one?"

"Users didn't know which agent to pick. A social media manager agent, a code reviewer agent, a research agent — it sounds organized on paper, but in practice people just wanted to ask a question and get a good answer. One agent that discovers the right skill for the job is simpler and more powerful than 19 specialists you have to manually route between."

Q: "What's the A2A protocol and why does it matter?"

"Agent-to-Agent is a protocol for AI agents to talk to each other across platforms. With A2A support, your LikeClaw agents can be called by external apps — AI Native Chat, custom integrations, other agent platforms. It's the beginning of agents as interoperable services, not walled-garden chatbots."

Q: "Is the robot assembly onboarding just cosmetic?"

"No. Every choice you make during assembly creates a real agent. Pick Claude Sonnet as the brain, attach web search and code execution tentacles, connect your Google account — and the agent you built is the agent you use. It's configuration disguised as a game."

Q: "How does the Claude Code CLI improve background tasks?"

"OpenCode had a 120-second bash timeout that killed long-running tasks. Claude Code CLI doesn't have that limitation. On our benchmark, successful task completion went from 10+ minutes (with frequent failures) to 3.5 minutes. It's not a marginal improvement — it's a different class of reliability."

We deleted 19 agent personas, rebuilt the UI from scratch, and shipped a robot assembly onboarding game. Here's what v3.0 looks like — and why it took 87 commits in two weeks.

On March 12, we deleted 19 agents. Not archived. Not deprecated. Deleted. The social media manager, the code reviewer, the research assistant, the content writer — all of them, gone in a single commit. Two weeks and 87 commits later, we shipped v3.0.

This is what happened.

The 19-agent problem

When we first built LikeClaw, the mental model was simple: one agent per job. Need social media help? Talk to the social media agent. Need code reviewed? Talk to the code review agent. It felt clean. It was not.

Users didn’t know which agent to pick. They’d ask the research agent to write code, get a mediocre answer, and assume the platform was bad. The routing problem — figuring out which specialist to talk to — was supposed to be the user’s job. That’s backwards. The whole point of an agent platform is that the machine figures out what to do.

So we killed all 19 and replaced them with one agent backed by 28 skills. The agent discovers which skill fits the task and delegates — though every skill still requires explicit user approval before it can run. Users just ask their question. This is the same architectural shift we described in from chat to agent platform, but taken to its logical end.

Tearing down a 3,000-line god component

The old UI was a single React component. 3,000 lines. It handled chat, settings, agent selection, file uploads, and about four other things that had no business living together. Every change risked breaking something unrelated. We’d been patching around it for months.

For v3.0, we ripped it out and replaced it with a 4-column layout: IconRail on the far left for navigation, SessionsPanel for chat history, Center for the actual conversation, and RightConfigPanel for agent settings. Each panel collapses independently. The code went from one monolith to a set of composable modules — none longer than 400 lines.

This wasn’t a cosmetic refresh. It was a prerequisite for everything else in v3.0. You can’t ship a skills picker UI, an agent config panel, and an onboarding flow inside a god component without losing your mind.

Building a robot to configure an agent

Onboarding is where most agent platforms lose people. You sign up, you see an empty chat box, you don’t know what to do. We tried guided tours. We tried tooltip walkthroughs. They all felt like homework.

So we built a game. New users walk through a factory where they assemble a Matrix-style octopus robot. Four stations: Brain (pick your LLM model), Tentacles (attach skills like web search or code execution), Storage (connect files), and Connections (link Google account). Dark cyberpunk aesthetic. Every choice maps to a real agent configuration — when you finish the game, the robot you built is the agent you use.

It’s configuration disguised as a game. And it works because it answers the question every new user has: “What can this thing actually do?”

Agents talking to agents

We shipped A2A — the Agent-to-Agent protocol. Your LikeClaw agents now expose standardized endpoints that external apps can call. Multi-turn context, streaming responses, the full conversation loop.

This matters because agents stuck inside one platform aren’t that useful. With A2A, your agent can join a chat room in AI Native Chat, get called by a custom integration, or participate in a multi-agent workflow orchestrated by a completely different system. We covered the groundwork for external integrations back in schedules, gateway, and the event loop. A2A is the next step.

10 minutes down to 3.5

We replaced OpenCode with Claude Code CLI for background tasks. The numbers tell the story: successful task completion went from 10+ minutes (with frequent timeouts) to 3.5 minutes. The root cause was a 120-second bash timeout in OpenCode that killed any long-running process. Claude Code CLI doesn’t have that limitation.

This was one of those changes where the fix is boring but the impact is massive. Background tasks that used to fail 40% of the time now complete reliably. If you’ve been following our MCP server work, you know reliability is the thing we optimize for above all else.

LINE joins the gateway

LINE is the dominant messaging app in Japan, Taiwan, and Thailand. We added a full LINE adapter — 862 lines of gateway code with webhook handling, message formatting, and session management. It joins Telegram and WhatsApp as a supported messaging channel.

The gateway architecture we built for the sandbox bet made this straightforward. Each messaging platform gets an adapter that translates platform-specific messages into our internal format. Adding LINE took three days instead of three weeks because the hard abstraction work was already done.

What’s next

v3.0 is the foundation. The skills-first architecture means we can add capabilities without adding agents. The A2A protocol means external apps can consume those capabilities. The 4-column UI means we have room to build without rewriting.

Next up: skill marketplace (let users publish and share skills), workspace templates (pre-configured agent setups for common workflows), and deeper A2A integrations with partner platforms. We’re also looking at voice — the LINE gateway proved our adapter pattern scales, and voice is just another adapter.

Eighty-seven commits in two weeks. Nineteen agents deleted. One agent that actually works.

Questions about v3.0

Why consolidate 19 agents into one?

Users didn't know which agent to pick. A social media manager agent, a code reviewer agent, a research agent — it sounds organized on paper, but in practice people just wanted to ask a question and get a good answer. One agent that discovers the right skill for the job is simpler and more powerful than 19 specialists you have to manually route between.

What's the A2A protocol and why does it matter?

Agent-to-Agent is a protocol for AI agents to talk to each other across platforms. With A2A support, your LikeClaw agents can be called by external apps — AI Native Chat, custom integrations, other agent platforms. It's the beginning of agents as interoperable services, not walled-garden chatbots.

Is the robot assembly onboarding just cosmetic?

No. Every choice you make during assembly creates a real agent. Pick Claude Sonnet as the brain, attach web search and code execution tentacles, connect your Google account — and the agent you built is the agent you use. It's configuration disguised as a game.

How does the Claude Code CLI improve background tasks?

OpenCode had a 120-second bash timeout that killed long-running tasks. Claude Code CLI doesn't have that limitation. On our benchmark, successful task completion went from 10+ minutes (with frequent failures) to 3.5 minutes. It's not a marginal improvement — it's a different class of reliability.