In From Chaotic AI Tools to a Structured Development Team, I described the forced switch from JetBrains to VSCode — a change I wouldn't have made voluntarily. In When 15 Modes Become an Orchestrated Symphony, I explored the handover system, quality gates, and rule hierarchy. In Problems, Learnings, and 36% Refactoring, a narrative arc closed: thanks to RunVSAgent, I could return to my beloved JetBrains IDEs.

Now a bigger arc closes: The Roo Code Agile Software Development Team has reached its end. But as often happens in software development, an ending is also a new beginning.

The Decision

Late September, I received a message from my boss. The topic: API costs. In the three months since starting the project, I had accumulated around $6,000 in API costs — August alone was over $3,200. The multi-agent system with its 15 specialized modes consumed significantly more tokens than a single agent. Every handover, every quality gate check, every context transfer — everything had a cost.

The suggestion was pragmatic: Claude Code with a team subscription and premium seat. The costs would be covered, and I could keep working — just on a different platform. The next day, I was already set up.

There was no long hesitation, no discussions about "but I've invested so much." The decision was quick and pragmatic. When a better alternative is available — why not?

What Remains: A Retrospective

Before I get to the limitations, I want to acknowledge what the project brought. The many hours of development work, the 118 commits in three days, the 36.4% refactoring — none of that was wasted effort.

Technical Learnings

The most important technical insights aren't tied to Roo Code. They're transferable to any system where multiple AI agents need to collaborate.

Handover patterns proved essential. Without structured context transfer between agents, information gets lost. An agent doesn't know what the previous one did, what decisions were made, what constraints apply. The template-based handover system — with clear goals, constraints, and success criteria — was one of the project's biggest successes.

Quality gates prevent errors from being passed along. Instead of having a pile of problems at the end of a long chain, you catch them early. The code reviewer checks before the QA engineer tests. The security engineer looks things over before production. These checkpoints cost time — but they save more than they cost.

The rule hierarchy resolves conflicts clearly: System Integrity before Task Continuity before Mode Boundaries before Quality Gates before Efficiency. When security and speed collide, security wins. When a task exceeds mode boundaries, escalation happens instead of improvisation. This clear prioritization makes the system's behavior predictable.

Specialization beats generalization — perhaps the most surprising insight. A team of 15 specialists, each doing one thing well, works better than a "jack of all trades" doing everything mediocrely. The backend developer doesn't write UI. The QA engineer doesn't fix bugs. The documentation writer doesn't code. These clear boundaries lead to better results.

Personal Growth

Beyond the technical learnings, the project fundamentally deepened my understanding of multi-agent orchestration. I learned that AI agents have a built-in "perfectionism bias." They don't just want to solve the problem — they want to deliver the best possible result. That sounds good but is dangerous. A simple bug fix becomes a comprehensive refactoring session. A small change grows into a feature package. You have to actively teach an AI to do less.

I also learned that simplicity beats complexity. My original 8-step mode drift algorithm with similarity scoring and circuit breaker pattern was "academically interesting." The 3-case decision matrix that replaced it was understandable, debuggable, and worked better.

Concrete Artifacts

Beyond the learnings, concrete artifacts remain that I can reuse. The universal handover templates. The three policies against over-engineering: Minimal Code Changes, Feature Addition Control, Scope Expansion Control. The priority framework for issue tracking with four levels P1 to P4. These tools aren't platform-specific — they work wherever multiple agents need coordination.

The Limitations: Challenges, Not Showstoppers

Despite these successes, the project hit boundaries. Not all were insurmountable — but together they raised the question: was further iteration in Roo Code the right path?

API Costs

Costs were the most obvious factor: around $6,000 in three months, peaking at over $3,200 in August alone. The multi-agent system consumes more tokens than a single agent. Every handover transports context. Every quality gate check needs understanding of the overall state. The architecture is inherently more expensive than a monolithic approach.

Was this a showstopper? Not really — the costs would have continued to be covered. But it raised the question: is there a better way?

Context Overflow

15 modes mean 15 whenToUse sections that Roo Code automatically writes into the system context. Even the 8 standard team modes were already heavyweight. The more context used for mode descriptions, the less remains for the actual task — the code, the files, the conversation history.

I tried various workarounds: shorter whenToUse descriptions, temporarily disabling unused modes, optimized context compression. It helped — but only partially. A feature request for selective mode activation exists but isn't implemented yet. With that feature, the problem would have been solvable.

No Parallelization

This was the only true architectural boundary — not solvable through iteration.

Roo Code works strictly sequentially: one mode after another, one conversation after another. My original parallel patterns — boomerang tasks, coordinated checkpoints, merge strategies for parallel results — were obsolete. The backend developer can't work parallel to the frontend developer. The QA engineer must wait until the code is finished.

The switch from parallel to sequential was a workaround. It made the system more predictable and easier to debug — but it wasn't a solution. The efficiency gains that real parallelization would have brought remained out of reach.

The Orchestrator Problem

One problem I never fully solved: when a specialist encounters a problem outside their expertise during work, a new team lead starts instead of returning to the original one. The entire previous context is lost.

With time and further iteration, I could have solved this. But the "gentle push" to Claude Code came first. The work on the framework wasn't abandoned — it was relocated.

The Bottom Line on Limitations

These limitations aren't bugs that Roo Code needs to fix. They're characteristics of the platform. With further iterations and new Roo Code features, most problems could have been solved or worked around — except for the missing parallelization.

For my multi-agent framework, Roo Code was the right starting point — but not the right home.

Lessons Learned: Six Principles for Multi-Agent Systems

Regardless of platform: what did I learn that's transferable to any multi-agent system?

1. Start simple, iterate fast. The 36.4% refactoring commits weren't a sign of bad planning — they were the path to the goal. You build the first system to learn. The second to use. Better to start early and iterate than plan forever.

2. AI agents have a perfectionism bias. They want to do more than asked. That sounds good but is dangerous. Explicit policies against over-engineering are necessary. You have to actively teach an AI to do less.

3. Structured handovers aren't optional. Without standardized handover templates, context gets lost. Every handover needs three things: goal, constraints, success criteria. Uniformity beats specialization in templates.

4. Simplicity beats cleverness. The 8-step mode drift algorithm was academically interesting. The 3-case decision matrix worked better. If you can't explain it in three sentences, it's too complicated.

5. Check platform limits early. Parallelization was an architectural boundary I recognized too late. Before building a framework: what can the platform actually do? Workarounds work — until they don't.

6. Costs scale differently than expected. Multi-agent systems consume exponentially more tokens than single agents. Every quality layer costs context. Plan budget reviews before the bill arrives.

These principles apply regardless of whether you use Roo Code, Claude Code, LangChain, or something custom. Multi-agent orchestration is a craft — and like any craft, you learn it by doing.

The New Beginning: Claude Code

The work wasn't abandoned — it was relocated. And Claude Code offers things Roo Code couldn't.

Skills instead of modes. Claude Code has a different extension concept. Skills are more lightweight, more context-efficient. No 15 whenToUse sections filling up the context.

Real parallelization with the Task tool. Subagents can work in parallel. What was architecturally impossible in Roo Code is built in here. Boomerang tasks and coordinated checkpoints become possible again.

Plugins for agent personality. The extension logic lives in plugins, not tool settings. Project-specific configuration sits in CLAUDE.md directly in the repository — version-controlled, shareable, traceable.

Where do I stand now? Roo Code is no longer used — Claude Code is the only tool. I'm still learning its extensive features and analyzing what suits which part of the framework. A basic concept exists as a draft. The "real" development still needs to begin.

The framework won't be copied — it will be rethought. What worked in Roo Code comes along. What didn't work stays behind. And what's newly possible in Claude Code will be used.

The Journey Continues

What began as a forced switch from JetBrains to VSCode led to one of the most educational projects of my development career. Four articles, from chaotic creation through orchestrated symphony, through problems and learnings, to the end — and new beginning.

The narrative arc of the series is complete: JetBrains → VSCode → JetBrains → Claude Code. Every switch was a chance to learn something new. Every platform had its strengths and limits. What matters isn't where you start — but that you keep going.

Roo Code was the right starting point. Claude Code is the next step. The journey continues.