• the weekly swarm
  • Posts
  • From Cooperation to Collapse: Why Multi-Agent Systems Struggle

From Cooperation to Collapse: Why Multi-Agent Systems Struggle

When more agents means more problems: the challenges of scaling AI teamwork

In partnership with

Stop Asking AI Questions, and Start Building Personal AI Software.

Feeling overwhelmed by AI options or stuck on basic prompts? The AI Fast Track is your 5-day roadmap to solving problems faster with next-level artificial intelligence.

This free email course cuts through the noise with practical knowledge and real-world examples delivered daily. You'll go from learning essential foundations to writing effective prompts, building powerful Artifacts, creating a personal AI assistant, and developing working software—all without coding.

Join thousands who've transformed their workflows and future-proofed their AI skills in just one week.

Hello everyone and welcome to my newsletter where I discuss real-world skills needed for the top data jobs. 👏

This week I’m writing about the failure in multi-agentic systems. 👀

Not a subscriber? Join the informed. Over 200K people read my content monthly.

Thank you. 🎉

Multi-agent AI systems (MAS) are rapidly emerging as the next major leap in artificial intelligence—networks of large language model (LLM) agents working together to tackle challenges no single model could handle on its own.

If you’re a subscriber then you’ve read my take on the catastrophic collapse of data science in the real-world. Are we witnessing a similar failure in the long term for complex multi-agentic systems? I don’t believe so but agents are certainly receiving the same hype as the faker science role did a decade ago.

Microsoft CEO Satya Nadella even likened deploying an AI agent to the simplicity of creating a Word document or PowerPoint slide. But that’s just the beginning. The real potential lies in orchestrated teams of agents, operating in harmony, capable of solving increasingly sophisticated problems at our command.

Not so fast. A recent research paper from UC Berkley throws a little shade on the new AI superstar. The long term failure rate of multi-agentic systems is just shy of 70%, coming in around 67%. 

Multi-agent AI systems behave like badly managed human teams.

Multi-agent systems suffer from the same dysfunctions that plague human organizations: unclear roles, poor communication, misaligned objectives, and weak quality control. These failure modes aren’t just similar to human team breakdowns—they’re strikingly identical. In fact, one of the core insights from recent research is that many failures in multi-agent LLM systems directly mirror classic flaws in organizational design.

And, as with human teams, the consequences go far beyond inefficiency—they can be catastrophic. The researchers argue that the answer isn’t simply to build bigger models or feed them more data. What’s missing is organizational thinking: clearly defined roles, institutional memory, structured communication, and rigorous verification—all the foundational principles that make human teams work.

The researchers argue that the answer isn’t simply to build bigger models or feed them more data.

The paper, Why Do Multi-Agent LLM Systems Fail?, identifies 14 distinct failure modes common across a wide variety of multi-agent setups. These include unclear objectives, poor adherence to roles, lost context, ignored input, and weak verification processes. Some failures merely lead to inefficiency—but others are more severe, producing incorrect or incomplete outputs with no oversight mechanism in place to catch the mistakes.

What’s most striking is that these aren’t exotic or uniquely “AI” problems. They mirror the same dysfunctions that plague human teams, stemming from the same root causes: confusion, misalignment, and a lack of coordinated oversight.

What’s most striking is that these aren’t exotic or uniquely “AI” problems. 

Multi-agent AI systems aren’t just software—they’re complex systems. And like all complex systems, they come with inherent risks: unpredictable interactions, cascading failures, coordination breakdowns, and fragile handoffs. These systems don’t fail because the models lack intelligence. They fail because the overall system isn’t designed to handle complexity.

That’s what makes the paper’s message so critical. The real challenge isn’t purely technical—it’s organizational. We’re not just building AI agents; we’re building agent teams, workflows, and organizations. And without intentional structures—clear roles, shared memory, robust verification, and graceful failure modes—these systems will fall into the same chaos that undermines poorly managed human teams.

As part of that design process, we must also ask: Where—and when—should humans remain in the loop? At what points in decision-making, escalation, or oversight is human judgment essential? This isn’t about compensating for AI’s limitations—it’s about recognizing the complementarity between human strengths and machine capabilities. The goal is to design systems where humans and AI collaborate safely, effectively, and accountably.

Thanks for reading and have a great day.

Are we fighting Occam's Razor here? Does complexity lead to less reliability?

Login or Subscribe to participate in polls.