Why Most Agentic AI Projects Fail

And why that doesn’t mean you shouldn’t build one, if you know what you’re doing. This article is part of a broader series on how product and engineering can build more reliable AI together. For the product perspective, read on why you should still ship fragile AI.

If you’ve tried to build anything with agentic AI in the last year, you’ve probably hit a wall.

Everyone wants autonomous systems that can reason across multiple steps, make decisions, and handle complex workflows. What they don’t realize is that every step adds error, and those errors stack fast. If you’re building something with 30 steps and your model is 99% accurate at each one, you’re ending up with a system that’s right 74% of the time, best case.

Most businesses can’t use that. So they either shelve the project or sink time and money into trying to make the model better.

That’s the wrong approach. → consider changing this to “You will fail to deliver a product because the perfect model does not exist.”

The part no one wants to say out loud

Agentic AI doesn’t think. It predicts tokens. That prediction can look like reasoning, but it isn’t. Which means if you want to use it to do anything reliably, you have to engineer an entire structure around it.

You need to sandbox it so it can’t break things.

You need to log every action for traceability.

You need checkpoints where a human can step in before something goes sideways.

And you need to build the whole thing assuming it will go sideways, eventually.

There are frameworks out there: Model Context Protocol (Anthropic), AutoGen (Microsoft), Process (Semantic Kernel in Microsoft). We tried them. Most aren’t mature enough for production. We wrote our own.

Product matters more than you think

The hardest part of these projects isn’t writing tools. It’s defining what you’re actually trying to build.

Most use cases that sound agentic don’t actually need full autonomy. What they need is a scoped system with limited tool access, guardrails for decision-making, and just enough flexibility to be useful without breaking under real-world conditions.

That’s why we like to work with product early. Before anyone writes code, we should figure out where the model is likely to break, where a human needs to step in, and what can be safely shipped. Engineering can build the skeleton, but product shapes the experience that makes it usable—and trusted.

What we've learned the hard way

  1. The more general your platform, the more brittle it becomes
  2. If your project depends on the model figuring it out, it’s probably going to fail
  3. You don’t solve reliability through more modeling. You solve it through guardrails, traceability, and product design
  4. If you don’t define what counts as done, your agent is going to just keep going until it hits a wall
  5. It’s okay to call something agentic if that’s what gets it funded. Just make sure you design for what’s real

You don’t need a perfect model to build something real.

You just need to respect the constraints and design like they matter.

Transform Ideas Into Impact

Discover how we bring healthcare innovations to life.