Tuesday, 02 September 2025

5 min

Format-3
Curiosity
AI
AI’s iBeer Moment: Why 95% of Projects Fail Before Breakthroughs Arrive

AI’s iBeer Moment: Why 95% of Projects Fail Before Breakthroughs Arrive

Scroll ↓

Tuesday, 02 September 2025

5 min

by Kevin Stewart

When MIT released its study showing that 95% of AI pilots fail to deliver business impact, it made headlines everywhere. The report defined failure as projects that never moved beyond pilot stage or failed to create measurable return on investment (ROI) against profit-and-loss.

By those standards, most AI experiments, no matter how clever or technically impressive, are flops. Cue the commentary: “AI is hype.” “It’s the new dot-com bubble.” “Robots are stealing our jobs (badly).”

It’s a catchy statistic, but it misses the nuance. Failure at this stage doesn’t mean AI is a false dawn. It means we’re in the messy, necessary part of innovation.

Here’s why…

Failure is the Baseline for Innovation

It’s easy to gasp at “95% failure.” But let’s put that number in context:

Around 90% of startups fail.

Between 70–90% of new product launches flop (Harvard Business Review, Clayton Christensen).

In other words, the odds are stacked against any new thing.

So why would AI, an entirely new general-purpose technology, be any different? High failure rates aren’t evidence of weakness. They’re the entry price of building new categories.

If anything, 95% feels almost normal.

Every New Technology Often Starts with Novelty

Go back to the last big technology leap: mobile. What were the first breakout apps?

iBeer (2008)

Tilt your phone and watch a virtual pint of beer “pour” and slosh around. Created by Hottrix, it was one of the most downloaded paid apps globally in the App Store’s early years.

Zippo Lighter (2008)

Licensed by Zippo, you could flick the screen to light a virtual flame, sway it at concerts, and even customise your lighter skin. It was a pop-culture hit, downloaded millions of times.

By MIT’s criteria, both iBeer and Zippo would be failures. They didn’t deliver sustained ROI. They never evolved into billion-dollar businesses. They were novelties that faded as quickly as they appeared.

And yet, they mattered.

These apps were more than toys. They tested the iPhone’s sensors, multitouch gestures, and sound synchronisation in ways no one had before. They made people comfortable waving phones in social settings. They revealed new human behaviours around mobile devices.

That experimental “failure” is what paved the way for the meaningful apps: Google Maps walking directions, Uber’s live driver tracking, Pokémon Go’s augmented reality.

The same is true of AI today. The silly apps, the gimmicky chatbots, the endless clones, they’re all part of exploring the edges of what the technology can do. You don’t leap straight to Uber. You start with iBeer.

It’s Not the Tech That’s Failing, It’s Us

Spend five minutes with a modern LLM and it’s obvious: the capability is astonishing. These systems can take vast amounts of structured and unstructured data, make sense of it, and generate predictions, insights, and content at scale.

The issue isn’t the tech. It’s the way we’re deploying it. Too many projects start with the shiny tool and ask: “What can we bolt this onto?” That’s backwards.

Steve Jobs famously put it: “You’ve got to start with the customer experience and work back to the technology.”

Case Study: Uber

Uber didn’t begin with “GPS is cool, let’s use it.” The story goes that Travis Kalanick and Garrett Camp couldn’t get a cab on a rainy night in Paris in 2008. They imagined a button you could press to summon a ride. GPS and smartphones were the enablers, not the idea. The problem was human: unreliable urban transport.

Case Study: Airbnb

Airbnb wasn’t “let’s monetise Web 2.0.” It started when two broke designers in San Francisco couldn’t make rent. With a design conference in town and hotels booked out, they bought airbeds, hosted strangers, and charged them for breakfast. The tech was primitive. The insight was human: people needed flexible, affordable places to stay.

That’s why Uber and Airbnb rewrote culture, while thousands of “apps” are long forgotten. They started with problems, not technology.

AI today is making the same mistake in a new outfit: tech-first, not problem-first.

Discovery Is Broken, But AI Offers Hope

Discovery was supposed to help us fail fast and learn cheaply. In practice, it rarely reduces the failure rate.

Classic discovery often becomes research theatre: endless interviews, workshops, and slide decks that cost money but don’t lower risk.

Lean discovery approaches like design sprints give you insight, but the Day 5 lo-fi test is often too unreliable to justify real investment.

The root cause? The high cost of production. Building software, even as a prototype, has always been expensive. So we swap it for rituals.

This is what AI changes. Code generation, synthetic data, design automation — the cost of producing working (if scrappy) software is collapsing. Suddenly you can test higher-fidelity ideas faster, with real users, and get more reliable data.

For the first time, discovery has a chance to escape theatre and become a genuine experimentation engine.

Why 95% Failure Is Good News

If most AI projects are failing, it means the field is still wide open. The winners haven’t been written yet.

That’s good news for founders, corporates, and designers, provided they take the right approach:

Be problem-focused, not tech-focused.
Use AI to shorten the path to validated learning.
Invest in finding what’s meaningful before you invest in scaling.

At Format-3, that’s exactly what we help teams do through our AI-empowered Rapid Discovery service. We cut the time from idea to validated learning, so when you do bet big, it’s on something that matters. (with a hell of a track record)

Closing thoughts

By MIT’s yardstick, 95% of AI projects are failures. But that definition of failure ignores the reality of innovation: exploration is the raw material of progress.

The Zippo lighter app failed. iBeer failed. And yet without them, we may never have had Uber, Google Maps, or Spotify.

So yes, 95% of AI projects are failing. Good. That’s the messy stage before the breakthroughs.

The only question is: are you testing the right problems, fast enough, to be in the 5% that break through