I Gave 4 AI Tools the Same Impossible Job: Rebuild a SaaS That Processes Millions. Only One Survived.

I've been building software for over a decade. I founded Junglebee in 2016 — a SaaS booking platform for tours and charters. Hundreds of thousands of bookings. Tons of customers. We process millions of dollars in bookings per year through the platform.

You can verify the numbers yourself at https://trustmrr.com/startup/jungle-bee — I'm not here to talk theory. This is a real business with real revenue.

I run Black Matter Defense. We deploy AI systems across organizations — orchestration, integration, governance. I've done this for venture funds like Northzone, Active Impact Investments, and Clean Energy Ventures, and now for enterprises where security and compliance aren't optional.

I'm not a VC investor. I'm a builder and operator. I live inside these tools every single day.

A few weeks ago, I decided to do something that most people would call insane.

Rebuild Junglebee from the ground up. No engineering team. No sprints. No Jira. Just me and AI coding agents.

This is not a hypothetical. This is not a thought experiment. This is happening right now, and I'm going to walk you through every ugly detail.

IWhy I Snapped

Three frustrations pushed me over the edge. If you run a SaaS product that's been around for a while, you'll feel every single one of these in your bones.

aFrustration 1: Adding new features takes forever

Here's a concrete example. I wanted to add team functionality to Junglebee. Invites. Permissions. Roles. Pending invitations. Standard stuff that every SaaS app needs.

With a human engineering team, here's what that looks like. You plan sprints. You create Jira tickets. You record Loom walkthroughs explaining exactly what you want. You have meetings. Engineers ask questions. Then they build it, and it takes weeks. Then you do a bug round. Then another bug round. Then another one because the first two rounds introduced new issues.

By the end, you've spent thousands of dollars and several weeks for one feature.

With AI vibe coding tools — Replit, Lovable, Claude Code — I built the same feature in one to two hours. Ninety percent functioning right out of the gate. A few tweaks and you're good.

And here's the part that really stings: the UX was better than what we'd been shipping.

When you've lived both sides of that contrast, it changes something in your brain. You can't unsee it.

bFrustration 2: Performance nightmares from legacy code decisions

Any system that's been built over nine years by multiple teams accumulates debt. That's just how software works. Different people, different eras, different priorities. Things pile up.

But some of it crosses a line.

Let me give you a specific example that still makes my blood pressure spike.

The dashboard shows tours for a full week. Could be anywhere from one to twenty tours per day. Simple enough. We needed a button to block a tour or change its passenger capacity. That's it. A button. When you click it, a little form pops up.

What actually got built: a hidden form for every single product that loads with the page. Not triggered on click. Not dynamically rendered when someone actually needs it. Just loaded and hidden in the DOM from the moment the page opens.

So if you have a hundred-plus products displayed on that page, that's a hundred-plus invisible forms loading every single time someone opens their dashboard.

Think about that for a second.

Every operator opening their dashboard is loading hundreds of hidden forms they will never interact with. The server is rendering all of them. The browser is parsing all of them. For nothing.

We spent weeks debugging and optimizing after that. Servers going down. Dashboards ultra slow for users. Customers complaining. All because of one feature that should have been a simple button with a click handler.

This is what happens when software grows over years without constant architectural oversight. It's not about any one person doing bad work. It's about systems that accumulate complexity until they buckle under their own weight. And once you're deep in legacy code, every team that touches it is fighting against the current.

cFrustration 3: Bugs that breed more bugs

This one is the silent killer.

You find a bug. You fix it. You deploy. Now there are five new bugs hiding in places you didn't think to check.

Each bug takes multiple days of work. You run testing cycles where half the stuff still doesn't work. You're in a constant battle of trying not to create more bugs while fixing bugs. Every round costs thousands of dollars.

And you're never really done — you're just temporarily ahead.

In 2020, this was normal. This was just what building software looked like. You accepted it. You budgeted for it. You moved on.

But it's 2026 now. And when you see what AI-built systems look like — all running on the latest tech, performance optimized out of the box, architecturally clean — the gap between "what we have" and "what's possible" becomes unbearable.

So what do you do when the thing you built is holding you back from what's now possible?

IIThe Decision

Here's the situation. Junglebee is not a side project. It's a real business. Hundreds of thousands of bookings processed. Tons of paying customers. We process millions of dollars in bookings per year through the platform.

And I decided to see if I could turn it into a one-human show powered entirely by AI agents.

Here's the thing. I've built a ton of AI systems. I work with big companies every day. I see what AI is capable of. Internal tools, automations, complex workflows — I've done all of it.

But the one thing I had not seen — and nobody I know had seen — is this: taking a full-stack system that existed before the AI era and rebuilding it from scratch using AI.

Not building a new toy app. Not creating a landing page. Not making a demo.

Taking something that dozens of engineers have been building for years. That cost hundreds of thousands of dollars to develop. That handles real money and real customers.

And handing the entire rebuild to AI.

Is that even possible?

Because if it works, the implications are staggering. You save hundreds of thousands of dollars. You get a way more effective company. Better UX and features for your customers. More revenue, more profit, faster customer acquisition.

And as a founder, you just tell AI what you want and it builds it.

I had to find out.

IIIThe Preparation

Before I let any AI tool touch a single line of code, I spent two days preparing. This part matters more than people realize.

I took Junglebee's entire codebase and connected it to Claude Code. Then I pushed it. Hard. My goal was to extract the full essence of the app — every feature, every edge case, every integration, every business rule.

It took about two full days of work just to get everything out.

But what Claude Code created was genuinely impressive.

A massive brief about the company and what the product does. A full implementation plan. 360 tickets broken into phases. User experience descriptions for every single ticket — what the user would expect to see, what they would expect to do, how every interaction should work.

I really pushed it to create the ultimate plan. Not a rough outline. Not a feature list. A comprehensive, ticket-by-ticket blueprint for rebuilding the entire application.

Then I assembled the inputs that every AI tool would receive:

A full Figma design. Professionally done, page by page, every screen of the application. The GitHub repo — which is massive. The 360-ticket plan with UX specs for each one. And the company brief and implementation plan.

Same inputs. Same expectations. Four different AI tools.

Let's see what happens.

IVThe Four Contestants

All four got the exact same massive plan, Figma files, tickets, UX specs, and system intricacies. No tool got special treatment. No tool got extra context.

Fair fight.

aContestant 1: Lovable

I want to say this upfront: Lovable is genuinely good at what it's designed for. Building new web apps, marketing sites, MVPs from the ground up — it does an excellent job. Every tool in this contest does. That's not the question.

The question is whether any of them can rebuild a massive, nine-year-old piece of software with hundreds of features, edge cases, and integrations.

That's a very different ask.

First problem: no Figma access or connection that I could find. There was no way to plug in a Figma file directly. So I had to take screenshots of every single page of the design and paste them into the chat UI, one by one. Dozens of screens. Screenshot, paste, wait, repeat.

Second problem: I'd give it a batch of work and it would start building, then stop and ask "do you want me to continue?" I told it multiple times — just leave me alone and build the app. Don't ask. Just go. It ignored me and kept asking. Every few minutes, another interruption.

Then came the real issues. It would build the dashboard from the Figma design, and it looked okay at first glance. But nothing was wired up. The Book Now button existed on the page — you could see it, you could click it — but it didn't do anything. Created products that appeared in the UI but didn't actually function. There was this cascading effect where it would build the visual layer of things but skip the actual logic and connections per ticket.

I told it to test everything. It said it tested. I went to check. Nothing worked.

Looking back, I think the scope was simply too much for what Lovable is designed for. Rebuilding a full legacy application with 360 tickets and complex business logic is a fundamentally different challenge than building something new. I don't think this is a Lovable problem. I think it's a scope problem. You could probably make it work by sitting behind the computer constantly directing it, step by step, every three tickets. Babysitting every interaction. But that defeats the purpose.

Verdict: not the right tool for a full-stack application rebuild at this scale. Probably excellent for what it's actually built to do.

bContestant 2: Claude Code (with the Garry Tan repo)

This one got me excited.

I gave it everything. Connected the Vercel token. Connected the Supabase token. Hooked up the GitHub repo with auto-deploy. The whole setup felt professional and promising.

And then — man, it just worked away for like two days.

Watching it was something else. It looked like a pro. Files being created, components being built, commits going through, deploys happening. Very exciting to watch.

I left my computer on all night. The whole next day. Could see it working for about 36 hours straight.

This thing was grinding.

Then I went to actually use what it built.

The result was underwhelming. Designs were terrible. Nothing fit together visually. I couldn't get through the onboarding flow — bugs everywhere. Finally muscled my way to the dashboard, and the app just wouldn't work. Nothing was set up properly.

I told it to test everything. It went off and "tested" for half a day. Came back telling me everything was tested and working. I went to test it myself. Nothing worked. I couldn't even book a trip because I couldn't create a product. The most fundamental action in a booking platform — creating something to book — was broken.

I was surprised. I really thought it would execute like Robocop and just take out my entire app. Thirty-six hours of watching it work had built up my expectations.

The crash back to reality was rough.

I didn't even want to play with the product because it was so buggy. Every interaction was broken. Getting anything to function required lots of back-and-forth chatting, step by step, and even then things were unstable.

cContestant 3: Factory AI

I'd seen lots of hype about Factory AI. Articles, Twitter threads, endorsements. I went in wanting to believe.

And the setup was genuinely impressive. It asked really good questions during the investigation phase. Did a very thorough job at understanding the system, the requirements, the architecture. I was thinking — okay, this one gets it.

I had to go with their $200 max plan to handle the scope.

But once it was in the flow and actually building — it ran into the same walls. The jump from understanding a system to actually rebuilding it is enormous. And that's fair. What I asked these tools to do was hard.

The investigation phase showed real competence. Real depth. Factory AI clearly understood the codebase and could reason about it intelligently. But translating that understanding into a working full-stack rebuild at this scale was a different challenge entirely.

Here's the thing though. I'm not done with Factory AI.

This month, I'm running a new experiment. Instead of asking Factory AI to rebuild the entire app from scratch, I'm going to plug it into the current Junglebee codebase and test whether it can serve as an embedded engineer. Optimizing performance. Fixing bugs. Building specific new features on the existing legacy system.

Not a full rebuild. Acting as a dev on the team.

This feels like a much better fit for what Factory AI showed it could do. It asked great questions. It understood architecture. It reasoned well about the system. Maybe the full rebuild was the wrong test. Maybe where it shines is working within an existing codebase, the way a senior engineer would — making targeted improvements, not starting from zero.

I'll report back on how that goes. It's one of the experiments I'm most curious about right now.

dContestant 4: Replit (the winner)

Replit has always been my go-to. But I tried to be fair. Same inputs, same expectations as everybody else.

First big win right out of the gate: I connected the Figma file directly. Just plugged it in. No screenshots. No pasting images one by one. Replit had a Figma connection and it worked. After the Lovable experience, this alone was a relief.

The agent worked for about eight to ten hours and finished all 360 tickets.

That's fast.

First test: nothing worked. Login worked. That was it. I'm sitting there thinking — what the hell? Another one bites the dust.

But then something different happened. Something that separated Replit from the other three tools.

I started the back-and-forth. Told it the Figma designs weren't being followed. Asked it to re-check its Figma connection. Gave it the URLs again.

And it started replicating the designs rather accurately. Not perfect, but genuinely close to the mockups.

Overall functionality was way better than the other tools. Still far from my expectations initially — but the baseline was higher.

Then I started chatting with it more. Told it to re-look at the briefs. Retest sections. Fix specific flows.

And it did a very good job.

Not going to lie — this was not a situation where you just hand it the work and walk away. I had to be quite involved. But the involvement actually produced results, which is more than I can say for the others.

Then something surprised me.

It produced UX and UI that was actually on par with what we had in the existing app. And in some places, it figured out improvements on its own. Without me asking. It just made things better because it understood the user flow.

I started seeing improvements very quickly across different sections. Settings. Product creation. Booking flows. The changes were landing and sticking. Not breaking other things. Not regressing.

The testing difference was the most important part. When I asked it to test end-to-end, about ninety percent of the time it actually did test properly. Not the "I tested everything" claims with nothing behind them. Actual testing where things were verified and fixed.

Now — it did fail at testing all 360 tickets in one shot. That was too much scope for a single pass. But when I asked it to test section by section — booking flow, then settings, then dashboard, then onboarding — it did it well. Methodically. And the fixes it applied during testing actually worked.

Verdict: I actually think we can replace the full-stack system with this. My estimate is two to three weeks of guided work using Replit, going step by step through the application, to get to a production-ready rebuild.

VWhat This Means

Junglebee will very soon become a full application system run by AI agents. Not partially. Not "AI-assisted." The entire engineering function replaced by AI.

Here's what that unlocks.

We'll be able to build features faster than any competitor. While other booking platforms are planning sprints and hiring engineers, I'll be describing what I want and watching it get built in hours.

The speed advantage compounds over time. Every week, we pull further ahead.

Bugs reported by users could potentially be fixed by the agent in real time. Customer writes in, describes the issue, agent investigates and patches it. No ticket queue. No sprint planning. No waiting for the next release cycle.

How many SaaS companies can say that?

And here's one that matters more than people realize: with a full-stack system built the traditional way, building separate mobile apps is painful. You're maintaining multiple codebases, syncing features, dealing with platform-specific bugs. With Replit, you build from the web app across everything. One codebase. One source of truth. Deploy everywhere.

VIWhat This Means for Software Engineers

I know this section will make some people uncomfortable. I want to be thoughtful here, because this isn't about bashing anyone's work.

The engineers who built Junglebee over nine years did real, hard, valuable work. Every SaaS product that survived and grew did so because talented people poured skill and time into it.

But the role is changing. Fast.

Before 2025, software developers were primarily valued for the act of building. Writing code, debugging, shipping features — that was the core value proposition. That was what you hired for.

What's happening now is that a significant chunk of that building work is being absorbed by AI tools. Not perfectly. Not without guidance. But well enough that the economics are shifting in ways that are hard to ignore.

What does that mean for engineers?

It means the role is evolving, not disappearing. The engineers who thrive will be the ones who move up the stack — from writing code to directing systems. Architecture knowledge. UX intuition. Knowing what to build and why. Product thinking combined with AI fluency and technical depth.

Think of it like this: the translation layer — turning specs into working software — is being automated. The thinking layer is what remains and what becomes more valuable.

It's sort of like a product manager but with deep technical knowledge and the ability to direct AI systems. The people who can do that will be in enormous demand.

I think I'm probably a bit early on this. Most companies haven't felt it yet. Most engineering teams are still operating the old way. But this shift is happening, and the gap between "early adopter results" and "industry standard" is shrinking from years to months.

If you're an engineer reading this, the move is to lean into the change. Learn to direct AI tools. Develop your product sense. Get good at architecture and system design. The people who adapt will be more productive and more valuable than ever before.

VIIThe Honest Summary

Let me be real about where things stand.

None of these tools delivered a "hand it off and walk away" experience. Every single one required involvement, direction, and correction. The dream of describing your app and coming back to a finished product is not here yet.

But Replit got close enough that the math works.

Two to three weeks of my time, guided work, section by section — versus months of engineering time and hundreds of thousands of dollars. That's not even a close comparison.

Here's my scorecard:

Lovable: Excellent for building new apps from scratch — MVPs, marketing sites, web apps. But rebuilding a massive 9-year-old platform with 360 tickets was probably too much scope for what it's designed for. The lack of Figma integration and constant interruptions didn't help, but the core issue was asking it to do something outside its sweet spot.

Claude Code: Impressive work ethic — 36 hours of grinding. But the output quality was poor and the "I tested everything" claims were completely false. Looked like a pro, delivered like an intern.

Factory AI: Best investigation and setup phase of any tool. Genuinely understood the system. But translating that understanding into a working full-stack rebuild didn't come together. I'm testing it next as an embedded engineer on the existing codebase this month — that might be where it shines.

Replit: The clear winner. Not because it was perfect on the first pass — it wasn't. But because the back-and-forth actually worked. When you corrected it, it learned. When you asked it to test, it mostly actually tested. When you pointed it back at the Figma, it followed through. The collaboration loop functioned.

VIIIWhat Happens Next

This is not a finished story. This is the first chapter.

I'm now in the process of going through Junglebee section by section with Replit, rebuilding every component, every flow, every integration. Testing as I go. Fixing as I go. Improving as I go.

The Factory AI experiment is coming this month — plugging it into the legacy codebase as a working engineer. If that works well, it opens up another interesting model: Replit for the rebuild, Factory AI for maintenance and optimization on the existing system during the transition.

I'll be sharing the full journey — what works, what breaks, what surprises me, and what the actual production cutover looks like when we migrate real customers from the old system to the AI-built one.

If you're a founder sitting on a legacy codebase, wondering if the AI hype is real — it is. But it's not magic. It's not "set it and forget it." It's more like having a very fast, very capable junior engineer who needs clear direction and regular check-ins.

The difference is this engineer works around the clock. Doesn't take vacation. Costs a fraction of a human team. And gets better every month.

The real skill is not coding anymore.

The real skill is knowing exactly what you want, being able to articulate it clearly, and having the product and architectural sense to evaluate what the AI builds.

That's the new game. And it started about five minutes ago.

More updates coming. Stay tuned.

— Michael Rouveure