ThoughtsMay 23, 20266 min

What AI doesn't ask

AI writes good code. Projects fail in the places it never asked about. Four of them.

I use Cursor every day. Claude Code is in the terminal next to me right now. The booking page on this site? Vibe-coded in an evening. AI writes good code. Projects still fail. They fail in four places the model never asked about.

The first is business.

AI sees the code. It doesn't see what the business needs the code to do.

Last month a client showed me a "complete" billing module Claude Code had built. Subscriptions, webhooks, retries, dunning. Beautifully structured. Idempotent webhooks, exponential backoff, clean schema. Charged in USD only. The client sells in three countries, 60% of revenue in EUR. Stripe Tax not configured. No VAT. The invoice numbering wouldn't pass an EU audit. The refund policy didn't match the competitor's, which happens to be the industry standard.

The model did exactly what the prompt asked. The prompt was "build a subscription billing module." Nobody told it the business sells in Germany and France, because the founder assumed the tool would figure that out. The tool doesn't figure out things that aren't in the codebase or the prompt.

A senior would have asked first. Where do you sell, who pays you, what's the refund policy, what's the dispute rate. Before touching the keyboard. That conversation doesn't happen between a founder and Cursor.

The model can't figure out what isn't in the codebase or the prompt.

The second is data over time.

AI optimizes for the schema you have right now. It doesn't model how that schema will bend.

A user table with name, email, created_at. Six weeks later someone needs first_name and last_name. Three months later the customer is also a vendor, so we add a role. Six months later roles aren't binary anymore and the migration is ugly because none of the original code anticipated movement.

A senior would have built that table differently. Not perfectly future-proof. Just with an eye for what changes and what doesn't. AI doesn't have that eye. It has never lived with a database for two years and watched it bend under requirements nobody saw coming.

Uplevel's 2024 study found teams using GitHub Copilot introduced 41% more bugs over the same timeframe. The interesting question is which bugs. From the codebases I've audited: migration-shaped bugs. Code that worked when written, broke when the data underneath it moved.

The third is the one that scares me.

Auth flows work. Users log in, JWT tokens are signed, sessions expire. What AI doesn't model is who can do what to whom. Authorization. The actual security boundary.

A SaaS app has organizations. Each org has users. Members view dashboards, admins change billing. Standard.

Now: can a member of Org A query an endpoint and see data from Org B? In every AI-built codebase I've audited this year, the answer was yes by default. The code wasn't wrong. Nobody told the tool that "user can fetch this resource" and "user can fetch this resource for this tenant" are different sentences.

Wiz's April disclosure: 10.3% of audited Lovable projects had user data exposed before launch. GitGuardian's 2025 report: AI-assisted repos leak 40% more secrets than human-written ones. Those numbers feel high until you've reviewed one of these codebases. Then they feel low.

Same root cause every time. The model writes what was asked. Nobody asked it to fence the tenant boundary. That sentence has to come from a human who has seen what goes wrong when nobody says it.

The fourth is what happens when things break.

When the third-party API is down. When the database is at 95% disk. When a user uploads a 4GB file. When two requests race for the same row.

AI handles errors when you ask. It doesn't build a picture of where this code will run in the real world and what will try to break it. A senior's brain runs failure scenarios automatically. The model's doesn't. It has no concept of "production."

A client's AI-generated email service crashed every Tuesday morning at 9am ET. Took us a day to find. The newsletter blast sent 50k emails through a SendGrid endpoint that returned 429 after the first 10k. The code had a retry. Synchronous, blocking, no backoff. The worker pool jammed for 40 minutes every week until enough requests died and the queue cleared.

The code was correct. It had error handling. It didn't model the failure mode it would actually hit.

The code is almost free now. The decisions around the code are what you pay for.

Volume is what makes this matter. The barrier to shipping a working SaaS dropped to near zero. a16z counted 14,000+ AI-wrapped products launched in 2025. Most technically work. Most don't survive their first hundred users.

The differentiator used to be "can you build it." Everyone can build it now. The differentiator is "did you build the right thing in a way that doesn't fall apart." That used to be the implicit job description of a senior. It's the explicit one now.

If you run engineering, stop measuring on velocity. Velocity is free. Measure on the four blind spots. Do they ask about the business. Do they design for data movement. Do they think about tenant boundaries by default. Do they enumerate failure modes before writing code. That's what you pay for. The code itself you can almost get for free.

Use AI aggressively where it shines. Scaffolding, refactoring, test generation, boilerplate. The 5x speedup is real. Take it. Anyone telling you to ban Cursor in 2026 is fighting last year's war.

But move your seniors off the keyboard. Their job isn't writing code faster. It's making the decisions AI can't. Code review, threat modeling, architecture, customer calls.

Hire for the eye. Not "can you write code fast." Everyone can. Can this person read a product spec and ask the three questions that would have prevented the bug. That skill just got rare.

Good news for engineers: the part of the job that pays well got more valuable, not less. Bad news for the "we don't need engineers" crowd: you're about to find out which decisions weren't being made.