The Real AI Question: How Do We Build Systems That Require Less Code?

The Real AI Question: How Do We Build Systems That Require Less Code?

In the race to embrace AI, we're asking the wrong question.

It seems like everyone is obsessing over code generation speed. "How fast can we build?" "How many lines per hour can a model write?" It's the AI equivalent of measuring a lawyer by pages written, not cases won. The truth is, faster code output is only meaningful if the code itself is necessary, correct, and maintainable.

But what if the better question is: How do we build systems that require less code in the first place?

AI is a Tool, Not a Shortcut

AI can help. It's a tool—one that can assist, augment, even accelerate aspects of development. But it doesn’t replace the fundamentals. Good architecture, clear thinking, system design, and business understanding—these MUST come first. They always have.

When companies allow AI to autopilot mission-critical systems without sufficient guardrails, they gamble with real-world consequences.

And devs that have worked in large teams know well how it goes:

  • Massive PRs
  • Get through the first quarter, 15 min have passed
  • Look at pipeline, all tests have passed
  • "Approved and merged"
But my team is very disciplined, Mr. Alonso

Yes, I'm sure. But things get through the cracks even with good team leads when they're reviewing 5 other PRs and attending design meetings.

Let's consider the following real and hypothetical scenarios:

Microsoft:

Microsoft's own AI tools provide sobering examples of what happens when AI becomes a shortcut rather than a carefully managed tool.

GitHub found 39M secret leaks in 2024. Here’s what we’re doing to help
Every minute, GitHub blocks several secrets with push protection—but secret leaks still remain one of the most common causes of security incidents. Learn how GitHub is making it easier to protect yourself from exposed secrets, including today’s launches of standalone Secret Protection, org-wide scanning, and better access for teams of all sizes.

GitHub Copilot, Microsoft's flagship coding assistant, demonstrates the real-world consequences of treating AI as a magic solution. Research shows that 40% of Copilot-generated code contains security vulnerabilities. More troubling, repositories using Copilot show a 40% higher rate of leaked secrets (API keys, passwords, credentials) compared to those without it—6.4% versus 4.6%.

This isn't just a theoretical risk. In 2024, over 39 million secrets were leaked across GitHub, with Microsoft's own analysis showing that Copilot can inadvertently reproduce and amplify existing security vulnerabilities in codebases. When developers use Copilot as a shortcut—accepting suggestions without rigorous review—they're not just writing faster; they're spreading security debt at unprecedented scale.

What if Microsoft was using AI to build internal system code in the same way?

JPMorgan:

JPMorgan is experimenting with AI to help modernize legacy systems. This might include translating COBOL code into modern languages like Java. But it would be immensely irresponsible of them to hand the reins to AI without oversight. These systems aren’t simple scripts; they’re the backbone of global finance, encoding decades of institutional logic and risk controls.

JPMorgan credits this AI tool for boosting software engineers’ efficiency by up to 20%
The bank already has about 450 potential cases for which it could use AI, and CEO Jamie Dimon expects those potential applications to surge to 1,000 by next year.

Is it technically possible? It might be:

Code Reborn AI-Driven Legacy Systems Modernization from COBOL to Java
This study investigates AI-driven modernization of legacy COBOL code into Java, addressing a critical challenge in aging software systems. Leveraging the Legacy COBOL 2024 Corpus -- 50,000 COBOL files from public and enterprise sources -- Java parses the code, AI suggests upgrades, and React visualizes gains. Achieving 93% accuracy, complexity drops 35% (from 18 to 11.7) and coupling 33% (from 8 to 5.4), surpassing manual efforts (75%) and rule-based tools (82%). The approach offers a scalable path to rejuvenate COBOL systems, vital for industries like banking and insurance.

But it doesn't come without risks:

AI-Generated Code is Causing Outages and Security Issues in Businesses
Businesses using artificial intelligence to generate code are experiencing downtime and security issues, according to Sonar CEO.

This is not to single out these companies—it's to highlight how normal this kind of behavior has become. The culture is shifting toward automating what we don't even fully understand ourselves. This can be dangerous.

The Real Metric: Systems Delivered, Not Code Written

What should we be measuring instead? Not "tokens per second" or "PRs merged per hour." Not even just "features shipped." We should be looking at secure, maintainable, and correct systems delivered per quarter. Systems that real people can trust. Systems that don't collapse when the team that built them leaves. Systems that don't require three engineers and an ops war room to debug every third week.

AI can help maintain these systems. It can assist with code summaries, upgrades, test generation (simple tests), log analysis, scaffolding, refactoring, and more.This alone increases the speed of development and helps ship code faster and onboard new team members more quickly. But it cannot—and should not—be expected to design, justify, or validate the very systems we rely on. That is what developers, QA, and business MUST do.

Less Code Is the Goal

The real power lies in reducing complexity—not generating it faster. Smaller, cleaner, well-thought-out systems are more valuable than large AI-generated ones. Systems that are simple, observable, and correct at their core require less maintenance, less documentation, and less heroism to operate. This code churn gets replaced with thinking and system design– which is still faster than undoing an AI generated nightmare long-term.

We should be asking:

  • What decisions can eliminate an entire layer of code?
  • What abstractions are truly necessary?
  • How do we design for correctness from the start?

These are questions that require human experience, domain insight, and rigorous thinking—not a prompt.