Compaction Is a Financial Strategy

Why smaller codebases win in the AI era

December 27, 2025

The cheapest system in the AI era is not the one that never changes. It is the one who parts can be cheaply regenerated because they are small and decoupled.

This claim sounds like architecture-conference wisdom, the kind of thing consultants say to justify rewrites. But something has shifted. The emergence of AI-assisted development has transformed code size from an aesthetic concern into a direct economic variable. Context windows have budgets. Tokens cost money. Every line of code you keep is a line you pay to process, again and again, every time you ask a model to reason about your system.

Compaction—the deliberate practice of making systems smaller—was always the quiet secret behind sustainable software. AI has simply made the economics impossible to ignore.

The Hidden Cost of Keeping Code

Most organizations dramatically underestimate how expensive it is to keep code. Not to write it—that cost is visible in salaries and sprints. Not to run it—that cost shows up in hosting bills. The hidden expense is keeping it: the ongoing cognitive and computational tax imposed by code that exists.

Consider what happens every time an engineer touches a large codebase. Before they can make a change, they must build a mental model of the relevant subsystems. This takes time—sometimes hours, sometimes days. The more code exists, the longer this ramp takes. Senior engineers hesitate to modify things they don't fully understand. Junior engineers make changes without understanding, introducing subtle bugs. The phrase "I'm not sure what this does, so I won't touch it" represents real operational risk, but it rarely appears in any budget.

AI compounds this problem in a new way. When a model assists with development, it reasons over whatever context you provide. Large codebases exceed context limits constantly, which means every prompt becomes a lossy compression of your actual system. The model sees fragments. It infers relationships. It guesses at conventions. Sometimes it guesses wrong, and you pay for those mistakes in debugging time.

There is a harder version of this objection that deserves acknowledgment: context windows are growing rapidly. Gemini offers two million tokens. Competitors are racing to match or exceed that figure. Why worry about code size when context is becoming effectively unlimited?

The answer is that capacity and quality are different things. Attention mechanisms degrade with noise regardless of window size. Retrieving relevant information from a massive context is itself a lossy process. More hay does not make the needle easier to find—it makes the search more expensive and less reliable. The constraint is not how much a model can hold but how well it can reason over what it holds. Smaller, cleaner inputs produce better outputs. This remains true whether the window is 100,000 tokens or 100 million.

What Legacy Systems Already Proved

None of this is entirely new. The software industry has been running a decades-long experiment on what makes systems survive, and the results point in a consistent direction.

Large systems fail for a boring reason: humans cannot reason about them. The so-called bus factor—the risk that key knowledge walks out the door when certain people leave—is usually described as a people problem. But it's really a surface-area problem. When only one person understands a system, it's almost always because the system is too big, too implicit, and too entangled for shared comprehension. The knowledge concentrated in that person's head is a symptom of architectural failure, not its cause.

The systems that survived longest tended to share certain characteristics: flat data models, explicit workflows, minimal abstraction, and code that repeated itself rather than hiding behind clever indirection. This last point requires clarification, because it seems to contradict the case for compaction. If repetitive code is good, doesn't that mean more lines, not fewer?

The distinction is between two kinds of complexity. Accidental complexity is bloat—code that exists because of historical accident, defensive layers accumulated over time, abstractions that obscure more than they clarify. Essential complexity is the irreducible difficulty of the problem domain itself. Compaction targets the former, not the latter. A system with explicit, even somewhat repetitive business logic can be smaller than a system with elaborate abstraction hierarchies, because the abstractions themselves consume space and impose cognitive load. The goal is not minimum character count. The goal is minimum semantic complexity: the smallest system that does the job while remaining comprehensible to both humans and machines.

Legacy systems that survived were not the ones built with the most sophisticated architectures. They were the ones that fit in people's heads.

The Politics of Deletion

If compaction is so valuable, why don't more organizations practice it? The technical answer—that deletion is risky and requires deep understanding—is part of the story. But the more important barrier is organizational.

Code has authors. Authors have feelings and careers. Managers who approved code have reputations attached to it. Deleting a system is not just a technical act; it is a political act that implicitly criticizes past decisions. This is why deprecation efforts so often stall. The engineer who proposes removing a subsystem must navigate a minefield of organizational sensitivities while also taking on technical risk. If the deletion goes wrong, they own the outage. If it goes well, the reward is invisible—the absence of problems that would have occurred otherwise.

There is also the Chesterton's Fence problem: code often exists for reasons that are no longer documented or understood. That strange conditional? It handles an edge case that caused a production incident four years ago. The seemingly redundant validation? It compensates for a bug in a third-party library that was never fixed. Deleting such code requires either deep institutional knowledge or the willingness to rediscover these constraints the hard way.

This is why compaction, in practice, is a senior-engineer activity. It takes experienced judgment to distinguish load-bearing complexity from accumulated sediment. The cost of that judgment is real. In the short term, it is often cheaper to work around old code than to remove it. The long-term costs of that choice are diffuse and easy to ignore until they become critical.

AI Sharpens the Imperative

What changes in the AI era is that the costs become measurable in ways they never were before.

Cognitive load was always real, but it was hard to quantify. How do you put a number on "the engineers are confused"? Token costs are different. Every unnecessary line of code increases inference cost: more tokens to load, more ambiguity to resolve, more paths for the model to evaluate. When you remove code, you can see the prompt shrink. You can measure the reduction in API calls. Deletion has ROI you can put in a spreadsheet.

The effect goes beyond cost. AI agents—systems that take actions based on model outputs—behave differently in large versus small codebases. In complex environments, agents get stuck in loops. They hallucinate libraries that don't exist. They make changes that break unrelated subsystems because they couldn't see the full dependency graph. Compaction is not just about efficiency; it's about reliability. Smaller systems produce more consistent agent behavior because there's less room for the model to get confused.

This creates a new kind of feedback loop. Teams that maintain compact codebases get more value from AI assistance. That increased value creates resources and motivation for further compaction. Teams with sprawling codebases struggle to use AI effectively, which means they have less capacity to clean things up. The gap between well-maintained and poorly-maintained systems will widen as AI capabilities improve.

The Design of Deletable Systems

Systems that can shrink are systems designed for deletion. This is not the same as systems designed for change, though the two overlap. The critical property is what might be called clear seams: boundaries between components that allow removal without collapse.

Loose coupling is often presented as an architectural virtue in its own right, a marker of good design. In the context of compaction, it's better understood as a prerequisite for deletion. A tightly coupled system cannot shrink because removing any part damages the whole. A loosely coupled system can shed components the way a healthy organization can lose employees—with adjustment, but without crisis.

Replacement beats refactoring for a related reason. Refactoring preserves historical constraints. It says: this code has problems, but its fundamental structure represents decisions worth keeping. Replacement discards those constraints entirely. When regeneration becomes cheap—when you can describe what you want and get working code quickly—carrying forward old decisions becomes increasingly irrational. The sunk cost fallacy, already a problem in software, becomes even more expensive to indulge.

Deletion is the most underrated operation in software. It eliminates unknown behavior. It collapses state space. It restores comprehensibility. And unlike refactoring, it cannot introduce new bugs in the code it removes. The systems that endure will not be the ones that grew most carefully. They will be the ones that learned how to remove safely.

Where This Leads

If compaction lowers cost and improves AI reliability, then regeneration replaces maintenance as the dominant strategy for certain kinds of software. This is a more radical shift than it might first appear.

Maintenance assumes preservation. Its central question is: how do we keep this system working while making necessary changes? The answer involves careful modification, extensive testing, and respect for existing structure. Maintenance treats code as an asset to be protected.

Regeneration assumes discard. Its central question is: how do we describe what this system should do so we can rebuild it when needed? The answer involves clear specifications, good tests, and confidence that reconstruction will work. Regeneration treats code as a byproduct of understanding—valuable, but not precious.

Not all software can or should be regenerable. Critical infrastructure, systems with hard-won safety properties, code that encodes institutional knowledge built up over years—these may always require careful maintenance. But a surprising amount of software is more like scaffolding than like cathedrals. It exists to solve a problem at a moment in time. When the problem or the context changes, the old solution may have less value than a fresh one.

The shift raises uncomfortable questions. If code is disposable, does quality still matter? If understanding lives in prompts and specifications rather than implementations, what happens to the craft of programming? These are genuine uncertainties, not rhetorical flourishes. The definition of quality may be moving from "durability" to "regenerability"—from code that lasts to code that can be reliably reproduced. What that means for how we train engineers, evaluate systems, and think about software as a discipline is not yet clear.

What is clear is that the economics have changed. Compaction was always wise. Now it is also profitable in ways you can measure. The organizations that figure this out first will find themselves with systems that are cheaper to run, easier to understand, and better suited to AI assistance. The ones that don't will be paying a tax on every prompt, forever, for the privilege of carrying code they no longer need.

Subscribe to The Phoenix Architecture

to get updates in Reader, RSS, or via Bluesky Feed