All writing

Every Abstraction Is a Debt You Haven't Been Billed For Yet

Every abstraction is borrowed complexity: a quiet debt that comes due at 3am, in the layer you chose never to learn. On staying ahead of the bill.

The framework is wonderful right up until it isn’t, and the bill always arrives during the incident.

At 3:14 in the morning the phone goes off, and the thing that’s broken is the one thing you were promised you’d never have to think about. Anyone who runs production systems knows this flavor on contact. The framework you adopted three years ago because it handled all the tedious plumbing, the one whose entire selling point was that you could ignore the layer beneath it, has failed somewhere down in that layer, and now you’re sitting in the dark reconstructing a mental model of machinery you deliberately chose never to learn. The abstraction was supposed to save you from exactly this moment. Instead it saved up everything it hid, with interest, and it’s handing you the invoice now, at the worst possible hour, in a part of the stack you have no map for.

I opened this series with a post about how every button hides a world, and how the power you get at a clean surface is always paid for in distance from the substrate underneath. This is the post about that bill. The distance isn’t free; it’s borrowed, and engineers already carry the right word for what you’re holding.

Abstraction is the best deal in engineering, and like most great deals it’s a form of borrowing. It takes some snarl of complexity and wraps it in a clean interface, so you can act through it without hauling around the weight of what’s beneath. That’s not a trick or a compromise; it’s the foundation of the whole discipline. Dijkstra said it best in his 1972 Turing lecture: the point of abstracting isn’t to be vague, it’s to create a new level where you can be absolutely precise. Nobody writes a web service by reasoning about voltages. The filesystem lets you say “open this file” without thinking about platters, sectors, wear-leveling, or the controller firmware cheerfully lying to you about all three.1 Virtual memory goes a step further and hands every process its own private, near-limitless address space, paging the real bytes in and out behind your back so you can pretend the machine has more memory than it does and that nobody else is touching it. We stand on those layers, each one absorbing the mess of the one below, and that standing is what lets us build anything large at all. The borrowing isn’t the problem. It’s the entire point.

The catch is repayment, and the word for it is already in the building: debt. Ward Cunningham coined the debt metaphor for software in his 1992 OOPSLA report, and he meant something humane by it. Shipping code that reflects your current, partial understanding is like taking a loan, he said, and a little debt is fine as long as you pay it down with a rewrite as you learn. (He went out of his way, in a 2009 video, to say he was never blessing sloppy code; the borrowing was honest, the interest was the part that stung.) That interest is the extra effort every future change costs while the gap between what you understood and what you now understand stays open. Point the same idea at the abstractions themselves and it still holds: every layer you adopt is a position taken on credit. You borrowed someone else’s model of a hard problem. Most days the loan is silent and you forget you’re carrying it. The day it surfaces, you pay, and the abstraction picks the hour.

It surfaces because nothing seals perfectly. A good abstraction holds almost all the time and then opens up at the worst possible moment, and the cruel part is what the opening reveals. While it held, it was saving you time. The instant it failed, the only thing that could save you was a real model of the layer beneath, and if you never built that model, the failure is precisely where you discover you skipped it. You were productive right up until the second you needed the understanding you traded away.

A lot of this debt hides in plain sight as the config nobody understands: a timeout that looks too long, a retry count that looks arbitrary, a flag everyone is quietly afraid to touch. Each one is the system paying interest on something it learned the hard way and then forgot it knew, the knowledge drained out of every human head while the setting kept the scar. (What happens when someone tidies one of those away is a whole post of its own. For now it’s enough to see it for what it is: debt, not clutter.)

The reason it compounds so quietly is that your daily loop never leans on the part the abstraction hides. The happy path works, the tests pass, the demo runs. Production is where the world actually puts its weight on the hidden layer, under load, under concurrency, under the malformed input nobody modeled, and the world is patient. A timeout that’s wrong by a factor of two costs nothing until a downstream service slows down. A retry without backoff is free until the thing it retries against goes away, and then the retries stack up and turn a recoverable hiccup into a sustained outage. A cache that quietly carries your read load is a gift until it restarts cold and a stampede of requests hits the database it was politely hiding. That private, limitless memory is free right up until your working set outgrows RAM and the machine starts thrashing, spending all its time shuffling pages and almost none of it running, the cliff Peter Denning mapped in 1970. Interest on an abstraction isn’t charged evenly; it accrues in silence and gets collected all at once, heaviest exactly where the stakes are highest.

The worst debts are the ones you can never refinance. Curtis Dunham and I have argued that even the instruction set, the supposedly clean line between software and silicon, quietly leaks microarchitecture: a detail as small as a fixed register count, once it’s written into the contract, becomes interest the whole industry keeps paying decades later, long after anyone remembers agreeing to it. I think now we had it only half right. Hiding the microarchitecture buys you flexibility, the freedom to build new machines behind the same interface, but it also starves the layers above. A compiler scheduling instructions, and these days the model driving that compiler, has to reason about the microarchitecture to schedule well, and an ISA that compresses those details away can leave its own users unable to make real use of it. Abstract too little and you weld in the past; abstract too much and you blind the future. Either way the bill arrives; the craft is knowing which details are safe to hide.2

None of this is an argument against abstraction, and I want to be clear about that, because the lazy reading is to distrust every layer and try to understand everything down to the metal, which is both impossible and pointless. Take the loan. Someone who refuses every abstraction ships nothing. The discipline isn’t abstinence; it’s knowing you’re carrying a balance, keeping a rough sense of where it sits, and not being shocked when it’s called. The engineers I trust most aren’t the ones who avoid abstractions, and they aren’t the ones who worship them. They’re the ones who can tell you roughly what their stack is hiding and where it’s likeliest to give, the ones who aren’t surprised at 3am because they always knew the bill was coming, just not the hour. That’s the same move I started the series with, only one level down: stay close enough to the world the interface hides that the day it shows itself, it’s a cost you’d already budgeted for, and not a stranger in the dark.

If your own stack has a setting nobody dares touch and nobody remembers adding, tell me about it. Those are the good ones.

Sources & Further Reading

Footnotes

  1. I’ll never get over how good the Unix and Linux “everything is a file” idea is. Your disk is a file, your terminal is a file, a running process is a directory you can read, a network socket reads like one, and the kernel will hand you the machine’s own memory and state as plain text under /proc. None of those things is “really” a file; the virtual file system is a layer of indirection draped over wildly different hardware and concepts, presenting one boringly uniform interface you can cat, grep, and pipe. It’s the cleanest demonstration I know of the old line that any problem in computer science can be solved by another level of indirection, usually credited to David Wheeler (the inventor of the subroutine) and made famous by Butler Lampson in his 1993 Turing Award lecture. The corollary even fits this post: except the problem of too many layers of indirection. File systems are just cool. If any of this makes you want to go properly deep, two references earn their shelf space: Michael Kerrisk’s The Linux Programming Interface (No Starch Press), which documents this file model down to the last system call, and Andrew Tanenbaum’s Modern Operating Systems, still the clearest account of why it all works the way it does.

  2. On a related note: an abstraction that’s genuinely context-free can be compressed with nothing lost, because each piece means the same thing no matter what surrounds it. The vector instructions we were complaining about weren’t context-free. Their real meaning lived in the loads and stores around them, the memory access pattern they belonged to, and that surrounding context is exactly what the abstraction threw away. Compress something context-free and you lose nothing; compress something context-dependent and you lose the one part that mattered.