Draft — not yet published.

Primitive-Driven Design

The cost of discovering your primitives is now lower than the cost of maintaining architecture you didn't need.

April 21, 2026

A team, ten developers, Java Spring Boot SaaS with paying customers. Complex domain: billing, contracts, fiscal reports. They followed what they understood as Clean Architecture: multiple layers, interfaces for every repository, a use case class for every operation, ports, adapters, DTOs, mappers. The full ritual. Six months ago they added Claude Code to their workflow. Productivity should have doubled.

It didn’t. In some modules, it dropped.

A simple feature, adding a new field, touched eighteen files. The AI coding agent read all of them, consumed context, invented relationships that didn’t exist, sometimes duplicated logic because it couldn’t find the original. Cost per feature went up.

This essay argues for organizing code around what I call domain primitives: the smallest domain-native units that compose into real behavior. Git’s blob, tree, commit, tag. A billing system’s Charge, Period, Rule, Document, Event. A payment terminal’s PaymentMethod, PaymentKernel, UiStep. Few enough to name, powerful enough to build everything from.

This isn’t just an anecdote. A February 2026 paper, The Navigation Paradox in Large-Context Agentic Coding (Paipuru, arXiv:2602.20048), measured the problem on a FastAPI codebase using Claude Sonnet. When files are connected through dependency injection containers rather than direct imports, the agent correctly identifies only 76% of the required files. One in four critical files, missed silently. When given a specialized tool to navigate these hidden connections, the agent ignored it 58% of the time. A rational heuristic: “glob and read already give me 80%, the extra tool call isn’t worth it.”

An important distinction. This is about container-based DI, runtime wiring invisible to static analysis. Not constructor injection with explicit imports, which remains fully traceable. The problem is architectural invisibility, not dependency inversion as a principle. Robert Martin’s Dependency Rule and Open-Closed Principle remain sound ideas. What the data exposes is the distance between those ideas and what most teams actually build: the eighteen-file version, the interface-per-single-implementation, the mandatory ceremony that the original authors never prescribed.

The paper studied a single Python codebase with a single model, so the specific numbers shouldn’t be overgeneralized. But the pattern is structural. Every file in your project is a token in the agent’s context. Every indirection is a hop the agent must trace. Every abstraction without a real purpose is noise competing with the logic that matters. These costs existed before AI. Now they’re measurable.

The expensive part isn’t building software anymore. It’s knowing what to build. What if we organized code not by architectural layers, but by the domain’s own structure?

The pendulum trap

The obvious reaction is to flatten everything. One module, procedural code, no abstractions. Ship it.

This works on day one. It erases the domain on day thirty.

A team I work with built a payment terminal HAL in Rust. The domain: abstract the payment EMV kernel, which is different for every manufacturer, so the product has no vendor lock-in. The early architecture modeled the domain explicitly. A PaymentMethod trait for EMV, PIX, loyalty points. A PaymentKernel trait per manufacturer. A UiStep enum for what the screen should show next. And a core state machine driving the payment flow, which was solid work by the team. Each payment method lived in its own crate. Rust’s feature flags handled compilation per manufacturer target.

The team’s concern was legitimate. This looked like a lot of structure for a system that initially needed only one payment method. The decision was to simplify into one procedural crate now and add the trait-based architecture later, in a dedicated iteration. This is a common and reasonable position: separate feature delivery from architectural work, tackle them in different sprints.

The disagreement was about when. The architecture wasn’t speculative. The company’s strategy explicitly required multi-manufacturer and multi-payment-method support. The traits weren’t future-proofing. They were modeling the domain as it already existed in the requirements. Deferring them meant building the domain once without its structure, then rebuilding it with structure later. Two passes instead of one.

In the procedural implementation, the domain disappeared. Functions calling functions, conditionals branching on payment type, UI logic tangled with protocol logic. The system no longer looked like a payment terminal HAL. It just looked like code. You couldn’t point at a module and say “this is what a payment method is.”

After debate, the team aligned on the trait-based design, not because anyone called it “primitive-driven,” but because the domain demanded it. Then the real test came. The bank was hosting a stand at a franchise event and wanted a creative activation: customers navigate journeys at the stand, earn points (a currency we called PACTs) and spend them at a company store using our payment terminals.

Since we already had a ledger built for the BTG Pay product, we dogfooded our own infrastructure. Created a namespace on the ledger for PACT wallets, and on the spending side, implemented a PACT payment method on the HAL. One new crate, hal-pact, implementing the existing PaymentMethod trait. Zero changes to the core, zero changes to the Flutter UI, zero changes to the EMV payment flow. The team added a new payment method like plugging in a cartridge. And strategically, even before BTG Pay terminals launched for external customers, attendees saw the bank relying on its own payment infrastructure for the event.

The UiStep enum is worth pausing on. It inverts the dependency. The HAL drives the Flutter UI, not the other way around. Flutter doesn’t know about payment methods. It knows about UI steps: render a QR code, show an amount, redirect to another app. New payment method needs a QR code screen? Returns UiStep::QrCode. Flutter already handles that. Most new payment methods need zero UI changes because they reuse existing step variants.

The traits weren’t overengineering. They were the domain. A payment terminal that supports multiple manufacturers and multiple payment methods IS a system with those polymorphisms. Deferring the structure didn’t save work. It doubled it.

A subtler version of the trap: KISS and YAGNI, weaponized together.

KISS says keep it simple. YAGNI says don’t build what you don’t need. Both are correct. But applied carelessly, they become a rhetorical trap. “We don’t need that trait, YAGNI. Just put it in one function, KISS.” The domain’s real complexity gets hidden behind a claim of simplicity.

The problem is the word “simple.” Rich Hickey drew the distinction that matters here: simple means few things braided together. Easy means familiar, close at hand. A five-branch case statement mixing dispatch, validation, and protocol logic is easy. You know how to write it. A trait with five separate implementations is simple. Each piece does one thing, nothing tangled. KISS should mean “keep it simple” in Hickey’s sense, not “keep it easy.”

Sometimes a concept genuinely has two strategies. A pack registry that resolves domain packs from either the local filesystem or a git repository. In Elixir, a Behaviour (the language’s interface construct) models this. In Rust, a trait. In Go, an implicit interface. These constructs are free. They’re part of the language, not an architectural framework. Choosing not to use them means writing your own ad-hoc dispatch: a conditional chain, a type tag, a string comparison. You build a worse version of what the language already provides.

YAGNI and KISS should apply to ceremony. Don’t add layers you don’t need, don’t introduce patterns for hypothetical futures. But the domain’s complexity isn’t ceremony. It exists whether you model it or not. A payment terminal with three payment methods has three strategies. That’s not added complexity. That’s the problem. Using the language’s first-class tools to express it costs near zero. Avoiding them costs you an inferior substitute.

The question was never “should we abstract?” It was always “does this abstraction correspond to something real in the domain?”

What the best systems have in common

Git has four object types: blob, tree, commit, tag. From those four things you get version control, branching, merging, rebasing, bisecting, and, as it turns out, issue tracking. The power isn’t in architecture around Git. It’s in the choice of primitives.

Unix has one I/O abstraction: the file descriptor. Everything is a byte stream. Disk, terminal, network, pipe, device. From that single primitive and the pipe operator, you get an entire operating system philosophy.

SQL has three concepts (relation, tuple, attribute) and every database query ever written composes from them. WordPress has six primitives: Post Type, Taxonomy, Term, Meta, Block, Hook. A WooCommerce product is a Post Type. A product category is a Taxonomy. Price is Meta. The checkout composes through Hooks. WordPress’s code quality is legendarily bad. Its primitive design is legendarily good. These are independent facts. Six primitives power 43% of the web.1

Different paths. Mathematical analysis for Git, practical evolution for WordPress. None followed a methodology. What they share is the outcome: a small set of composable units that could express everything the domain needed to say.

I call this pattern Primitive-Driven Design. Not because I invented it (these systems existed long before this essay) but because naming it makes it teachable, debatable, and something you can aim for deliberately rather than arrive at by accident.

If this name ever becomes a certification program or a framework with a dependency injection container, it will have failed at its purpose. The goal is a question you ask, “can I name my primitives?”, not a process you follow.

Two things this is NOT, to get ahead of the obvious objections. It is not a claim that all software should be modeled this way. Integration domains, systems that translate between fifteen external APIs, genuinely need adapter layers. And it is not a claim that finding the wrong primitives is costless. Commit to the wrong set in month two, discover it in month eight, and every composition needs reworking. More on both later.

Prior art and an honest reckoning

Clean Architecture and Domain-Driven Design were legitimate attempts to solve a real problem: how do you model a domain faithfully and protect that model from infrastructure concerns? Both Evans and Martin were trying to help teams build software that looks like what it models. That instinct was right. The problem isn’t the principles. It’s what happened to them on the way to production.

Framework authors, tutorial writers, and conference speakers turned the principles into ceremony. Evans’ Bounded Contexts became folder structures. Martin’s Dependency Rule became mandatory interface-per-implementation. The intent (model the domain faithfully) got buried under the ritual (organize files in concentric circles). Teams started measuring compliance to the pattern rather than fidelity to the domain. That’s the thing this essay pushes back against: not Clean Architecture as Martin conceived it, but the cargo-cult version that most teams ship.

A language confound worth naming: The eighteen-file ceremony is specifically the shape Clean Architecture takes in Java and C# where the framework forces you to create interface-implementation pairs for DI container wiring. In Rust, a trait is free. In Elixir, a Behaviour is free. In Go, interfaces are implicit. The same domain model that costs eighteen files in Spring costs three in Elixir. Some of what looks like an architecture problem is actually a language-ergonomics problem. PDD’s feasibility is partly a function of language choice, and I should be honest about that.

None of the underlying ideas are new. Evans called it Distillation in Chapter 15 of the DDD book: isolate the Core Domain, the smallest model that captures essential complexity. What I call “primitives,” he calls the Core Domain after distillation. Martin’s Open-Closed Principle already says extend without modifying, which is exactly what the PaymentMethod trait does. Hickey’s “Simple Made Easy” gave me the language to explain why a trait with five implementations is simpler than a case statement with five branches.

Joe Armstrong’s PhD thesis on Erlang is almost literally a PDD manifesto: process, message, supervision. Three primitives composing into fault-tolerant distributed systems at industrial scale.

What’s new is the cost argument. Before 2024, architectural ceremony cost you cognitive load, which is subjective and easy to dismiss. Now it costs you tokens, navigation accuracy, missed files. The Navigation Paradox paper put numbers on what good architects have always felt. That’s the contribution here. Not a new design instinct, but a new reason to follow an old one.

This pattern didn’t come from studying architecture theory. It came from working across five domains and watching the same shape appear each time.

It started with git. Studying the internals, I realized four object types (blob, tree, commit, tag) power everything git does. That led me to build git-native-issue: issues mapped directly onto those objects, manipulated with plumbing commands (git’s low-level object tools), no translation layer. Issue is a tree, state is a ref, history is a commit.

Then at BTG Pactual. When Miano, who led the digital banking initiative, brought me on five years ago, the vision was to be the financial operating system for corporate clients. Last month, preparing a talk on software architecture at an event for our partners, I looked at the mature platform and recognized the banking primitives: accounts, transfers, payments, investments. And the vision had evolved. We provide the banking primitives. Our partners build the operational primitives for each vertical they serve. The talk became about positioning partners as co-constructors of that system, not just API consumers.

Then StateCraft, a personal project. A deterministic engine for reasoning under uncertainty, built because AI produces plausible output without separating what it knows from what it infers from what someone claimed. The primitives that emerged (Entity, Fact, Belief, Lens, Transition, Scenario) were a direct response to that problem. Domain-agnostic: I use it for geopolitics and markets, but the engine doesn’t care.

The payment terminal HAL was the latest confirmation. A different team, a different language, a different domain. Same pattern.

I didn’t theorize PDD and then apply it. I applied it five times and then named it.

The diagnostic

The practical test: can you name your primitives?

If you can point at your codebase and say “these are the four things, and everything else is a composition of them,” your architecture is probably healthy. If you can’t, if features require new entities rather than new compositions, the primitive set is either incomplete or hasn’t been discovered yet.

Think of it as a fitness function: can this feature be expressed as a composition of existing primitives? When the answer is no, you’ve found either a missing primitive or a genuine new domain concept. Either way, you find it during delivery, not in a planning phase.

A second test: does the system look like the thing it models?

When you read a StateCraft module, you see Entities, Facts, Beliefs, Lenses, Transitions. The code looks like what it is. When you read git-native-issue’s source, you see git objects manipulated with git plumbing commands. The code looks like what it is. When the system looks like its domain, the primitives are right. When it looks like layers and adapters and mappers, you’re looking at the architecture, not the problem.

Every domain has primitives

A common objection: “Sure, Git and Unix have elegant primitives, but my domain is just CRUD.”

CRUD is a set of operations. It’s not a description of the domain. Create, Read, Update, Delete are what you DO with primitives. They’re not what the primitives ARE.

Take billing. A domain everyone considers “just CRUD.” What are the primitives?

Five primitives. A subscription is a recurring Charge with a Period cadence. A discount is a Rule that reduces a Charge. Tax is a Rule on a Document. Payment is an Event on a Document. Every billing feature composes from these five things.

In the cargo-cult version of Clean Architecture that most teams end up building, adding a new discount type touches the use case, the entity, the repository interface, the repository implementation, the DTO, the mapper, and probably the tax calculation because discounts affect the tax base. Seven files. Martin didn’t prescribe this. But this is what gets built.

With the right primitives, adding a new discount type is a new Rule implementation. The composition engine already knows how to apply Rules to Charges. One file, plus a test.

Fair objection: that composition engine IS real complexity. Finding the primitives is the easy part. Composing them is the hard part. A late-arriving usage event that crosses a period boundary and triggers a retroactive discount is genuinely hard. In the primitive model, that’s an Event creating a new Charge in a previous Period, re-applying Rules, regenerating the Document. It’s not simple. But you can follow the chain. In a layered architecture, the same operation scatters across the use case, the discount service, the tax recalculation service, the invoice regeneration service, and the event bus. The complexity is the same. The traceability is not.

Stripe proves this. Stripe’s API is primitive-driven: Charge, PaymentIntent, Invoice, Subscription, Price. And Stripe is still legendarily complex. The composition rules (webhook ordering, idempotency, the Balance Transaction model) are hard. PDD doesn’t promise that composition is easy. It promises that complexity lives where you can see it, in explicit rules over named primitives, rather than scattered across layers that exist for organizational reasons.

So yes, you still need to think carefully about how primitives compose. That thinking IS the architecture. The difference is that the thinking happens in domain terms (how do Rules apply to Charges across Periods?) rather than in infrastructure terms (which service calls which service through which event bus?).

The pattern holds beyond billing. Notion found Block, Page, Database, Relation, View. Linear found Issue, Project, Cycle, View, Workflow. Excel found Cell, Formula, Range, Sheet. The question isn’t whether your domain has primitives. It’s whether you’ve found them yet.

One thing worth saying plainly: if you’ve found the right primitives and you know how they compose, the architecture is already there. You don’t need a separate sprint for it. A new discount type is a new Rule implementation. Shipping it tests the primitive set. If the Rule behaviour handles it cleanly, the architecture is confirmed. If it doesn’t, you’ve found where the primitives need to evolve. The primitives you compose today are the same ones the next feature will compose tomorrow. That’s not a separate cost. That’s just building software.

Why AI makes this urgent

Before 2024, bad architecture cost you time. Now it costs you tokens, and the bill is itemized.

The Navigation Paradox paper (Tarakanath Paipuru, arXiv:2602.20048) found that container-based dependency injection creates connections invisible to static analysis. These hidden dependencies cause the agent to miss roughly one in four critical files. The paper’s own solution is CodeCompass, an MCP (Model Context Protocol) server that exposes the AST dependency graph so the agent can navigate structural connections. When the agent uses it, coverage jumps to 99.5%. The problem is adoption: 58% of the time, the agent skips the tool entirely.

Paipuru’s finding cuts two ways. One response is better tooling: give the agent a graph and it navigates fine. Another is architecture that makes the graph trivially visible to begin with, so the agent doesn’t need a special tool. PDD is the second bet. Explicit composition (imports, behaviour implementations, protocol dispatch) gives the AST the full picture without extra infrastructure. The agent follows domain relationships, not layer boundaries.

Both responses are legitimate. I’m not arguing the paper vindicates PDD. I’m arguing PDD makes the problem the paper identified less likely to occur.

Runtime process registries, database triggers, environment-variable-driven config, dynamic message passing: these remain invisible to static analysis in any architecture. PDD reduces hidden dependencies. It doesn’t eliminate them.

The context efficiency gain is real but varies. Less ceremony means more domain logic per context window.

The bigger win is constraint. Tell an AI “this system has five primitives: Charge, Period, Rule, Document, Event” and it becomes immediately obvious when it proposes a sixth. You catch it in review. The primitives make scope violations visible. Without them, the AI wires anything to anything and you don’t notice until the PR is already up.

A project’s context document, whatever guides the agent, becomes the domain description: “here are the primitives, here are the composition rules.” That’s more useful than “we use Clean Architecture with ports and adapters.” Though navigation hints like file conventions and naming patterns remain equally important for day-to-day agent performance.

What this doesn’t fix: test generation, multi-step refactoring, long-horizon planning, data migrations. Those stay hard regardless of how you organize code. Primitive discovery helps with navigation, not everything.

The overengineering test

A testable definition: an abstraction is overengineered when it doesn’t correspond to a real category in the domain.

A Behaviour that models a real polymorphism, like two pack resolvers. A Result type that models real uncertainty, because this query might actually fail. An enum that models real variants, because a charge can be one-time, recurring, or usage-based. These are the domain expressed in the language’s own tools.

An IUserRepository interface for a single implementation. A CreateInvoiceUseCase class wrapping what could be a function call. A mapper translating between three identical DTOs. Ceremony. The abstraction exists because a template said so.

The test is one question: does this model something that exists or varies in the domain, or does it exist because of an architectural pattern?

Finding your primitives

Primitives aren’t invented. They’re discovered. The discovery requires understanding the domain before abstracting it.

Five strategies, from studying how the most successful primitive sets were found:

Mathematical reduction. What is the formal structure? Git started from content-addressable storage. SQL started from set theory. If your domain has a known mathematical foundation (finance, scheduling, routing) the math gives you the primitives.

Unifying abstraction. What do these seemingly different things share? Unix found that disk, terminal, and network all share byte-stream semantics. WordPress found that posts and pages are both “content with a type.” When you’re building your third similar-but-different thing, look for the common structure.

Domain interrogation. What things exist in this domain? What can vary? What states are possible? How do things combine? How do they change over time? Ask these questions directly and map each answer to a construct. StateCraft found its six core primitives this way: Entity (what exists), Fact (what’s observed), Belief (what’s estimated), Lens (how you look), Transition (how things change), Scenario (what could happen).

Substrate mapping. What does your platform already give you? Don’t invent new abstractions when battle-tested primitives exist. git-native-issue mapped issues onto git’s four object types using only plumbing commands. No translation layer, no ORM. The system looks like the thing it models. If your substrate’s primitives fit your domain, use them directly.

Generalization under pressure. This new thing looks like that existing thing, can you generalize? The most common and safest strategy. Don’t generalize on the first instance. Don’t generalize on the second. When the third similar thing appears, extract the primitive. WordPress evolved this way over years.

The process is closer to scientific hypothesis-testing than to engineering specification. Start flat. Write procedural code. Collect instances. Then question the domain: what exists, what varies, what composes. Identify candidate primitives. Validate: can every feature be expressed as composition? Express them natively in your language’s first-class constructs. If the mapping feels forced, either the primitives are wrong or the language isn’t the right fit.

One caveat. “Start flat” assumes you don’t yet know what the primitives are. When the domain polymorphism is already real and contractual, as it was with the payment terminal (multi-manufacturer, multi-payment-method from day one in the requirements), deferring structure doubles the work. The flat phase is for when you’re still discovering. Not for when the domain has already told you.

How many primitives? Doesn’t matter as much as whether each one is independent. Three might be right. Twelve might be right. You write concrete code to discover the domain’s primitives, not to avoid them.

Where this breaks

PDD has failure modes, and they’re worth naming before someone discovers them in production. Pretending otherwise would repeat Clean Architecture’s mistake of prescribing one answer for every context.

It applies within a bounded context, not across an entire system. An ERP with eight hundred tables isn’t one primitive set. It’s fifteen bounded contexts, each with its own. Strategic DDD comes first.

Integration domains genuinely need adapter layers. When you translate between fifteen external APIs, each with its own data model, the domain IS adaptation. Ports and Adapters isn’t ceremony there. It’s the domain truth.

And wrong primitives are worse than no primitives. Commit to the wrong set in month two, discover it in month eight, and every composition that used the wrong primitive needs reworking. The mitigation is patience: generalize under pressure, wait for the third instance, treat early primitives as hypotheses not commitments.

You discover primitives through building, and sometimes you get them wrong. The wrong hypothesis discovered early beats the wrong hypothesis discovered late.

One thing I’ll say for PDD’s failure mode: it’s loud. Features that don’t compose tell you immediately. Ceremonial architecture fails quietly. Eighteen files “work” whether they need to or not. The cost just… accumulates.

Primitives, not layers

The architecture debate has been stuck for a decade. One camp says add layers for safety. The other says flatten for simplicity. Both organize code by technical concerns, where things go, rather than by what the domain is.

Primitive-Driven Design changes the question. Not “how many layers?” but “what are the composable units of this domain?” Not “should we abstract?” but “does this abstraction model something real?”

The approach isn’t new. Git did it. Unix did it. SQL did it. WordPress stumbled into it. OTP did it with processes, messages, and supervision trees. Nobody named it.

The AI era made the cost of ignoring it visible. Every unnecessary file is tokens. Every hidden dependency is a missed edit. Every ceremonial abstraction is noise where domain truth should be.

Find your primitives.


Emerson Soares builds StateCraft, a deterministic reasoning engine for uncertainty, and git-native-issue, an issue tracker built on git’s four object types. He writes at remenos.codes.


Authoring note. Drafting assistance by Claude. All arguments, examples, judgments, and claims are the author’s. AI was used for rendering (organizing, rephrasing, tightening), not for originating the thesis or authoring claims.

Footnotes

  1. W3Techs, “Usage statistics of content management systems,” April 2026.