Architecture discussions go sideways because everyone is reasoning at internally inconsistent altitudes, detailed where they’re fluent and hand-wavy where they’re not. C4 Level 2 is the altitude where nobody gets to hide, and that’s why it aligns humans and agents alike.

Everyone’s Altitude Is Already Inconsistent

The standard story about architecture and design discussions failing is that people disagree. My experience is the opposite: most of the time, people don’t disagree enough, because they’re not drawing the same kind of thing.

As engineers, we default to inconsistent altitudes inside our own heads. For the parts of the system we own, we’re zoomed all the way in: class names, method signatures, internal state machines. For the parts we don’t own, we’re zoomed all the way out: “the API,” “the queue,” “the service”, “the system”. This isn’t laziness; it’s the default operating mode for a brain that knows some things well and other things poorly or not at all.

What this produces, in practice, is a room full of people all drawing different diagrams in their heads while pointing at the same whiteboard. The auth specialist is committing to specific token structures. The frontend lead is drawing a single rectangle where three services actually live. The tech lead is nodding along because the parts they understand deeply sound rigorous and the parts they don’t understand deeply sound… fine, probably?

Devils hide in the details; behind unfamiliar domains, behind rectangles without boundaries, behind arrows without protocols. The architecture meeting that feels productive is often the one that let the most devils through.

C4 Level 2 Is Where the Abstraction Stops Lying

C4’s Level 2, the container diagram, is the altitude where abstraction runs out of places to hide.

A container, in C4 terms, is a process with a runtime: a web service, a worker, a database, a browser app, a mobile app, a queue broker. It’s something that gets deployed and runs on its own, not a class or a module. The arrows between containers are network calls: HTTP, gRPC, a message on a topic, a read from a database. They’re wire-level communication, not function calls or imports.

This altitude works because it’s the lowest rung where you’re still drawing boxes and arrows instead of designing code. Above it, in context diagrams and system landscapes, you can hand-wave. Below it, at the component level, you’re committing to implementation before you have to. Level 2 is the altitude at which end to end system architecture actually takes shape.

Creating a social contract to work at Level 2, you don’t get to draw a black box around the part you’re fuzzy on. Name the process. Name the protocol. If you can’t, that’s the conversation you need to be having.

Process boundaries aren’t a convention C4 invented; they’re how software has actually run for decades. The OS enforces them. Network calls aren’t a metaphor either; the wire enforces them. Level 2 is the least abstract abstraction, the last useful stop on the ladder before you’re just writing the code. Because it’s tied to physical reality, it can’t drop the details that drive complexity: latency, partial failure, ordering, backpressure, deployment boundaries, who gets paged at 3am.

Event-driven architectures are the cleanest demonstration of this. At Level 1, an EDA looks tidy: a few services reacting to events, clean separation, loose coupling. At Level 2 it’s a dozen processes, a broker, a schema registry, dead-letter queues, consumer groups, replay semantics, and an argument about at-least-once versus exactly-once that nobody wants to have. The complexity was always there; the altitude was hiding it. None of this is an argument against EDA. The point is that most EDA designs having emerging complexity because it’s more convenient to work at a higher altitude, and by the time you’re running one in production you’re discovering the complexity the hard way.

Agents like Level 2 for the Same Reason

This isn’t a fact about human psychology so much as one about reasoning under abstraction, and it applies to LLM agents too.

Ask an agent to “summarise the architecture for this system.” You’ll get something reasonable but patchy results: a description of key components and technologies. It’s not wrong, but it’s almost certainly incomplete and potentially misleading. The agent is doing exactly what engineers do when you don’t constrain the altitude: hiding in abstraction where it’s unsure, elaborating where it feels fluent.

Now try this prompt:

Map this system as a C4 container diagram. For each container, list: the process name, its runtime, and its responsibility in one sentence. For every connection between containers, specify: the protocol (HTTP, gRPC, message queue, shared DB, webhook), authentication method (if used), the direction, and the data being exchanged. Do not propose internal components, classes, or modules.

The output is different in kind. You get a list of processes that actually need to exist, with protocols that imply failure modes, and connections that raise real questions (“why is the notification service reading directly from the user DB?”) because the specific grounding at the right altitude makes the questions askable.

It’s the same mechanism as the human case. Agents, like humans, will hide in abstraction if you let them. The Level 2 constraint removes the hiding places by forcing commitments to operational reality. It’s also why vague architecture prompts produce vague architecture: the model is matching the altitude of the request, and the request never asked it to land.

How to Run a “Level 2 Session”

The in-meeting rules are simple. Enforce three things to stop architecture conversations drifting:

  • Name every process. “The notification service” becomes “the notification-dispatcher worker, running on ECS, consuming from the notifications topic.” If nobody in the room can fill that in, you’re just discovered the work that needs to be done first (Know Your Technology).
  • Name every protocol. HTTP, gRPC, a queue, a shared database read, a webhook: the protocol determines the failure mode, and the failure mode is the architecture. Explicitly label the AuthN/AuthZ methods used by the protocol; what interactions are required for it?
  • Stop before components. The moment someone starts naming classes, modules, or internal design patterns, you’ve dropped out of architecture and into design. That’s a different meeting with a different set of participants. End the drift out loud: “we’re into components, let’s note it and come back.”

Three simple rules with one social contract: nobody gets to hand-wave past the parts they don’t know, because those parts are exactly where the design decisions are at risk.

Fluency at This Altitude Is Table Stakes

Running a good Level 2 session is tactical. Making Level 2 the default in your team is strategic, and it hinges on a claim I’ll defend plainly: fluency with the Level 2 picture of your own system is what competent engineering looks like in the agentic age. If you and your team don’t have it, nothing else you do at the keyboard matters.

The work below Level 2 (the code, the docs, the runbooks, the test scaffolding) is increasingly something you supervise rather than author. That shifts what humans are actually for. You’re there to make the decisions the model can’t be trusted to make alone: which boundaries to draw, which failure modes are acceptable, which trade-offs to live with, when the model’s plausible-looking proposal is actually wrong. You cannot make those calls well without a solid working knowledge of what runs, where, and how the pieces talk. If you and your team can’t describe the system at Level 2 from memory, you’re not supervising the model, you’re rubber-stamping it, and the failure modes of a rubber-stamped system land in production at the same speed the model can generate code.

This is why you must own the architecture diagram rather than delegate it. Not because a capable agent can’t draw one (it can, and often does), but because the act of producing, arguing over, and maintaining the Level 2 picture is how the humans in the loop stay fluent in the thing they’re supervising. Hand the diagram to the model and you don’t save time; you save the effort that was building the competency you need. The diagram is the forcing function, and if you drop it the fluency goes with it.

To act on this insight is pretty simple - make the diagram the artifact of record.

Non-trivial architecture decisions aren’t decided until there’s a Level 2 diagram attached. No diagram, no decision. This sounds heavy-handed until you realize you’re already demanding it implicitly; you just don’t notice the cost of its absence until something breaks. Making it explicit is cheap when maintained consistently. Every design conversation run at Level 2 produces an artifact that anchors the next one.

Wrapping Up

  1. The failure mode isn’t disagreement; it’s inconsistent altitude. Engineers zoom in on what they know and zoom out on what they don’t. Create a social contract to ground at the same altitude so everyone in the room is speaking the same language.
  2. Level 2 is the least abstract abstraction. Processes and network calls are real, with the OS and the wire enforcing them. That’s why the altitude aligns humans, and why the same framing works on agents.
  3. Level 2 fluency is table stakes for good decisions. Humans in the loop exist to make the calls the model can’t be trusted to make alone, and those calls require working knowledge of what runs, where, and how the pieces talk. If you and your team don’t have the fluent, working knowledge of your system at this level, you’re trusting it to the agent’s probability distribution.

The next time an architecture whitebaording sessions starts to feel vague or go in circles, stop and ask what runs, where, and how the pieces talk. If the room can’t answer, you’ve found the meeting you should actually be having.