Methodology

The Generative Protocol

Why I don’t propose solutions until all five layers have requirements.

Stefan Kovalik · March 17, 2026 · 10 min read

In Make Me Think, I walk through five perception layers and a four-step diagnostic for reading a website the way a physician reads symptoms. That diagnostic, Feel, Unpack, Diagnose, Prescribe, tells you what’s broken and which layer is failing. This is the other half: how to build what’s right, not just diagnose what’s broken.

The diagnostic and the generative protocol are different cognitive operations. Evaluating a design is pattern recognition: you see what’s there, you feel the response, you trace the failure to its source. Generating a design is constraint satisfaction: you gather requirements from each layer and find the solution that satisfies all of them simultaneously. One is forensic. The other is architectural. They need different rules.

The Mistake

Early in my career, I kept making the same error. I’d see a problem and immediately propose a solution. Client’s homepage feels cluttered? “Let’s simplify the navigation and add a stronger hero section.” Checkout conversion is low? “Let’s reduce the form fields and add trust badges.”

Those aren’t bad recommendations. They’re reasonable, experienced, and often correct. They’re also contaminated.

In clinical diagnosis, Pat Croskerry calls this premature closure: locking onto the first measurable finding instead of the actual root cause (Croskerry, 2009). The design version is identical. The moment you propose a solution, your brain stops looking for requirements. You’ve collapsed the problem space into a single point. Instead of asking “what does this design need to do?” you’re now asking “does my proposed solution work?” Those are different questions, and the second one is dangerously easier to answer.

Confirmation bias takes over. You start seeing evidence that supports your proposal and filtering out evidence that contradicts it. Requirements that your solution doesn’t address become invisible, not because they don’t exist, but because you’re no longer looking for them. The solution has become the lens, and the lens only shows you what it can focus on.

Tversky and Kahneman (1974) identified the mechanism: the availability heuristic. People overweight problems that are mentally accessible (the pricing page everyone’s been arguing about) and underweight problems that are harder to articulate (the vague feeling that the homepage “doesn’t feel right”). Once you’ve proposed a solution, that solution becomes the most available mental object. Everything else recedes.

I watched this happen to myself dozens of times before I named it. I’d propose a solution in the first meeting, spend the next three weeks refining it, and then discover at launch that the real problem was two layers below where I’d been working. The fix was fine. It just fixed the wrong thing.

Rule Zero

Do not propose any solution until all five layers have been analyzed and requirements accumulated.

That’s it. One rule. Everything else in the generative protocol follows from this.

Solution-first thinking contaminates the requirement space. You stop seeing what the layers demand and start seeing what the solution provides. You pattern-match the solution against your expectations rather than deriving the solution from the constraints. And you miss requirements that no existing proposal addresses, because you never looked for them.

Rule Zero is hard to follow because experience works against you. The more projects you’ve done, the faster your brain pattern-matches to a familiar solution. Gary Klein’s research on Recognition-Primed Decision making (Klein, 1998) showed that experts in high-stakes environments almost never compare options the way decision theory says they should. They recognize the situation as a variant of something they’ve seen before and mentally simulate the first plausible response. That’s powerful for firefighters. It’s dangerous for designers, because the “first plausible response” skips the requirement-gathering that makes the solution correct rather than just familiar.

Senior designers are more susceptible to this than juniors, not less. Their pattern library is deeper, so the match happens faster, and the confidence that comes with experience makes it harder to hold the solution at arm’s length while you finish gathering requirements.

The discipline is not “don’t have ideas.” Ideas will show up. The discipline is “don’t commit to any of them until the layers have spoken.”

The Derivation

In practice, I state the design problem in terms of user experience, not features. “Users can’t find the primary action” is a design problem. “We need a bigger button” is a solution masquerading as a problem. This distinction matters because the solution might not be a bigger button. It might be fewer competing elements (Foundation), a stronger visual hierarchy (Layer 2), or a better trail (Layer 4).

Then I work each layer bottom-up, from the Foundation (L0) through Layer 4. For each layer, I ask three questions. What is the constraint (the biological or psychological hard limit this layer represents)? How does the current design violate it? What requirement does this generate?

The requirement is a “must,” not a “should.” The design MUST do this. “Should” is negotiable. “Must” is not. If working memory holds three to five chunks (Cowan, 2001), then a viewport that demands twelve is not a suggestion to simplify. It is a Foundation violation with a measurable cognitive cost.

The Foundation (L0): Can a first-time user complete the primary task without getting stuck? This is Sweller’s territory: Cognitive Load Theory (1988) distinguishes between intrinsic load (the complexity of the task itself) and extraneous load (the complexity your design adds on top of it). If navigation is ambiguous, if the information architecture reflects the org chart instead of user goals, if the user has to think about how to use the thing before they can think about whether to use the thing, Foundation is failing. Every unit of extraneous load your interface demands is a unit of working memory stolen from the actual task. Requirement: the structural layer must be invisible.

Layer 1 (First Impression): Does the page activate the right emotion in the first 50 milliseconds? The visitor’s System 1 fires a verdict before conscious processing begins. If that verdict is “cheap,” “cluttered,” or “I don’t trust this,” no amount of good copy downstream will reverse it. Requirement: the first impression must match the brand’s intended positioning.

Layer 2 (Processing Fluency): Is every visual and verbal element consistent, hierarchical, and easy to process? Reber and Schwarz (1999) demonstrated that processing ease directly affects judgments of truth: if it’s easy to process, it feels true. Typography, spacing, color, alignment. The things that make a page feel “professional” without anyone being able to name why. Alter and Oppenheimer (2009) showed that even minor fluency manipulations shift judgments of truth, confidence, and trust. The effect compounds: twenty slightly wrong details create a cumulative feeling of “something’s off” that no single element accounts for. Requirement: processing effort must be minimized so cognitive bandwidth is spent on the message, not the medium.

Layer 3 (Perception Bias): Does the design speak to what users respond to, not what they say they want? Nisbett and Wilson (1977) demonstrated that people regularly misidentify the causes of their own behavior. The gap between stated preference and revealed preference is where perception bias lives. Users say they want more options. Their behavior shows them converting on fewer. Requirement: design for observed behavior, not reported preference.

Layer 4 (Decision Architecture): Does the trail lead to conversion without manipulation? The path from “I see what this is” to “I want this” must feel like walking, not searching. Requirement: the decision trail must be clear, natural, and honest.

When all five layers have been analyzed, I have five requirements. All non-negotiable. The solution has to satisfy all of them simultaneously. When requirements conflict (and they sometimes do), lower-layer requirements win. The dependency stack is the tiebreaker.

Only then do I derive the solution. What satisfies requirement 1 AND requirement 2 AND requirement 3 AND requirement 4 AND requirement 5? If no single intervention covers all five, what minimal set does? The solution emerges from the constraint intersection, not from brainstorming. Not from competitive analysis. Not from “what would Apple do?”

This is slower than proposing a solution and iterating. Significantly slower. But it surfaces things that solution-first thinking systematically misses. The requirement that nobody had on their list because they were already thinking in terms of sidebars and carousels and hero images. The constraint that only becomes visible when you ask the layers in order.

The Gate

The Gate is the last piece. It runs before shipping. Each layer is pass/fail.

If any layer below Layer 4 fails, do not ship. Downstream layers cannot compensate for upstream failures. This is the dependency stack applied as a quality gate, and it’s the hardest one to enforce because of schedule pressure, because of stakeholder expectations, because shipping something imperfect feels better than shipping nothing.

Each layer has a specific gate question. Can a first-time user complete the primary task without getting stuck? That’s the Foundation (L0). Does the first impression activate the right emotion? That’s Layer 1. Is every visual and verbal element consistent? Layer 2. Does the design speak to what users respond to, not what they say they want? Layer 3. Does the trail lead to conversion without manipulation? Layer 4.

I’ve shipped work that failed the gate. I’m not going to pretend otherwise. Sometimes the client’s timeline wins. Sometimes the budget doesn’t allow for the iteration needed. But I name the failure. I tell the client which layer isn’t passing and what the likely cost will be.

That honesty is part of the practice. Berdichevsky and Neuenschwander (1999) argued that any technology designed to change attitudes or behaviors has an obligation to change them in the direction of truth. Not the designer’s truth. Not the client’s truth. Actual reality. The gate doesn’t exist to make shipping impossible. It exists to make failure visible and intentional rather than accidental and invisible. If I’m inflating perception above reality, I should know it and say it. That’s what The Oath demands.

Anti-Patterns

There are patterns of failure I see repeated across projects, across industries, across experience levels. Each one sounds reasonable until you see the downstream damage.

Solution-first. “We need a sidebar.” “We need a chatbot.” “We need to redesign the homepage.” These are all solutions. None of them are problems. When the team starts with a solution, every subsequent conversation evaluates that solution rather than exploring the problem space. Requirements get shaped to justify the solution instead of the solution being derived from requirements. This is premature closure (Croskerry, 2009) applied to design: locking onto the first measurable intervention and never looking past it.

Competitive copying. “Miro does X, so we should too.” “Our competitor has a three-column pricing page.” Competitors are solving their own problems in their own contexts with their own constraints. Copying their solutions imports their assumptions, and their assumptions might not match your users, your brand, or your layer stack. This is domain-driven design, not perception-driven design.

Velocity-first. “We need results this quarter.” Brand systems take months. Campaigns take weeks. When the pressure hits, the campaign always wins, and the product work gets pushed to next quarter. Then the quarter after that. Then never. The campaign won the quarter. The product would have won the year.

Checklist mode. “Does it have a hero section? Check. Does it have social proof? Check. Does it have a CTA above the fold? Check.” Evaluation by checklist catches obvious omissions but generates no insight. You can pass every item on a checklist and still have a page that feels wrong, because checklists evaluate components and perception operates on the whole. The Feel step exists for exactly this reason: the compressed System 1 signal catches what no checklist can.

Skipping layers. Jumping from “the bounce rate is high” to “let’s add more CTAs” without diagnosing which layer is actually failing. The bounce rate might be a Layer 1 problem (the page doesn’t look trustworthy), but the intervention is Layer 4 (decision architecture). The fix doesn’t address the failure. The layer stack is a dependency chain. You can’t skip links.

The generative protocol and the diagnostic are two halves of the same methodology. The diagnostic tells you what’s broken. The generative protocol tells you what to build. Both work the layers in order. Both refuse to skip ahead. Both treat the dependency stack as non-negotiable.

The full diagnostic is in Chapter 9 of the Make Me Think series. The foundation layer that everything else depends on is in Chapter 4. The ethical framework that governs both halves is in Chapter 11. If you want to run a quick version right now, the 5-Minute Perception Audit gives you one test per layer. And if you want to see the protocol applied to a live site, Forge runs it automatically.

Key Terms

Generative Protocol	The rule that no solution should be proposed until all five layers have been analyzed and requirements accumulated. The generative counterpart to the Feel, Unpack, Diagnose, Prescribe diagnostic.
Rule Zero	Do not propose any solution until all five layers have been analyzed and requirements accumulated. Solution-first thinking contaminates the requirement space.
Premature closure	Croskerry (2009). Locking onto the first measurable finding instead of the actual root cause. The diagnostic error that occurs when you skip from symptom to treatment without the intermediate steps.
Intrinsic vs. extraneous load	Sweller (1988). Intrinsic load is the complexity of the task itself. Extraneous load is the complexity your design adds on top. Your job is to eliminate every unit of extraneous load so the visitor’s working memory is spent on the task, not the interface.
The Gate	Pre-ship quality check. Each layer is pass/fail. If any layer below Layer 4 fails, do not ship. Downstream layers cannot compensate for upstream failures.
Availability heuristic	Tversky & Kahneman (1974). People overweight problems that are mentally accessible and underweight problems that are harder to articulate. Once you propose a solution, it becomes the most available mental object.
Solution-first	Anti-pattern. Starting with a solution instead of a problem. Every subsequent conversation evaluates the solution rather than exploring the problem space. The design version of premature closure.
Competitive copying	Anti-pattern. Importing another company’s design decisions without importing their context, constraints, or users.
Velocity-first	Anti-pattern. Choosing the campaign that wins this quarter over the product work that would win the year.
Checklist mode	Anti-pattern. Evaluating components instead of perception. You can pass every item and still have a page that feels wrong.
Skipping layers	Anti-pattern. Applying a Layer 4 fix to a Layer 1 problem. The dependency stack is a chain. You can’t skip links.

References

Tversky & Kahneman (1974)	Judgment under uncertainty: heuristics and biases. Science, 185(4157), 1124–1131.
Nisbett & Wilson (1977)	Telling more than we can know: verbal reports on mental processes. Psychological Review, 84(3), 231–259.
Sweller (1988)	Cognitive load during problem solving: effects on learning. Cognitive Science, 12(2), 257–285.
Klein (1998)	Sources of Power: How People Make Decisions. MIT Press.
Berdichevsky & Neuenschwander (1999)	Toward an ethics of persuasive technology. Communications of the ACM, 42(5), 51–58.
Reber & Schwarz (1999)	Effects of perceptual fluency on judgments of truth. Consciousness and Cognition, 8(3), 338–342.
Cowan (2001)	The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24(1), 87–114.
Croskerry (2009)	A universal model of diagnostic reasoning. Academic Medicine, 84(8), 1022–1028.
Alter & Oppenheimer (2009)	Uniting the tribes of fluency to form a metacognitive nation. Personality and Social Psychology Review, 13(3), 219–235.
Cowan (2010)	The magical mystery four: how is working memory capacity limited, and why? Current Directions in Psychological Science, 19(1), 51–57.

A note on how this was written: This post is AI-assisted. I provide the methodology, the case studies, and the editorial direction. AI helps me structure and draft. This is consistent with Perception-First Design’s own transparency principle: if I’m writing about perception, I should be honest about how the writing itself is produced.