Notes · Essay

You Wouldn’t Vibe Code a Car

The internet found a format that respected how brains work, used it for five years, then abandoned it for one that doesn’t.

Stefan Kovalik · March 13, 2026 · 18 min read

A friend sent me a link last week. "Just watch this." Sixty seconds of someone talking into a camera about a topic I could have read about in eight.

I didn’t watch it. I almost never do. Reels, TikToks, YouTube Shorts, Instagram stories. I avoid them the way other people avoid spam calls. They frustrate me in a way I couldn’t articulate for a long time, a low-grade anger that shows up the moment someone sends a video and expects me to stop what I’m doing and hand over my full attention to find out if the payoff is worth it.

I finally figured out what the anger is. It’s my brain doing cost-benefit analysis in real time and returning a negative number before I even hit play.

The Meme

In the early 2000s, the Motion Picture Association ran an anti-piracy ad. You’ve seen it. Dark screen, dramatic narration: "You wouldn’t steal a car. You wouldn’t steal a handbag." The message was dead serious. The internet turned it into a punchline immediately.

"You wouldn’t download a car."

The remix was better than the original because it exposed the absurdity of the comparison. Digital copying isn’t theft in the way stealing a physical object is. Everyone knew this. The meme made it sayable in five words.

Then the world shifted. By the mid-2010s, 3D printing had gotten serious enough that "you wouldn’t 3D print a car" stopped being entirely hypothetical. Local Motors printed one. The joke got uncomfortable because the gap between absurd and plausible had narrowed.

Now it’s 2026 and people are building functional prototypes by describing what they want to an AI. "I vibe coded a car" isn’t a meme yet. But I’m sitting here thinking about it, and the thing that makes it work is the same thing that made "download a car" work 20 years ago: the template is pre-loaded. Your brain fills in the pattern before you finish reading. You get it instantly. That’s processing fluency, and it’s the reason I spent 15 years building a design methodology around it.

It’s also the reason memes used to spread the way they did. And the reason they don’t anymore.

When Memes Were Free

Between roughly 2008 and 2014, the internet ran on image macros. Impact font over a photograph. Rage comics drawn in MS Paint. Demotivational posters. Advice animals. The production cost was zero. The comprehension cost was zero. Anyone could make one. Anyone could get one.

Richard Dawkins coined "meme" in 1976 to describe a unit of cultural transmission that replicates like a gene. He identified three properties that determine survival: fidelity (does the core idea stay recognizable through variations?), fecundity (how fast and how widely does it replicate?), and longevity (how long does it survive?). Image macros scored high on all three. The template preserved fidelity. Low production cost enabled fecundity. Simplicity ensured longevity.

Limor Shifman, who wrote the definitive academic treatment of internet memes in 2014, called them "(post)modern folklore." She was right. They were participatory culture at its purest. The template structure, fixed format plus variable content, meant anyone could contribute without needing production skills, distribution channels, or permission. You needed a browser and an idea.

What nobody named at the time, and what I want to name now, is the cognitive architecture that made this work. A static image meme consumes one cognitive channel (visual) for about one to two seconds. Pattern recognition handles most of the processing. The scaffold is pre-loaded from cultural context. Your brain fills one slot, gets the joke, moves on. Total cognitive investment: near zero. Total information transfer: high.

That ratio, maximum information per unit of cognitive effort, is what Perception-First Design calls processing fluency. Reber and Schwarz demonstrated in 1999 that the subjective ease of processing directly influences truth judgments: statements that are easier to read feel more true. Song and Schwarz showed in 2008 that hard-to-read instructions make the described task seem harder. The feeling of ease isn’t just aesthetic. It shapes what people believe, what they share, and what they do next.

A 2021 UCLA study by Wong and Holyoak tested what actually drives meme sharing. They found a specific chain: relatability leads to aptness, aptness leads to comprehensibility, comprehensibility leads to humor, humor leads to sharing. Comprehensibility, a processing fluency proxy, was the critical mediator. If you don’t get it fast, you don’t share it. Period.

Memes were accidentally Perception-First Design. Nobody called it that. Nobody had to. The format was L2-optimal by default.

The Cost of Video

I am autistic. I process information differently from most people, and I’ve spent most of my adult life figuring out exactly how. One of the things I know about my own cognition is that video demands everything from me. Visual channel. Auditory channel. Temporal tracking. Attention maintenance. All of it, simultaneously, for a duration I can’t control.

A meme costs me one to two seconds and one cognitive channel. A 60-second video costs me 60 seconds and every channel I have. The information payload is frequently identical. The cost is an order of magnitude higher.

This isn’t just my experience. John Sweller’s cognitive load theory, introduced in 1988 and refined over the following decades, distinguishes between intrinsic load (the complexity of the content itself), extraneous load (the complexity imposed by how the content is presented), and germane load (the effort spent building understanding). Poorly designed media increases extraneous load, the load that has nothing to do with the actual content, and suppresses learning. Sweller, Ayres, and Kalyuga’s 2011 consolidation of the theory identified the transient information effect: information that disappears (audio, video) imposes higher cognitive load than information that persists (text, static images), because you have to hold the vanishing information in working memory while simultaneously processing whatever comes next.

Annie Lang demonstrated in 2000 that video automatically commandeers cognitive resources through orienting responses to structural features: cuts, zooms, motion, sound changes. Your brain allocates attention to these features whether or not they carry meaningful content. You don’t choose to pay attention to a jump cut. The medium takes your attention without asking.

A Stanford review by Tversky, Morrison, and Betrancourt in 2002 examined dozens of studies comparing static graphics to animations. Their conclusion: animations rarely outperform static representations. When they appeared to, it was because they conveyed more information, not because the format itself was better. When information content was controlled, static either matched or beat animation. The perception that video is a superior format for conveying ideas is, itself, a perception bias.

And then there’s the CHI paper. Chiossi and colleagues published a study in 2023 showing that TikTok use significantly degraded prospective memory, the ability to remember to do things you planned to do. Short-form video combined with rapid context-switching overloaded working memory. It wasn’t a survey. It was a controlled experiment. N=60, between-subjects design, published at the top human-computer interaction venue in the field.

TikTok literally makes you forget what you were about to do.

I feel this. Not as a data point, but as lived experience. The anger I feel when someone sends a video link is my cognitive system signaling that the expected return on investment is negative. I know, before I press play, that the ratio of effort to information is going to be bad. My brain has learned this through years of evidence.

"Just Watch This"

The phrase itself is the problem. Three words that dismiss the cognitive cost of what they’re asking.

When someone sends you a video and says "just watch this," they’ve already watched it. The processing cost is behind them. They know what’s in it, they know the payoff, they’ve already decided it was worth their time. To them, sharing it feels low-cost, like handing you a gift.

To you, it’s an unaudited demand on your cognitive budget. You don’t know the payoff. You don’t know the length. You can’t skim it, can’t scan it for the relevant parts, can’t compare it to something else side by side. You have to surrender your attention linearly, from beginning to end, and hope the source is trustworthy. "Just" is doing a lot of work in that sentence, and all of it is wrong.

Pirolli and Card published their information foraging theory in 1999, adapting optimal foraging from behavioral ecology to information behavior. The core idea: people navigate information environments to maximize the rate of valuable information gained per unit of time. They introduced the concept of information scent, the perceived value of a source based on proximal cues.

A meme has high information scent. You can assess its value in under a second from the thumbnail. A video has low information scent. You literally cannot assess its value without investing time watching it.

What "just watch this" really means is: I’m spending your cognitive resources without your permission, and I’m framing it as a favor.

This is a design problem. The design principle hiding inside my frustration is the same one that drives every good landing page: frontload the value proposition. Tell me what I’m going to get before I invest. A meme does this. A headline does this. A well-designed hero section does this. Video, by default, does not.

The Fraud

If memes were working so well, why did they fade? The standard narrative is that platforms evolved, audiences matured, attention spans shortened. None of that is what happened.

What happened was that Facebook lied about video.

Between 2015 and 2018, Facebook inflated its video viewing metrics by 150 to 900 percent. Not a rounding error. Not a methodology dispute. The original reports estimated 60-80% inflation; the amended class-action complaint revised the figure upward to 150-900%. Facebook settled for $40 million.

The inflated numbers told media companies that video was the future. Audiences were supposedly watching more, engaging longer, converting better. The data said so. The data was fabricated.

The result was the "pivot to video." CollegeHumor gutted its editorial staff. Funny or Die laid off most of its writers. Mic, a news outlet with tens of millions of monthly readers, fired its entire editorial team and pivoted to video production. Within two years, Mic sold for a reported $5 million after being valued at $100 million.

These companies had profitable text and image operations. They destroyed them to chase metrics that were fake.

The memes didn’t fail. They were murdered. The format that respected how brains work was abandoned because a platform fabricated evidence that a more expensive, more demanding, more attention-extracting format was performing better. The pivot to video wasn’t a response to audience behavior. It was a response to fraud.

Cory Doctorow named the mechanism in 2023: enshittification. Platforms follow a three-stage decay cycle. First, be good to users to attract them. Then, abuse users to benefit business customers (advertisers). Finally, abuse business customers to extract maximum value for shareholders.

Video serves stage two. It increases dwell time and ad inventory. It captures more attention per session. It is better for the platform’s business model in every way that is worse for the user’s cognitive experience.

In 2025, Ardoline and Lenzo published the first peer-reviewed paper connecting enshittification to cognitive harm. They introduced the concept of cognitive deskilling: when platforms degrade, users who have offloaded cognitive tasks to those platforms lose the capacity to perform those tasks independently. The shift from text and image to algorithmic video feeds doesn’t just change what people consume. It degrades their ability to seek, evaluate, and retain information on their own. Doctorow endorsed the paper. The American Dialect Society had already named "enshittification" its 2023 Word of the Year.

The Disinformation Gap

Around 2015, the same year the platform pivot to video accelerated, something else was accelerating: disinformation. I don’t think the timing is coincidental.

Pennycook and Rand published a study in 2019 that I think about constantly. They tested whether susceptibility to fake news was driven by partisan bias or by lazy thinking. The answer was lazy thinking. People who scored higher on analytical reasoning tests were better at distinguishing fake from real news, regardless of political alignment. The problem wasn’t ideology. The problem was insufficient cognitive engagement.

Video, by its nature, is a passive consumption format. You sit. You receive. The medium’s temporal structure demands that you process information at the source’s pace, not your own. You cannot pause to evaluate a claim without losing the thread. You cannot scan ahead to assess whether the argument holds up. You cannot place two videos side by side and compare their claims the way you can with two articles.

Sundar’s MAIN model, published in 2008, identified the realism heuristic: the more realistic a medium feels, the more credible people perceive its content to be, regardless of actual accuracy. Video triggers this heuristic more strongly than text or static images. "Seeing is believing" is not a proverb. It is a measurable cognitive bias, and video exploits it by default.

Vaccari and Chadwick tested this in 2020 with deepfake political videos. Their finding: deepfakes didn’t primarily mislead viewers directly. What they did was create uncertainty that reduced trust in all news on social media. Chesney and Citron named this the "liar’s dividend" in 2019: once people know video can be faked, even real video becomes suspect, and anyone caught on camera can claim the footage is synthetic. The realism heuristic cuts both ways.

A static meme, by contrast, is inherently low-trust. Nobody mistakes an image macro for journalism. The format signals "this is an interpretation, a joke, a take." It invites evaluation rather than demanding belief.

And because it processes in under two seconds, the analytical brain isn’t bypassed. You get the claim, you evaluate the claim, you share or don’t. The entire cycle happens fast enough that System 2, Kahneman’s slow, deliberative reasoning system, doesn’t have time to disengage.

Video is long enough for System 2 to go to sleep. And when System 2 sleeps, you believe what you’re shown.

Full Disclosure

I should tell you something about this essay. I vibe coded it.

I also vibe coded the website you’re reading it on. My design methodology, extracted into an API and shipped as a SaaS. Cognograph, from a thought into a full web app with four provisional patents, a design system, and its own site. A 12-part book series that picks up where one of my favorites left off. Client dashboards. Client reporting. Client websites for multiple businesses. The list keeps growing. Two months after figuring out the scaffolding Claude Code needed to let me focus on the input and output only.

The methodology itself I trademarked. I synthesized it from notebooks I’d been filling for over a decade, OCR’d and fed to Claude as source material. The AI didn’t invent PFD. It helped me extract and formalize what was already there.

I didn’t do this by accident, and I didn’t do it by being a better coder than the machines. I have 15 years of full-stack design and development experience. What that taught me wasn’t how to write better code. It taught me what to ask for and how to know when the output is right.

The skill didn’t stay in execution. It moved to quality control on both sides of it. Specification quality: knowing exactly what good looks like before anything gets built. Evaluation quality: recognizing whether the output meets that standard after it’s rendered.

If you can hold both ends, the middle can be produced by anything. If you can’t, it doesn’t matter what produces it. The output will be garbage and you won’t know why.

This is exactly what Perception-First Design does for the design process itself. PFD is the specification layer: perception science says this configuration should produce this cognitive response. PFD is also the evaluation layer: measure whether it did. The design in the middle, the pixels, the layout, the copy, can be produced by a human, an AI, a team of twelve, or one person with a terminal window. The quality lives in the specification and the vetting, not in who or what renders it.

Vibe coding with bad specs and no vetting produces garbage. Vibe coding with enterprise-grade specification and enterprise-grade evaluation produces enterprise-grade output. The people panicking about AI replacing designers are worried about the wrong layer. The rendering was never the hard part. Knowing what to render and knowing whether you got it right: that’s the whole game.

I am, as it turns out, the meme. "You wouldn’t vibe code a car." I would. I did. Several of them.

The Design Principle

I started this essay angry about a video link. I’m ending it with a claim I can defend: the internet found a communication format that was accidentally optimized for how human cognition works, used it for roughly five years, then abandoned it in favor of a format that is optimized for attention extraction at the expense of comprehension, critical thinking, and cognitive autonomy.

The image macro was L2-native. It operated at the layer of processing fluency where ideas transmit with minimal friction. It respected the brain’s limited capacity by demanding almost nothing from it. It was self-paced, instantly evaluable, and structurally honest about what it was.

Video-first content is L2-hostile. It demands sustained multi-channel attention for uncertain payoff. It triggers automatic resource allocation through structural features that carry no information. It creates the feeling of engagement without the substance of comprehension. And it powers a platform economy that benefits from your attention regardless of whether you benefit from spending it.

Perception-First Design is built on a specific commitment: design for how people actually perceive and process the world, not for how you wish they did and not for how your business model needs them to. The meme got this right by accident. The platforms got it wrong on purpose.

So: you wouldn’t vibe code a car. But you might build one if the instructions cost you two seconds and one cognitive channel instead of sixty seconds and all of them.

The format that respects the brain wins. It always has. The question is whether the platforms will let it.

Key Terms

Processing fluency	The subjective ease of cognitive processing. Information that feels easy to process is perceived as more true, more beautiful, and more trustworthy. The foundational mechanism of PFD Layer 2.
Information scent	The perceived value of an information source based on proximal cues, before full engagement. Memes have high information scent (assessable in under one second). Video has low information scent (requires time investment to evaluate). From Pirolli & Card’s information foraging theory.
Transient information effect	The cognitive load penalty imposed by information that disappears (audio, video) versus information that persists (text, images). Vanishing information must be held in working memory while new information arrives, increasing extraneous load. From Sweller’s cognitive load theory.
Cognitive deskilling	The atrophy of cognitive capacities when platforms degrade. Users who offload seeking, evaluating, and retaining information to platforms lose the ability to perform those tasks independently when the platform changes. From Ardoline & Lenzo (2025).
Realism heuristic	The tendency to perceive content in more realistic media formats as more credible, regardless of actual accuracy. Video triggers this heuristic more strongly than text or static images. From Sundar’s MAIN model.

References

Ardoline & Lenzo (2025)	The Cognitive and Moral Harms of Platform Decay. Ethics and Information Technology, 27, 37.
Chesney & Citron (2019)	Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security. California Law Review, 107, 1753-1820.
Chiossi et al. (2023)	Short-Form Videos Degrade Our Capacity to Retain Intentions. Proceedings of CHI ’23, ACM.
Dawkins (1976)	The Selfish Gene. Oxford University Press.
Doctorow (2023)	TikTok’s Enshittification. Wired / Pluralistic.
Kahneman (2011)	Thinking, Fast and Slow. Farrar, Straus and Giroux.
Lang (2000)	The Limited Capacity Model of Mediated Message Processing. Journal of Communication, 50(1), 46-70.
Pennycook & Rand (2019)	Lazy, Not Biased: Susceptibility to Partisan Fake News Is Better Explained by Lack of Reasoning. Cognition, 188, 39-50.
Pirolli & Card (1999)	Information Foraging. Psychological Review, 106(4), 643-675.
Reber & Schwarz (1999)	Effects of Perceptual Fluency on Judgments of Truth. Consciousness and Cognition, 8, 338-342.
Shifman (2014)	Memes in Digital Culture. MIT Press.
Song & Schwarz (2008)	If It’s Hard to Read, It’s Hard to Do. Psychological Science, 19(10), 986-988.
Sundar (2008)	The MAIN Model. In Metzger & Flanagin (Eds.), Digital Media, Youth, and Credibility. MIT Press.
Sweller (1988)	Cognitive Load During Problem Solving. Cognitive Science, 12(2), 257-285.
Tversky et al. (2002)	Animation: Can It Facilitate? International Journal of Human-Computer Studies, 57(4), 247-262.
Vaccari & Chadwick (2020)	Deepfakes and Disinformation. Social Media + Society, 6(1).
Wong & Holyoak (2021)	Cognitive and Motivational Factors Driving Sharing of Internet Memes. Memory & Cognition, 49, 863-872.

A note on how this was written: This essay is AI-assisted. I provide the thesis, the personal experience, the editorial direction, and the framework connections. AI helps me structure, draft, and locate academic citations. This is consistent with Perception-First Design’s own transparency principle: if I’m writing about perception, I should be honest about how the writing itself is produced.

Want Perception-First Design applied to your business?

Work with me