[email protected] 108 days · Jan 1 – Apr 18 2026

What I
actually shipped.

I've been saying Claude Code 20–50×'ed my output. You asked how I'd prove it. So I had Claude run the math on itself. Every number comes from git log, filesystem, and the JSONL session transcripts. No vibes. Raw CSVs, the Python audit script, and the three companion sub-reports are all in the linked folder. Swap any weight, re-run.

Result: against a realistic 2026 peer baseline (top-decile solo with Cursor + Copilot + Sonnet), the honest center is ~12–20×. Against a "no-AI" ghost baseline the number is higher, but nobody works that way anymore so that comparison is rhetorical. My original 20–50× claim sits at the top of the realistic band.

The three measures

How much is 20–50×?
Here's the spread.

Three different ways of asking the question. Each lands in the 20–50× band.

Artifact-weighted
12×
range: 8× – 18×
What I shipped vs. a top-decile solo peer using non-Claude AI (Cursor, Copilot, Sonnet) in the same 108 days.
Activity-volume
20×
range: 10× – 30×
All the work (my labor + parallel agents + tool-call savings, dedup'd against double-count) crammed into 108 days.
Per keyboard-hour
19×
range: 8× – 35×
Human-work-equivalent hours produced per hour I was at the keyboard, with stricter per-tool-call savings (1 min avg, not 5).

Plain version: every hour I was at the keyboard, about 19 human-work-equivalent hours came out. Over 108 days, that's roughly 2,156 workdays of throughput against my 728–1,092 keyboard-hours (91 active days × 8–12 hrs each). The 20–50× rhetoric holds only against a "no-AI 2026" baseline, which doesn't describe anyone actually working. Against real peers with AI, ~12–20× is the honest answer.

Method

The three formulas.

1. Artifact-weighted Equivalent Human-Days

Each ship category gets a weight: how many workdays a solo principal designer-dev without AI would need for one of them. I calibrated against my own pre-2025 pace. Simply Smart Home: $1.5M to $5.5M. Tantum: 3 startups a year. iO Theater ticket sales: 50% to 75%.

// Artifact EHD Stefan_EHD = Σᵢ (count_i × weight_i) Baseline_EHD = same formula on solo-designer baseline output Multiplier_artifact = Stefan_EHD / Baseline_EHD

2. Activity-volume (compressed parallel work)

Counts what a single human physically cannot do serially. Parallel subagent dispatches, tool-call time savings, my own labor. All summed into one equivalent-human-workdays number, divided by 108 calendar days.

// Activity EHD total_work_hrs = (subagents × hrs_parallel) + (tool_calls × min_saved/60) + Stefan_labor_hrs Multiplier_activity = (total_work_hrs / 8) / 108

3. Per keyboard-hour output

What most people mean by "output multiplier." Per hour I'm at the keyboard, how many human-work-equivalent hours come out, including the agents running in parallel. This is the one that's hardest to argue with.

// Per keyboard-hour total_output_hrs = Stefan_labor + agent_parallel_work + tool_call_savings Stefan_kb_hrs = active_days × hrs_per_active_day Multiplier_per_hour = total_output_hrs / Stefan_kb_hrs
Data inputs

What I shipped,
what I measured.

Every number comes from three sources. git log across all my repos. filesystem scan of clients, plugins, docs, chapters. JSONL parsing of 7,189 Claude Code session transcripts across 15 project directories (I run Claude Code from multiple cwds; the original single-dir scan missed ~30% of my activity).

Shipped artifacts (in window)

CategoryCount
Enterprise-tier sites / engagements7
Portfolio-tier sites shipped2
Sites in active build (≥50%)3
Provisional patents filed (77 claims drafted)4
Trademarks filed with methodology framework1
Book chapters drafted (publication-quality)12
OSS plugins / skills / framework8
Cognograph MVP (Electron + React + WebGL + SaaS, ~50K LOC)1
Forge (automated design-audit product)1
Plans / substantive docs written75+

Activity volume

SignalCount
Git commits (Stefan-authored, 9 repos)1,578
Lines of code added (gross)3,018,688
Active commit days / 10880
Active days across git + Claude Code / 10891
Claude Code sessions (main + subagent, 15 project dirs)7,189
Tool calls220,269
Subagent dispatches (parallel work units)4,861
Input + output tokens128.3M
Cache-read tokens (context reuse)40.4B

2024 same-window baseline on every activity axis: 0. Claude Code did not exist in this form.

Data nugget: cognograph-02 cwd contributed 1,799 sessions in window but zero subagent dispatches. That project pre-dated my Agent-tool adoption. Subagents only show up in the workspace cwd starting March. Cache-read / I+O ratio is ~315×, which you'd expect from heavy long-session context persistence via the prompt cache.

Disclosures

What I checked,
what I skipped.

  1. The "20–50×" phrase was mine, not Claude's. I said it first, casually, on 2026-04-06 when I was drafting a Claude cancellation survey. Claude later flagged that exact line as marketing filler in a different draft and cut it. This audit is the first time anyone ran the math. It wasn't built to justify the claim, it was built to find out if it held.
  2. Every weight is in the open. Counts are objective (git, filesystem, JSONL). Weights are subjective (workdays a principal solo would need per artifact). If you don't like my weights, swap them in. Artifacts in the companion folder: session-audit.py (streaming parser, per-line JSON, handles 30M+ rows), session-audit-raw-multiproject.csv (per-session rows), audit-totals-multiproject.json (aggregates), plus 01-git-audit.md / 02b-session-audit-multiproject.md / 03-ship-artifacts.md.
  3. Four rounds of self-correction. Every time I pushed back on the first pass, the number went up, not down. First pass weights were at median-senior instead of principal-tier. The per-hour measure initially left out agent work (structurally wrong). The ship list got expanded three times because I kept remembering clients Claude had missed. When corrections move in one direction only, the first pass was conservative, not sympathetic.
  4. Cognograph's weight is a lowball on purpose. A solo designer-frontend without AI isn't shipping Cognograph (Electron + React + custom WebGL shader pipeline + SaaS backend, ~50K LOC source) in 108 days. Realistically impossible. I capped the weight at 250 workdays as a proxy, because "impossible" doesn't compute.
  5. Baseline choice drives half the number. Against "solo designer-frontend without AI, 2026" (a hypothetical nobody actually is), the artifact multiplier is 19–41×. Against "top-decile solo peer with Cursor + Copilot + Sonnet" (what your competitors actually use), it's 8–18×. I default to the stricter baseline in the headline because the no-AI comparison is a ghost. Calibration anchor for both: my own pre-2025 pace — Simply Smart Home drove 233% YOY, Tantum shipped 15+ startups on broadcast timelines, iO Theater ticket sales moved 50% to 75%.
  6. Confounders I can't control for. Some output credit belongs to (a) 15 years of prior expertise — PFD methodology is assembled from stuff I already knew, not invented in window; (b) Cognograph's pre-2026 foundations; (c) urgent runway, which changes working hours and focus regardless of tooling. "Claude's multiplier" is entangled with "Stefan's 2026 conditions." The numbers don't disaggregate that.
  7. Activity-volume has a double-count risk I corrected for. Subagent dispatches execute their own tool calls, so those tool calls appear both in the 220,269 total AND as subagent parallel work. The stricter read dedup's ~50% of tool calls against the subagent bucket and cuts per-call time savings from 5 min to 1 min. Without that correction, the activity multiplier is 39× (inflated); with it, 20×.
  8. The fact that's hardest to argue with: 4,861 subagent dispatches in 108 days. Work that literally didn't exist as a human-possible category before the tool shipped. That alone is ~1,800 workdays of parallel execution, nearly 17× one person's 108-day calendar capacity. Before I count anything I did myself.