Blog
Blog
Long-form essays on AI products, eval / benchmark methodology, and design.
-
How Far You Can Let an AI Off the Leash Depends on Whether You Can Check Its Work
After a few years building with large language models, I keep landing on the same conclusion. What decides how much you can hand to an AI agent isn't how smart it is. It's whether you can check its work quickly and reliably.
-
What an AI-Native Person Is Actually Like
Since 2023 I've been building things on top of large language models, and I'm more and more convinced of one thing — the people who adapt to AI fastest usually aren't the most technical. Every generation of "native" people differs not in their tools but in their default assumptions.
-
Why Agent Interfaces Run Backwards
A nine-year-old built a polished game with Claude Code. The model is already smart enough. The real question is what kind of interface lets an eager beginner steer all that intelligence.
-
What an AI-Era PM Actually Does
PRD is dead, long live the benchmark. In the AI era, the thing a PM uses to define a product's value is shifting — from writing requirement docs to defining standards for how a model should behave.