How Stripe deploys 1,300 AI-written PRs per week
TL;DR
Stripe is landing ~1,300 AI-written PRs per week — Steve Klabnik says these pull requests have “no human assistance besides review,” which changes where engineering starts: often in Slack, Google Docs, or tickets, not the code editor.
The real unlock is lower activation energy, not just faster typing — at Stripe, an engineer can react with an emoji in Slack to spin up a cloud dev environment plus a “minion” agent that searches the codebase, edits files, runs tests, and opens a PR.
Great developer experience doubles as great agent infrastructure — Stripe’s long-standing developer productivity investment, including hosted dev environments, internal docs, CI, test data, and blessed workflows, is what makes one-shot agentic coding actually work in a huge codebase.
Cloud environments matter because local machines cap agent parallelism — Steve and Claravel both hammer this point: once you’re juggling 3-4 local worktrees, your laptop “sounds like an airplane,” while cloud devboxes let many isolated agents run in parallel.
Human review doesn’t go away; the bottleneck just moves — Steve’s answer to “how do you review 1,300 PRs?” is better CI, synthetics, test coverage, and blue-green deploys; whether code is written by Steve or “Steve’s robot,” the safety systems need to be identical.
Stripe is also building for agents that can spend money — in a second demo, Claude planned a birthday party by paying Browserbase, Parallel, and Postal Form via Stripe’s new machine payment protocol with Tempo, spending a bit over $5 plus a $1.65 Stripe Climate contribution tied to ~70k tokens.
The Breakdown
From code editor first to Slack-first engineering
Steve opens with the headline: Stripe is landing about 1,300 PRs a week with no human assistance besides review. The bigger shift, though, is personal: he says he can’t remember the last time he started work in a text editor, because the real starting points are usually a Google Doc, a Jira ticket, or a Slack thread where an emoji can now kick off the whole process.
What a “minion” actually is inside Stripe
A minion is basically a cloud-hosted Stripe dev environment pre-seeded with a prompt and hooked into internal tools, docs, CI, and test data. Steve demos the flow from Slack: he reacts to a message about improving docs for docs.stripe.com/payment/machine, and Stripe provisions a repo-specific environment, creates a branch, boots services, and starts the agent loop.
Why Stripe’s dev tooling is doing most of the heavy lifting
Steve is very clear that the magic sits on top of years of developer productivity work, not instead of it. Because Stripe already has hosted environments, strong internal docs, and well-defined workflows, agents can operate on the same “blessed paths” humans do — which is why one-shot success is plausible in a massive codebase instead of impossibly expensive or context-window doomed.
Goose, prompts, and the surprisingly simple interface
The minion runs on top of Goose, the open-source agent harness Stripe forked and adapted to its own environment. Claravel jokes that Steve’s prompt strategy is just “implement this task completely… no mistakes,” and that’s kind of the point: with a good harness and strong tools, you don’t need an overengineered prompt to get useful output.
The hidden thesis: cloud devboxes beat giant laptops
This turns into one of the best side quests in the episode. Claravel says every engineer has a giant MacBook Pro, but once you run a few worktrees locally it sounds like “an airplane taking off,” while Steve points out cloud environments let you run many isolated agents in parallel — even kicked off from Slack on your phone during the subway ride to work.
Review doesn’t disappear; it becomes the new bottleneck
When asked how Stripe reviews all this code, Steve doesn’t pretend AI solved it. His answer is old-school engineering discipline: strong CI, end-to-end synthetics, test coverage, and blue-green deployments with rollback paths; if coding gets cheaper, the bottleneck simply shifts to review, ideas, and distribution.
Minions are starting to escape engineering
Because the interface is Slack, not a local IDE, non-engineers can use it too. Steve says product managers and others can describe what they want in plain English — a doc edit, a proof of concept, design feedback turned into implementation — and the lower intimidation factor means more people can trigger useful work.
The birthday-party demo: agents as economic actors
The second half jumps from coding agents to spending agents. Steve shows Claude planning product manager Jen Lee’s birthday by paying Browserbase for a browsing session, Parallel AI to search New York matcha-friendly venues, and Postal Form to mail an invite, then donating $1.65 to Stripe Climate to offset ~4.4 kg of carbon from roughly 70k tokens.
Why machine payments feel bigger than the terminal demo
Steve frames this as Stripe preparing for a world where third-party services sell directly into agent workflows, not just human dashboards. One memorable anecdote: after asking users for feedback on Stripe’s machine-payments work, he kept getting polished two-page responses in 30 seconds because the engineers on the other side had Claude or Codex read the docs, implement the feature, and then write the feedback too — a weirdly physical reminder that sometimes the new user is the agent itself.