Anthropic, OpenAI, and Microsoft Just Agreed on One File Format. It Changes Everything.
TL;DR
Skills have shifted from personal prompts to org-wide infrastructure — Nate says what started as individual Claude add-ons in October are now version-controlled, workplace-wide assets surfacing in Claude, Copilot, Excel, PowerPoint, and ChatGPT as a shared AI context layer.
Agents, not humans, are now the main callers of skills — because agents can make hundreds of skill calls in one run, skills need to be written agent-first, with clear routing, predictable outputs, and handoffs that survive without a human correcting drift midstream.
A skill is deceptively simple: just a folder with a
skill.mdfile — but the real leverage comes from encoding methodology, reasoning, output format, edge cases, and examples in plain English so workflows become reusable instead of trapped in copy-pasted prompts.The description field is the make-or-break piece — vague lines like “helps with competitive analysis” fail, while single-line, trigger-rich descriptions with explicit artifact types and output expectations are what actually get Claude and other models to invoke the right skill.
The best teams are treating skills like compounding operational memory — Nate contrasts prompts that “evaporate” with skills that improve over time, citing examples like a developer ‘specialist stack’ and Texas Paintbrush’s 50,000 lines of real-estate operations skills across 50 repos.
Nate’s big practical point is that skills need testing, contracts, and tiers — if agents are using them in production, teams should benchmark them with test suites, define outputs like API contracts, and organize them into standard, methodology, and personal workflow layers.
The Breakdown
From solo prompt hacks to shared AI infrastructure
Nate opens by saying the industry’s mental model is outdated: people still think of skills as the little personal Claude trick from last fall, when they’ve quietly become infrastructure. The biggest shift is organizational rollout — skills are now single uploads, version-controlled, and available across tools like Excel, PowerPoint, Claude, and Copilot, so expertise no longer has to live “in someone’s head.”
Why the real caller is now the agent
His second big update is the one he thinks everyone missed: humans used to call skills, but now agents do. That matters because a person might invoke a few skills in a chat, while an agent may make hundreds of calls in one workflow, which means design failures get amplified fast and there may be no human in the loop to catch them.
The tiny file format that suddenly matters everywhere
Nate reduces the whole thing to a surprisingly humble primitive: a skill is just a folder with one required markdown file containing metadata and instructions. But he argues that this “lowly markdown file” is becoming a cross-platform standard embraced by Anthropic, Microsoft, and OpenAI — and because best practices are still being discovered, people are trading skills “like baseball cards at camp.”
What people are actually building with skills
He gives two concrete patterns. In software, teams create a “specialist stack,” where one skill turns vague ideas into a PRD, another breaks that into GitHub issues, and another writes tests; in real estate, Texas Paintbrush on X has built 50,000 lines of skills across 50 repos for rent rolls, comps, cash flow, and handoff protocols. Nate loves that these repos help both the agent and the next human hire understand how the business actually works.
Why prompts don’t compound but skills do
This is where he gets especially animated: prompts are still useful, but they’re now the “basic 4x4 Lego block,” not the castle. Skills, by contrast, get refined, versioned, and improved over time, so six months of iteration creates real operational memory instead of a pile of copied-and-pasted prompts that disappear when the conversation ends.
The practical recipe — and the gotchas that break skills
Nate says most skills die in the description field. The description must be specific, include trigger phrases and output types, and stay on a single line — if a formatter breaks it, Claude may ignore the second line entirely. In the body, he wants reasoning instead of just steps, a specified output format, explicit edge cases, and examples, while still keeping the core file lean — usually under 100 to 150 lines.
Designing for agent use means testing and thinking in contracts
Because agents are now the primary caller, Nate says teams need to test skills quantitatively with baskets of evaluations, version them, and compare changes rather than assuming edits will behave predictably. He also pushes a more agent-native design mindset: the description is a routing signal, the output should behave like an API contract, and every skill should produce something composable enough for the next agent or sub-agent in the chain.
The team playbook and Nate’s community repo bet
For mixed human-AI teams, he suggests three tiers: standard skills for brand voice and templates, methodology skills that capture senior practitioners’ craft, and personal workflow skills that shouldn’t stay buried in one person’s laptop. That framing leads to his pitch for a new community repo inside OpenBrain: not generic starter-pack prompts, but domain-specific practitioner skills for things like competitive analysis, deal memos, financial model review, and research synthesis, all vetted for “agent readability.”