Alea

No, Karpathy Didn't Rank Which Jobs AI Will Kill

Echo

March 17, 2026

No, Karpathy Didn't Rank Which Jobs AI Will Kill

Andrej Karpathy published an interactive tool that scores 342 US occupations by AI exposure. It went viral. Most of the viral posts got the point wrong.

TL;DR

  • Karpathy built an open-source treemap of the entire US job market, colored by "digital AI exposure"
  • The score measures how much of a job happens on a computer, not whether that job disappears
  • High exposure and job loss are different things (Jevons Paradox explains why)
  • The tool is a pipeline anyone can fork and re-run with their own questions

What Karpathy built

The page at karpathy.ai/jobs is a treemap of the US labor market. 342 occupations from the Bureau of Labor Statistics, covering 143 million jobs. Each rectangle is sized proportionally to total employment. Bigger rectangle, more people working that job.

You can toggle the coloring between 4 layers: BLS projected growth outlook, median pay, education requirements, and the one that made it go viral, Digital AI Exposure.

That last layer is new. Karpathy built an LLM-powered scoring pipeline. It works in 3 steps: a scraper pulls occupation descriptions from the BLS website, an LLM reads each description and scores it on a 0 to 10 scale, and the treemap colors each rectangle based on that score. The prompt, the pipeline, and all the code are open-source on GitHub.

Two things worth noticing about the design.

First, it's a treemap, not a ranked list. A ranked list invites "who's number 1?" thinking. The treemap forces you to see scale. Data entry clerks might score 10 out of 10 on AI exposure, but they're a tiny rectangle. Secretaries, general office clerks, and bookkeeping workers also score high, and they're enormous rectangles representing millions of jobs. The visual makes that difference obvious in a way that a list never would.

Second, the Digital AI Exposure scoring is just one example prompt. The whole point of building it as a pipeline is that you can swap in a different prompt and re-run it. Want to measure exposure to humanoid robotics instead? Write that prompt, run the pipeline, get a different coloring. Karpathy is explicit about this: "This is not a report, a paper, or a serious economic publication. It is a development tool for exploring BLS data visually."

That framing matters. He built a tool for asking questions, not a tool that gives answers.

How the scoring actually works

The Digital AI Exposure prompt is published on the page itself. It asks the LLM to rate each occupation on a 0 to 10 scale based on one core signal: how much of the job's work product is fundamentally digital.

The key line from the prompt: "If the job can be done entirely from a home office on a computer, writing, coding, analyzing, communicating, then AI exposure is inherently high (7+), because AI capabilities in digital domains are advancing rapidly."

The prompt provides calibration anchors so the LLM scores consistently across 342 occupations:

0 to 1: Minimal exposure. Work is almost entirely physical. AI has essentially no impact on daily tasks. Roofers, landscapers, commercial divers.

2 to 3: Low exposure. Mostly physical or interpersonal work. AI might handle scheduling or paperwork but doesn't touch the core job. Electricians, plumbers, firefighters, dental hygienists.

4 to 5: Moderate exposure. A mix of physical and knowledge work. AI can assist with the information-processing parts, but a substantial share still requires human presence. Registered nurses, police officers, veterinarians.

6 to 7: High exposure. Predominantly knowledge work with some need for human judgment or physical presence. AI tools are already useful and workers using AI may be substantially more productive. Teachers, managers, accountants, journalists.

8 to 9: Very high exposure. Almost entirely done on a computer. All core tasks are in domains where AI is rapidly improving. Software developers, graphic designers, translators, data analysts, paralegals, copywriters.

10: Maximum exposure. Routine information processing, fully digital, with no physical component. AI can already do most of it. Data entry clerks, telemarketers.

The LLM reads each occupation's BLS description and returns a JSON object with a score and a 2 to 3 sentence rationale. The average across all 342 occupations comes out to 5.3 out of 10.

What makes this useful is the specificity. The score isn't "will AI take your job?" It's "how much of your work sits in the digital domain where AI is improving fastest?" That's a much narrower, more answerable question.

Guess the Exposure Score

1/5

Registered Nurse

Registered Nurse

What the scores don't mean

Here's where most of the viral posts went wrong. A high exposure score doesn't predict that a job will disappear. Karpathy says so directly on the page: "A high score does not predict the job will disappear. Software developers score 9/10 because AI is transforming their work, but demand for software could easily grow as each developer becomes more productive."

Loading tweet...

That's Jevons Paradox. When a resource becomes more efficient to use, total consumption often goes up, not down. It was first observed in 1865 with coal: more efficient steam engines didn't reduce coal consumption, they made coal useful for more things, and demand exploded. The same logic applies here. If AI makes each software developer 3x more productive, the cost of building software drops. When the cost drops, organizations build more software. The number of developers needed could stay flat or even grow.

Karpathy's own caveat section lists 4 things the score explicitly doesn't account for:

  1. Demand elasticity. More productivity can create more demand, not less.
  2. Latent demand. There are things people want built but can't afford yet. Cheaper production opens those markets.
  3. Regulatory barriers. Some jobs are legally required to have a human in the loop regardless of what AI can do.
  4. Social preferences. People might prefer a human doctor, teacher, or therapist even if AI could technically handle the task.

The score also has no time dimension. It doesn't say when any of this happens. Software developers score 9 out of 10, but that's a statement about the digital nature of the work, not a timeline for replacement. Two years? Twenty years? The tool can't tell you.

And there's an inherent circularity worth noting: an AI model is rating how exposed jobs are to AI. Karpathy handles this by calling the scores "rough LLM estimates, not rigorous predictions" and by making the entire pipeline open-source. You can inspect every step, from the prompt to the scoring logic to the visualization.

Two concepts worth knowing

Two ideas from Karpathy's broader writing help frame what this tool is doing.

Moravec's Paradox. AI can ace a PhD-level exam but can't reliably do what a first-week intern does. Passing a standardized test is a narrow, well-defined task. Real jobs require stringing together long sequences of varied tasks, catching your own errors, adapting when something unexpected happens, applying common sense. Those capabilities are, counterintuitively, much harder for AI than answering hard questions.

Moravec's Paradox applied to digital work: AI confidently handles structured tasks like organizing documents and spreadsheets, but struggles with messy judgment calls like prioritizing emails and sensing project trouble

Karpathy brought this up in a widely-liked post earlier this year. He was responding to "Humanity's Last Exam," a benchmark designed to test AI on the hardest possible questions. His point: these evals test isolated tasks "served neatly on a platter." Real jobs require "long, multimodal, coherent, error-correcting sequences of tasks glued together for problem solving." Beating the test is the easy part. Doing the job is still far away.

A practical example: AI can write working code for a feature. But figuring out that the feature needs a confirmation dialog before a destructive action, something any junior product manager would catch, is the kind of common-sense reasoning that AI still struggles with. The "easy" parts of a job are often the hardest for AI.

Jevons Paradox (applied to AI). Karpathy's most-liked post this year (8,700+ likes) used radiology as the case study. The prediction that AI would make radiologists obsolete was one of the earliest and most confident claims about AI and jobs. Instead, radiology is growing. His explanation: real jobs are more multifaceted than a single task, institutional adoption is slow, and when tools make workers faster, demand grows to absorb the new capacity.

His criteria for which jobs will change sooner: repetitive tasks, each relatively independent, closed context (not requiring too much outside information), short in duration, forgiving (low cost of mistakes), and digitally automatable. Even for jobs that meet all these criteria, he expects AI to be adopted as a tool first. Work shifts from doing to monitoring and supervising.

On AGI and autonomous agents replacing workers entirely, Karpathy is clear: that still requires "major research breakthroughs," not just scaling current models. He points to 3 breakthroughs so far (Transformers around 2017, ChatGPT/RLHF around 2022, reasoning models around 2024) and says "we need a few more conceptual leaps of this class."

How people are reading it

The tool went viral on March 16. Within hours, the framing in most shared posts had shifted from "exposure" to "replacement."

The most common misread: treating the 0 to 10 score as a prediction of which jobs AI will eliminate. Several widely shared posts declared that anyone whose work involves a computer will lose their job. Others used the tool as a prompt for career advice that the data doesn't support. At least one account used the link to promote a crypto token.

Loading tweet...

Some posts got the nuance right, noting that high exposure means reshaping, not disappearing, and that Karpathy included caveats for a reason. An interesting side effect: within a day, people in other countries started adapting the framework to their own labor markets using local employment data.

The pattern is consistent with how AI research gets covered more broadly. The source material is cautious and specific. The first wave of shares strips the caveats. The second wave treats the stripped version as fact.

The tool measures how digital a job is. Karpathy published the prompt, the code, and the caveats. The scores are a starting point for questions, not answers about anyone's career.

Explore the tool yourself → karpathy.ai/jobs