TBPN·March 30, 2026·29m

FULL INTERVIEW: Why I Think Nvidia Is Perfectly Positioned In The AI Race

TL;DR

Tay Kim thinks Nvidia is 'positioned perfectly' for the next AI wave, not peaking — despite a 21% drop from the 52-week high and sector-wide fear around capex, he argues the setup looks like last year's DeepSeek/tariff panic while actual demand for AI compute is still accelerating.
Inference demand is exploding because of agents and coding assistants — Kim says engineers at Meta, Google, and Nvidia all report compute shortages, while everyday 'vibe coders' are juggling multiple model subscriptions and even using 'sneaker bots for NeoClouds' to grab scarce B200 GPUs.
Nvidia's Groq deal is framed as a smart complement, not a philosophical pivot away from GPUs — Jensen Huang is pairing Vera Rubin with Groq-style low-latency inference, with Kim citing a rough split of 75% Rubin and 25% Groq for the workloads that need it.
The real bottleneck may be broader than GPUs: CPUs, wafers, packaging, and possibly helium all matter now — Kim highlights 3-5 year locked supply contracts from hyperscalers, says AI agents need far more CPU orchestration, and expects an industry-wide compute shortage even as Nvidia gets preferential allocation through TSMC.
The 'depreciation gate' fear around H100s becoming worthless is not showing up in the market — CoreWeave says GPUs are lasting 5-6 years and still earning 90-95% of pricing, while even older rental GPUs remain sold out because demand is still outrunning supply.
The next token boom won't just be codegen — it's every knowledge-work workflow getting deeper and more data-hungry — Kim and the hosts argue agents will move into customer service, research, chip design, drug discovery, and finance, with examples like AI now pulling perfect same-store sales data that used to take Kim hours by hand.

The Breakdown

Nvidia's stock drop looks more like Groundhog Day than doom

The interview opens with the obvious question: Nvidia is down hard from its highs, so is the AI trade over? Tay Kim basically laughs it off. He says the chip sector is flat on the year, Nvidia is down about 10%, and the mood feels exactly like prior freak-outs over DeepSeek, tariffs, and now Iran/oil — scary macro overlays while the business itself keeps flying.

Agents and coding assistants are driving a real compute crunch

Once they move from stock chatter to product reality, Kim's main point is simple: inference demand is ripping. He says conversations with engineers at Meta, Google, and Nvidia all point to AI compute shortages, and the hosts pile on with examples of coders hitting rate limits, paying for multiple model plans, and even using bots to snipe B200 GPUs — "like sneaker bots but for NeoClouds." The vibe is gold-rush, not glut.

Why Jensen's Groq move fits Nvidia's playbook

Kim frames the Groq asset acquisition as classic Jensen Huang: spot the shift in where value is forming, then lock in the missing piece. He compares it to the 2019 Mellanox deal, where Nvidia anticipated giant GPU clusters and bought the networking layer; now it's doing something similar for low-latency inference in an agent world. His rough framing is that maybe 25% of inference wants the Groq-style setup, while 75% still belongs to Vera Rubin — together, that mix makes Nvidia stronger, not less GPU-centric.

GTC's deeper message: AI still has a long runway for improvement

Kim says his biggest GTC takeaway wasn't just Nvidia's own roadmap but a session featuring Jeff Dean and Bill Dally, the chief scientists of Google and Nvidia. He points to context-window innovations that can focus on the right 10,000 documents, memory stacked directly on top of GPU/TPU, and synthetic audio/video data as evidence that model progress is nowhere near done. Add in Anthropic's rumored step-function jump and OpenAI's coming model release, and his message is that multiple technical vectors are still pushing demand up.

Open source, China, and the quiet significance of H200 approvals

On Nvidia funding an open-source frontier lab, Kim doesn't see a huge strategic break — he says the planned spend is small relative to OpenAI or Anthropic, and Nvidia wins as long as GPUs are being used anywhere. The conversation then turns to China, where he notes Jensen said Nvidia had licensing approvals on both the U.S. and China sides, implying billions in H200 orders could follow. The host's point is that unlike Meta, Nvidia actually has assets to negotiate with in China.

The next shortage may be CPUs, not just GPUs

The supply discussion broadens fast. Kim says Nvidia is in the driver's seat with TSMC because Jensen visits constantly, has deep relationships, and can prepay tens of billions to secure wafers and CoWoS, but the industry as a whole is still headed into shortages. He thinks an underappreciated trend is a CPU crunch: Dell, AMD, and Intel's CFO are all talking about 3-5 year supply commitments, and ARM's own comments support the idea that AI agents need much more CPU horsepower for orchestration, tool calls, database queries, and web search.

Why Terra Fab sounds much easier in pitch than in reality

Asked for the bull case on Elon's Terra Fab idea, Kim basically won't take the bait. He says fabs are constrained by semicap equipment from ASML and Applied Materials, and chipmaking is "almost like cooking" — too much tacit trial-and-error knowledge to brute-force overnight. The hosts riff on the cultural side too: unlike AI talent in San Francisco, Taiwan's TSMC workforce feels unusually mission-driven, which makes the idea of a simple talent raid even less realistic.

Depreciation fears, helium risk, and where token demand goes next

Kim dismisses the current panic about GPU depreciation, citing CoreWeave's claim that systems last 5-6 years and still fetch 90-95% of pricing, with even older GPUs rented out because demand is so strong. On helium, he says there is roughly 6-9 months of channel inventory, so it only becomes a real issue if geopolitical disruption drags on. The closing stretch gets more practical: Kim argues codegen is still early, then gives a concrete example from his own work — pulling same-store sales for Chipotle and Cava used to take him hours manually, but now Gemini and ChatGPT get it right in a minute or two, which is exactly how agents start eating the boring parts of knowledge work.