Skim Logo
Dwarkesh PatelApril 30, 2026
How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope
2:13:40
DP

How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope

GPU Rack Configuration and Communication Patterns — Dwarkesh Patel

From How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope. Category: Tech. Format: Commentary. This is a single keypoint from the analysis.

Nvidia's Blackwell racks, with 72 GPUs, are designed for expert parallelism. The all-to-all communication pattern within a rack, facilitated by NVLink and internal switches, is ideal for MoE layers. However, scaling beyond a single rack introduces significant communication bottlenecks due to slower rack-to-rack interconnects.

Impact: High. The physical design of GPU racks and their interconnects directly dictates the feasibility and efficiency of scaling AI models. Rack-level all-to-all communication is a key enabler for MoE, but inter-rack communication remains a challenge.

In the source video, this keypoint occurs from 00:37:00 to 00:41:00.

Sources in support: Dwarkesh Patel (Host)

For the full credibility analysis, key takeaways, and other keypoints from this video, see the full analysis on skim.

This keypoint analysis was generated by skim (skim.plus), an AI-powered content analysis platform by Credible AI.