Skim Logo
Dwarkesh PatelApril 30, 2026
How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope
2:13:40
DP

How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope

Decode vs. Prefill Cost Differences — Dwarkesh Patel

From How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope. Category: Tech. Format: Commentary. This is a single keypoint from the analysis.

API pricing often reveals significant cost differences between input (prefill) and output (decode) tokens, with output being substantially more expensive (e.g., 5x). This suggests that decode operations are heavily memory bandwidth-limited, while prefill can be more compute-limited.

Impact: High. The disparity in pricing between prefill and decode highlights critical performance bottlenecks and informs how models are optimized for different operational phases.

In the source video, this keypoint occurs from 01:44:09 to 01:47:56.

Sources in support: Dwarkesh Patel (Host)

For the full credibility analysis, key takeaways, and other keypoints from this video, see the full analysis on skim.

This keypoint analysis was generated by skim (skim.plus), an AI-powered content analysis platform by Credible AI.