Skim Logo
Dwarkesh PatelApril 30, 2026
How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope
2:13:40
DP

How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope

Reiner Pope: Invertible Layers and Memory Savings — Dwarkesh Patel

From How GPT-5, Claude, and Gemini are actually trained and served – Reiner Pope. Category: Tech. Format: Commentary. This is a single keypoint from the analysis.

The concept of reversible layers, inspired by cryptographic constructions like Feistel networks, allows neural networks to reconstruct activations during the backward pass instead of storing them. This significantly reduces the memory footprint during training, offering a trade-off where increased computation saves memory.

Impact: High. This technique offers a novel approach to optimizing AI training by minimizing memory requirements, potentially enabling larger models or more efficient training cycles on constrained hardware.

In the source video, this keypoint occurs from 02:08:05 to 02:10:01.

Sources in support: Dwarkesh Patel (Host)

For the full credibility analysis, key takeaways, and other keypoints from this video, see the full analysis on skim.

This keypoint analysis was generated by skim (skim.plus), an AI-powered content analysis platform by Credible AI.