Yoshua Bengio proposes that baking honesty into AI systems is the key to achieving safety, reducing the problem to training a system to be honest through modified training objectives and data processing. This 'Scientist AI' is envisioned as a predictor, not an agent, with inherent honesty guarantees.
Impact: High. This foundational shift aims to preemptively address safety concerns by building honesty into the AI's core architecture, rather than relying on post-hoc alignment techniques.
In the source video, this keypoint occurs from 00:00:16 to 00:01:15.
Sources in support: Yoshua Bengio (Guest, AI Researcher)

