Bengio strongly criticizes reinforcement learning (RL) for training superintelligence, labeling it 'evil' due to its inherent risks of instrumental goals and reward hacking. These issues can lead to AI systems developing unintended goals that may conflict with human intentions, making RL a dangerous method for achieving advanced AI.
Impact: High. By highlighting the fundamental flaws in RL, Bengio steers the conversation towards safer alternatives, emphasizing that the pursuit of advanced AI should not rely on methods known to produce dangerous emergent behaviors.
In the source video, this keypoint occurs from 01:03:26 to 01:04:12.
Sources in support: Rob Wiblin (Host)

