AI Alignment Awards is a contest calling for proposals for how to reliably understand and control advanced AI systems. Prizes are up to $100k; the new deadline is May 1.
AI has been rapidly progressing. Will this be beneficial or catastrophic? AI Alignment Awards think this will depend on our ability to reliably understand and control advanced AI systems – something researchers don’t yet know how to do, even with current models.
AI Alignment Awards are running two contests to make progress on this problem:
The Goal Misgeneralization Contest (based on Langosco et al., 2021 and Shah et al., 2022): AI’s often learn unintended goals. Goal misgeneralization occurs when a reinforcement learning agent retains its capabilities out-of-distribution, yet pursues the wrong goal. How can we prevent or detect goal misgeneralization?
The Shutdown Contest (based on Soares et al., 2015): People sometimes say, “If the AI is getting out of control, we’ll just turn it off”, but this may be a difficult technical challenge. As AI systems get more advanced, how can we design or train them such that they don’t try to stop operators from turning them off?
No prerequisites are required to participate!
Find out more on AI Alignment Awards website.