OpenAI Creates ‘Superalignment’ Team to Safeguard Against a Generative AI Skynet
OpenAI has assigned a fifth of computing power to a new group focusing on controlling advanced AI and keeping it safe for people to use. Chief scientist and co-founder Ilya Sutskever will head up the new Superalignment team and design guardrails to ensure “superintelligent” AI systems stay aligned with human needs and functions.
OpenAI decided to form the Superalignment team based on its estimate that AI will transcend human limits of intelligence possibly within the next ten years. Sutskever, joined by alignment head Jan Leike, will strategize on how to make sure that projected AI doesn’t rampage out of human control or pursue objectionable goals. The team’s goal is to complete a technical solution to harnessing such a superintelligent AI in four years or less.
“Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue,” OpenAI wrote in a blog post. “Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.”
Specifically, OpenAI wants to compile and train AI systems around human responses so that it can independently judge other AI models for their alignment with human preferences. A single mediating AI would be a lot easier to manage for humans, and it could take up the task of evaluating the many AI models that theoretically go beyond human comprehension.
“We plan to share the fruits of this effort broadly and view contributing to alignment and safety of non-OpenAI models as an important part of our work,” OpenAI said. “While this new team will focus on the machine learning challenges of aligning superintelligent AI systems with human intent, there are related sociotechnical problems on which we are actively engaging with interdisciplinary experts to make sure our technical solutions consider broader human and societal concerns. Superintelligence alignment is one of the most important unsolved technical problems of our time. We need the world’s best minds to solve this problem.”