laitimes

OpenAI predicts the arrival of superintelligence in 10 years, forming a "human guardian"

author:AI self-sizophistication

Follow the trend of love

OpenAI predicts that AI smarter than humans is likely to emerge before 2030

In the face of "creatures" that are smarter than humans, are you worried that the future of humanity will be ruled by AI?

OpenAI is forming teams to develop new tools to enable future AIs to conform to human ethics

OpenAI outlined their ideas in a blog post and also served as a "job posting" to recruit team members

After reading it, what new suggestions do you have for this idea?

Enjoy it!

Superintelligence will be the most influential technology ever seen to help us solve many of the world's most important problems. However, the immense power of superintelligence can also be very dangerous, which can lead to loss of human control over AI or even lead to human extinction.

Although superintelligence may seem far away, we still believe it could appear before 2030.

OpenAI predicts the arrival of superintelligence in 10 years, forming a "human guardian"

To address the risks posed by superintelligence to humans, we need to establish new regulatory approaches and address the problem of superintelligence "alignment" (editor's note: the word "alignment" in the original text refers to a technology that enables AI to understand and follow human intent to ensure that AI behavior and decisions are consistent with human-desired outcomes):

How can we ensure that AI that is much smarter than humans will follow human thoughts?

Currently, we have not addressed the potential problem of guiding and controlling super-intelligent AI to prevent AI from becoming out of control. The techniques we currently use to "align" AI, such as self-learning from human feedback, rely on human abilities to supervise AI. However, humans will not be able to reliably supervise AI that is much smarter than us in the future, so our current "alignment" technology cannot adapt to superintelligence. We need new technological breakthroughs.

OpenAI's approach

Our goal is to build an "auto-aligned researcher" roughly equivalent to humans. We can then leverage massive computing resources to scale what we do and gradually align superintelligence.

OpenAI predicts the arrival of superintelligence in 10 years, forming a "human guardian"

In order to "align" the first "auto-align researchers", we need to: 1) develop scalable training methods, 2) validate the resulting model, and 3) stress test our entire "alignment" process:

1. In order to provide training signals on problems that are difficult for humans to evaluate, we can use AI to assist in evaluating other AIs (scalable supervision). In addition, we want to understand and control how our AI big models apply our supervision to tasks that we cannot supervise (generalization).

2. In order to verify whether the system is "aligned", we will automatically search for problematic behavior (robustness) and the deep cause of the problem (automatic explainability).

3) Finally, we can ensure that our technology can detect the most severe "misalignment" (adversarial test) by training a deliberately misaligned model to test the entire process of "auto-alignment researchers" operation.

We anticipate that as we learn more about this issue, our research focus will change significantly. At the same time, we may expand into entirely new areas of research. We plan to share more information about our research route in the future.

A new team

We are assembling a team of top machine learning researchers and engineers to tackle this problem.

We will devote 20% of our computing resources to solving super-intelligent "alignment" problems over the next four years. Our major fundamental research is betting on our new "supersmart" teams, but getting this right is critical to achieving our mission, and we expect more teams to contribute, from developing new approaches, scaling up, to deployment.

Within four years, solve the core technical challenges of super-intelligent "alignment"

This is an incredible, ambitious goal, and we can't guarantee it will succeed. But we remain optimistic that we can solve this problem by focusing on working together, that many of these ideas have shown to be feasible in initial experiments, that we have made more and more useful progress, and that we can empirically study many of these problems using current models.

OpenAI predicts the arrival of superintelligence in 10 years, forming a "human guardian"

Ilya Sutskever (co-founder and chief scientist of OpenAI) has made this his core research focus and will lead the team together with Jan Leike, who leads the "alignment" technology. This team includes not only researchers and engineers from our previous "alignment" team, but also researchers from other teams in the company.

We are also looking for talented new researchers and engineers to join the program. Superintelligent "alignment" is fundamentally a machine learning problem, and we believe that good machine learning experts – even if they are not yet working on "alignment" – can be the key to solving this problem.

We plan to share the results of this work widely, and we see "alignment" and contributing to security issues beyond OpenAI models as an important part of our work.

The new team's mission complements OpenAI's existing work to improve the security of large models like ChatGPT while gradually discovering and mitigating other risks of AI, such as abuse, financial damage, disinformation, bias, discrimination, addiction, over-dependence, and more.

While this new team will focus on the challenges of machine learning and aligning "superintelligent" AI with human intent, there are also some social science issues involved, so we are actively engaging with interdisciplinary experts to ensure that our technological solutions take into account a wider range of human and societal issues.

Read on