Editor: Editorial Department

Foreign media shocked that the key figures of OpenAI's super alignment team were suddenly fired, citing leaks. Moreover, this person is also an important ally and confidant of Ilya. And Ilya, who was at the center of the storm, still did not show up......

Shocking melons!

According to foreign media reports, 2 researchers of the OpenAI Super Alignment Team were officially expelled for leaking "secrets"!

This is also the first public personnel change for OpenAI since Sam Altman returned to the board of directors in March this year.

The second round of OpenAI's infighting!Ilya's cronies and 2 members of the super alignment team were expelled and criticized for leaking secrets

One of the fired researchers, Leopold Aschenbrenner, worked on the newly formed Super-Alignment team.

At the same time, he is also a supporter of Ilya Sutskever, chief scientist of OpenAI, who has not appeared in public since the OpenAI infighting.

Another employee who was fired, Pavel Izmailov, was in charge of reasoning research and contributed to the security team.

Leopold Aschenbrenner(左), Pavel Izmailov(右)

It is worth mentioning that the two people who were fired were both authors of a new paper by the OpenAI Super Alignment team last year.

However, it is unclear what information was leaked by the tools of the two dismissed employees.

What is the dismissal of key people in the team?

OpenAI's development is still stable and improving, and its valuation was even as high as $86 billion in the latest employee stock sale.

The Superalignment team is a hot topic within OpenAI.

In the end, if AI becomes superintelligent, the advantage is that it may be able to help us solve the nuclear fusion problem and even open up other planets, but on the contrary, what if it is so powerful that it begins to harm mankind?

To this end, last summer, Ilya Sutskever set up this team to develop technology to control and direct superintelligence.

Aschenbrenner is one of the key figures in the super-intelligent alignment team.

One controversy is: is there really a need for this team to exist?

Within OpenAI, employees have mixed opinions on this.

The previous infighting turmoil is also inseparable from the controversy over this concept.

As the co-founder of OpenAI and head of major technological breakthroughs, Ilya, along with other board members, decided to fire Sam Altman due to his lack of candor.

After Altman returned to the position of CEO, Ilya left the board of directors and seemed to have disappeared since then, attracting the suspicion of many netizens.

"Effective altruism" again

Intriguingly, many of the characters involved in the incident have "Effective Altruism" (Effective Altruism), which is inextricably linked.

Aschenbrenner, a key figure in the alignment team, is part of the effective altruism movement.

The campaign stresses that we should prioritize addressing the potential risks of AI rather than pursuing short-term profits or productivity gains.

Speaking of which, we can't fail to mention Sam Bankman-Fried, the founder of FTX who is now a prisoner, and he is also one of the big fans of effective altruism.

Aschenbrenner, who graduated from Columbia University at the age of 19, worked at the Future Fund, a philanthropic foundation founded by SBF to fund projects that "improve the long-term prospects of humanity."

一年前,Ash Burner加入了OpenAI。

The other board members who kicked Altman out were also found to have something to do with effective altruism.

For example, Tasha McCauley is a member of the board of directors of Effective Ventures, the parent organization of the Effective Altruism Center.

Helen Toner worked on the Open Philanthropy project, which focused on effective altruism.

When Altman returned to the role of CEO last November, both men also came to the board meeting.

In this way, it is worth exploring whether Aschenbrenner's expulsion was due to leaks or other reasons.

In short, Sam Altman seems to be on the same page as the effective altruists – after all, their ideas are the biggest stumbling block to Altman's ideal of AGI (and even ASI).

Leopold Aschenbrenner

Leopold Aschenbrenner还在大三时,便入选了Phi Beta Kappa学会,并被授予John Jay学者称号。

At the age of 19, he graduated summa laude from Columbia University.

期间，他不仅获得了对学术成就授以最高认可的Albert Asher Green奖，并且凭借着「Aversion to Change and the End of (Exponential) Growth」一文荣获了经济学最佳毕业论文Romine奖。

In addition, he has worked as a research assistant to Professor Robert Y. Shapiro in Political Science and Professor Joseph E. Stiglitz in Economics.

Originally from Germany, Leopold Aschenbrenner lives in beautiful San Francisco, California, with a vision to ensure freedom for future generations.

His interests range from First Amendment law, to German history, to topology, and artificial intelligence. Current research focuses on achieving AI generalization from weak to strong.

Pavel Izmailov

Pavel Izmailov received his B.S. in Mathematics and Computer Science from Moscow State University, his M.S. in Operations Research from Cornell University, and his Ph.D. in Computer Science from New York University.

His research interests are broad, including topics within the core areas of machine learning, but his primary focus is on understanding how deep neural networks work.

Improve AI's reasoning and problem-solving capabilities
Interpretability of deep learning models, including large language models and computer vision models
Leveraging AI for scientific discovery
Out-of-distribution generalization and robustness of large-scale models
Technical AI alignment
Probabilistic Deep Learning, Uncertainty Estimation, and Bayesian Methods

In addition, his team's work on Bayesian model selection won the Outstanding Paper Award at ICML in 2022.

Before joining OpenAI, he interned at Amazon, Google, and other major companies

Beginning in the fall of 2025, Izmailov will join NYU as an assistant professor in the Tandon CSE Department and a visiting professor in the Courant CS Department, as well as join the NYU CILVR group.

Supervise GPT-2 with GPT-4

In this study, the OpenAI team proposed an innovative model alignment method – supervising large models with small models.

Leopold Aschenbrenner explains that intuition tells us that transhuman AI systems should be able to "sense" whether they are operating safely.

But can humans extract these concepts from powerful models simply through "weak supervision"?

In the future, AI systems can handle extremely complex tasks, such as generating a million lines of code.

But humans need to set some limits on their behavior, it's better to "don't lie" or "don't run away from the server".

And at present, the black box of large models, human beings simply cannot understand their behavior, so how do we achieve these limitations?

Typically, we train AI systems with human annotations.

However, compared to AI systems that are much smarter than us, humans can only be regarded as "weak supervision".

That is, in complex problems, humans provide only incomplete or flawed annotations.

Fortunately, powerful models have been able to clearly represent concepts such as "whether this action is dangerous or not".

In this way, humans can ask it to say what it knows, including those complexities that we cannot directly monitor.

To do this, the team devised an ingenious experiment – what happens when we use a small model to supervise a large model?

Does a strong model mimic a weaker overseer, or even its mistakes, or is it able to generalize to a deeper level of tasks or concepts?

As a result, they were pleasantly surprised to find that they could take advantage of deep learning's excellent generalization capabilities.

A weak chicken model like GPT-2, which can't count to ten, can be used to supervise GPT-4, which can take the college entrance examination, and restore it to nearly 80% of the performance of the perfect annotation.

However, this approach is currently only effective in certain situations, so if we simply apply current alignment techniques such as RLHF, we may have difficulty scaling the superhuman model.

However, the authors argue that generalization beyond weak overseers is a common phenomenon, and that humans can greatly improve their generalization ability through simple methods.

Future directions for this research may include:

find a better way;
Deepening scientific understanding: when and why do we see good generalizations?
With a similar setup: there are important differences between the experimental setup and future super-alignment problems – can we solve them?

What excites the authors most about this study is that they can make iterative empirical progress on aligning the core challenges of future transhuman models.

Much of the previous alignment effort has either fallen into theory or, while empirical, has not directly addressed the core challenges.

For example, there is a long-standing view in the field of alignment that is "leading". (Instead of aligning a very smart model directly, first align a slightly smarter model, then use it to align a moderately smart model, and so on)

Now, although it is not enough, OpenAI researchers can already test it directly.

Resources:

https://www.theinformation.com/articles/openai-researchers-including-ally-of-sutskever-fired-for-alleged-leaking?rc=epv9gi

The second round of OpenAI's infighting!Ilya's cronies and 2 members of the super alignment team were expelled and criticized for leaking secrets

Foreign media shocked that the key figures of OpenAI's super alignment team were suddenly fired, citing leaks. Moreover, this person is also an important ally and confidant of Ilya. And Ilya, who was at the center of the storm, still did not show up......

Read on

OpenAI introduces more enterprise-grade AI capabilities for API customers to compete with Meta's Llama 3

Nvidia delivered the world's first H200 to OpenAI [with a forecast of the market size of the global AI chip industry]

Llama 3没能逼出GPT-5！OpenAI怒“卷”To B战场

Lao Huang personally came to the door to deliver supercomputing!OpenAI Ultraman went to Stanford to give a speech on GPT-5 after signing

Huang delivered the first super AI chip!

OpenAI is betting on solar energy to drive AI development, co-investing $20 million in Exowatt

Sound cloning revolution: OpenAI technology takes only 15 seconds and realistically mimics the human voice

Abandoning OpenAI, HUDstats adopts Amazon Bedrock to advance esports storytelling technology

My company hasn't been killed by OpenAI yet

Interview with the person in charge of OpenAI Sora: 20 questions to delve into the details of R&D, Sora is still in the GPT-1 period

Fresh Early Technology丨OpenAI opens the "memory" function to ChatGPT Plus users, Cao Cao Travels submits an IPO application to Hong Kong, and Xiaohongshu denies the Pre-IPO round of financing

OpenAI is making trouble mysteriously, GPT-4.5 is online, reasoning crushes GPT-4, Ultraman laughs but doesn't say anything

Restart negotiations with OpenAI, Apple finds a "spare tire" for iOS 18's AI

OpenAI secretly launched a mysterious model, suspected to be ChatGPT4.5 for public testing

Microsoft and OpenAI have been sued as a class

The AI Revolution: The Way Forward for Microsoft and OpenAI