laitimes

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

Author: Zhu Hum Hum

Editor: Wang Haha

Typography: Li Xuewei

On November 17, 2021, Science magazine released its 2021 annual scientific breakthrough list, with AlphaFold and RoseTTA-fold, two technologies based on artificial intelligence predicting protein structure, topped the list.

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

In addition, the annual scientific breakthroughs include the development of antiviral drugs against COVID-19, new measurements of μ, Observations of the Martian earthquake, restoration of ancient human DNA from soil, in vivo applications of CRISPR, new insights into early human development, the use of psychedelic drugs to treat PTSD, the development of monoclonal antibodies to treat infectious diseases, and advances in fusion energy generation.

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

In addition, Science magazine selected three annual breakdowns, including slim hope of meeting climate goals, anger over Alzheimer's drugs, and exclusion and attacks from COVID-19.

This article focuses on the most important scientific breakthrough of the year , artificial intelligence-based protein structure prediction.

Structural biology has lasted for more than 50 years

We all know that protein is the main bearer of life activities, and it is even impossible to say that there is no life without protein. Therefore, proteins have long been the focus of research by life science workers. Among them, the structure of proteins is a hot topic for many life science workers, after all, its main function is determined by structure.

In 1957, John C. Kendrew and Max F. Perutz determined the first protein structure through X-ray crystallography. Soon after, Christian B. Anfinsen Jr. proposed that the structure of proteins is thermodynamically stable, and it seems that the three-dimensional structure of proteins can be predicted based on the amino acid sequence of proteins.

However, the structural complexity of proteins is far greater than one might think. According to the central law, proteins are mainly composed of DNA transcribed into RNA, then translated into peptide chains and assembled, a protein molecule is composed of one or more polypeptide chains, and the polypeptide chains are folded into a unique shape. At the same time, the specific shape of protein molecules is determined by the structure of four levels, including primary, secondary, tertiary and fourth-order structures, and the former structure determines the latter-level structure.

Wherein the amino acid sequence of the polypeptide chain is a primary structure, part of the peptide chain in the primary structure is curled or folded to produce a secondary structure. The secondary structure undergoes a series of conformational changes to form a three-dimensional structure, that is, a tertiary structure, which is generally spherical or fibrous. Tertiary structures have specific domains, forming binding sites or regulation sites, which can bind substances of specific structures and exercise specific functions. Proteins composed of two or more polypeptide chains can form a four-stage structure.

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

Figure | Protein 3D structure (Source: Nat Commun)

As a result, for more than 50 years since Christian B. Anfinsen Jr. theory was proposed, scientists have been unable to solve the problem of protein folding, and their understanding of protein structure is still very limited.

In recent years, with the development of cryo-electron microscopy, protein structure can be observed without crystalline samples, which has made progress in protein structure research. However, cryo-EM is a very expensive device, and only a very small number of laboratories have the conditions to be equipped, which is very unfriendly to the majority of scientific researchers. Therefore, the life sciences community urgently needs new ways to solve the problem of protein folding.

AI helps solve protein structure prediction challenges

With the development of computer science, some scholars have previously proposed to use computer models to solve the problem of protein folding. While this idea was feasible, the various computer models developed over the decades that followed were consistently limited in their accuracy in predicting the structure of proteins.

For the past 25 years, the International Protein Structure Prediction Competition (CASP) has been following advances in this field, trying to find computer models that can perfectly solve the problem of protein folding. Until the 14th casp14 competition was successfully held, DeepMind's AlphaFold system demonstrated unparalleled accuracy in predicting protein structure.

The competition is evaluated by comparing the solutions provided by the participants to the "Golden Test Standard", measuring accuracy with a GDT score ranging from 0 to 100, and a GDT score of around 90 points, which can be considered competitive with human experimental methods. The total score of DeepMind's AlphaFold system reached 92.4, which was 1.6 with an error from the experiment, which was a terrifying 87.0 even for the most difficult protein without homologous templates.

At the same time, AlphaFold's neural network can predict the structure of a typical protein in minutes, as well as the structure of a larger protein (such as a protein with 2180 amino acids and no homologous structure). The model can accurately estimate the reliability of each amino acid, making it easy for researchers to use its prediction results.

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

Figure | 3D view of human interleukin 12 binding to its receptor predicted by researchers using RoseTTAFold (Source: UW Medicine Institute for Protein Design)

Then, in July, David Baker, professor of the Department of Biochemistry at the University of Washington School of Medicine and director of the Institute of Protein Design, led a team of computational biologists to successfully develop a tool called RoseTTAFold, based on deep learning, that can quickly and accurately predict the structure of the target protein based on limited information, with the same accuracy as AlphaFold2.

Not only that, but RoseTTAFold requires less computational energy and computation time than AlphaFold2: with just one gaming computer, protein structures can be reliably calculated in just ten minutes. Even more remarkably, RoseTTAFold's code and servers are completely freely available to the scientific community!

Science released the top ten scientific breakthroughs in 2021, and AlphaFold ranked first!

Figure | David Baker (Source: University of Washington website)

Since July, the program has been freely downloaded from GitHub by more than 140 independent research teams, and scientists from around the world are now using RoseTTAFold to build protein models to accelerate research in related fields.

Also in July of this year, Demis Hassabis, founder and CEO of DeepMind, also shared AlphaFold's open source code in Nature magazine and published a complete methodology for the system, detailing how AlphaFold accurately predicts the 3D structure of proteins. That said, this powerful protein structure prediction model is already completely free.

So far, two powerful artificial intelligence-based protein structure prediction models are all free and open, and researchers can use these two models to obtain the spatial structure of proteins at any time, without crystallizing proteins or using expensive cryo-EM for research.

In a simultaneously distributed review article, Holden Thorp, editor-in-chief of science magazine, said, "First, it solves the protein folding problem that has plagued the life sciences for nearly 50 years, like gravitational waves in physics, scientists have been persevering for decades to finally overcome this problem; second, this technology has changed the rules of future structural biology, like cryo-ELECTRON microscopy, accelerating the development of life sciences; Completely free means it's a protein prediction model that's truly suitable for everyone. ”

Resources:

https://www.eurekalert.org/news-releases/937705?

www.science.org/doi/10.1126/science.abn5795

Read on