ACM Distinguished Member Ji Shuiwang: Deep Learning in Quantum Chemistry and Physics

Finishing 丨 Wang Haowen

Proofreading 丨 Victor

Quantum technology and artificial intelligence are the most advanced science and technology at present, the former is expected to have super computing power, and the latter has been "killing everywhere" in all walks of life. What kind of sparks will collide when the two meet? In what ways can AI power quantum technology?

In December last year, Ji Shuiwang, a professor in the Department of Computer Science and Engineering at Texas A&M University (with an influential president), delivered a speech on "Quantum Chemistry and Deep Learning in Physics" at the CNCC Conference, expressing his feelings about the two disciplines.

"Quantum breaks a lot of our common-sense understanding that the world of quantum states is uncertain, and we can only predict the probability of various outcomes at best."

In addition, he also said that although the object of quantum study is an object at the atomic level or even subatomic level, it also has similarities with macroscopic rules, such as the structure between molecules can be treated as a graph.

Ji Shuiwang received his Ph.D. in Computer Science from Arizona State University in 2010 under the supervision of Professor Ye Jieping. His research interests include machine learning, deep learning, data mining, and computational biology, and he received the National Science Foundation Career Award in 2014. Elected ACM 2020 Distinguished Member in 2020.

The following is the full text of the speech, AI technology reviews have been sorted out without changing the original meaning.

Hello everyone, today I will focus on how to use AI and graphics computing technology to solve quantum physics and quantum chemistry problems.

First, let's get some context: in the field of classical physics, we're talking about objects or phenomena in the macroscopic world. For example, if you kick a ball, if you know the exact mass, speed, and current time of the ball, you can predict where the ball will be in five seconds. But in the quantum field, because the object of study is an object at the atomic or even subatomic level, such as molecules composed of atoms and chemical bonds, it is impossible to think about the rules of the field according to traditional logic.

In recent years, we have been working with experts in various fields, hoping to make research breakthroughs from quantum physicists, quantum chemists, quantum materials scientists, etc. Scholars in these different fields have some topics that need to be studied in common, and these topics are related to images, AI, and especially deep learning. Now let me report on the latest developments.

AI meets quantum chemistry

Molecules are made up of atoms and chemical bonds between them, for example, in molecules, atoms are represented by dots, while molecules are represented by lines. So it is possible to present molecules in the form of 2D patterns. In the field of machine learning and data mining, graph computing is a common topic. But migrating to the molecular field also faces new challenges: the form of 2D graphics does not fully exploit the properties of molecules. After all, a molecule is not actually a 2D plane, which has three-dimensional spatial properties. Its structure is not determined not only by the properties of points and lines, but also by spatial coordinates, chemical bond angles, and so on. Therefore, when exploring molecular function, it is necessary to pay attention to its three-dimensional structure.

How to efficiently use the spatial information of molecules to make predictions and generate models? Message Passing Neural Network (MPNN) is a commonly used graph neural network framework. We can find that such methods can be summarized into two equations: the aggregate function and the node update function. Aggregate functions aggregate information from neighbor nodes.

When we try to calculate the information of a node, we basically consider the properties of the node itself at the previous point in time, as well as the properties of the intermediary node, as well as the information of the boundary. After the aggregate function is calculated, the function needs to be updated with nodes, which requires the ability to update the current node information using the information and properties of the previous steps. But the move simply takes into account the characteristics of nodes and boundaries. So, our recent work is to try to build a 3D graph computing network so that we can get the complete 3D information.

Once three-dimensional information needs to be incorporated, computing networks will become very complex and information transmission will become inefficient. So we want the network to be efficient while making computing equivalent and stable. If there is a molecule, many of its 2D properties may not change when you rotate the molecule, but the 3D information is not necessarily the same; therefore, we hope that in the prediction and generation model, when one node of the molecule rotates, its quantum properties will also remain stable.

The predictive function of a model is to predict the properties of a given molecule, for example, we can predict whether a molecule has the potential to act as an antibiotic. Generative models, on the other hand, are new molecules that are generated/synthesized according to a given property.

At present, many researchers have taken 3D properties into account. One of the earliest jobs was called SchNet, which incorporated distance as a three-dimensional property. Even using SchNet means that the boundary and the length of the boundary are considered. Recently there was also a job called DimeNet. DimeNet builds on SchNet because it takes perspective into account. For example, if you have information from j to i, you need to calculate mi, j, then you need to take into account not only the node information, but also the angle between the two chemical bonds.

But in chemistry, we've found that it's not enough to just consider distance and geometry. As shown above, the red part represents a plane that does not really exist, and the same is true for the blue part. Molecules have geometric shapes, but it is impossible to fully determine the geometry of molecules just by knowing the distances of three chemical bonds and the angles of two bonds.

Let's think about it, there will be a φ angle between the plane determined by d1,d2 and the plane determined by d2,d3. It is also this angle that becomes an uncertain factor in the above model. Because even when there are two identical bond angles, and the φ angles are constantly changing, the geometry of the molecule will also change.

What we're trying to build is a complete, all-case-ready geometric framework called spherical information transfer.

To solve the above problem, the angle of φ is taken into account, φ angle is the angle between X and his projection.

One consideration in this move is that the presented molecule must be stabilized. For example, when rotating molecules, their properties, such as all the angles, should not change. In spherical information transfer, we build a spherical coordinate system that includes reference points, distances, and torsional angles. But this model is not 100% perfect.

Because only one control is considered, uncertainties arise when other nodes are considered. So our work is imperfect but very efficient.

Recently there was a system called GemNet, and the idea was that our system only used the information of the neighbor node of node A, and did not use the information of the 2-hop domain. GemNet scientists believe that when you use 2-hop domain information, the system will be close to perfection. It is true that when you use 2-hop domain information, the angle information will be better combined and achieve a nearly perfect effect, but the problem is that once you use 2-hop information, the information update will incorporate a large number of neighbor nodes, and the entire information update step will become extremely complicated.

In contrast, our system, while not 100% perfect, is more efficient and can intuitively see the complexity: n represents the number of nodes, and K represents the average degree of freedom of all nodes. In practice, our model is very similar to the more complex GemNet.

The following diagram provides a clear picture of what our model can or cannot represent. Graphs a and b represent phenomena that are chemically called chiral.

In fact, these two molecules are like mirror images. We designed a network that was able to distinguish between the two cases, which many previous methods could not do. Because in our method, we use the torsion angle as the relative angle, and in the chiral example, the angles of q1 are 60° and 90° respectively. However, in the second case, the torsion angle of q1 is all 90°, so our method cannot distinguish it. The second case is also pointed out by peers in the community to "hopefully correct", but in a chemical sense, the probability of this happening is very, very low, because q2 and q3 are different atoms, and the same torsion angle between them and q1 is almost impossible. So we think that while our model isn't 100% covered by all situations, situations that can't be covered are difficult to happen in nature.

AI encounters quantum mechanics

When we started looking at quantum mechanics, the Schrödinger equations provided us with the solution. If you know the values of distance angles and torsional angles, you can use different functions in the equation, such as spherical harmonics and spherical Bezier functions, or you can use other basic functions to collect Θ values and end up with a eigenvector. This is a characteristic vector of physical significance that can be used in actual information transmission.

The following diagram is the system build process. There is an input module, an interaction module that uses torsional angle and distance information as input, which may be revalued many times, and the number of repetitions will depend on the amount of data you have. Finally, there is the output module, which enables information transfer to be used in some events, such as the Public Catalyst Challenge.

The Open Catalyst Challenge is a competition launched by Facebook AI and CMU. The purpose of the event is to use new large-scale molecular data to predict thermodynamic data. In the field of catalyst discovery, these target molecules are usually relatively large, and each molecule contains an average of 80 atoms in structure.

So they divided the dataset into four groups based on the relationship between training and testing, and the score was based on the average of each absolute error to evaluate the best quality that the system could measure. Each row represents a model, and CGCNN comes from a company that uses models to study molecules, as well as SchNet, DimeNet, and GemNet. As you can see, SphereNet can occupy a very competitive position in all systems.

The graph above is the result from other datasetSYM9. This is a relatively small dataset, with each column representing a quantum property and each row representing a prediction method, and from the table you can see the average error of each method on different properties.

Our system has also been successful on other datasets, such as MD17, which is a smaller dataset. As we mentioned, GemNet can only be used in smaller datasets because it uses 2-hop data and requires more computing power.

As you can see in the chart above, even in smaller datasets, our system performs slightly better than DimNet, and the performance is almost the same as GemNet, but GemNet's computing consumption is greater.

The following illustration calculates a comparison of consumption. As of now, our computing consumption is much smaller compared to two generations of GemNet computing.

The following figure shows the filters for the system. As you can see, each row represents a molecule with a different torsion angle, and in many cases our filters show results that are very different at different torsion angles, which confirms that the torsion parameter is very important when capturing different molecular patterns.

To summarize briefly, our idea was to try to display the molecule's three-dimensional information in its entirety, so we built the SphereNet framework. And the framework is theoretically nearly perfect and very efficient, from the practical effect of our framework may already be 100% covered, we have also made a lot of progress in this direction.

At present, the work has been open sourced and designed as a "dive into graphs" library.

In particular, for molecular research applications, we have a dedicated library called "molecule X". If you follow the KDD competition, you'll learn that we are one of the leaders in graph neural network computing, and we are also involved in AI Cures' Open Challenge for Covid-19 and we are currently ranked first in both AUCROC and AUPRC.

Therefore, our work is mainly to develop new image processing technologies to solve problems in the field of basic science, especially in quantum chemistry, quantum physics, and materials science. My team developed computational methods, open sourced software libraries, and published our results in conferences and journals. At the same time, we have also participated in a number of open challenges, such as the KDD Cup.

Our research into the intersection of artificial intelligence and quantum physics, in quantum physics, is based on Schrödinger's equations. The corresponding research is very expensive, and if you build a particle system, you need huge computing power support to solve the eigenvalue problem. But combining quantum physics with AI computing will be a very hot field, and at present, the field is still in the exploration stage.

Leifeng NetworkLeifeng Network

ACM Distinguished Member Ji Shuiwang: Deep Learning in Quantum Chemistry and Physics

Read on