laitimes

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

author:HyperAI

In the industrial world, high-purity gases are widely used in semiconductor manufacturing, optical fiber production, scientific research, medical health, environmental protection and energy, and many other fields. For example, in the semiconductor industry, high-purity gases are key raw materials for chip manufacturing, which directly affects the performance and yield of integrated circuits.

The key challenge in the preparation of high-purity gases is gas separation, and the common gas separation methods include cryogenic method (principle of rectification), adsorption method (molecular polarity), membrane method (membrane filtration), etc. Among them, metal-organic frameworks (MOFs) show great application potential in gas adsorption, storage and separation due to their highly ordered pore structure and adjustable pore size. Stakeholders predict that MOFs may be as important to the 21st century as plastics are to the 20th century.

然而准确预测 MOFs 吸附能力仍面临诸多挑战,针对这一问题,清华大学化工系卢滇楠教授团队,联合美国加州大学河滨分校吴建中教授和北京科学智能研究院高志锋研究员,近日在 nature communications 发布了题为「A comprehensive transformer-based approach for high-accuracy gas adsorption predictions in metal-organic frameworks」的最新论文。

In this study, we propose a machine learning model called Uni-MOF, a three-dimensional machine learning model for predicting the adsorption behavior of nanoporous materials for various gases under various working conditions, which is a major breakthrough in the application of machine learning technology in the field of materials science.

Research Highlights:

* The Uni-MOF framework is a versatile solution that can be used to predict the gas adsorption capacity of MOF under different conditions

* Uni-MOF not only recognizes and restores the three-dimensional structure of nanoporous materials through pre-training, but also further considers operating conditions such as temperature, pressure, and different gas molecules, which makes it suitable for both scientific research and practical applications

* By using adsorption data from other gases, Uni-MOF accurately predicts the adsorption performance of unknown gases

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

Address:

https://www.nature.com/articles/s41467-024-46276-x

Follow the official account, and reply "Adsorption" in the background to get the full PDF

Datasets: Existing databases + program-generated data

In this study, the MOF/COF constructs used for pre-training were mainly derived from two sources—collected from currently available databases, or generated using corresponding programs.

There are a large number of MOF/COF databases, including the computationally synthesized hMOFs50 database, topology-based crystal construction programs (ToBaCCo) MOFs, and experimental-grade CoRE (Computationally Ready Experiments) MOFs51, CoRE COFs52, and CCDC (Cambridge Crystallography Data Center), among others.

In addition, more than 168,000 MOF/COF structures are available in the online integrated database MOFXDB. In addition to exploring nanoporous materials in the materials library, the researchers used the ToBaCCo.3.0 program to generate more than 306,773 MOF structures.

For the downstream task, the adsorption and absorption of gases by MOFs, the researchers collected data from online sources such as MOFXDB to form an adsorption dataset of more than 2.4 million hMOFs for five gases (CO2, N2, CH4, Kr, Xe) at 273/298 K and 0.01–10 Pa, and more than 460,000 CoRE MOFs for two gases (Ar, N2) at 77/87 K and 1–105 Pa.

In addition, the researchers performed Grand Canonical Monte Carlo (GCMC) 53 simulations using RASPA54 software, resulting in an additional dataset of more than 99,000 gas adsorption absorption datasets, with 50,000 initialization cycles and an additional 50,000 cycles for adsorption capacity samples. The adsorption data collected were obtained in the range of 150–300 K and 1 Pa–3 bar, taking into account 7 gas molecules (CH4, CO2, Ar, Kr, Xe, O2, He).

Model framework: pre-training + multi-task prediction fine-tuning

The Uni-MOF framework includes pre-training of 3D nanopore crystals and fine-tuning for multi-task prediction in downstream applications.

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

Uni-MOF 框架示意概览

During the pre-training phase of the model, the researchers implemented two types of tasks to improve the model's performance.

The first type of task is to predict the type of masked atom, i.e., to identify and predict the type of atom that is masked in the molecular structure. The second type of task is to perform the 3D coordinate recovery task under noise, which is to introduce uniform noise in the range between [-1Å,+1Å] on 15% of the atomic coordinates, and then calculate the spatial position coding based on these damaged coordinates.

These two types of tasks are designed to improve the model's ability to resist interference with the data, so as to provide more accurate performance in the face of subsequent prediction tasks.

During the fine-tuning phase, the researchers used about 3 million labeled data points, covering MOFs and COFs under various adsorption conditions, resulting in accurate predictions of adsorption capacity.

Through a diverse database of cross-system target data, the fine-tuned Uni-MOF is able to predict the multi-system adsorption performance of MOFs in any state, including different gases, temperatures, and pressures. Therefore, Uni-MOF is a unified and easy-to-use framework for predicting the adsorption performance of MOF adsorbents.

Results: The Uni-MOF framework has a wide range of applications in the field of materials science

First, the researchers validated the predictive power of Uni-MOF.

The prediction results show that when applied to databases with sufficient data and a relatively centralized operational state, such as hMOF_MOFX_DB and CoRE_MOFX_DB, Uni-MOF exhibits very high robustness, with R² values of 0.98 and 0.92, respectively. On the widely distributed dataset CoRE_MAP, the prediction accuracy of Uni-MOF reaches 0.83, which can still achieve excellent prediction accuracy, demonstrating its good generalization ability.

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

Overall performance of Uni-MOF in large-scale databases

Second, the researchers compared the predicted results of Uni-MOF with those collected experimentally.

The researchers found that the Uni-MOF framework was able to accurately screen for high-performance adsorbents based solely on the predicted adsorption capacity at low pressure. Notably, many of its predicted values under low pressure conditions deviate significantly from experimental values, especially in the case of Mg-DOBDC and MOF-5. Even so, the Uni-MOF framework is one of the most accurate in its predictions, making it suitable for solving engineering challenges.

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

Adsorption isotherms based on low pressure predictions and high pressure experimental values Each curve represents the Langmuir fit

Third, the researchers validated the predictive power of Uni-MOF in terms of cross-system characteristics.

The test results show that Uni-MOF is robust in predicting the adsorption capacity of unknown gases, achieving a high prediction accuracy (R²) of 0.85 for Krypton and better than 0.35 for all unknown gases. Compared with single-system tasks, the Uni-MOF framework shows superior performance on cross-system datasets, and can accurately predict the adsorption performance of unknown gases, showing its strong prediction ability and universality.

Effectively identifying 630,000 three-dimensional spatial configurations, Tsinghua University took the lead in publishing the Uni-MOF model

The case of Uni-MOF cross-system prediction

In addition, in order to evaluate the ability of the model in structure recognition, the researchers took hMOF-5004238 as an example to analyze the interatomic interaction forces inside the material structure, and proved the effectiveness of Uni-MOF in identifying more than 630,000 three-dimensional spatial configurations and their interatomic connections, highlighting the versatility and broad application prospects of the model.

In summary, the Uni-MOF framework is a versatile prediction platform for MOF materials, as a gas adsorption predictor for MOFs, it exhibits high accuracy in predicting gas adsorption under various operating conditions, and has a wide range of applications in the field of materials science. More importantly, Uni-MOF has made a major breakthrough in the application of machine learning techniques in the field of materials science.

Discover-Design-Optimize, AI Accelerates Materials Science Across the Board

Materials science is an important discipline related to the discovery, design and manufacture of new materials, and it plays an extremely important role in various fields. From healthcare to energy storage, from environmental protection to information technology, developments in materials science are essential to solving the various challenges facing society today.

With the continuous advancement of technology, we are in an era of material science revolution, and the emergence of new materials provides new ways and tools for human beings to solve problems. With a deeper understanding of material properties and structures, we can expect to create lighter, stronger, and more energy-efficient materials.

Artificial intelligence technology can accelerate the discovery of new materials, improve material properties, and reduce R&D costs, and has shown great application potential in the field of materials science in recent years.

* Material Discovery & Design:

AI technology can accelerate the discovery and design process of new materials through efficient data mining and pattern recognition. For example, machine learning algorithms can be used to analyze the structure and properties of a large number of known materials to predict new materials with specific properties. This approach can significantly reduce the time required for material screening and reduce the cost of testing.

At the end of November 2023, Google DeepMind published a paper in the journal Nature saying that it had developed Graph Networks for Materials Exploration (GNoME), an artificial intelligence reinforcement learning model for materials science, and found more than 380,000 thermodynamically stable crystalline materials through the model and high-throughput first-principles calculations, which is equivalent to "nearly 800 years of knowledge accumulation by human scientists" , which greatly accelerates the research speed of discovering new materials.

(Click here for a detailed story: DeepMind Releases GNoME, Uses Deep Learning to Predict 2.2 Million New Crystals)

* Material property prediction:

AI technology can build efficient predictive models that can be used to predict the performance and behavior of materials. These models can be trained on large amounts of experimental data or simulation results to provide accurate predictions of material properties. For example, machine learning algorithms can be used to predict the mechanical properties, thermal properties, and electronic structure of materials, providing an important reference for material design and application.

* Material Optimization and Design:

AI technology can improve the performance and stability of materials by intelligently optimizing their structure and properties. For example, reinforcement learning algorithms can be used to achieve automatic optimization during material preparation to maximize material properties.

* Material Process Control & Monitoring:

Artificial intelligence technology can be used to optimize the material preparation process and enable intelligent monitoring and control of the material production process. For example, machine learning algorithms can be used to analyze various parameters and conditions in the material preparation process, optimize the process, and improve production efficiency and material quality. At the same time, artificial intelligence technology can also realize real-time monitoring and early warning of the material production process, helping to identify and solve potential problems in advance and reduce production risks.

A series of important advances have been made in the application of artificial intelligence technology in the field of materials science, which provide new ideas and methods for the discovery, design, optimization and preparation of materials. In the future, scientists can use AI technology to better predict the properties of materials, simulate the structure of molecules, optimize the design of materials, explore the properties of materials, and more...... In this way, we will continue to promote progress and innovation in the field of materials science.

Resources:

1.https://www.nature.com/articles/s41467-024-46276-x#Sec11

2.https://www.sohu.com/a/753459278_661314

3. Hatpas://vv. Tsinghua. Edu. CN/Info/1175/110086. Hatma

Read on