laitimes

To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

author:ScienceAI
To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

Edit | Cabbage leaves

Understanding the intermolecular interactions of ligand-target pairs is key to guiding optimized cancer drug research, which can greatly reduce the burden on wet labs. Current computational methods have some shortcomings that limit their practical application.

Here, researchers from Harbin Institute of Technology present DrugMGR, a deep multi-granular drug representation model capable of predicting the binding affinity and region of each ligand-target pair.

Through the learning of complex natural mechanisms of ligands and multi-granular representations of higher-order features of proteins, DrugMGR significantly outperforms current state-of-the-art methods on almost all datasets. And, it is the first model to analyze protein-ligand complexes using graph, convolution, and attention-based information at the same time.

该研究以「DrugMGR: a deep bioactive molecule binding method to identify compounds targeting proteins」为题,于 2024 年 4 月 1 日发布在《Bioinformatics》。

To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

Drug development is critical to the treatment of diseases, and scientists can quickly find treatment options through drug repurposing, but the high cost and long cycle time of traditional experimental methods limit their application. In contrast, computational methods to identify high-confidence ligand-target interactions can significantly narrow the list of compound candidates and reveal the binding mechanisms of protein-ligand complexes.

Over the past decade, the proliferation of data on bioactive molecules has driven the use of deep learning and artificial intelligence in studying protein-ligand interactions.

However, there are two problems with existing deep learning methods: first, most models have insufficient ability to capture multi-particle size ligand features, and fail to fully integrate multiple natural mechanism information such as atomic environment and chemical genome sequence; Second, many methods ignore the construction of the interpretability of the binding region, and although a few try to infer the binding site with the help of attention mechanism, the associated biological characteristics are not clear, which is not conducive to guiding researchers to locate the binding site.

To address these deficiencies, researchers at Harbin Institute of Technology proposed DrugMGR, a model based on deep multi-granularity representation that predicts ligand binding affinity and region to protein targets.

To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

Figure: Overview of the DrugMGR methodology. (Source: Paper)

Specifically, the team first used three deep modules to comprehensively encode the natural mechanisms of ligands, i.e., using graph attention networks (GATs) to model the atomic environment, CNNs to extract global chemical genome sequences, and molecular transformers (MTs) to capture the interactions of local substructures.

The researchers also designed a parallel VAE module to learn the high-level features of proteins in a probabilistic encoder via CNN blocks, and then reconstruct the target structure in a probabilistic decoder.

The ligand and protein coding representations are then fed into a pairwise interaction mapping module consisting of attention networks, thus learning the interaction patterns of the protein-ligand complex. Joint pairwise interactions are represented by a fully connected network decoded for predicting the binding affinity of bioactive molecules.

To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

Figure: Performance comparison between random and cold-start splitting of the BindingDB dataset. (Source: Paper)

For binding region prediction, the researchers first highlight the binding site of the reconstituted protein with ligand binding potential as the original binding region. Subsequently, convolution operations are used to multiply the multi-granular ligand features with the protein features.

Next, they recorded the convolution results as a response vector for each ligand-target pair, and labeled regions with high values in the response vector as visualized binding regions. Finally, the researchers utilized these two regions to guide the binding region for the final prediction.

Compared to DrugBAN, a binary classifier for simple identification of interactions between drugs and targets, the team's proposed DrugMGR can further understand the comprehensive binding information (binding affinity and binding region) of protein-ligand complexes, which plays a central role in the practical application of bioactive molecule binding.

To predict the binding affinity of ligand-target pairs, Harbin Institute of Technology developed a new SOTA drug representation model

Figure: Visualization of the identified drug Talazoparib and targeting PARP1 in three predicted regions. (Source: Paper)

For triple-negative breast cancer (TNBC), which is highly aggressive, has a poor prognosis, and lacks effective targeted therapies, the team used the DrugMGR model to identify potential inhibitors and chemotherapy drugs against PARP1 from the DrugBank database.

The top 10 candidates were screened and validated with the GeneCards and PDB systems, and the validity of the model was confirmed by visualizing the binding region of PARP1 to Talazoparib (PDB ID: 4PJT).

The results show that DrugMGR accurately predicts the binding site and performs excellently, which is expected to be a powerful tool for virtual screening of PARP1 and help biopharmologists screen better combinations of anti-tumor drugs.

Paper link: https://academic.oup.com/bioinformatics/article/40/4/btae176/7638803

Read on