laitimes

Pfizer's AI method is published in Science, revealing tens of thousands of ligand-protein interactions

author:ScienceAI
Pfizer's AI method is published in Science, revealing tens of thousands of ligand-protein interactions

Edit | X

This is despite significant progress in protein structure prediction. But for more than 80% of proteins, no small molecule ligands have been found to date. Identifying small molecule ligands for most proteins remains challenging.

Now, researchers at CeMM, the research center for molecular medicine of the Austrian Academy of Sciences, have teamed up with Pfizer to develop a method to predict the binding activity of hundreds of small molecules to thousands of human proteins.

This large-scale study revealed tens of thousands of ligand-protein interactions, which could lead to the development of chemical tools and therapeutics.

In addition, with the support of machine learning and artificial intelligence, it can "unbiasedly" predict how small molecules interact with all the proteins present in living human cells.

相关研究以《Large-scale chemoproteomics expedites ligand discovery and predicts ligand behavior in cells》为题,于 4 月 26 日发表在《Science》杂志上。

Pfizer's AI method is published in Science, revealing tens of thousands of ligand-protein interactions

Paper link: https://www.science.org/doi/10.1126/science.adk5864

Most drugs are small molecules that affect protein activity. These small molecules, if well understood, can also be a valuable tool for characterizing protein behavior and conducting basic biological research.

Given these important roles, it is surprising that for more than 80% of proteins, no small molecule conjugates have been found to date. This has hindered the development of new drugs and treatment strategies, as well as new biological insights into health and disease.

Pfizer's AI method is published in Science, revealing tens of thousands of ligand-protein interactions

Figure: Schematic diagram of the ligand discovery method. (Source: Paper)

To close this gap, CeMM's researchers partnered with Pfizer to expand and expand an experimental platform that allowed them to predict how hundreds of small molecules with different chemical structures interact with all expressed proteins in living cells.

This has resulted in a rich catalog of tens of thousands of ligand-protein interactions, which can now be further optimized to represent a starting point for further therapeutic development.

Specifically, researchers used chemical proteomics methods to map protein-ligand interactions in the human proteome. Through a library of approximately 400 ligand fragments attached to a photoactivated crosslinker, the authors identified approximately 50,000 statistically significant interactions in approximately 2500 proteins, including most of the targets for which no previously known ligands were known.

These results were validated by biochemical experiments, and E3 ligase binders and transmembrane transporter inhibitors were identified from the screening.

The integrated machine learning binary classifier further enables explainable prediction of fragment behavior in cells. The resulting resources for fragment-protein interaction and predictive models will help elucidate molecular recognition principles and accelerate ligand discovery efforts for hitherto undrugged proteins.

In this study, a team led by CeMM PI Georg Winter demonstrated this by developing small molecule conjugates of cell transporters, components of cellular degradation mechanisms, and understudied proteins involved in cell signaling.

Pfizer's AI method is published in Science, revealing tens of thousands of ligand-protein interactions

Figure: Fragment promiscuous prediction. (Source: Paper)

In addition, using large data sets, machine learning and artificial intelligence models have been developed that can predict how other small molecules interact with proteins expressed in living human cells.

The researchers used the Fully functionalized fragment (FFF) descriptor and combined it with a fast, lightweight, and fully automated ML algorithm for binary classification.

Briefly, the screened fragments are first labeled as mixed (1) or unmixed (0) according to the threshold of protein interaction counting. Then, a Transformer-based ML model (TabPFN) was used to map the compound's FFF descriptor to a classification score (0 or 1).

TabPFN is a fully learned model that approximates Bayesian inference and does not require hyperparameter tuning, so a high-performance ML classifier can be obtained directly from chemical proteomics analysis data.

With this approach, the promiscuous model can also be used to understand the specificity of the bound protein.

"We were amazed to discover how AI and machine learning can improve our understanding of the behavior of small molecules in human cells. We hope that our catalog of small molecule-protein interactions and associated AI models can now provide a shortcut to drug discovery approaches. Winter said.

To maximize the potential impact and usefulness of the scientific community, all data and models are freely available through a web application (open source address: https://ligand-discovery.ai).

said Dr. Patrick Verhoest, vice president and head of drug design at Pfizer. "This is an outstanding collaboration between industry and academia. We are excited to show you the results of our team's three years of working closely together. It's a great project. 」

References: https://phys.org/news/2024-04-shortcut-drug-discovery-method-large.html

Read on