laitimes

Scientists warn: AI could propose 40,000 potential new chemical weapons in 6 hours

Written by | Green Apple

Scientific papers are often exemplary in every detail. It is often the responsibility of the author team to disclose all the information needed to make it easier for others to reproduce their findings.

But the study is an exception.

A recent paper published in Nature's sub-journal Nature Machine Intelligence, "Dual - use of artificial Intelligence - powered drug discovery, apparently frightened its authors. This is reflected in the tone of the text and the lack of disclosure of key information.

One possibility verification

In 2021, Raleigh, North Carolina-based Collaborations Pharmaceuticals was invited to publish a paper on "Drug Discovery Techniques May Be Misused." The company uses computers to help customers identify molecules that appear to be potential drugs. The venue was a conference organized by the Spitz Laboratory in Switzerland.

It is a series of "fusion" meetings set up by the Swiss Government to identify technological developments that may have an impact on the Chemical Weapons Convention and the Biological Weapons Convention. Held biennially, the Conference brings together an international group of scientific and disarmament experts to explore the latest technological status and trajectories in the chemical and biological fields, to reflect on potential security implications and to consider how best to address them internationally.

In preparation for the talk, some researchers at collaboration conducted what they called a "thinking experiment," which computationally demonstrated the concept of manufacturing chemical and biological weapons.

At the Swiss conference, Collaborations Pharmaceuticals decided to explore how AI could be used to design toxic molecules. The company previously designed a model of drug molecule generation called MegaSyn, which uses machine learning models to predict biological activity and look for new therapeutic inhibitors for human disease targets. This generative model typically penalizes predicted toxicity and rewards the predicted target activity.

In the new experiment, they tweaked the model to reward both toxicity and biological activity, and trained the model using molecules from a public database.

Their methods and results are troublingly simple: by training on the chemical structure of a group of drug-like molecules extracted from a public database (defined as substances that are easy to synthesize and easily absorbed by the body) and the known toxicity of those molecules, the modified software discovered forty thousand potentially deadly molecules in less than six hours. These molecules meet the parameters predefined by the researchers and may be used as chemical weapons.

The Verge spoke to the paper's lead author, Fabio Urbina, a senior scientist at Collaborations Pharmaceuticals Drug Discovery, on the issue of possible misuse of AI technology in drug development.

The research team had never thought of it before, and they were also vaguely aware of the safety of working with pathogens or toxic chemicals. Urbina's work is rooted in building ML models for therapeutic and toxic targets, not to create viruses, but to better assist in the design of new molecules for drug discovery, using ML models to predict the toxicity of newly produced drugs.

It's like, there is a wonderful drug that can magically lower blood pressure, but its side effect is to break through the heart channel, then, this drug touches the forbidden area, it is impossible to market, because it is too dangerous.

For decades, teams have been using computers and AI to improve human health. In other words, no matter which drug you're trying to develop, you first need to make sure they're not toxic.

Recently, the company released a number of computational ML models for toxicity prediction in different fields, and Urbina chose to flip the switch during the conference presentation, truly moving toward toxicity, exploring how TO use AI to design toxic molecules.

It was an unprecedented mental exercise for the team, which eventually evolved into a computational proof of concept for manufacturing biological weapons.

Urbina is a little vague in its description of some details, deliberately concealing certain details to prevent them from being exploited.

In simple terms, the general workflow of the whole experiment is to use the molecular datasets that are already in the development history as a prediction label, because these molecules have been tested for toxicity.

It's important to note that the team focused on VX.

So what exactly is VX?

Strictly speaking, it is an artificial chemical warfare agent classified as a nerve agent. Nerve agents are known to be the most toxic and rapidly effective chemical warfare agents. Specifically, VX is an inhibitor of the so-called acetylcholinesterase. Whenever you do anything related to muscle, neurons use acetylcholinesterase as a signal to encourage you to "move your muscles." This is exactly where VX is fatal, it actually blocks your diaphragm, which affects the movement of your lung muscles, causing your lungs to become paralyzed, unable to breathe, or even paralyzed.

Obviously, this is what people want to avoid. Therefore, historically, experiments have been conducted on different types of molecules to see if they inhibit acetylcholinesterase. So Urbina built a large dataset of these molecular structures and their toxicity.

The team can then use these datasets to create an ML model that basically tells which parts of a molecular structure are important for toxicity and which are not. The ML model can then be given new molecules, possibly new drugs that have never been tested before. Its judgments then tell us which drugs are predicted to be toxic, or predicted to be non-toxic.

It is the above method that effectively improves the speed of researchers' screening of drugs, that is, they can very quickly screen out a large number of molecules and weed out those that are predicted to be toxic.

In the team's study, however, this was reversed. Apparently, the goal the team tried to achieve with the model was to predict toxicity.

In addition, another key part is these new generative models. The team can learn how to put molecules together by feeding some completely different structures into the generative model. Then, in a sense, it can be asked to produce new molecules. At this point, generative models can produce new molecules throughout chemical space, but only randomly and have no substantive significance. But one thing researchers can do is tell the generative model where it wants to go.

Of course, this can be achieved by designing a scoring function that gives it a high score if the molecule it produces is what the researchers expect. Taking the production of poisons as an example, it is necessary to give high scores to toxic molecules.

The results of the experiment can see that the model began to generate these molecules, many of which look like VX and also like some other chemical agents.

Urbina says the whole team really isn't sure what they're going to get. Because generative models are still relatively new technologies, there is currently no widespread use of generative models.

But one issue of particular concern is that many of the resulting compounds are more toxic than VX. What's even more shocking is that VX is basically one of the most potent compounds known, which means that only very, very, very small amounts are required to kill.

While these predictions have not been validated in real life, and the researchers say they don't want to verify them themselves, predictive models generally perform quite well. Therefore, even if there are many false positive reactions, there should be more toxic molecules in them.

Second, the research team actually looked at many of the structures of these newly formed molecules. It's not hard to see that many of them do look like VX and other warfare agents, and even generate real chemical agents in some models. And, these were generated when the model had never seen these chemical agents. Needless to say, the model will certainly be able to generate some toxic molecules, because some of them have been manufactured before.

So, worryingly, how easy is it to achieve?

The researchers say that many of the things used during development are free. You can download toxicity datasets from anywhere. If one person knows how to program in Python and has some ML capabilities, it's possible to take a short weekend to build a generative model like this one driven by toxic datasets.

So that's why the researchers really considered publishing the paper: The bar is simply too low for this type of abuse.

Urbina said in the paper: "We still cross a gray moral boundary, proving that it is possible to design virtual potentially toxic molecules without much effort, time or computing resources." While we can easily delete the thousands of molecules we have created, we cannot delete knowledge of how to recreate them. ”

Urbina says it's a very unusual topic, and they want to take that real information out there and really talk about it. At the same time, it is not desirable to fall into the hands of the illegitimates.

But he made it clear that as a scientist, you should take care that the content posted must be done responsibly.

In addition to this, Urbina says that what is currently being done is indeed very easy to replicate. Because a lot of it is open source — the sharing of science, the sharing of data, the sharing of models.

Urbina earnestly hopes that more researchers will acknowledge and be aware of the potential abuse.

When you start working in chemistry, you're indeed told about the dangers of chemical abuse, and it's your responsibility to make sure you avoid that as much as possible. In ML, in contrast, there is no guidance on the misuse of the technology.

"We just want more researchers to acknowledge and be aware of the potential abuse," Urbina says.

Given that the model is performing better and better, it is necessary to make this awareness public and can really help people pay attention to this problem: at least it has been discussed in a wider circle, and at least it can become a focus for researchers.

This article is reproduced from the academic headline (ID: SciTouTiao) with permission, if you need to reprint it for a second time, please contact the original author.

Read on