laitimes

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

Reporting by XinZhiyuan

EDIT: Time

【Introduction to New Zhiyuan】

On January 26, Nature published the article "New Dangers? The computer found 100,000 new viruses in the original genetic data, and the article pointed out that clues about future epidemics may be hidden in the existing genetic data.

The coronavirus alone has paralyzed the world economy, killing millions of people.

However, virologists estimate that there are still trillions of unknown viruses present.

Many of these could be fatal or have the potential to trigger the next pandemic.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic
New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

Global outbreak data on 27 January 2022

"This is an impressive feat of engineering"

On Jan. 26, scientists posted an article in Science titled "New Dangers? The computer found 100,000 new viruses in the original genetic data, pointing out that clues about future outbreaks may be hidden in existing genetic data.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

Because the number of viruses increased by an order of magnitude, the study would use RNA, not DNA.

"It's a foundational job." National Library of Medicine J. Rodney Brister said.

The study will also launch what is known as Petabytes genomics, 1 PB = 1024 TB, which will analyze the data of RNA and DNA.

"It also shows that our understanding of the virus is seriously lacking." Disease ecologist Peter Daszak said.

Peter Daszak is president of the New York City Eco-Health Alliance, a nonprofit research organization that is raising money to conduct a global virus survey.

By sifting through existing genomic data on an unprecedented scale, scientists discovered nearly 132,000 RNA virus genomes.

"This is an impressive feat of engineering!" Bioinformatician C. Titus Brown said.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

National Library of Medicine

"Faster than anyone thinks"

Fast forward to early 2020, when Artem Babaian, a computational biologist at the University of Cambridge, began the work.

Babaian was curious, he wondered, how many coronavirus sequences are there besides this outbreak of COVID-19?

With that in mind, Babaian approached Jeff Taylor, a supercomputing expert who jointly retrieved genomic data.

The data is already stored in a global sequence database and is the responsibility of the U.S. National Health Agency. So far, the database contains 16 petabytes of archived sequences from genetic data ranging from pufferfish to soil to humans. In these samples, the genomes of viruses infecting different organisms can be obtained by sequencing, which is often undetectable.

Babaian and Taylor devised a suite of computing tools specifically designed to search for cloud data, optimizing the software with the help of several other bioinformatics experts. Their analysis is "faster than anyone thinks" and can process 1 million sets of data a day, each costing less than 1 cent.

They quickly expanded their virus search beyond the coronavirus and looked at all the data in the cloud, which also includes those that cause influenza, polio, measles and hepatitis.

In fact, the new database does not have a complete sequence of each new virus, only genes for RNA polymerase.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

RNA polymerase

"Become a huge virus surveillance network"

The researchers conducted their research by looking for RNA polymerase, which is key to the replication of all RNA viruses.

To do this, the researchers used partial sequences to construct genealogies, reveal the relationships between different viruses, as well as their evolutionary laws, and also find out where and where specific viruses were found.

"We've turned the database into a huge virus surveillance network." Babaian said.

Bioinformatician C. Titus Brown said this could help researchers better understand how human pathogens are produced and improve diagnostic tests for viral infections.

"When a new virus is isolated from a patient, researchers can more easily determine whether it has been found elsewhere." Brown said.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

Virus Public Database: https://www.serratus.io/

Unexpected other discoveries

In some aquatic animals, such as pufferfish and salamanders, studies have found unknown coronaviruses and can be used to piece together the entire viral genome. The sequence shows that "the novel coronavirus genome has two separate rings, rather than the usual single strands of RNA." Babaian's report states so.

In the study of bacteriophages, people, cats, and dogs have found giant phages, which are viruses that invade bacteria and are also genetic material that gives biological traits to host bacteria, and the study has found more than 250 kinds of phages, which are similar to algal viruses.

To that end, the Babaian team created a public repository in which others can take advantage of the research's tools and results.

New danger? Computers have found more than 100,000 new viruses, which may trigger a major epidemic

A cloud-based analysis found 9 novel coronaviruses

Resources:

https://www.science.org/content/article/new-dangers-computers-uncover-100-000-novel-viruses-old-genetic-data

https://www.serratus.io/

Read on