laitimes

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Reporting by XinZhiyuan

Editor: Peach Yuan Xie La Yan

Recently, the Stanford University research team shortened the time of human DNA sequencing to 5 hours and 2 minutes, redefining the speed of human genome sequencing!

DNA sequencing time halved and set the latest Guinness World Record!

Recently, the Stanford University research team completed the "DNA Macrosequencing" technology that uses artificial intelligence computing to accelerate workflows.

The latest study was published in the New England Journal of Medicine on Jan. 12.

Address of the paper: https://www.nejm.org/doi/full/10.1056/NEJMc2112090

The fastest case samples processed in the study were sequenced in just 5 hours and 2 minutes, and the time from the time the samples were delivered to the laboratory to diagnosis was 7 hours and 18 minutes.

The previous world record for the rate of genetic diagnosis was 14 hours.

Fastest DNA sequencing: 5h

Why is this a major breakthrough?

Put it this way, genome sequencing is designed so that doctors can see the patient's complete DNA composition.

These key messages, from eye color to genetic disorders, are important for the diagnosis of a patient's disease. Once doctors know about a specific genetic mutation, they can develop a precise treatment plan.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Therefore, in order to speed up the process of patient treatment, doctors have to choose and race against time.

If the genome is sequenced faster, the faster the patient can leave the ICU.

At the same time, patients need fewer tests, faster recoveries, and less expensive medical care.

You know, the fastest time record for DNA sequencing diagnosis before was the 14 hours set by Reddy Children's Hospital, which is already a god speed. Stanford's record this time is nearly twice as fast.

Professor Euan Ashley, head of the research project, said, "Most doctors at the moment mention that they are genetically sequencing patients and getting the results, and within a few weeks it will be fast."

The study was led by Dr. Euan Ashley, professor of data science in medicine, genetics and biomedicine at Stanford University School of Medicine, in collaboration with NVIDIA, Google and others.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

In the study, the team tested accelerated genome sequencing techniques on undiagnosed patients in the intensive care unit of Stanford University Hospital.

From December 2020 to May 2021, a total of 12 patients were recruited for the test. The genome sequencing process is as follows:

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Ultra-fast genome sequencing process

Of the 12 patients who provided genetic samples, 5 received results on the same day, and the rest were of a non-genetic nature of the etiology. The Stanford team's 42 percent confirmed diagnosis rate was higher than the 30 percent confirmation rate for intractable diseases where the cause was difficult to find.

Initially, the researchers obtained a preliminary genetic diagnosis in 5 patients, and the minimum time from a blood sample sent to the laboratory to the initial diagnosis was 7 hours and 18 minutes.

At the same time, 5 patients recovered rapidly after genetic diagnosis treatment.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

The upper dark patch area represents the 5 patients who were the first to diagnose the results, and patients 11 took the shortest time

The test subjects included a 3-month-old newborn with epilepsy whose cause could not be detected during routine hospital instrumentation. The team found the causative genetic abnormality within 8 hours and 25 minutes of receiving the sample, while the results sent to the ordinary gene sequencing agency at the same time were not available until two weeks later.

There is also a 13-year-old heart failure patient whose symptoms have been misdiagnosed as COVID-19. The team also calibrated the genetic variant that caused his heart muscle abnormalities within hours, allowing him to replace it with a healthy transplanted heart within 21 days.

One of the authors of the paper, postdoc John Gozinski, said on his personal Twitter post, "This will completely change the way critically ill patients diagnose genetic diseases, bringing new standards to the healthcare industry that was previously only dreamed of."

The cost is as low as 30,000

After determining the diagnosis of Patient 1, the scientists updated the bioinformatics framework to transmit the raw signal data to cloud storage in real time and distribute the data to multiple cloud computers for near real-time sample calling and alignment.

This step reduced post-sequencing runtime by 93%.

In the process of processing samples in some cases, the average gene sequencing speed of the research team reached a speed of 1.8G data per minute, that is, a human genome was measured within 1 minute and 45 seconds, which is an unprecedented speed.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

The time spent on each stage of the diagnostic process in 12 patients

Speeding up requires updating hardware. Stanford's sequencing team used a new machine made by Oxford Nanopore Technologies with 48 sequencing units, also known as flow slots.

The Stanford team's new approach is to use the new analyzer to process samples from the same patient simultaneously with all the flow tanks.

This extreme operation method has achieved great results. Honestly, the results were almost too great. The amount of data per hour of 173-236G, the 94% comparison recognition rate, and the maximum tenperoptic coverage of more than 60 times (the number of reads of dominant autosomal data) are enhanced to overwhelm the computers processing the data.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Sneha Goenka, a graduate student at Stanford University, found a quick solution for this. This solution abandons the traditional method of processing data with local sequencing chips, and directly stores the compiled data into a storage system based on NVIDIA Tensor Core GPUs and Google Cloud.

With cloud computing systems, computing power can be amplified and filtered in real time in the data.

Using NVIDIA's Clara Parabricks architecture, the researchers then independently ran a special decision tree algorithm tailored to it to scan the genetic code of the input sample for pathogen traits that could lead to the disease, and to assign weights to them.

Nvidia's Clara Parabricks architecture is a GPU-accelerated version of Google's PEPPER-Margin-DeepVariant pipeline. The PEPPER-Margin-DeepVariant pipeline, developed by Google in collaboration with the University of California, Santa Cruz, uses a recurrent neural network algorithm to analyze genetic sequencing data.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Decision tree algorithm process

Finally, the researchers compared the genetic abnormalities of the patient samples with an open database of pathogenic genes to obtain a diagnosis.

Because of the enhancements in hardware and software, the research team also chose the long-reading sequencing method that was previously more expensive and more difficult.

Traditional gene sequencing slits sample genes into small segments and then determines the DNA base pairs in each segment. This approach reduces costs and man-hours under the constraints of older technologies, but is prone to mistesting or missing variations that can only be fully represented in long DNA sequences.

Long-read sequencing does not require excessive dna cutting, and simultaneously determining long DNA sequences between 10,000 and 100,000 base pairs can improve the accuracy of sequencing while providing more detailed data on genetic variation.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

Under the limitations of the previous old technology, the cost of long-read sequencing is much higher than that of traditional sequencing.

Now that the speed has increased and the accuracy has increased, how much did this test cost?

The scientists estimated the cost of using the method, including DNA extraction, library preparation, sequencing and calculation, and found that these costs ranged from $4,971 to $7,318 (about 30,000 to 46,000 yuan), far lower than previously expected.

As genome sequencing technology continues to advance, the cost of sequencing plummets at a "super-molar rate".

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

The Stanford researchers say the technology will increase the turnover of intensive care units to less than 10 hours at pilot Stanford Hospitals and Lucille Packard Children's Hospitals at Stanford University. If expected, an attempt will be made to spread the application more widely.

4th generation gene sequencing technology

The development of genome sequencing technology can be traced back to 1977, and has since gone through more than 40 years of development.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

History of genome sequencing technology| Source: Network

When it comes to DNA sequencing, we have to mention the influential "Human Genome Project".

Readers who follow our previous articles should be impressed that Rand, an American scientific consultant who has only been on the job for one year, may have made his greatest achievement in mapping the human genome and promoting the development of the Human Genome Project.

In fact, since the early 1990s, the academic community has been involved in the "Human Genome Project".

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

The Human Genome Project is one of the world's largest scientific megaprojects.

Its purpose is to determine the sequence of nucleotides composed of six billion pairs of human chromosomes, so as to map the human genome, and identify the genes and their sequences contained in them, so as to achieve the ultimate goal of deciphering human genetic information.

In 1990, the Human Genome Project was funded by the U.S. Department of Energy and the National Institutes of Health and is expected to be completed within 15 years.

Then, in order to coordinate the research of the human genome in various countries, in 1988, under the initiative of scientists such as Victor Markkusk, the International Human Genome Organization was established.

The development of DNA sequencing technology has undergone 4 major leaps.

First generation: chain termination method

In 1975, Frederick Sanger and others proposed the chain termination method, marking the birth of the first generation of sequencing technology.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

They determined the first genomic sequence, the phage X174 with a total length of 5375 bases. Since then, humans have gained the ability to snoop into the genetic code of life.

The advantage of the first generation of sequencing technology is that the sequencing reading length can reach 1000bp and the accuracy is as high as 99.999%, but due to the shortcomings of high sequencing cost, long time and low throughput, it seriously affects its truly large-scale application.

Second generation: high-throughput sequencing

The second generation of high-throughput sequencing is a revolutionary revolution in the previous generation of Sanger sequencing, which can sequence hundreds of thousands to millions of DNA molecules at a time, and in some literatures refer to high-throughput measurement as "next-generation sequencing technology".

In addition to greatly reducing the cost of sequencing, the second generation of sequencing technology has also greatly improved the sequencing speed and maintained high accuracy.

It takes 3 years for the first generation of sequencing technology to complete the sequencing of a human genome, while it takes only 1 week to use the second generation sequencing technology.

James Watson, the father of DNA, obtained the world's first personal genome map in less than two years and at a cost of just $2 million.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

After that, scientists saw great promise in the development of gene sequencing technology in this direction, so they continued to innovate and invented the third generation of single-molecule sequencing technology.

Third generation: Single-molecule real-time DNA sequencing

The new generation of sequencing technology, represented by PacBio's SMRT technology and Oxford Nanopore Technologies' "nanoporous single molecule technology", is called third-generation sequencing technology.

Stanford AI 5-hour DNA sequencing breaks world record! A new milestone for humanity, the cost is only 30,000

PacBio Instruments

Single-molecule sequencing technology, which does not require PCR amplification, enables individual sequencing of each DNA molecule. Third-generation sequencing technology is also called de novo sequencing, that is, single-molecule real-time DNA sequencing.

Fourth generation: Nucleotide sequencing

The basic hallmarks of fourth-generation sequencing techniques are the direct determination of single-molecule RNA sequences without cDNA (complementary DNA synthesized on RNA templates), without PCR amplification, and the determination of modified nucleotide sites on single-molecule RNA.

The emergence of the first generation of sequencing technology has enabled human beings to explore the genetic nature of life, and the research of life sciences has entered the era of genome research.

In the more than 40 years so far, gene sequencing technology has been greatly developed from the first generation to the fourth generation.

In the future, attempts to decode the sequence of DNA time will continue...

Resources:

https://www.zdnet.com/article/stanford-uni-nvidia-use-ai-computing-to-cut-dna-sequencing-down-to-five-hours/

Read on