laitimes

Quantum Technology + Life Sciences!BGI has developed a quantum algorithm for polyploid haplotype assembly

author:BGI of BGI

In a jigsaw puzzle, reassembling scattered pieces into a complete picture is undoubtedly a challenging task, especially when the number of pieces is huge, and the difficulty is multiplied exponentially.

Similarly, in the field of genomics, each organism contains a unique genetic "blueprint" that determines the properties and functions of the organism, carefully woven from DNA sequences.

Scientists have access to vast amounts of genetic information through advanced sequencing technology, but this information is often broken down into millions of tiny pieces, like pieces of a jigsaw puzzle that have been disrupted.

The challenge for scientists is to piece these seemingly disorganized pieces back together to restore the mysteries of life. In bioinformatics, this process is known as genome assembly, and the goal is to reconstruct a complete picture of the genome from scattered sequencing data.

This task is even more difficult for polyploid organisms, which have multiple sets of similar chromosomes, which makes it extremely complicated to precisely piece together the "blueprint" of each set of chromosomes from the many chaotic fragments. This complex process is known as polyploid haplotype assembly.

Polyploid haplotype assembly is extremely important for understanding the genetic properties of organisms, revealing susceptibility to disease, predicting drug responsiveness, and exploring the evolutionary history of species. However, due to the computational complexity and large amount of data involved in this process, the accurate completion of polyploid haplotype assembly tasks has always been a major challenge in the field of bioinformatics.

In response to this challenge, the research team at BGI has developed a new tool that uses quantum computing technology to solve haplotype assembly problems, VRP assembler. With the support of mature quantum computing technology in the future, high-quality monomer assembly can be realized more quickly.

After research and analysis, the researchers first found an efficient modeling method for haplotype assembly, proposed a mathematical model that can be applied to haploid, diploid and polyploid genome assembly, and obtained high-precision haplotype assembly results in the main human histocompatibility complex (MHC) region.

The results demonstrate the potential of quantum computing for the future of life sciences research – to inform precision medicine, biodiversity, and evolutionary research by enabling the analysis of complex genomes. The research results were recently published in the international methods journal Cell Reports Methods.

Quantum Technology + Life Sciences!BGI has developed a quantum algorithm for polyploid haplotype assembly

Cell Reports Methods官网截图

In this study, the following steps are used to analyze and explore how quantum computing can perform polyploid haplotype assembly.

First of all, to get the complete genome and decode the "blueprint" of life, it is necessary to stitch the sequencing reads obtained by the sequencer in the correct order. Studying haploid organisms is relatively straightforward, requiring only one set of sequences to be stitched together, but for diploid and polyploid organisms, the situation is much more complicated because they have two or more sets of DNA sequences that are similar but not identical, which is like putting together several similar pieces of a puzzle from a pile of fragments at once.

In the task of polyploidy, small differences in alleles may represent important genetic information. In haplotypes, we need to fine-tune these differences when analyzing sequencing reads, group them correctly, and assemble them to ensure that every sequencing read in each set of sequences is placed in the right place. This process helps scientists accurately uncover how genetic variation affects individual health and disease, which has important implications for precision medicine and personalized treatment. But at the same time, the assembly complexity of the haplotype is enormous, especially when considering the length and complexity of genetic information, even advanced computing techniques and algorithms are often difficult to complete accurately.

In this problem, the researchers innovatively used the mathematical model of the vehicle routing problem to encode the assembly problem of the haplotype. In this model, we need to find the best path for a series of "vehicles" to visit all the "customers" and eventually return to the starting point. Each "vehicle" represents a DNA sequence in a haplotype, while a "customer" represents a sequencing read in the sequence. By finding the best path planning, the VRP assembler is actually finding the best way to assemble these sequencing reads in the correct order and orientation. The complexity of this problem is very high, and the complexity of exploring and comparing all the solutions is no less complicated than counting the atoms in the entire universe one by one.

The emergence of quantum computing technology provides a new way to solve this problem. Its unique parallel computing capability enables quantum computers to consider multiple possible path combinations at the same time when solving large-scale problems such as optimization and search, so as to approximate the optimal solution in a very short time, showing a great speed advantage. This computing power is particularly well-suited for combinatorial optimization problems such as vehicle routing because it can quickly evaluate and compare thousands of different combinations to find the best "customer" grouping and "visit" order, allowing VRP assembler to quickly and accurately find the correct position for each sequencing read in a vast array of possibilities.

The research team conducted small-scale haplotype assembly of simulated diploid and triploid genomes on the D-Wave quantum annealing device, a dedicated quantum computer, which reduces the time required by three orders of magnitude compared with traditional optimization algorithms, and is an important progress in the intersection of quantum computing and bioinformatics.

Quantum Technology + Life Sciences!BGI has developed a quantum algorithm for polyploid haplotype assembly

The D-Wave quantum annealing device completes small-scale proof-of-concept on simulated diploid and triploid [1]

To further test the accuracy of the model, the research team used VRP assembler, a new tool developed in this development, to assemble two sequences in length of about 5 million base pairs in the human MHC region, and the results showed that the mismatch rate was reduced to near the theoretical limit, which is important for identifying genetic variants and understanding how they affect health.

The acceleration capabilities of quantum computing enable VRP assemblers to quickly process large amounts of genomic data, providing a new way to process large-scale genomic information that is both efficient and accurate in the future.

Quantum Technology + Life Sciences!BGI has developed a quantum algorithm for polyploid haplotype assembly

VRP assembler combined with OR-Tools to complete high-precision monomeric assembly in the human MHC region[1]

With the continuous development and maturity of quantum technology, we are ushering in a historic opportunity for the deep integration of quantum computing and life sciences. As a new computing paradigm, quantum computing is expected to break through the computing power limitations of the post-Moore era and provide innovative solutions to the dimensional disaster problem of biological data. In the foreseeable future, quantum technology will enable a number of key areas such as bioinformatics processing, disease mechanism exploration, and new drug development, pushing the boundaries of life science research:

  • Quantum Bioinformatics:
  • Quantum algorithms can efficiently process complex high-dimensional bioinformatics data (such as genomics, transcriptomics, proteomics, etc.), which will help researchers identify disease-related genes more quickly, understand the molecular mechanisms of complex diseases, and support precision medicine.
  • Applications of Quantum Simulation in Biological Systems Research:
  • Quantum computing enables the simulation of quantum behavior inside biomolecules and cells with unprecedented precision and scale.
  • Through quantum computing, researchers can gain a deeper understanding of quantum effects in biological processes (such as brain science, photosynthesis, etc.) and advance the study of the underlying mechanisms of biology.
  • Applications of Quantum Precision Measurement in Bioassays:
  • Quantum precision measurement technology can provide measurement accuracy that cannot be achieved by traditional methods, which is of great value in the detection of biomarkers and early diagnosis of diseases.
  • The application of quantum sensors is expected to enable more accurate and sensitive medical detection and improve the accuracy of disease diagnosis.

This research marks a key step forward in the application of quantum computing in the life sciences. With the continuous breakthrough of quantum computing hardware technology and quantum algorithms, quantum technology will have a profound impact on genomics and the entire life science field.

This study was led by BGI Shenzhen and co-authored with BGI Wuhan. Yibo Chen and Junhan Huang are the co-first authors of the paper, and Xun Xu, Yuxiang Li and Daniel Zhang are the co-corresponding authors of the paper. The study has passed the ethical review and strictly follows the corresponding regulations and ethical guidelines.

[1] Chen, Y., Huang, J.H., Sun, Y., Zhang, Y., Li, Y., Xu, X. Haplotype-resolved assembly of diploid and polyploid genomes using quantum computing. Cell Reports Methods 4 (2024).

Read on