laitimes

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Reporting by XinZhiyuan

Editor: Rayyan David

If the sequencing of the human complete genome is compared to a Gestalt fill-in-the-blank, then the last 8% of this Gestalt has been done by scientists for 19 years. Now, it is finally possible to hand in a satisfactory answer.

Nearly 40 years of efforts by scientists around the world are finally complete today!

Science published six cover articles in a row, announcing the official completion of the human complete genome sequencing project.

According to Reuters, Science and others, this achievement fills the gap that still exists after decades of efforts by previous generations, providing new hope for clues about disease-causing mutations and genetic variations among the 7.9 billion people worldwide.

It has been nearly 40 years since humans first discussed genome sequencing programs.

The "progress bar" that was 8% off 19 years ago is now finally filled in

In 1984, at a conference funded by the U.S. government to discuss the growing technology of recombinant DNA, scientists first discussed the value of sequencing the human genome.

In 1990, the Human Genome Project, funded by the U.S. Department of Energy and NIH, is expected to be completed within 15 years.

In 1999, the Center for Human Genomes, Institute of Genetics, Chinese Academy of Sciences, submitted an application to the NIH International Human Genome Initiative (HGP) to undertake the sequencing task of 1% of the total sequencing volume (about 30 million pairs of bases).

In April 2003, the Human Genome Project was announced. But the "completion" here is discounted, because the program cannot sequence the DNA found in all human cells, but only the "true chromatin" regions of the genome, which make up 92% of the human genome.

The remaining 8 percent, called the "heterochromatin" region, consisted of highly repetitive and tightly structured blocks of DNA that sequencing techniques at the time were powerless.

Now, this previously difficult base pair has been solved by an organization called the Telomere-to-Telomere Consortium (T2T consortium).

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

The latest issue of Science reports on the blockbuster news under the title "Filling the Gaps" and publishes six papers on the details of new sequencing techniques.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!
Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!
Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!
Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!
Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!
Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Interestingly, telomeres are DNA repeats at the ends of eukaryotes' chromosomes, in the typical heterochromosome region, the previous parts that could not be sequenced.

The T2T Alliance announced on Twitter that the sequencing of these parts has been officially completed.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Ewan Birney, deputy director of the European Molecular Biology Laboratory and a former member of the Human Genome Project and bioinformatician, said: "I don't think we could have imagined it even 5 years ago.

Previously undecipherable genomic sequences are now clearly visible, including telomeres and centrioles, the latter located in the middle of each chromosome, acting as coordinated replication.

In addition, the genomes on the short arms of five chromosomes are sequenced. These short arms are known to contain dozens of genes that encode the backbone of the ribosome, the "protein factory" of the cell.

Whole genome sequencing becomes "Gestalt Fill-in-the-Blank"

In 2001, when the first draft of the human genome was proposed, even after the first announcement of "completion", gene sequencing techniques were unable to reach regions where DNA sequences contain very repetitive base segments. In sequencing results, these repeats are generally skipped blank or rendered incorrectly.

As sequencing technology improves and costs decline, there is less and less room for these blanks and errors. In 2017, scientists released a human genome called GRCh38. With less than 1,000 "blank" gaps, it has become a benchmark for other human genomes in the eyes of many.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Since then, scientists have begun to make this work "Gestalt Fill-in-the-Blanks."

More and more people are joining, wanting to fill in the blanks of this Gestalt to continue to the end.

In 2019, Adam Phillippy, a bioinformatician at the National Center for the Study of the Human Genome, reported that the X chromosome had been successfully sequenced from start to finish, which also inspired dozens of other researchers to join the cause.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

"This cause really has a life of its own," said Karen Miga, a geneticist at the University of California, Santa Cruz. After meeting Adam Phillippy at a conference, they began working together.

The T2T combines multiple sequencing techniques, including a nanoporous device that can identify 100,000 pairs of bases at the same time; and another sequencing device that has more accurate results but can only identify 10,000 pairs at the same time. The researchers made some upgrades to the latter approach, further improving the accuracy.

Waterston said, "Look at how many methods they use to solve these problems, you know how hard it is." He is a geneticist at the University of Washington who has led the Human Genome Project together.

In the end, about 200 million pairs of bases are arranged in the correct order and position. These include more than 1900 groups of genes, most of which are copies of known genes. The researchers recorded replicated regions and movable elements — such as the genetic material brought by viruses being integrated into genes.

The short chromosome arm holds another surprise. As expected, these short chromosomes contain many copies, for a total of 400, that replicate genes that code for RNA.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Miga said, "rDNA is the last domino." This part has always been the hardest place to sequence.

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

"It's really amazing that the human genome is so dynamic." Professor George Church, one of the organizers of the Human Genome Project, said.

Whose DNA was tested this time?

Science Cover 6 In a Row: Humanity's Most Complete Genome Sequencing Complete!

Such a complete genetic sequencing work is naturally inseparable from the selfless dedication of people to contribute their DNA. Leonardotic Peshkin, a 51-year-old Harvard biology doctor, is one such person.

The Y chromosome for the sequenced genome came from Peshkin, while the rest of the DNA came from what is known as molar pregnancy, a relatively rare form of uterine growth. If sperm do not have chromosomes when they enter the egg cells, glucose pregnancy occurs.

In this case, the fertilized cells replicate the sperm's 23 chromosomes, producing two identical sets of chromosomes with the ability to replicate. Urvashi Surti, a geneticist at the University of Pittsburgh, has found that this property can be of great help to genome sequencing efforts because the sequencer does not have to address differences between parent chromosomes. She wanted to grow a cell line based on that.

With permission from the medical center's review committee, she deleted all information that might have been associated with her parents. In 2001, she succeeded and obtained data from studies between 1981 and 2000.

In 2019, the study had some potential problems, when the National Human Genome Institute (NHGRI) required that any genetic data sharing must require 100 percent consent from donors.

Although Surti and her team did not receive the appropriate permission, the NHGRI eventually turned a blind eye to the study. They argue that this exception should open a backdoor, as most of the study's sequences are already public.

However, the problem has not been solved. Can the master identity of the genome created in the study be confirmed by the database?

The NHGRI argues that even if it could be confirmed, it could not do so. It is unethical to do so. The man's identity cannot be made public, even if it is to ask him for permission.

The genome in Surti's study had only the X chromosome, not the Y chromosome, and finally, Peshkin's DNA was added to it. Previously, Peshkin and his parents had donated tissue for DNA research.

A few months ago, Peshkin called NIST. NIST told him that the T2T research team was comprehensively sequencing Peshkin's X and Y chromosomes. It was because of his dedication that researchers were able to make extensive use of his DNA. His genome will be the first full human genome ever made.

Peshkin said, "I am very excited to be involved in this cutting-edge scientific research. To make a small contribution to science, that's what I should do."

Resources:

https://www.science.org/content/article/most-complete-human-genome-yet-reveals-previously-indecipherable-dna

https://www.science.org/toc/science/376/6588

https://www.reuters.com/lifestyle/science/scientists-publish-first-complete-human-genome-2022-03-31/

Read on