laitimes

Interview with Yu Guangchuang, Director of the Department of Bio-Information Technology of Southern Medical University: No major is absolutely superior, and it is not easy for Bio-Lettering people to be a "tool person".

author:Leifeng.com
Interview with Yu Guangchuang, Director of the Department of Bio-Information Technology of Southern Medical University: No major is absolutely superior, and it is not easy for Bio-Lettering people to be a "tool person".

What sparks can bioinformatics and medicine collide?

Professor Yu Guangchuang's scientific research practice gives the answer to this question.

As the director and professor of the Department of Bioinformatics at the School of Basic Medical Sciences of Southern Medical University, Yu Guangchuang focuses on the intersection of biomedicine, mathematics and computer science.

This is not a shortcut to scientific research, and the multidisciplinary knowledge reserve has scared off many scholars. For Yu Guangchuang, there is also an element of adventure.

His academic career started from the biotechnology major of South China Agricultural University, to the biochemistry and molecular biology of Anhui Medical University, to the phylogenetic research of the School of Public Health of the University of Hong Kong, and now to the teaching and research work of the School of Basic Medical Sciences of Southern Medical University.

But in Yu Guangchuang's words, this is his "specialty" and "a good time".

In fact, there is a small episode in this experience, he originally applied for a master's degree at the Shanghai Institute for Biological Sciences, Chinese Academy of Sciences, but missed the re-examination, so he had the idea of adjustment, and since then he has formed an indissoluble bond with medical universities.

In 2018, a phone call from Professor Li Jinming, the former head of the Department of Student Information Technology at Southern Medical University, tied Yu Guangchuang's scientific research career there.

Recruitment, employment, serving as the deputy director of the department in the following year, coordinating the reform of the whole department was selected into the "Double Ten Thousand Plan" of the Ministry of Education, and won the first place in the only university in the country to be selected as a major in student information.

Currently, Guangchuang Yu's research focuses on omics data analysis and tool development.

Through multi-omics research, his team has developed MMINP tools capable of predicting metabolomic data, as well as SVP software packages specifically for single-cell and spatial transcriptomics studies.

These research results provide a new entry point for the study of microbial ecology and phylogenetic relationships, and show great potential in the fields of disease mechanism exploration and drug target discovery.

Nowadays, "Shengxinren" has become an important label for Yu Guangchuang.

He pointed out that bioinformatics has changed from a supporting role to a leading force in scientific research. With the popularization and cost reduction of high-throughput data, the importance of bioinformatics in data analysis and interpretation has become increasingly prominent.

At the same time, he also encouraged young scholars to actively devote themselves to interdisciplinary research, "different disciplinary backgrounds mean different perspectives, and no background is absolutely superior", and should be good at using the perspectives and methods of different disciplines to carry out distinctive and innovative work.

Recently, the 2024 "2nd Bioinformatics and Translational Medicine Conference" came to an end. With the theme of "Translational Medicine in the Era of Artificial Intelligence", this conference was hosted by Beijing Xieyun Qiyuan Technology Co., Ltd., and Professor Yu Guangchuang was invited to participate.

Taking this opportunity, Leifeng.com had an in-depth dialogue with Professor Yu Guangchuang, and the following is the full text of the conversation (edited):

The indissoluble bond between students and medical colleges

Leifeng.com: First of all, could you please share your educational background and research field, what were the main topics at that time, and how did they shape your professional skills and research perspective?

Yu Guangchuang: I majored in biotechnology from South China Agricultural University. It was 2001, when the Human Genome Project was released and biotechnology ushered in its golden age.

During my undergraduate years, I developed a strong interest in computer science and taught myself about it. At the master's level, I wanted to switch to the field of bioinformatics, but since this is a new discipline, there are not many institutions in China that can provide graduate education in this major. Later, he went to Anhui Medical University, majoring in biochemistry and molecular biology, under the supervision of Professor Qin Yide.

During my master's degree, I was fortunate to continue my research in bioinformatics with Professor Bo Xiaochen from the Academy of Military Medical Sciences, and under his guidance, I learned programming and data analysis skills in depth. At that time, we mainly conducted microarray data analysis, and although next-generation sequencing technology began to rise, it was not yet widespread in China. In addition, we explored some computational methods based on the "semantic similarity measure of gene ontology", which was also a relatively new area of research at the time.

After graduating with a master's degree, I worked at the Institute of Life and Health Engineering of Jinan University, mainly conducting proteomics research. There, I was involved in tumor-related proteomic data analysis, focusing on studies at the protein level.

After working for a few years, I realized that only a master's degree was limited in academia, so I decided to pursue a PhD and finally chose the University of Hong Kong. There, I joined the lab of Prof. Yi Guan (School of Public Health, HKU), who made important contributions during the SARS epidemic. Under the supervision of Professor Guan, I was engaged in phylogenetic research related to infectious diseases such as influenza.

It can be said that during my master's and doctoral studies, I was exposed to completely different research topics, and I also realized the need to broaden my thinking and horizons.

Currently, I am at the School of Basic Medical Sciences, Southern Medical University, where I am mainly engaged in omics data analysis and tool development. Although I am not directly engaged in phylogenetic research now, I have been thinking about whether I can explore new methods of omics data analysis from the perspective of phylogeny and microbial ecology.

At present, both metagenomic and single-cell spatial transcriptome studies are population measurements, which provides a new entry point for us to study the relationship between microecology and phylogeny.

Lei Feng: What was the opportunity to join the School of Basic Medical Sciences of Southern Medical University?

Yu Guangchuang: After graduating from HKU, I stayed in Prof. Guan's group and continued to work as a postdoctoral fellow while looking for job opportunities. At that time, Professor Li Jinming, the old director of bioinformatics at Southern Medical University, called me and introduced me to the school and my major. I was impressed by Professor Li's introduction, and I applied here and joined in 2018.

In fact, 2018 and 2019 were the peak periods for Southern Medical University to introduce high-level talents, and not only our school, but many other schools are also vigorously introducing talents. With the passage of time, especially the impact of the epidemic, the school's funding has decreased, and the intensity of talent introduction has also weakened. Now, schools are more inclined to bring in overseas talents and rely on schools to apply for relevant programs. I'm kind of in time.

Since July 2019, I have been the Head of the Department of Student Information and my job involves more administrative and teaching tasks, mainly serving students and faculty in the department.

In fact, our undergraduate program was established in 2005 and is one of the earliest universities in China to offer such a major. In April 2019, the Ministry of Education launched the "Double Ten Thousand Plan", which aims to build 10,000 "national" first-class undergraduate majors and 10,000 "provincial" first-class undergraduate majors. We have done a lot of work on the construction of majors, including hardware, software, teaching materials and curriculum reform. Our major has also become one of the first batch of national first-class undergraduate majors, and it is also the first batch of bioinformatics majors to be selected.

I believe that as a teacher, teaching is just as important as research. This not only helps me personally to become a more well-rounded teacher, but also has great significance for educating the next generation and promoting the sustainable development of society and the country.

Leifeng: Because bioinformatics integrates multiple disciplines, including biomedicine, mathematics, and computer science, why are you interested in interdisciplinary research?

Yu Guangchuang: This goes back to my undergraduate days. It was then that I stumbled upon a book called "Developing Bioinformatics Computer Skills" in the library, which can be said to be my initiation and made me deeply interested in bioinformatics.

From that book, I learned that bioinformatics is an interdisciplinary discipline that integrates biomedicine, mathematics, and computer science.

I studied biology myself, and at that time people were very optimistic about the future of biotechnology, and I had a strong interest in computers, I think this is the right combination of my major and interest.

Therefore, I was determined to develop in this direction, and although I was still in a relatively ignorant state at that time, my heart was full of yearning.

However, it wasn't until I joined Prof. Bo Xiaochen's group as a graduate student that I really started to get in touch with bioinformatics, learn computer and mathematics knowledge, and apply them to biological research.

This gave me real first-hand research experience, and I have always maintained a strong interest in this field. I've maintained this interest drive over the years, both for work and further study, and that's very important to me.

Leifeng: You just mentioned the Human Genome Project, what changes has this global project brought to the entire research field, and what stages of development has the bioinformatics field gone through, up to your current research field?

Guangchuang Yu: The Human Genome Project has had a profound impact on modern biomedical research.

Prior to the program, research efforts had focused on the cloning of individual genes and their products, and progress had been relatively slow. Many researchers may have spent their entire lives studying a gene or protein, often for genes known to be associated with the disease. At the time, our understanding of the interactions between genes was very limited. Although there are few genetic differences between humans and mice or chimpanzees, there are significant differences in complexity at the regulatory level.

After the completion of the Human Genome Project, we have obtained a complete human genome sequence and relatively complete annotation information, which has fundamentally changed the research paradigm.

To use a metaphor, research used to be like fishing, but now it's like casting a net to catch fish.

Now, we can take a lot of data and let it drive the research to uncover clues that we didn't expect before. This data-driven approach to research has not only changed research strategies, but also accelerated the development of many research fields, including precision medicine.

These advances would not have been possible without the foundations laid by the Human Genome Project.

Being a good "tool man" is not simple

Leifeng: What is the focus of your current research, and what are the recent developments?

Guangchuang Yu: My current research focus is on omics data analysis and tool development. We have conducted multi-omics studies and developed software packages for microbiotaProcess, MMINP, SVP, and more for microbiomes.

For MicrobiotaProcess, we have designed a data structure to manage microbiome data and provide a range of analytical tools.

In the case of MMINP, metabolomic data can be predicted from microbiome data. Many people identify gut microbes by metagenomic or 16S rRNA sequencing, but the corresponding metabolomics data are often lacking. We have developed this tool to fill this gap and make a more comprehensive analysis.

In addition, we have developed SVP software packages specifically for single-cell and spatial transcriptomics studies, which can characterize cellular functions at the single-cell level, based on which we can identify spatially specific biological functions.

At present, our work is mainly focused on the field of basic research, and there is no specific translational application. But I think these findings have the potential to be translated.

For example, the metabolite information we get by predicting could theoretically help us save on research costs. We can start with an initial exploration with computational methods, and then conduct in-depth research through methods such as targeted validation.

Leifeng: What are your methodologies when developing biological big data analysis algorithms and software, and how do these tools help researchers better explore and analyze data?

Yu Guangchuang: When developing algorithms and software, we mainly focus on downstream needs and application scenarios.

In the field of bioinformatics, upstream algorithms, such as sequence alignment, typically focus on accuracy, speed, and computational performance, while we focus more on downstream method development and software design. That is, how to combine these techniques with biological needs to provide practical assistance to biologists in discovering molecular mechanisms.

In terms of methodology and learning, I think the most important thing is to target the user community and the software ecosystem. A good ecosystem and community can significantly lower the barrier to entry for development and promote collaboration and complementarity between different software packages.

For example, we have developed clusterProfiler software for a wide range of researchers to explore molecular mechanisms and elucidate how various biological processes and pathways are perturbed through feature-rich data analysis.

This analysis can be applied to the study of a wide range of diseases, not limited to a specific field, but can be applied to a wide range of research scenarios, so it has a very wide range of uses and a large audience.

Leifeng: At the same time, what is the biggest challenge in this process?

Yu Guangchuang: In the development process, I mainly focus on specific application scenarios, and we often encounter some problems in data analysis, and we don't have the right tools at hand.

Second, in the current era of big data, one of the main challenges we face is computing power, but computing power is not always readily available.

For example, U.S. restrictions on graphics card exports to China have limited many research efforts. In addition, similar research is being conducted by many large IT companies, compared to which universities are often difficult to match in terms of hardware resources.

Third, the complexity of the problem is increasing, and teamwork is becoming more and more important. While teamwork can be a key factor in solving these problems, it's a challenge in itself.

Leifeng: Who are your current cooperation projects and partners?

Yu Guangchuang: I currently have a collaborative project with a director of obstetrics and gynecology (Ningbo University, Chen Xia), where we are studying the relationship between gut microbiota and polycystic ovary syndrome.

In this project, we collected a large amount of metagenomic and metabolomic data.

When it comes to studying gut microbes, most previous studies have focused on bacteria. But I would like to approach this question from the perspective of bacteriophages, because phages can infect bacteria and regulate their function, as well as affect the ecology of the entire microbial community.

By analyzing metagenomic data, we hope to explore the relationship between bacteriophages and host bacteria. In addition, we collected some samples ourselves, captured the interactions between bacteria and phages using specific techniques, and performed corresponding sequence analysis.

Another collaborative project is with neurobiologists (Southern Medical University, Cao Xiong, and Tao Tao), where we used a mouse model of depression for spatial transcriptome studies.

We performed spatial transcriptome sequencing at five different locations in representative brain regions of the mouse brain, hoping to identify molecular mechanisms and signaling pathways associated with depression through these data. Spatial transcriptome technology enables in situ testing of cells, which is a very promising application for neuroscience research.

This technology is relatively new, and we are currently collaborating and exploring in this area.

Time will tell

Leifeng: What are some of the most influential papers you have published in journals such as The Innovation, Gut Microbes, Molecular Biology and Evolution, and what are the long-term implications of these research results for the biomedical field?

Guangchuang Yu: If you want to talk about impact, I think the most impactful work we have is the clusterProfiler tool that I mentioned earlier.

Its first edition was published in 2012, more than a decade ago. In 2021, we published a new edition in The Innovation magazine. This tool is widely used, with more than 25,000 citations so far, and has had an impact on research in our field.

In fact, many students and researchers have told me that the first thing they were exposed to when they were studying bioinformatics was the toolkit I developed. Because it's relatively simple to use and is able to provide feedback to beginners quickly. After the analysis is completed, we can get a number of visualizations, which helps them to understand the results of the analysis immediately.

In addition, another work that I think has a big impact is the phylogeny-related research that I started during my PhD.

We have developed a series of software packages that not only integrate and visualize phylogenetic data, but also help researchers interpret and map various data onto phylogenetic trees. With the development of experimental techniques, we now have more and more high-throughput data. Mapping these data or analysis results onto a phylogenetic tree can help us discover new or unexpected evolutionary patterns.

The work was published in three articles in the journal Molecular Biology and Evolution, and an article was first published in Methods in Ecology and Evolution in 2017, which was later selected as one of the "Ten Masterpieces" as the journal celebrated its 10th anniversary.

I have also written a book in English about these works, which is published by CRC Press. The book was later translated into Chinese and published by the Electronic Industry Publishing House in China. The book was well received by readers and was sold out on JD.com for a time.

These works are arguably my most impactful results. The impact of many research results takes time to verify, and may not be immediately considered particularly good at first, but over time, if more and more people use them, it is a sign that the work has stood the test of time.

Leifeng: In addition to developing these tools, do you also have database development work?

Yu Guangchuang: We didn't develop the database directly. Although databases play a very important role in bioinformatics, this is not the focus of our research.

Of course, the establishment of databases may be the focus of some researchers, who may present their results by collecting data and publishing articles. But there's a phenomenon that a lot of people develop tools or databases in order to get published, and once the article is published, they don't continue to invest in it.

However, I believe that the real value of a database lies in the fact that it can continue to accumulate data resources and promote the research progress of researchers' own topics.

Leifeng.com: You have been selected as the world's most cited researchers, the world's top 2% scientists, and China's highly cited researchers. Can you talk about a time when your research was widely recognized?

Yu Guangchuang: For me, I am full of affection for the tools I have developed, and I am always maintaining and updating. For example, the tool I mentioned earlier was maintained and updated for nine years, from the publication of the article in 2012 to the release of a new version in 2021.

This kind of long-term maintenance and renewal can be seen, and it has also formed a certain reputation. As time went on, people recognized my work more and more. Therefore, it is also a process of accumulation. When people generally recognize your work and are willing to use the methods and tools you develop, the number of citations will naturally increase. It is because of everyone's recognition and support that I am fortunate to be selected for the list of these highly cited researchers.

Lei Feng: I have written a series of articles on the development of bioinformatics in the past 30 years, and some professors will mention that in the past, bioinformatics researchers would be in an awkward position, arguably in an auxiliary role, rather than the leader of the subject. Has that changed?

Yu Guangchuang: The situation is indeed gradually improving.

In the past, our role was more of a support because we didn't directly produce data. When other research groups or colleagues in basic research and clinical research generate data, they may not analyze it, so they come to us and want to collaborate, and we basically become an assistant to help.

Moreover, when we conduct bioinformatics data analysis, we sometimes encounter challenges, including what I just mentioned, sometimes we need to develop our own tools to solve problems, which is not easy, and requires a certain professional background and scientific research experience.

In the eyes of our colleagues in basic or clinical research, they may sometimes underestimate our contributions, believing that we are just tools to run the program, and therefore our contributions may seem relatively small to them. This may be due to their own cognitive limitations, making it difficult to accurately assess the contributions of collaborators. This situation has previously embarrassed researchers.

But now, things are getting better. Our generation may face a little less challenge than our predecessors.

First, access to data is now more accessible. A lot of the data generated by many large-scale projects is publicly available, and we can conduct research based on this publicly available data.

And the cost of generating data is getting lower and lower, and the cost of data generation was high before, so the researchers who generated the data felt important. But now, as the cost of high-throughput data methods decreases and it becomes easier for us to generate data, bioinformatics is becoming more demanding and important in analyzing and interpreting data.

In addition, we can also find good research points or important findings in a data-driven way. Collaborators can then be found to test our hypotheses and findings, so that we can dominate the research to some extent.

So in general, with the popularization of biological big data, more and more researchers have begun to realize the importance of bioinformatics, which is not an auxiliary discipline, but an independent discipline, and even the role in leading research will become more and more obvious, and the recognition will gradually increase.

Leifeng: What new trends or breakthroughs do you think will be in this field in the next few years?

Yu Guangchuang: In terms of planning, I think one of the key themes at the moment is artificial intelligence.

This is an unavoidable trend of the times, and we don't expect AI to completely replace or disrupt existing methods, but at least it can empower us to solve more problems.

In the application scenarios of bioinformatics, the application of artificial intelligence will definitely increase. As we all know, AI has already begun to play a role in areas such as protein structure prediction, with the potential to play an even greater role in translational research.

Although my team and I are not researchers in the field of artificial intelligence, we must embrace artificial intelligence, and my plan is to explore the integration points with artificial intelligence in the field we are good at.

Leifeng.com: Regarding artificial intelligence, have you and your team been using relevant technologies before?

Yu Guangchuang: We mainly use traditional machine learning methods. As for deep learning, we haven't touched on it too much before. However, in spatial transcriptome analysis, we are trying to leverage deep learning techniques.

When we do spatial transcriptome measurements, we're talking about spatial information, but we're actually dealing with two-dimensional tissue sections. We are trying to reconstruct this data into 3D structures using deep learning techniques, and there is some exploration underway in this area.

Leifeng: What is your personal experience in this area of interdisciplinary collaboration, or do you have any advice for young scholars?

Yu Guangchuang: I think the key to the experience of interdisciplinary cooperation is to communicate more and exchange more. This is because there may be language and conceptual barriers when people from different academic backgrounds communicate. Sometimes I don't understand what you're saying, and you can't understand what I'm saying. Increased communication leads to a better understanding of each other's needs and goals.

In addition, interdisciplinary communication can also break down disciplinary boundaries and broaden ideas. Whether it is in a collaborative project or attending an academic conference, listening to other people's reports can broaden your horizons and thinking.

My advice to young scholars is that interdisciplinary learning is really not easy, and as my master's supervisor said, you need to be prepared to put in the extra effort.

But that doesn't mean you need to wait until you have mastered the basics of all relevant disciplines before you start working. This approach is unrealistic because it is difficult for you to grasp all the knowledge in its entirety and may deviate from your research topic. Instead, be project-driven and learn by doing.

Of course, interdisciplinarity also has its advantages. Different disciplinary backgrounds mean different perspectives, and no one background is absolutely superior. If you can make good use of your disciplinary background and find the right entry point, you can make a distinctive and distinctive work.

Leifeng: Are you still mentoring students, and what is their main professional background?

Yu Guangchuang: Yes, I want to mentor students. Most of my current students are majoring in bioinformatics. They may have studied bioinformatics as an undergraduate, as our school itself has this major.

In addition, there are some students from the biological fields such as biotechnology and biopharmaceuticals.

There are relatively few students in computer science with us because we are a medical university and students may be more interested in biomedical related fields, so students from this type of background are more inclined to choose us.

I don't have many students graduating from my mentorship at the moment. Some students go abroad for further study, some work as scientific researchers in hospitals, and some enter the company to work in bioinformation technology development and data analysis.

Therefore, their career direction is usually related to the field of bioinformatics, whether in universities, hospitals or companies.

The author of this article, Wu Tong, has long been concerned about artificial intelligence, life sciences and technology front-line workers, welcome to communicate with colleagues on WeChat: icedaguniang

Leifeng Net, Leifeng Net

Read on