Microbiome-Metagenomic Analysis Technical Symposium (2024.5)

Welfare announcement: In order to respond to the learning needs of the trainees, after the discussion and preparation of the training team of eSunson, it is now decided to arrange a combination of online live broadcast and offline teaching of amplicon 16S, metagenomics, clinical genomics and transcriptome analysis technical seminars. If you sign up for an online live class, you can open a video topic for self-study at any time, and teachers can choose to participate in an offline class of the same course within 1 year. We look forward to meeting you both online and offline.

Information that can be reported at this time:

The online/offline class start time of amplicon is 2024/4/12-14
Metagenomics Online/Offline Course Start Time: 2024/5/17-19
Clinical Genomics Online/Offline Start Date: Time to be determined
Transcription Group Online/Offline Course Start Time: Time to be determined
Registration link: http://www.ehbio.com/Training/
Microbiome-Metagenomic Analysis Technical Symposium (2024.5)

In the expectation of the majority of fans, "Biogenomics Dictionary" and "Metagenomics" jointly launched the 21st session of the "Metagenomic Analysis" symposium in Beijing or Shenzhen (online/offline classes are opened at the same time), providing you with a shortcut to enter the door of Bioxin, providing peers with an opportunity for metagenomic analysis to learn and communicate, helping students to truly understand the principles of analysis and complete practical analysis, and creating a unique four-stage teaching (3 days of intensive teaching + 2 weeks of self-practice + then focusing on answering questions + watching class videos and repeated exercises), "Teach- Practice-answer-use" four links are unified and coordinated, and the independent analysis of big data is truly realized.

For more information on the importance of learning bioinformatics analysis, read Bioinformatics 9-Day Crash Course – Becoming an Integral Part of Your Team. Bioinformatics analysis is inseparable from program writing, this part is not as difficult as imagined, as long as we follow the operation, you can understand, see "Program Learning Experience in Biological Information" for details.

Course Description:

Please read the course description in detail, and if you are proficient in all of the following, you do not need to take this training.

The Metagenomic Analysis Course is an "Advanced Amplicon Analysis", which focuses on shotgun metagenomic data analysis and the use of processes under Linux. If you are new to microbiome analysis, want to learn mapping, and 16S/ITS amplicon analysis, please register for the Amplicon Analysis Symposium.

Metagenomics/microbiome is one of the hottest research fields in the world today, in order to strengthen the technical exchange and dissemination in this field and promote the development of China's microbiome program, young researchers of the Chinese Academy of Sciences established the "metagenomics" public account, with the goal of creating a platform for the exchange of pure dry goods technology and ideas in this field. Founded three years ago, it has shared 3,200+ original articles on professional technology, with 160,000+ followers and 40 million+ readers.

In order to meet the needs of readers for further learning, we are now cooperating with the "Biotrust Dictionary" to organize a special training course on metagenomics to further learn and exchange metagenomics analysis technology, take you to get started quickly, save valuable time, and help scientific research results to be produced as soon as possible.

This course lasts for 3 days, 6 lessons per day, a total of 18 lessons, all of which are a combination of theory and practice (as long as the lectures are about analysis that can be learned and implemented by yourself). From Linux and R basics, metagenomics Linux server analysis platform construction, Windows common statistical analysis software, data analysis chart interpretation and practice, metagenomic reference/read-based (suitable for human, animal intestines, etc.) and non-reference (De novo/Assemble suitable for plants, environmental samples, etc.) standard analysis process, binning (mining single bacterial genome), statistical analysis, and various advanced analysis ( Multi-gene junction evolutionary tree, network diagram drawing and beautification, network attribute comparison, machine learning, etc.), and CNS-level image modification and typesetting. In 3 days, the veteran driver will take you through the rugged road that takes 3 months or even 3 years for self-study, helping you to truly realize metagenomic analysis and optimize the analysis plan according to the background of your topic.

Course outline

Each class has a theme for 1 hour, combining theory with actual combat, learning principles, and practical operation, all of which are selfless sharing of years of experience and code by old drivers. The following is the course schedule, such as 11 for the first class on the first day, 26 for the sixth class on the second day, and 41 for the online centralized video Q&A after two weeks.

numbering	topic	Brief introduction
11	Linux Basics	Introduction, Remote Login, File Transfer, Common Commands
12	Linux software installation	Conda installation and configuration, metagenomic software installation and database download
13	Win software installation	git、R、Rstudio、R包、STAMP、AI等
14	Chart interpretation	Commonly used analysis charts, meanings, and usage scenarios in the article
15	R basis	History, Applications in Biology, GGPLOT2 Plotting
16	visualization	Data collation and online plotting of 16 types of charts
21	Introduction to metagenomics	Development history, scope of application of common technologies, and analysis ideas
22	Metagenomics have reference quality control	FastQC、Trimmomatic、 MultiQC]()、KneadData质控、parallel并行计算
23	Species and functional composition	MetaPhlAn2 species composition, HUMAnN2 functional composition, and functional association driving species
24	Comparison and visualization of species and functional differences	GraPhlAn、LEfSe、STAMP、R语言统计
25	Pre-publication preparation	Image layout, data release, code organization (optional)
26	Network drawing	基础、igraph、Gephi
31	Species annotation and visualization	Kraken、Kraken2、GraPhlAn、Krona、microbiomeViz、metacoder
32	Splicing, gene annotation, and quantification	MEGAHIT、metaSPAdes、QUAST、Prokka、cd-hit、Salmon
33	Gene function annotation	KEEG、COG/EggNOG、CAZy/dbcan2、ARDB/Resfams/CARD、Uniref、VFDB、TCDB
34	Binning	理、MetaWRAP、VizBin
35	Bacterial genome evolution	Bins Extraction of Conserved Genes, Polygenic Evolutionary Tree, Understanding Evolutionary Trees in One Article, Evolview Basics, Advanced iTOL Beautification, Advanced, antiSMASH
36	Wrap up the thread	Metagenomic analysis routines: review and summary
37	50 questions in the exam	Self-evaluation of learning effects and review of knowledge points
41	Q&A - online	Q&A and lecture on exam content

A brief introduction to the tutorial is as follows:

1. Construction of analysis platform

"If you want to do a good job, you must first sharpen your tools", if you don't have your own analysis platform, if you want to analyze big data, how can you do it. The amount of metagenomic data is huge, and it is still difficult to process the big data that was originally taken off the computer in the early stage. Fortunately, at this stage, general universities, research institutes, and research groups have their own servers, and even if they don't have servers, they can also rent domestic services such as Alibaba Cloud and Tencent Cloud. Now that the analysis conditions are available, how to turn the server into a powerful tool for metagenomic analysis is a very complex professional problem, and you can learn it right away!

Microbiome-Metagenomic Analysis Technical Symposium (2024.5)

Figure 1. Setup of the metagenomic analysis pipeline – system, installation method, and main software

Ubuntu is recommended for the server. The minimum configuration is 32G memory, 8 cores, 256GB memory is recommended, and the minimum is 24 threads.

A computer without software is just a pile of scrap metal, and a server without a metagenomic analysis system has nothing to do with your data analysis. If you want to build a complete metagenomic analysis process, the resources on the Internet are scattered and scarce. The team will share years of experience to explore excellent software and layout skills, and share all the source code, so that you can quickly deploy dozens of common software, hundreds of dependent R and Python packages that the metagenomic analysis process relies on mainstream Linux server systems (Ubuntu 16/18.04, CentOS7 and other mainstream distributions), and easily have a professional analysis platform.

Figure 2. E-Sunson pioneered the data statistical analysis and visualization process optimized based on Win10, and the notebook became a big data analysis platform in seconds

It is recommended to use Windows 10 system, 8G memory analysis is faster and smoother.

The so-called big data of high-throughput sequencing is large in size in the process of raw data and analysis, but the results are not large. Metagenomic analysis usually results in a table of sample species composition, functional composition, which is the starting point for downstream analysis, advanced analysis, and personality analysis, and most of the work can be done on our laptops, but many people don't know how to get started.

In fact, your personal computer is a powerful tool for statistical analysis of data tables (abundance matrix). The third lesson takes you to easily build a statistical analysis and visualization platform for data tables on your own book, based on the most mainstream Win10 for optimization and testing, so that the notebook can become a data analysis visualization platform in seconds.

We will also take you to configure the entire analysis visualization platform on Linux (Mac is similar to Linux, there is no difference, but some software may be installed in different ways, without in-depth testing, and is not recommended for training).

2. Fundamentals of Biological Letters

With the bioinformation analysis platform, how to use it flexibly still needs to learn something unique. The most important thing in the 21st century is talent, and it is best to master three languages, which will make you invincible in life and indispensable talent in any team. The three languages are Chinese, English, and computer languages. Chinese is used every day, English has been exposed to doctoral studies for at least 10 years and can be applied to reading and writing literature, and programming languages have been studied by everyone in college, Visual Basic, Visual Foxpro, or C, but they can be applied in the workplace. Not to mention that these languages are very inefficient in the life sciences and are not encouraged to be learned.

The three most commonly used languages are Shell + R + Python/Perl, and the first two are the basics to ensure that you complete the project analysis. In this class, we will also explain the basic knowledge of Shell and R language that biologists need to master to ensure that you can use the metagenetic analysis platform efficiently and stably, and ensure the skills required for big data analysis and post-visualization to publication. We have provided a learning video at the end of the article for preview in advance.

Figure 3. Shell and R learning syllabus, the first Rstuio mouse click to complete shell scripting and R language analysis, which not only opens the door to bioinformatics, but also does not increase the time cost of biologists

When you take advantage of a few hours and walk through the doors of big data analytics and visualization, you'll discover a whole new world. Many people will feel that they will hate to see each other late, fall in love with analysis, and go to the fast lane of life from then on. Even if you're not interested in programming, the ideas you use here will benefit you for a lifetime and you will be able to do more with less and better than others in your future analysis. Besides, now even elementary school students are learning Python, and if they don't, they won't be able to take it well.

3. Chart interpretation and drawing topics

In view of the fact that many teachers lack a systematic background in student letters, do not understand the analysis of article charts, and are at a loss to draw various charts, we have launched the following two series, a total of 16 original articles, explaining 8 kinds of graphs and R language drawings.

Amplicon Chart Interpretation - Understanding the Idea of the Article
Amplicon Statistical Plotting - Impact High Score Article

But these are just the beginning, and during the training, we will further explain the principles and scope of the 16 commonly used analysis diagrams in combination with the high-level articles we have published, so that you can not only read the diagrams, but also know how to apply them to your own research, and complete the plots yourself with ease.

In order to solve the problem of high time cost of learning drawing using R language, the team of Yishengxin has developed a free drawing website for 16 commonly used drawings, which can produce drawings with one click, and can also modify the personalized style of graphics by clicking on the parameters of the mouse.

Figure 4. The meaning, usage scenarios and drawings of 16 commonly used graphics are expressed. This can be done using our online drawing tool.

In order to achieve publication-level composition of various statistical pictures, an Adobe Illustrator retouching and typesetting course is specially set up to explain the basic use skills, easily grasp the essence, so that the grade of the article plate is on par with CNS, and you can easily become a retouching and puzzle expert in the laboratory.

Figure 5. The AI typesetting subgraph is an example of a CNS publication-level group chart (Science, 2016 cover article)

4. Overview of metagenomics

After building a comprehensive foundation for scientific research on the first day, we will begin our metagenomic big data analysis journey.

As a professional basic, we will learn the following.

Background: International Microbiome, China Microbiome Project
Research objects: humans, animals, plants, and the environment
Research Methods: Cultureomics, Amplicon, Metagenomics, Metatranscriptome, Metaproteome, Metabolome, Metagenomic Association Analysis, Metaepigenomics......
Research hotspots in metagenomics: culture group, intestinal bacteria and diseases, metagenomic association analysis (MWAS), multi-omics joint analysis......
History and principles of sequencing
Sample preparation, experiment replicates, and selection of sequencing data volumes
A common routine for metagenomic analysis of SCI articles
Comparison of the advantages and disadvantages of metagenomics and amplicons
Evaluation of raw data and judgment of the quality of assembly results

Figure 6. Common methods in metagenomics: amplicons, metagenomics, and scientific questions that metatranscriptomes can answer

5. Metagenomics reference analysis process

If you don't know where to start, it is recommended to immediately come to a set of parametric analysis to quickly obtain the species composition and functional composition of the sample. The reference-based method, as the name suggests, directly uses the current annotation database of species and functional genes, and the data is only through quality control and comparison to quickly obtain the relative abundance matrix of the corresponding species and functional genes. This approach is also highly regarded in a recent review by Rob Knight, the first analyst in the field, Nature Review| Rob Knight and others will teach you how to analyze microbiota data (18,000 words translated in full text).

This method has obvious advantages, fewer steps, fast speed, time-saving and labor-saving, and is suitable for fields with good reference databases such as human intestines, model organisms, and oceans. The disadvantage is that it is not possible to identify the functional genes of unreported species, and a lot of information is lost when analyzing samples from plants, soils, and extreme environments.

Figure 7. The basic idea of metagenomic analysis - the process of parametric analysis. The species composition was mainly obtained from MetaPhlAn2 based on all reported microbial genomes, and the functional composition was determined based on UniRef, EggNOG, KEGG and other protein databases. The 16S amplicon data itself contains only species composition, and the functional composition of KEGG/COG can be obtained through PICRUSt.

Key Knowledge:

1. Principles of experimental design

2. KneadData process: rapid quality control and de-hosting process

3. 物成 MetaPhlAn2

4. Functional composition quantification of HUMAnN2

6. Parameter-free metagenomic analysis process

Metagenomic parareference-free analysis has two main purposes: one is to obtain unannotated species and gene expression, and the other is to mine the genome of new species through binning. It looks great, but in practice, it is very computationally demanding. There are more steps in the analysis process, such as assembly, gene prediction, construction of non-redundant gene sets, and gene annotation.

Figure 8. Metagenomic paramedia-free analysis workflow.

Key steps and use of the software:

数据质控fastqc, Trimmomatic, MultiQC, Khmer
组装拼接MEGAHIT和评估quast
Gene annotation Prokka
构建非冗余基因集：CD-HIT
Gene abundance estimation: Salmon and other methods can quickly quantify gene abundance, and then the overall differences between groups such as PCA, PCoA, and CCA can be compared, and edgeR, MetaStat, and LEfSe can be used to analyze the differences between groups.
Species annotation: Species annotation information of non-redundant gene sets can be obtained, and direct species annotation can also be performed using Kraken2 at the reads level, and species analysis between groups can be performed in combination with the abundance value of step 6.
Gene function classification annotation: metabolic pathway (KEGG), homologous gene cluster (eggNOG) annotation, combined with 6 medium abundances for differential function comparison between groups;

Figure 9. Macrotranscriptome analysis pipeline. The metatranscriptome is one more step in removing the rRNA gene sequence than the metagenome. The disadvantage of this method is that it does not allow for the true species composition, but it reflects the composition of species and functional gene expression levels that are active under specific spatiotemporal conditions.

7. Advanced analysis and visualization practice

R language statistical plotting with repeatable calculations
宏基因组中鉴定单菌(分箱bin)：MetaWRAP
Bin结果评估及可视化:CheckM, VizBin
宏基因组可视化:Circuses
在线流程:MEGAN、MG-RAST、EBI-metagenome
网络分析: igraph、WGCNA、Cytoscape
Multi-gene ligation tree construction: RaxML, fasttree, iTOL
其它常用:Graphlan、Krona

Figure 10. Visualization of metagenomic gene composition, abundance, coverage, and other information

Figure 11. Phylogenetic Tree Construction and Beautification Based on Multi-gene Junction (Levy-2018-NatureGenetics)

What do you get by completing this course?

A deep understanding of the fundamentals of biosequencing data

A comprehensive solution for three modes of metagenomic analysis, as well as statistical analysis of the results

16S AMPLICON DATA PICRUST PREDICTS METAGENOME
Metagenomic data Humann2 quantification of species and function
Denovo宏基因组拼接和binning

Experience with dozens of software databases

Dozens of tutorials for the installation and use of software in the field
Common features annotate the understanding and use of databases

Visualization of demanding results

Comparison of differences in results
Multiple visualization options

Instructor

The lecturers include the Institute of Microbiology, the Institute of Genetics and Development, the Institute of Genomics, the Institute of Biophysics, the Chinese Academy of Agricultural Sciences, Tsinghua University, Peking University, Zhongnong and many other front-line technical experts in this field.

Yongxin Liu, Ph.D. in bioinformatics, researcher, doctoral supervisor, executive editor-in-chief of iMeta journal, founder of metagenomics public account. His research interests include microbiome method development, food microbiome function research, and science communication. At present, he has published 60+ papers in journals such as Science, Nature Biotechnology, Cell Host & Microbe as the first author (including co-authors) or the person in charge of microbiome data analysis, with 16,000+ citations, and has been selected as one of the top 2% highly cited researchers in the world. Participant in the QIIME 2 project of the microbiome analysis platform. He was invited to publish a review of microbiome research methods in Protein & Cell (x3), Current Opinion in Microbiology, Genetics and other journals as the first author and/or corresponding author (including co-authors). In July 2017, he founded the "Metagenome" public account, and currently shares more than 3,200 original articles related to this field, including "Microbiome Chart Interpretation, Analysis Process and Statistical Drawing", "QIIME2 Chinese Tutorial", etc., with 160,000+ followers and 40 million+ readings.

Tong Chen, Ph.D., graduated from the Institute of Genetics and Developmental Biology of the Chinese Academy of Sciences in 2015 with a Ph.D. in bioinformatics, published articles in high-level journals such as Cell Stem Cell, Nature Communications, Nucleic Acids Research (x5), Protein & Cell as the first or main author, and operated the WeChat public account of "Bio信 Baodian" with tens of thousands of followers, giving you a different experience of learning bioinformatics.

Highlights from previous conferences

The trainees are mainly from universities and research institutes in Chinese mainland, as well as researchers from major factories such as Moutai, Wuliangye, Angel Yeast, Huawei, and even overseas Chinese from Europe, Australia, the United States, Canada, New Zealand, Singapore, Thailand and other places fly to Beijing or participate in special study seminars online.

Teaching assistant team

More than 10 doctors from the Chinese Academy of Sciences, Tsinghua University, and Peking University (including those currently studying), rotating lecturers and teaching assistants, assist students in learning and correcting deficiencies in the training process.

Mode of delivery

This course focuses on explaining the process and practical operation, and adopts the original four-stage teaching:

The first stage is 3 days of intensive teaching;
Stage 2: 2 weeks of self-study;
The third stage is online live Q&A;
Stage 4 Training videos for continued learning;
Realize the unified coordination of the four links of teaching, training, answering and application.

Training time

From 9 a.m. to 6 p.m. every day, semi-closed teaching (the last hour is for roundtable discussions, increasing interactive communication.) The last day will be slightly earlier to allow more time for discussion and for the teacher to return by car)

Registration time: on the day of class

Venue of the lesson

Online and offline classes: Online meeting platforms, such as Tencent Meeting.

Offline location: C1105, Block C, Caizhi International Building, No. 18 Zhongguancun East Road, Haidian District, Beijing. If it is Shenzhen, the location will be notified separately.

Conference registration fee

Please refer to the registration website for details http://www.ehbio.com/Training/
The number of places is limited, and the registration channel will be automatically closed after each course registration reaches 40 people
Provide internship opportunities or job opportunities at eHanbo Genomics

Course Benefits:

The seats are sorted according to the order of successful registration and payment (or prepayment) from front to back
Complimentary program foundation course (http://bioinfo.ke.qq.com)
Multiple people (N,10>N>1) can also get a discount of N-100 yuan per person (up to 500) if they register as a group and pay at the same time.
Free Kingston USB flash drive (32G with training data and scripts)
Attach the enrollment information corresponding to the recommendation and sharing to the circle of friends, and send the screenshot to [email protected] You can get a 200 yuan Tencent classroom course coupon (can be split for multiple courses)
At the same time, we have launched a number of related courses, and we have reported discounts! Amplicon (preliminary study of the project) + metagenome (high-precision), I wish you a higher level of analysis.

Precautions *

You need to bring your own laptop, it is recommended to use Win10 system, 4G or more memory (8G is recommended). The course practice will provide a cloud computing platform as needed
All data and documents of the training course are internal materials and are for reference only, and may not be reprinted without permission
Audio and video recordings are prohibited during classes
Successful payment of the trainees, if there is an urgent matter that cannot come, can apply for an extension, change the subsequent training course, can also apply for a refund
85% refund if you apply for a refund 2 weeks (inclusive) before the start of the course, 70% refund if you apply for a refund 3 working days (inclusive) before the start of the course (if the invoice has been issued, you will be charged the corresponding handling fee)
No extension and refund are permitted

For more detailed information about the course, please scan the QR code below.

At the same time, Yishengxin has launched a number of related courses, and there are discounts for continuous reporting and group purchases!

连报优惠 -- Lianbao n Courses, each course is cheap n-1
100 yuan;

The discount for the continuous registration course does not appear in the first course, but appears in the form of accumulation in the following courses, that is, the first course completes the second course before the discount.
老学员优惠
：

100 for the second course, 200 for the third course, and the following analogy, no more than 500 yuan.
Multi-person group discount, the discount range is 参团人数-1
100 yuan (if there are group members who withdraw when paying at the time of registration, the discount will be calculated according to the actual number of participants).

4. The lowest final price after the discount is not less than 4,000 yuan. Offers are dynamically changing, and prices are based on the correct price calculated by the system.

It can also be discounted at the same time as group purchases!Recommended amplicon (beginner) + metagenome (advanced) to learn in order, I wish you a higher level of analysis and become an indispensable person in the experiment, hurry up and sign up!.