Constructed the most comprehensive global microbial gene catalog to date, and fudan university achievements were listed in nature

Constructed the most comprehensive global microbial gene catalog to date, and fudan university achievements were listed in nature

Microorganisms are everywhere in the earth, hidden in people's skin, intestines and soil, rivers, oceans and other environments, forming a complex microbiome community. Traditional microbiome research is conducted separately according to different habitats, and it is impossible to describe the interconnection of microbial communities in different habitats from a global perspective.

Luis Pedro Coelho, a young researcher at the Institute of Brain-Like Research of Fudan University, Professor Zhao Xingming, and Professor Emeritus Pierre Burke have cooperated with scientists from Germany, Spain, the United States, the United Kingdom and other countries to conduct research, using microorganisms from different habitats on the earth as a unified system, using artificial intelligence technology to excavate 13,000 public metagenomic samples, constructing the most comprehensive global microbial gene catalog to date, taking an important step for global microbiome research. In the early morning of December 16, 2021, the relevant research results were published in nature in the form of a long article.

Constructed the most comprehensive global microbial gene catalog to date, and fudan university achievements were listed in nature

Luis Pedro Coelho

【The most comprehensive global microbial gene catalogue to date】

Gene catalogs are important for describing the species composition and functional characteristics of microbial communities. Since the European Molecular Biology Laboratory and BGI Genetics constructed the first human gut microbial gene catalog in 2010, the emerging microbial gene catalog has provided important clues for the study of human physiology and disease.

The Global Microbial Gene Catalog covers the main habitats of 14 microorganisms such as the gut, mouth, skin, ocean, soil, etc., collects 13,174 publicly available high-quality metagenomics and 84,029 high-quality genomes, obtains 303 million species-level genes, and constructs the most comprehensive global microbial gene catalog to date, which will provide important contributions to earth ecological research and human health research.

【Revealing the important association between microbial genes and habitat environment】

The study found that most genes are habitat-specific, which is consistent with the traits of microbes that tend to adapt to the environment; only 5.8% of the single gene clusters at the species level are multi-habitat genes, and multi-habitat genes are mainly enriched in antibiotic resistance genes and mobile genetic originals.

This is important for understanding the emergence of antibiotic resistance, as well as the future development of antimicrobial drugs.

【Multidisciplinary International Research Team】

The biomedical artificial intelligence team of the Institute of Brain-like Research of Fudan University focuses on the intersection of artificial intelligence and biomedicine, and has introduced a group of outstanding scholars at home and abroad since 2018. In recent years, a series of artificial intelligence algorithms have been developed around the characteristics of biomedical big data, which have been successfully applied to scenarios such as brain-intestinal axis, brain development and brain diseases. In 2020, the team won the first prize of Wu Wenjun Artificial Intelligence Natural Science.

As the first author and co-corresponding author of the paper, Coelho joined Fudan full-time in 2018. As the leader of the biomedical artificial intelligence team of the Brain-like Research Institute, Professor Zhao Xingming introduced that cutting-edge science is increasingly breaking through the boundaries of disciplines and requires a global vision and a global vision. In the next step, the team will also cooperate with domestic and foreign research institutes and clinical medical institutions based on the developed gene catalog to explore the impact of microorganisms, including human gut microbes, human life health, brain cognition and behavior.

架構設計 | 分布式系統排程,Zookeeper叢集化管理一、架構簡介二、叢集配置三、服務節點監聽四、源代碼位址

一、架構簡介

1、基礎簡介

Zookeeper基于觀察者模式設計的元件,主要應用于分布式系統架構中的,統一命名服務、統一配置管理、統一叢集管理、伺服器節點動态上下線、軟負載均衡等場景。

  • Linux下Zookeeper單節點安裝
  • SpringBoot整合Zookeeper中間件

2、叢集選舉

Zookeeper叢集基于半數機制,叢集中半數以上機器存活,叢集處于可用狀态。是以建議Zookeeper叢集安裝為奇數台伺服器。在叢集的配置檔案中并沒有指定Master和Slave。在Zookeeper工作時,是有一個節點為Leader,其他則為Follower,Leader是通過内部的選舉機制臨時産生的。

架構設計 | 分布式系統排程,Zookeeper叢集化管理一、架構簡介二、叢集配置三、服務節點監聽四、源代碼位址

基本描述

假設有三台伺服器組成的Zookeeper叢集,每個節點的myid編号依次1-3,依次啟動伺服器,會發現server2被選擇為Leader節點。

server1啟動,執行一次選舉。伺服器1投自己一票。此時伺服器1票數一票,未達到半數以上(2票),選舉無法完成,伺服器1狀态保持為LOOKING;

server2啟動,再執行一次選舉。伺服器1和2分别投自己一票,并交換選票資訊,因為伺服器2的myid比伺服器1的myid大,伺服器1會更改選票為投伺服器2。此時伺服器1票數0票,伺服器2票數2票,達到半數以上,選舉完成,伺服器1狀态為follower,2狀态保持leader,此時叢集可用,伺服器3啟動後直接為follower。

二、叢集配置

1、建立配置目錄

# mkdir -p /data/zookeeper/data
# mkdir -p /data/zookeeper/logs           

2、基礎配置

# vim /opt/zookeeper-3.4.14/conf/zoo.cfg

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper/data
dataLogDir=/data/zookeeper/logs
clientPort=2181           

3、單節點配置

# vim /data/zookeeper/data/myid            

三個節點服務,分别在myid檔案中寫入[1,2,3]

4、叢集服務

在每個服務的zoo.cfg配置檔案中寫入如下配置:

server.1=192.168.72.133:2888:3888
server.2=192.168.72.136:2888:3888
server.3=192.168.72.137:2888:3888           

5、啟動叢集

分别啟動三台zookeeper服務

[zookeeper-3.4.14]# bin/zkServer.sh start
Starting zookeeper ... STARTED           

6、檢視叢集狀态

Mode: leader是Master節點

Mode: follower是Slave節點

[zookeeper-3.4.14]# bin/zkServer.sh status
Mode: leader           

7、叢集狀态測試

随便登入一台服務的用戶端,建立一個測試節點,然後在其他服務上檢視。

[zookeeper-3.4.14 bin]# ./zkCli.sh
[zk: 0] create /node-test01 node-test01  
Created /node-test01
[zk: 1] get /node-test01           

或者關閉leader節點

[zookeeper-3.4.14 bin]# ./zkServer.sh stop           

則會重新選舉該節點。

8、Nginx統一管理

[rnginx-1.15.2 conf]# vim nginx.conf

stream {
    upstream zkcluster {
        server 192.168.72.133:2181;
        server 192.168.72.136:2181;
        server 192.168.72.136:2181;
    }
    server {
        listen 2181;
        proxy_pass zkcluster;
    }
}           

三、服務節點監聽

1、基本原理

分布式系統中,主節點可以有多台,可以動态上下線,任意一台用戶端都能實時感覺到主節點伺服器的上下線。

架構設計 | 分布式系統排程,Zookeeper叢集化管理一、架構簡介二、叢集配置三、服務節點監聽四、源代碼位址

流程描述:

  • 啟動Zookeeper叢集服務;
  • RegisterServer模拟服務端注冊;
  • ClientServer模拟用戶端監聽;
  • 啟動服務端注冊三次,注冊不同節點的zk-node服務;
  • 依次關閉注冊的服務端,模拟服務下線流程;
  • 檢視用戶端日志,可以監控到服務節點變化;

首先建立一個節點:serverList,用來存放伺服器清單。

[zk: 0] create /serverList "serverList"            

2、服務端注冊

package com.zkper.cluster.monitor;
import java.io.IOException;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooKeeper;
import org.apache.zookeeper.ZooDefs.Ids;

public class RegisterServer {

    private ZooKeeper zk ;
    private static final String connectString = "127.0.0.133:2181,127.0.0.136:2181,127.0.0.137:2181";
    private static final int sessionTimeout = 3000;
    private static final String parentNode = "/serverList";

    private void getConnect() throws IOException{
        zk = new ZooKeeper(connectString, sessionTimeout, new Watcher() {
            @Override
            public void process(WatchedEvent event) {
            }
        });
    }

    private void registerServer(String nodeName) throws Exception{
        String create = zk.create(parentNode + "/server", nodeName.getBytes(),
                                  Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);
        System.out.println(nodeName +" 上線:"+ create);
    }

    private void working() throws Exception{

        Thread.sleep(Long.MAX_VALUE);
    }

    public static void main(String[] args) throws Exception {
        RegisterServer server = new RegisterServer();
        server.getConnect();
        // 分别啟動三次服務,注冊不同節點,再一次關閉不同服務端看用戶端效果
        // server.registerServer("zk-node-133");
        // server.registerServer("zk-node-136");
        server.registerServer("zk-node-137");
        server.working();
    }
}           

3、用戶端監聽

package com.zkper.cluster.monitor;
import org.apache.zookeeper.*;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class ClientServer {
    private ZooKeeper zk ;
    private static final String connectString = "127.0.0.133:2181,127.0.0.136:2181,127.0.0.137:2181";
    private static final int sessionTimeout = 3000;
    private static final String parentNode = "/serverList";

    private void getConnect() throws IOException {
        zk = new ZooKeeper(connectString, sessionTimeout, new Watcher() {
            @Override
            public void process(WatchedEvent event) {
                try {
                    // 監聽線上的服務清單
                    getServerList();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        });
    }

    private void getServerList() throws Exception {
        List<String> children = zk.getChildren(parentNode, true);
        List<String> servers = new ArrayList<>();
        for (String child : children) {
            byte[] data = zk.getData(parentNode + "/" + child, false, null);
            servers.add(new String(data));
        }
        System.out.println("目前服務清單:"+servers);
    }

    private void working() throws Exception{
        Thread.sleep(Long.MAX_VALUE);
    }

    public static void main(String[] args) throws Exception {
        ClientServer client = new ClientServer();
        client.getConnect();
        client.getServerList();
        client.working();
    }

}           

四、源代碼位址

GitHub·位址
https://github.com/cicadasmile/data-manage-parent
GitEE·位址
https://gitee.com/cicadasmile/data-manage-parent           
最近更新