大資料學習筆記2--hdfs工作原理及源碼分析

2023-07-02 07:51:50

windows下配置hadoop

namenode

響應用戶端的請求，上傳檔案：

client申請上傳檔案，namenode檢視中繼資料資訊，檢視用戶端申請的路徑是否已存在
namenode傳回可用的datanode
client直接通路第一個datanode，上傳第一個block，datanode向namenode報告block資訊，第一個block建立一個pipeline，向其他datanode拷貝block副本，鍊式向下傳遞副本，達到配置的副本數。

namenode寫中繼資料

secondary namenode 同步修改fsimage

checkpoint

節點間通信：

FileSystem擷取過程

FileSystem.get(new URI(HDFS_PATH), new Configuration());//擷取檔案對象
CACHE.get(uri, conf)//從緩存Map中擷取
fs = createFileSystem(uri, conf);//建立新的fs
clazz = getFileSystemClass(uri.getScheme(), conf);//擷取fs類
ReflectionUtils.newInstance(clazz, conf)//執行個體化fs
fs.initialize(uri, conf);//初始化fs參數
dfs = new DFSClient(uri, conf, statistics)//擷取dfs用戶端
proxyInfo =

NameNodeProxies.createProxyWithLossyRetryHandler(conf,nameNodeUri,

ClientProtocol.class, numResponseToDrop)//通過RPC擷取和NN通信的用戶端代理對象
this.namenode = proxyInfo.getProxy()//獲得namenode代理對象

fs持有DistributedFileSystem dfs,dfs中持有DFSClinet dfsc對象，dfsc中持有namenode代理對象

繼續閱讀