使用hadoop restful api實作對叢集資訊的統計

2021-11-08 08:09:06

（适用于hadoop 2.7及以上版本）

resourcemanager rest api’s：

<a href="https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/resourcemanagerrest.html">https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/resourcemanagerrest.html</a>

webhdfs rest api：

<a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/webhdfs.html">https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/webhdfs.html</a>

mapreduce history server rest api’s：

<a href="https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/historyserverrest.html">https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/historyserverrest.html</a>

spark monitoring and instrumentation

<a href="http://spark.apache.org/docs/latest/monitoring.html">http://spark.apache.org/docs/latest/monitoring.html</a>

url

<a href="http://emr-header-1:50070/webhdfs/v1/?user.name=hadoop&op=getcontentsummary">http://emr-header-1:50070/webhdfs/v1/?user.name=hadoop&op=getcontentsummary</a>

傳回結果：

關于傳回結果的說明：

注意length與spaceconsumed的關系，跟hdfs副本數有關。

如果要統計各個組工作目錄的使用情況，使用如下請求：

<a href="http://emr-header-1:50070/webhdfs/v1/user/feed_aliyun?user.name=hadoop&op=getcontentsummary">http://emr-header-1:50070/webhdfs/v1/user/feed_aliyun?user.name=hadoop&op=getcontentsummary</a>

<a href="http://emr-header-1:8088/ws/v1/cluster">http://emr-header-1:8088/ws/v1/cluster</a>

傳回結果

<a href="http://emr-header-1:8088/ws/v1/cluster/scheduler">http://emr-header-1:8088/ws/v1/cluster/scheduler</a>

具體參數說明參考：

<a href="https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/resourcemanagerrest.html#cluster_application_queue_api">https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/resourcemanagerrest.html#cluster_application_queue_api</a>

<a href="http://emr-header-1:8088/ws/v1/cluster/apps">http://emr-header-1:8088/ws/v1/cluster/apps</a>

如果要統計固定時間段的，可以加上"?finishedtimebegin={時間戳}&finishedtimeend={時間戳}"參數，例如

<a href="http://emr-header-1:8088/ws/v1/cluster/apps?finishedtimebegin=1496742124000&finishedtimeend=1496742134000">http://emr-header-1:8088/ws/v1/cluster/apps?finishedtimebegin=1496742124000&finishedtimeend=1496742134000</a>

job掃描的資料量，需要通過history server的restful api查詢，mapreduce的和spark的又有一些差異。

<a href="http://emr-header-1:19888/ws/v1/history/mapreduce/jobs/job_1495123166259_0962/counters">http://emr-header-1:19888/ws/v1/history/mapreduce/jobs/job_1495123166259_0962/counters</a>

其中org.apache.hadoop.mapreduce.lib.input.fileinputformatcounter裡面的bytes_read為job掃描的資料量

<a href="http://emr-header-1:18080/api/v1/applications/application_1495123166259_1050/executors">http://emr-header-1:18080/api/v1/applications/application_1495123166259_1050/executors</a>

每個executor的totalinputbytes總和為整個job的資料掃描量。

使用hadoop restful api實作對叢集資訊的統計

繼續閱讀

Apache與PHP環境下配置本地虛拟主機

MapReduce的幾個企業級經典面試案例MapReduce的幾個企業級經典面試案例

Linux 7 中配置Apache服務，及禁止ip通路，删除apache廣告頁面。

Apache配置檔案中的deny和allow的使用

Apache 配置預設編碼

伺服器配置——Apache

Apache靜态檔案通路配置（書封伺服器）

apache httpd 配置

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

ubuntu14.04下安裝hbse1.0.1.1

Apache配置SSLApache配置SSL

Windows下配置Apache的SSL服務

User Defined Hadoop DataType

Apache2.4.x 配置檔案詳解Apache配置需要了解如下：開始講解：

配置apache支援PHP（win7）

Ambari介紹和架構原理