天天看點

HDFS Archival Storage一、是什麼二、怎麼做三、測試四、我們可以用來做什麼

      随着資料的日益增長,很多資料由熱變冷,已經不再或者很少使用,而資料的存儲需求越來越大,計算需求則相應增長不大。如何解耦這種急劇增長的存儲需求和計算需求?HDFS Archival Storage正好能派上用場。

      HDFS Archival Storage是Hadoop-2.6.0新增的一個特性,是Hadoop異構存儲中的一部分。它實作了資料按照政策分開存儲,解耦存儲與計算能力,将部分冷資料歸檔至擁有廉價高密度存儲媒體但計算能力不強的機器。

      Hadoop-2.6.0起增加了幾種存儲政策,可以規劃某些機器為歸檔伺服器,存儲冷資料,存儲媒體為高密度廉價磁盤,某些機器為熱資料伺服器,存儲熱資料,存儲媒體可以為磁盤,也可以是SSD。存儲政策包括如下幾種:

<code>public</code> <code>static</code> <code>final</code> <code>String MEMORY_STORAGE_POLICY_NAME = </code><code>"LAZY_PERSIST"</code><code>;</code>

<code>public</code> <code>static</code> <code>final</code> <code>String ALLSSD_STORAGE_POLICY_NAME = </code><code>"ALL_SSD"</code><code>;</code>

<code>public</code> <code>static</code> <code>final</code> <code>String ONESSD_STORAGE_POLICY_NAME = </code><code>"ONE_SSD"</code><code>;</code>

<code>public</code> <code>static</code> <code>final</code> <code>String HOT_STORAGE_POLICY_NAME = </code><code>"HOT"</code><code>;</code>

<code>public</code> <code>static</code> <code>final</code> <code>String WARM_STORAGE_POLICY_NAME = </code><code>"WARM"</code><code>;</code>

<code>public</code> <code>static</code> <code>final</code> <code>String COLD_STORAGE_POLICY_NAME = </code><code>"COLD"</code><code>;</code>

      目前,Hadoop-2.6.0支援的為HOT、WARM、COLD三種,熱資料全部存儲在标記為[DISK]的DataNode存儲路徑上(未标記的預設為[DISK]),而冷資料全部存儲在标記為[ARCHIVE]的DataNode存儲路徑上,這種節點機器可以是計算能力比較弱但是存儲密度高的廉價機器,溫資料則介于兩者之間,部分副本存儲于[DISK]上,而部分副本存儲于[ARCHIVE]上。而SSD則是在Hadoop-2.7.0開始支援的一種存儲媒體。

      重新開機DataNode之後,我們可以标記部分HDFS資料存儲路徑的存儲政策,然後利用hdfs mover工具進行資料的遷移。注意,未标記存儲屬性的DataNode預設為[DISK],未标記存儲政策的HDFS路徑預設為unspecified,建立檔案時存儲在[DISK]上。

           修改hdfs-site.xml配置檔案中的dfs.datanode.data.dir,在原路徑前增加存儲屬性,如下:

<code>&lt;</code><code>property</code><code>&gt;</code>

<code>  </code><code>&lt;</code><code>name</code><code>&gt;dfs.datanode.data.dir&lt;/</code><code>name</code><code>&gt;</code>

<code>  </code><code>&lt;</code><code>value</code><code>&gt;[ARCHIVE]file:///opt/hadoop/dfs.data&lt;/</code><code>value</code><code>&gt;</code>

<code>&lt;/</code><code>property</code><code>&gt;</code>

           目前Hadoop-2.6.0可以配置的存儲屬性包括[DISK]、[ARCHIVE],Hadoop-2.7.0開始支援[SSD]和[RAM_DISK]。

<code>sh hadoop</code><code>-daemon</code><code>.sh stop datanode</code>

<code>sh hadoop</code><code>-daemon</code><code>.sh start datanode</code>

<code>hadoop dfsadmin </code><code>-setStoragePolicy</code> <code>/tmp/lp_test COLD</code>

<code>hadoop dfsadmin </code><code>-getStoragePolicy</code> <code>/tmp/lp_test</code>

<code>hdfs mover</code>

或者

<code>hdfs mover </code><code>-p</code> <code>/tmp/lp_test</code>

      hdfs mover預設會遷移DataNode上和其存儲屬性不一樣的存儲政策的路徑下檔案的資料塊,-p或-f則隻會遷移指定路徑下或者指定檔案與存儲屬性不一緻的資料塊。

<code>hadoop fsck /tmp/lp_test/ </code><code>-files</code> <code>-blocks</code> <code>-locations</code>

       測試環境:5台虛拟機,其中4台為DataNode。

      1、标記一台DataNode為[ARCHIVE]後,并标記hdfs路徑/tmp/lp_test為HOT後,如果執行hdfs mover會怎麼樣?

<code>16/09/27 09:47:17 INFO balancer.Dispatcher: Successfully moved blk_1073760906_20104 with size=57521882 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.108:50010:DISK through xxx.xx.x.110:50010</code>

<code>16/09/27 09:47:23 INFO balancer.Dispatcher: Successfully moved blk_1073760895_20093 with size=57521882 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.108:50010:DISK through xxx.xx.x.111:50010</code>

<code>16/09/27 09:47:28 INFO balancer.Dispatcher: Successfully moved blk_1073760887_20085 with size=57521892 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.108:50010:DISK through xxx.xx.x.110:50010</code>

<code>16/09/27 09:47:30 INFO balancer.Dispatcher: Successfully moved blk_1073760905_20103 with size=57521892 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.109:50010:DISK through xxx.xx.x.111:50010</code>

<code>16/09/27 09:47:31 INFO balancer.Dispatcher: Successfully moved blk_1073760907_20105 with size=57521882 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.109:50010:DISK through xxx.xx.x.110:50010</code>

<code>16/09/27 09:47:33 INFO balancer.Dispatcher: Successfully moved blk_1073759183_18381 with size=79634354 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.109:50010:DISK through xxx.xx.x.111:50010</code>

<code>16/09/27 09:47:41 INFO balancer.Dispatcher: Successfully moved blk_1073760893_20091 with size=134217728 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.109:50010:DISK through xxx.xx.x.110:50010</code>

<code>16/09/27 09:47:44 INFO balancer.Dispatcher: Successfully moved blk_1073759178_18376 with size=134217728 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.109:50010:DISK through xxx.xx.x.110:50010</code>

<code>16/09/27 09:47:44 INFO balancer.Dispatcher: Successfully moved blk_1073760883_20081 with size=134217728 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.108:50010:DISK through xxx.xx.x.111:50010</code>

<code>16/09/27 09:47:45 INFO balancer.Dispatcher: Successfully moved blk_1073759181_18379 with size=134217728 from xxx.xx.x.111:50010:ARCHIVE to xxx.xx.x.108:50010:DISK through xxx.xx.x.111:50010</code>

           可以看到,如果路徑下的檔案被标記為[HOT],執行hdfs mover會将其資料塊由ARCHIVE遷移至DISK,并且,檢視ARCHIVE機器上的存儲路徑,依然還有很多資料塊,說明之前的已經存在的檔案不會被遷移。由此可以得出:

           1)更新前已存在的檔案沒有定義存儲政策,其可以在任何存儲屬性的DataNode上;

           2)沒有标記存儲屬性的DataNode預設為[DISK];

           3)對應标注存儲政策的路徑,資料塊的存放則需要按照DataNode存儲屬性嚴格存放,hdfs mover會引起資料塊的遷移。

      2、建立路徑,并上傳檔案後,檔案如何分布?

           執行以下指令上次檔案至hdfs新建立的路徑:

<code>hadoop fs </code><code>-mkdir</code> <code>/tmp/lp_test2</code>

<code>hadoop fs </code><code>-copyFromLocal</code> <code>00002.txt  /tmp/lp_test2/</code>

           然後,檢查檔案資料塊存儲分布情況,如下:

<code>hadoop fsck /tmp/lp_test2 </code><code>-files</code> <code>-blocks</code> <code>-locations</code>

           執行結果主要輸出如下:

<code>/tmp/lp_test2/00002.txt 6005783952 bytes, 45 block(s):  OK</code>

<code>0. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761093_20294 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>1. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761094_20295 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>2. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761095_20296 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>3. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761096_20297 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>4. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761097_20298 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>5. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761098_20299 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>6. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761099_20300 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>7. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761100_20301 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>8. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761101_20302 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>9. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761102_20303 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>10. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761103_20304 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>11. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761104_20305 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>12. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761105_20306 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>13. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761106_20307 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>14. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761107_20308 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>15. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761109_20310 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>16. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761110_20311 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>17. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761111_20312 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>18. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761112_20313 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>19. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761113_20314 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>20. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761115_20316 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>21. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761116_20317 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>22. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761118_20319 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>23. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761119_20320 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>24. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761120_20321 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>25. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761122_20323 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>26. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761123_20324 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>27. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761124_20325 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>28. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761125_20326 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>29. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761126_20327 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>30. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761127_20328 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>31. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761128_20329 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>32. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761129_20330 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>33. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761130_20331 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.108:50010]</code>

<code>34. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761131_20332 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>35. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761132_20333 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>36. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761133_20334 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>37. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761134_20335 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

<code>38. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761135_20336 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>39. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761136_20337 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>40. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761137_20338 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.109:50010]</code>

<code>41. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761138_20339 len=134217728 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.109:50010]</code>

<code>42. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761139_20340 len=134217728 repl=2 [xxx.xx.x.109:50010, xxx.xx.x.110:50010]</code>

<code>43. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761140_20341 len=134217728 repl=2 [xxx.xx.x.110:50010, xxx.xx.x.108:50010]</code>

<code>44. BP-822385924-xxx.xx.x.107-1470286871711:blk_1073761141_20342 len=100203920 repl=2 [xxx.xx.x.108:50010, xxx.xx.x.110:50010]</code>

           檔案所有的資料塊副本分布在了xxx.xx.x.108~110三台[DISK]機器上,而xxx.xx.x.111這台[ARCHIVE]機器不會存放建立檔案的資料塊。這個說明,建立檔案雖然沒有定義存儲政策,但是其資料塊是被存儲在[DISK]資料節點上的。

      如果叢集上很多hive表都是按天的分區表,而且如果資料有相當一部分是N年前的資料的話,這種資料理論上應該不會再或者很少會被用到,如果将這些資料标記為冷資料,歸檔至一些計算能力比較弱的廉價機器上,将會極大節省計算節點的存儲空間,增加了存儲能力,而計算能力則維持不變,并且這些冷資料可以做壓縮、降副本處理,更多的節省歸檔伺服器的存儲空間,将來如果還要頻繁使用,更改存儲政策并利用hdfs mover遷移資料即可。另外,将來也可以利用SSD、記憶體做熱資料的存儲,提高資料存取速度,進而提高計算能力。

繼續閱讀