laitimes

HDFS Disk Balancer Disk Balancer

1. Background

During a period of running our Hadoop cluster, the distribution of data across DataNade's disks may be uneven for a number of reasons. For example, we have just added a new disk to a DataNode or a large number of write & deltete operations on the cluster. So is there a tool that balances data across multiple disks in a single DataNode? This is possible with the help of the Diskbalancer command-line tool provided by Hadoop.

2) What is the difference between HDFS Balancer and HDFS Disk Balancer?

hdfs balancer: It is for the data balancing of DataNodes in the cluster, that is, for multiple DataNodes.

HDFS Disk Balancer Disk Balancer

hdfs balancer

hdfs disk balancer: is to balance data across multiple disks in a single DataNode.

HDFS Disk Balancer Disk Balancer

hdfs disk balancer

Note: Currently, DiskBalancer does not support data transfer across storage media (SSD, DISK, etc.), so disk balancing is required under one storageType. Because there is heterogeneous storage in HDFS.

3. Operation

3.1 Generate Plans

[hadoopdeploy@hadoop01 ~]$ hdfs diskbalancer -plan hadoop01 -out hadoop01-plan.json           

-plan: followed by the host name.

-out: Specifies the output location of the plan file.

HDFS Disk Balancer Disk Balancer

Build a plan

3.2 Implementation Plan

[hadoopdeploy@hadoop01 ~]$ hdfs diskbalancer -execute hadoop01-plan.json           
HDFS Disk Balancer Disk Balancer

Execute the plan

3.3 Query Plans

[hadoopdeploy@hadoop01 ~]$ hdfs diskbalancer -query hadoop01           

-query is followed by the hostname

HDFS Disk Balancer Disk Balancer

Query plans

3.4 Cancellation of Program

[hadoopdeploy@hadoop01 ~]$ hdfs diskbalancer -cancel hadoop01-plan.json
           
HDFS Disk Balancer Disk Balancer

Cancel a plan

4. Configuration related to disk balancers

disposition description
dfs.disk.balancer.enabled This parameter controls whether diskbalancers are enabled for the cluster. If not enabled, any execution commands will be rejected by DataNode. The default value is true.
dfs.disk.balancer.max.disk.throughputInMBperSec This controls the maximum disk bandwidth consumed by DiskBalancer when copying data. If a value such as 10MB is specified, the diskbalancer will only replicate 10MB/s on average. THE DEFAULT VALUE IS 10MB/S.
dfs.disk.balancer.max.disk.errors Sets the maximum number of errors that can be tolerated during the specified movement, beyond which failure occurs. For example, if a plan has 3 pairs of disks to replicate in, and the first disk set encounters more than 5 errors, then we discard the first copy and start the second copy in the plan. The default value for maximum errors is set to 5.
dfs.disk.balancer.block.tolerance.percent Sets the threshold for the difference between the amount of data stored on each disk and the ideal state when balancing data between disks. The value range is [1-100], and the default value is 10. For example, the ideal amount of data storage for each disk is 100 GB, and this parameter is set to 10. Then, when the data storage capacity of the target disk reaches 90 GB, the storage state of the disk is considered to have met expectations.
dfs.disk.balancer.plan.threshold.percent Set the data density domain value difference between two disks that can be tolerated in Disk Data Balancer, in the range of [1-100], and the default value is 10. If the absolute value of the difference in data density between any two disks exceeds the threshold, the disks need to be data balanced. For example, if the total data on a 2-disk node is 100 GB, the disk balancer calculates the expected value on each disk to be 50 GB. If the tolerance is 10%, the data on a single disk needs to be greater than 60 GB (50 GB + 10% tolerance value) for the DiskBalancer to start working.
dfs.disk.balancer.plan.valid.interval The maximum time that the disk balancer schedule is valid. The following suffixes are supported (not case sensitive): ms(milis), s(sec), m(min), h(h), d(day) to specify the time (e.g. 2s, 2m, 1h, etc.). If no suffix is specified, milliseconds is assumed. The default value is 1d

5. Additional knowledge points

5.1 The new block is stored on that disk (volume).

When data is written to a new block, DataNode selects different disks to store according to the policy.

Round Robin Policy: The default policy is to distribute new blocks evenly across the available disks, potentially causing data skew.

Free space policy: Select disks with more free space (by percentage). This may cause the IO pressure of a disk to increase over a certain period of time.

5.2 Disk Data Density Metrics

HDFS Disk Balancer Disk Balancer

Disk data density metrics

The image above is from https://www.bilibili.com/video/BV11N411d7Zh/?p=81

6. Reference documentation

1、https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html

2、https://help.aliyun.com/document_detail/467585.html 3、https://www.bilibili.com/video/BV11N411d7Zh/?p=81