在安装完集群后,我们都需要先对集群做一些测试,下面讲解测试读写的性能
写性能
包名:
Apache:
hadoop-mapreduce-client-jobclient-2.7.5-tests.jar
CDH:
hadoop-mapreduce-client-jobclient-3.0.0-cdh6.2.0-tests.jar
包路径:
/home/hadoop-jrq/bigdata/hadoop-2.7.5/share/hadoop/mapreduce
CDH路径:
/opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/jars
执行命令:
hadoop jar hadoop-mapreduce-client-jobclient-2.7.5-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 128MB
CDH同理 写10个128M的数据文件
结果:
INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
INFO fs.TestDFSIO: Date & time: Thu May 02 11:45:23 CST 2019
INFO fs.TestDFSIO: Number of files: 10 --十个文件
INFO fs.TestDFSIO: Total MBytes processed: 1280.0 --总大小1280M
INFO fs.TestDFSIO: Throughput mb/sec: 10.69751115716984 --吞吐量 每秒10.69M
INFO fs.TestDFSIO: Average IO rate mb/sec: 14.91699504852295 --平均IO情况
INFO fs.TestDFSIO: IO rate std deviation: 11.160882132355928
INFO fs.TestDFSIO: Test exec time sec: 52.315 --总运行时间
读性能
命令:
hadoop jar hadoop-mapreduce-client-jobclient-2.7.5-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 128MB
结果:
INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
INFO fs.TestDFSIO: Date & time: Thu May 02 11:56:36 CST 2019
INFO fs.TestDFSIO: Number of files: 10 --文件数
INFO fs.TestDFSIO: Total MBytes processed: 1280.0 --总大小
INFO fs.TestDFSIO: Throughput mb/sec: 16.001000062503905 --吞吐量
INFO fs.TestDFSIO: Average IO rate mb/sec: 17.202795028686523 --平均IO情况
INFO fs.TestDFSIO: IO rate std deviation: 4.881590515873911
INFO fs.TestDFSIO: Test exec time sec: 49.116 --总时间
删除测试数据
hadoop jar hadoop-mapreduce-client-jobclient-2.7.2-tests.jar TestDFSIO -clean
测试排序程序
1.使用RandomWriter来产生随机数,每个节点运行10个Map任务,每个Map产生大约1G大小的二进制随机数
hadoop jar hadoop-mapreduce-examples-2.7.5.jar randomwriter random-data
2.执行Sort程序
hadoop jar hadoop-mapreduce-examples-2.7.5.jar sort random-data sorted-data
3.验证数据是否真正排好序了
hadoop jar hadoop-mapreduce-examples-2.7.5.jar testmapredsort -sortInput random-data -sortOutput sorted-data