天天看点

Hadoop1.0.2+hbase0.94.0+linux完全分布式搭建

linux 3台虚拟机:

192.169.200.101     cpyftest-1(主机名或者域名)  Namenode(别名)

192.169.200.161     cpyftest-2(主机名或者域名)   Datanode1(别名)

192.169.200.191     cpyftest-3(主机名或者域名)   Datanode2(别名)

hadoop可运行兼容文件:

hadoop1.0.2替换了核心包的可以正常使用的版本包含hbase0.94.0 

>1.设置/etc/hosts

 192.169.200.101     cpyftest-1  Namenode

 192.169.200.161     cpyftest-2   Datanode1

 192.169.200.191     cpyftest-3   Datanode2

>2设置账户:

      3台虚拟机的用户组和用户名和密码一致

>3.设置三台虚拟机ssh无密码登录

在三台虚拟机上运行命令:ssh-keygen -t rsa  

把生成的3个id_rsa.pub拷贝到  authorized_keys中 每台虚拟机持有一份该文件

参考命令:cat id_rsa.pub >>  authorized_keys 拷贝内容

            在101运行命令:scp authorized_keys cpyftest-2 :/.ssh/ 实在不行就打开文件复制黏贴

>4.配置hadoop文件

hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_07

core-sit.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>  
      <property>  
        <name>fs.default.name</name>  
        <value>hdfs://192.169.200.101:9000</value>  
      </property>  
 </configuration> 
           

 hdf-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
 <property>  
        <name>dfs.replication</name>  
        <value>2</value>  设置拷贝2份
      </property> 
      
 <property>
 <name>dfs.data.dir</name>      
 <value>/root/hadoop/data</value>  
 </property>

      <property>  
        <name>dfs.web.ugi</name>  
        <value>用户名,用户组</value>  设置web访问权限否则提示:no id webuser
      </property> 
      
      <property>  
        <name>dfs.permissions</name>  
        <value>true</value>  
      </property> 
     
      <property>  
        <name>dfs.permissions.supergroup</name>  
        <value>supergroup</value>  
      </property> 
      
      
</configuration>
           

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
 <property>  
      <name>mapred.job.tracker</name>  
    <value>192.169.200.101:9001</value>
    </property>  
</configuration>
           

>5配置master和slaves

   在hadoop/conf文件夹中找到

  两个文件

master配置如下

192.169.200.101 或 cpyftest-1  或 Namenode 选择一个即可

slaves

配置其他两个

Datanode1

Datanode2

>6把配置好的hadoop文件每台机器拷贝一份

>7在101机器进入hadoop/bin目录中运行命令:./start-all.sh

运行结束后输入命令:./jps 查看进程(jps是java自带的,如果提示没有该命令请检查java的环境变量是否设置了,也可以进入java/bin目录中运行该命令)

28037 NameNode               名称节点进程       28037是进程号

28950 Jps28220 SecondaryNameNode      辅助名称节点进程   28220是进程号

28259 JobTracker               作业跟踪器进程     28259是进程号

>8在161和191运行./jps

查看进程都会有如下进程:

DataNode                  数据节点进程

 Jps                               java进程

TaskTracker                 任务跟踪器进程

至此hadoop可以运行了

请查看http://namenode:50070查看是否运行正常以及后台log日志是否报错

>9配置hbase该文件也是替换了hadoop的核心包,可以正常运行,同样需要拷贝到其他的2台机器上,要求路径相同,并且计算机的时间相同

hbase-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_07

其中默认文件的jar包引入错误

# Extra Java CLASSPATH elements.  Optional.

export HBASE_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.94.0.jar:$HBASE_HOME/hbase-0.94.0-tests.jar:$HBASE_HOME/conf:${HBASE_HOME}/lib/zookeeper-3.4.3.jar

export HBASE_MANAGES_ZK=true 完全分布式为true,自带的zookeeper

hbase.site.xml

<property>
    <name>hbase.rootdir</name>
    <value>hdfs://192.169.200.101:9000/hbase</value>
    <description>The directory shared by region servers and into
    which HBase persists.  The URL should be 'fully-qualified'
    to include the filesystem scheme.  For example, to specify the
    HDFS directory '/hbase' where the HDFS instance's namenode is
    running at namenode.example.org on port 9000, set this value to:
    hdfs://namenode.example.org:9000/hbase.  By default HBase writes
    into /tmp.  Change this configuration else all data will be lost
    on machine restart.
    </description>
  </property>
   <property>
    <name>hbase.tmp.dir</name>
    <value>hdfs://192.169.200.101:9000/tmp</value>
    <description>Temporary directory on the local filesystem.
    Change this setting to point to a location more permanent
    than '/tmp' (The '/tmp' directory is often cleared on
    machine restart).
    </description>
  </property>

<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false for standalone mode and true for distributed mode.  If
      false, startup will run all HBase and ZooKeeper daemons together
      in the one JVM.
    </description>
  </property>
<property>
    <name>hbase.tmp.dir</name>
    <value>hdfs://192.169.200.101:9000/tmp</value>
    <description>Temporary directory on the local filesystem.
    Change this setting to point to a location more permanent
    than '/tmp' (The '/tmp' directory is often cleared on
    machine restart).
    </description>
  </property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>192.169.200.101,192.169.200.161,192.169.200.191</value>
    <description>Comma separated list of servers in the ZooKeeper Quorum.
    For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
    By default this is set to localhost for local and pseudo-distributed modes
    of operation. For a fully-distributed setup, this should be set to a full
    list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
    this is the list of servers which we will start/stop ZooKeeper on.
    </description>
  </property>

           

regionservers中配置

Namenode

Datanode1

Datanode2

>10进入hbase/bin目录中

运行:./start-hbase.sh

运行:./hbase shell

在shell中运行list 不报错则运行正常

查看Master

http://192.169.200.101:60010/master.jsp

查看Region Server

http://192.169.200.101:60030/regionserver.jsp

查看ZK Tree

http://192.169.200.101:60010/zk.jsp

参考:http://www.itpub.net/thread-1713683-1-1.html

    http://genius-bai.iteye.com/blog/641724

http://www.yankay.com/wp-content/hbase/book.html

http://www.linuxidc.com/Linux/2012-03/55622.htm