赞
踩
最近在学习hadoop大数据技术,自己动手搭建了HA集群,为记录学习过程,特将搭建过程记录在博客。文中的各配置文件均来源于我的自己的配置,只需修改为自己的主机名和路径就可以使用,给和我一样初涉大数据的小白们提供个参考。
1、拷贝 apache-zookeeper-3.5.7-bin.tar.gz 安装包到 Linux 系统下
2、解压到指定目录
tar -zxvf apache-zookeeper-3.5.7-bin.tar.gz -C /opt/project/
3、修改名称
mv apache-zookeeper-3.5.7-bin/ zookeeper3
4、配置修改
(1)在/opt/project/zookeeper3/这个目录上创建 zkData 文件夹
mkdir zkData
(2)在/opt/project/zookeeper3/zkData 目录下创建一个 myid 的文件
在文件中添加与 server 对应的编号(注意:上下不要有空行,左右不要有空格)
1
(3)将zookeeper3/conf 这个路径下的 zoo_sample.cfg 修改为 zoo.cfg;
mv zoo_sample.cfg zoo.cfg
(4)打开 zoo.cfg 文件,进行如下修改:
修改 dataDir 路径:
dataDir=/opt/project/zookeeper3/zkData
增加如下配置
server.1=hadoop-chengzhipeng101:2888:3888
server.2=hadoop-chengzhipeng102:2888:3888
server.3=hadoop-chengzhipeng103:2888:3888
以上都在101主机上进行,完成后同步配置好的 zookeeper 到其他机器上,并分别在 102、103 上修改 myid 文件中内容为 2、3
5、启动、停止、查看集群:
在每个节点上执行命令:bin/zkServer.sh start
bin/zkServer.sh status
bin/zkServer.sh stop
6、集群启停脚本:
#!/bin/bash case $1 in "start") for hostname in hadoop-chengzhipeng-101 hadoop-chengzhipeng-102 hadoop-chengzhipeng-103 do echo "----------zookeeper $hostname 启动-------------------" ssh $hostname "/opt/project/zookeeper3/bin/zkServer.sh start" done ;; "stop") for hostname in hadoop-chengzhipeng-101 hadoop-chengzhipeng-102 hadoop-chengzhipeng-103 do echo "----------zookeeper $hostname 关闭-------------------" ssh $hostname "/opt/project/zookeeper3/bin/zkServer.sh stop" done ;; "status") for hostname in hadoop-chengzhipeng-101 hadoop-chengzhipeng-102 hadoop-chengzhipeng-103 do echo "----------zookeeper $hostname 状态-------------------" ssh $hostname "/opt/project/zookeeper3/bin/zkServer.sh status" done ;; esac
至此,zookeeper集群搭建完毕。
将之前的单节点Namenode集群版本拷贝到/opt/ha目录下,删除data和logs目录。
<configuration> <!-- 把多个 NameNode 的地址组装成一个集群 mycluster --> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <!-- 指定 hadoop 运行时产生文件的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/ha/hadoop-3.1.3/data</value> </property> <!-- 指定 zkfc 要连接的 zkServer 地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop-chengzhipeng-101:2181,hadoop-chengzhipeng-102:2181,hadoop-chengzhipeng-103:2181</value> </property> </configuration>
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- NameNode 数据存储目录 --> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.tmp.dir}/name</value> </property> <!-- DataNode 数据存储目录 --> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.tmp.dir}/data</value> </property> <!-- JournalNode 数据存储目录 --> <property> <name>dfs.journalnode.edits.dir</name> <value>${hadoop.tmp.dir}/jn</value> </property> <!-- 完全分布式集群名称 --> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!-- 集群中 NameNode 节点都有哪些 --> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2,nn3</value> </property> <!-- NameNode 的 RPC 通信地址 --> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop-chengzhipeng-101:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop-chengzhipeng-102:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn3</name> <value>hadoop-chengzhipeng-103:8020</value> </property> <!-- NameNode 的 http 通信地址 --> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop-chengzhipeng-101:9870</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop-chengzhipeng-102:9870</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn3</name> <value>hadoop-chengzhipeng-103:9870</value> </property> <!-- 指定 NameNode 元数据在 JournalNode 上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop-chengzhipeng-101:8485;hadoop-chengzhipeng-102:8485;hadoop-chengzhipeng-103:8485/mycluster</value> </property> <!-- 访问代理类:client 用于确定哪个 NameNode 为 Active --> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- 使用隔离机制时需要 ssh 秘钥登录--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/ha_czp/.ssh/id_rsa</value> </property> <!-- 启用 nn 故障自动转移 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration>
zkServer.sh start
zkServer.sh status
三台JN上执行 启动命令:
hdfs --daemon start journalnode
hdfs namenode -format
启动当前的NN
hdfs --daemon start namenode
(作为主节点)
在没有格式化的NN上执行同步
hdfs namenode -bootstrapStandby
一定要先把zk集群正常 启动起来
[ha_czp@hadoop-chengzhipeng-101 hadoop3]$ hdfs zkfc -formatZk
start-dfs.sh
运行完成以上,查看所有节点进程:
[ha_czp@hadoop-chengzhipeng-101 hadoop3]$ my_jpsall.sh ------------------hadoop-chengzhipeng-101 hadoop进程开启情况-------------------------------- 3125 DataNode 3509 DFSZKFailoverController 2728 NameNode 2330 QuorumPeerMain 3580 Jps 2541 JournalNode ------------------hadoop-chengzhipeng-102 hadoop进程开启情况-------------------------------- 2402 JournalNode 2691 DataNode 2197 QuorumPeerMain 2904 DFSZKFailoverController 2954 Jps 2604 NameNode ------------------hadoop-chengzhipeng-103 hadoop进程开启情况-------------------------------- 2899 DFSZKFailoverController 2395 JournalNode 2941 Jps 2589 NameNode 2686 DataNode 2191 QuorumPeerMain [ha_czp@hadoop-chengzhipeng-101 hadoop3]$
打开浏览器,查看HDFS各节点情况,可以看到只有一台是活跃,其余是备份:
打开zookeeper客户端(命令在截图中),查看zookeeper节点信息,依次使用下面截图命令,最终获取节点信息的路径后,进入节点,查看节点信息,可以看到HDFS活跃为hadoop101
1、yarn-site.xml配置如下:
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 启用 resourcemanager ha --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!-- 声明Resourcemanager集群名称 --> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster-yarn1</value> </property> <!--指定 resourcemanager 的节点列表--> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2,rm3</value> </property> <!-- ========== rm1 的配置 ========== --> <!-- 指定 rm1 的主机名 --> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop-chengzhipeng-101</value> </property> <!-- 指定 rm1 的 web 端地址 --> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>hadoop-chengzhipeng-101:8088</value> </property> <!-- 指定 rm1 的内部通信地址 --> <property> <name>yarn.resourcemanager.address.rm1</name> <value>hadoop-chengzhipeng-101:8032</value> </property> <!-- 指定 AM 向 rm1 申请资源的地址 --> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>hadoop-chengzhipeng-101:8030</value> </property> <!-- 指定供 NM 连接的地址 --> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>hadoop-chengzhipeng-101:8031</value> </property> <!-- ========== rm2 的配置 ========== --> <!-- 指定 rm2 的主机名 --> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop-chengzhipeng-102</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>hadoop-chengzhipeng-102:8088</value> </property> <property> <name>yarn.resourcemanager.address.rm2</name> <value>hadoop-chengzhipeng-102:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>hadoop-chengzhipeng-102:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>hadoop-chengzhipeng-102:8031</value> </property> <!-- ========== rm3 的配置 ========== --> <!-- 指定 rm3 的主机名 --> <property> <name>yarn.resourcemanager.hostname.rm3</name> <value>hadoop-chengzhipeng-103</value> </property> <!-- 指定 rm3 的 web 端地址 --> <property> <name>yarn.resourcemanager.webapp.address.rm3</name> <value>hadoop-chengzhipeng-103:8088</value> </property> <!-- 指定 rm3 的内部通信地址 --> <property> <name>yarn.resourcemanager.address.rm3</name> <value>hadoop-chengzhipeng-103:8032</value> </property> <!-- 指定 AM 向 rm3 申请资源的地址 --> <property> <name>yarn.resourcemanager.scheduler.address.rm3</name> <value>hadoop-chengzhipeng-103:8030</value> </property> <!-- 指定供 NM 连接的地址 --> <property> <name>yarn.resourcemanager.resource-tracker.address.rm3</name> <value>hadoop-chengzhipeng-103:8031</value> </property> <!-- 指定 zookeeper 集群的地址 --> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop-chengzhipeng-101:2181,hadoop-chengzhipeng-102:2181,hadoop-chengzhipeng-103:2181</value> </property> <!-- 启用自动恢复 --> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <!-- 指定 resourcemanager 的状态信息存储在 zookeeper 集群 --> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateSt ore</value> </property> <!-- 环境变量的继承 --> <property> <name>yarn.nodemanager.env-whitelist</name> <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLAS SPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> </property> </configuration>
在 core-site.xml 里面适当调大下面的两个参数
<!-- NN 连接 JN 重试次数,默认是 10 次 -->
<property>
<name>ipc.client.connect.max.retries</name>
<value>20</value>
</property>
<!-- 重试时间间隔,默认 1s -->
<property>
<name>ipc.client.connect.retry.interval</name>
<value>5000</value>
</property>
###说明:
本实训hadoop集群采用三台节点,zookeeper集群也是采用三台节点,需要说明的是,如果hadoop集群的节点数大于三台,zookeeper不一定每个节点都要有,但是至少得在三个节点上部署,因为选举机制必须半数以上存在才能运行。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。