赞
踩
Apache Hadoop 3.3.4 – HDFS High Availability Using the Quorum Journal Manager
linux121 | linux122 | linux123 |
NameNode | NameNode | |
JournalNode | JournalNode | JournalNode |
DataNode | DataNode | DataNode |
ZK | ZK | ZK |
ResourceManager | ||
NodeManager | NodeManager | NodeManager |
启动zookeeper集群
zk.sh start
查看状态
zk.sh status
注意:这里的zk.sh是我写的群起脚本命令。
(1)停止原先HDFS集群
stop-dfs.sh
(2)在所有节点,/opt/lagou/servers目录下创建一个ha文件夹
mkdir /opt/lagou/servers/ha
(3)将/opt/lagou/servers/目录下的 hadoop-2.9.2拷贝到ha目录下
cp -r hadoop-2.9.2 ha
(4)删除原集群data目录
rm -rf /opt/lagou/servers/ha/hadoop-2.9.2/data
(5)配置hdfs-site.xml(后续配置都要清空原先的配置)
- <property>
- <name>dfs.nameservices</name>
- <value>lagoucluster</value>
- </property>
- <property>
- <name>dfs.ha.namenodes.lagoucluster</name>
- <value>nn1,nn2</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.lagoucluster.nn1</name>
- <value>linux121:9000</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.lagoucluster.nn2</name>
- <value>linux122:9000</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.lagoucluster.nn1</name>
- <value>linux121:50070</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.lagoucluster.nn2</name>
- <value>linux122:50070</value>
- </property>
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://linux121:8485;linux122:8485;linux123:8485/lagou</value>
- </property>
- <property>
- <name>dfs.client.failover.proxy.provider.lagoucluster</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- </property>
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/root/.ssh/id_rsa</value>
- </property>
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/opt/journalnode</value>
- </property>
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
(6)配置core-site.xml
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://lagoucluster</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/opt/lagou/servers/ha/hadoop-2.9.2/data/tmp</value>
- </property>
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>linux121:2181,linux122:2181,linux123:2181</value>
- </property>
(7)拷贝配置好的hadoop环境到其他节点
(1)在各个JournalNode节点上,输入以下命令启动journalnode服务(去往HA安装目录,不要使用环境变量中命令)
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start journalnode
(2)在[nn1]上,对其进行格式化,并启动
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -format
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/hadoop-daemon.sh start namenode
(3)在[nn2]上,同步nn1的元数据信息
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs namenode -bootstrapStandby
(4)在[nn1]上初始化zkfc
/opt/lagou/servers/ha/hadoop-2.9.2/bin/hdfs zkfc -formatZK
(5)在[nn1]上,启动集群
/opt/lagou/servers/ha/hadoop-2.9.2/sbin/start-dfs.sh
(6)验证
官方文档
Apache Hadoop 3.3.4 – ResourceManager High Availability
YARN-HA工作机制
(1)配置YARN-HA集群
(2)具体配置
(3)yarn-site.xml(清空原有内容)
- <configuration>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <!--启⽤resourcemanager ha-->
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
- <!--声明两台resourcemanager的地址-->
- <property>
- <name>yarn.resourcemanager.cluster-id</name>
- <value>cluster-yarn</value>
- </property>
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>rm1,rm2</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.rm1</name>
- <value>linux122</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.rm2</name>
- <value>linux123</value>
- </property>
- <!--指定zookeeper集群的地址-->
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>linux121:2181,linux122:2181,linux123:2181</value>
- </property>
- <!--启⽤⾃动恢复-->
- <property>
- <name>yarn.resourcemanager.recovery.enabled</name>
- <value>true</value>
- </property>
- <!--指定resourcemanager的状态信息存储在zookeeper集群-->
- <property>
- <name>yarn.resourcemanager.store.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
- </property>
- </configuration>
(4)同步更新其他节点的配置信息
(5)启动hdfs
sbin/start-yarn.sh
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。