赞
踩
在搭建高可用集群之前,如果搭建了完全分布式hadoop,先执行stop-all.sh停掉所有的服务,只保留jdk和zookeeper的2个服务,然后再去搭建。
目标:
一.高可用集群简介
二.部署高可用集群
先分别在每一个机器中建文件夹Hadoop313-HA
mkdir -p /export/servers/hadoop313-HA
在hadoop01的/export/servers目录下安装hadoop,并使用mv指令改名为hadoop313-HA
vi /etc/profile
# 然后在尾部添加
export HADOOP_HOME=/export/servers/hadoop313-HA
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile #让文件生效
1)配置Hadoop运行时环境 vi hadoop-env.sh
export JAVA_HOME=/export/servers/jdk1.8.0_241
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
2)配置Hadoop vi core-site.xml
<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/export/data/hadoop313-HA/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/export/data/hadoop313-HA/datanode</value> </property> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>hadoop01:9000</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>hadoop01:9870</value> </property> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>hadoop02:9000</value> </property> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>hadoop02:9870</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/ns1</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/export/data/journaldata</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.permissions.enable</name> <value>false</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration>
4)配置MapReduce vi mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value> </property> </configuration>
5)配置YARN vi yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>jyarn</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop01</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop02</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> </configuration>
6)配置Hadoop从节点运行的虚拟机
scp -r /export/servers/hadoop313-HA root@hadoop02:/export/servers/
scp -r /export/servers/hadoop313-HA root@hadoop03:/export/servers/
scp /etc/profile root@hadoop02:/etc/
scp /etc/profile root@hadoop03:/etc/
1)分别在虚拟机hadoop01、hadoop02和hadoop03中启动JournalNode
hdfs --daemon start journalnode
2)在虚拟机hadoop01上格式化HDFS文件系统
hdfs namenode -format
3)同步NameNode
scp -r /export/data/hadoop313-HA/namenode/ hadoop02:/export/data/hadoop313-HA/
scp -r /export/data/hadoop313-HA/namenode/ hadoop03:/export/data/hadoop313-HA/
注意:同步NameNode是为了确保初次启动HDFS时两个NameNode存储的FSImage文件一致。并且此操作只在初次启动Hadoop高可用集群之前执行。
4)格式化ZKFC
为了确保ZooKeeper集群能够通过ZKFC为HDFS提供高可用,在****初次启动****Hadoop高可用集群之前需要进行格式化ZKFC的操作
hdfs zkfc -formatZK
5)启动HDFS
start-dfs.sh
6)启动YARN
start-yarn.sh
【测试1】:关闭hadoop01中状态为active的NameNode
hdfs --daemon stop namenode
此时,重新查看NameNode的状态,*发现hadoop01无法访问,hadoop02备胎转正*
【测试2】:关闭hadoop01中状态为active的ResourceManager
yarn --daemon stop resourcemanager
此时,重新查看ResourceManager的状态,*发现hadoop01无法访问,hadoop02备胎转正*
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。