当前位置:   article > 正文

HadoopHA模式(由于Hadoop的HA模式是在Hadoop完全分布式基础上,利用zookeeper等协调工具配置的高可用的Hadoop集群模式)_hadoop ha

hadoop ha

目录

1.前期准备

1.1.hadoop-3.1.3.tar.gz,jdk-8u212-linux-x64.tar.gz,apache-zookeeper-3.5.7-bin.tar.gz三个包提取码:k5y6

2.解压安装包,配置环境变量

3. 将三个节点分别命名为master、slave1、slave2并做免密登录

免密在前面Hadoop完全分布式搭建说过,这里不再赘述

4.搭建zookeeper集群

 根据配置的路径新建zkdata,zkdatalog目录。然后到zkdata目录中可以touch新建一个文件myid,也可以直接echo写入为1,另外slave1,salve2分别对应2,3。​编辑

 5.分发解压后的java,/etc/profile,zookeeper修改myid为2,3

6.启动zookeeper

查看状态

vim core-site.xml

vim hdfs-site.xml

vim yarn-site.xml

其余几个配置和前面Hadoop完全分布式一样

6.分发Hadoop

7.首次启动HDFS的HA模式,步骤如下

7.1.在虚拟机master上启动zookeeper集群

7.2.在虚拟机master上格式化zookeeper

7.3.分别在虚拟机master,slave1,slave2上启动journalnode进程

7.4.然后格式化

7.5.

 start-all.sh报错

  hadoop-daemon.sh start namenode单独启动master上的namenode

  hdfs namenode -bootstrapStandby再在另外你要起的虚拟机上同步namenode

最后 start-all.sh

8.在Master节点上使用命令分别查看服务nn2与rm2进程状态

hdfs haadmin -getServiceState nn2

yarn rmadmin -getServiceState rm2


HadoopHA模式搭建规划

主机名

IP地址

相关进程

master

根据自己的

NameNode,DataNode,

DFSZKFailoverController,

QuorumPeerMain,JournalNode,

ResourceManager,NodeMananger

slave1

根据自己的

NameNode,DataNode,

DFSZKFailoverController,

QuorumPeerMain,JournalNode,

ResourceManager,NodeMananger

slave2

根据自己的

DataNode, NodeMananger,

QuorumPeerMain, JournalNode

1.前期准备


1.1.hadoop-3.1.3.tar.gzjdk-8u212-linux-x64.tar.gzapache-zookeeper-3.5.7-bin.tar.gz三个包提取码:k5y6

2.解压安装包,配置环境变量

tar -zxf  tar包  -C  指定目录

 解压后

 apache-zookeeper-3.5.7-bin名字好长不太习惯可以用mv改名

 或者ln -s 软链接

vim /etc/profile配置环境变量,source /etc/profile使环境变量生效

验证

hadoop version

java -version

3. 将三个节点分别命名为master、slave1、slave2并做免密登录

修改主机名,断开重连

hostnamectl set-hostname 主机名

免密在前面Hadoop完全分布式搭建说过,这里不再赘述

4.搭建zookeeper集群

cd /opt/module/zookeeper/conf

cp zoo_sample.cfg zoo.cfg

编辑zoo.cfg新增下列配置

 

 根据配置的路径新建zkdata,zkdatalog目录。然后到zkdata目录中可以touch新建一个文件myid,也可以直接echo写入为1,另外slave1,salve2分别对应2,3。

 5.分发解压后的java,/etc/profile,zookeeper修改myid为2,3

scp -r /opt/module/jdk1.8.0_212/ slave1:/opt/module/

scp -r /opt/module/jdk1.8.0_212/ slave2:/opt/module/

scp /etc/profile slave1:/etc/profile
scp /etc/profile slave2:/etc/profile(不要忘记source)

scp -r /opt/module/zookeeper/ slave1:/opt/module/

scp -r /opt/module/zookeeper/ slave2:/opt/module/

6.启动zookeeper

zkServer.sh start

查看状态

zkServer.sh status

cd /opt/module/hadoop-3.1.3/etc/hadoop

vim core-site.xml

  1. <property>
  2. <name>fs.defaultFS</name>
  3. <value>hdfs://cluster</value>
  4. <description>The name of the default file system. A URI whose
  5. scheme and authority determine the FileSystem implementation. The
  6. uri's scheme determines the config property (fs.SCHEME.impl) naming
  7. the FileSystem implementation class. The uri's authority is used to
  8. determine the host, port, etc. for a filesystem.</description>
  9. </property>
  10. <property>
  11. <name>hadoop.tmp.dir</name>
  12. <value>/opt/module/hadoop-3.1.3/tmpdir</value>
  13. <description>A base for other temporary directories.</description>
  14. </property>
  15. <property>
  16. <name>ha.zookeeper.quorum</name>
  17. <value>master:2181,slave1:2181,slave2:2181</value>
  18. <description>
  19. A list of ZooKeeper server addresses, separated by commas, that are
  20. to be used by the ZKFailoverController in automatic failover.
  21. </description>
  22. </property>

vim hdfs-site.xml

  1. <property>
  2. <name>dfs.replication</name>
  3. <value>3</value>
  4. <description>Default block replication.
  5. The actual number of replications can be specified when the file is created.
  6. The default is used if replication is not specified in create time.
  7. </description>
  8. </property>
  9. <property>
  10. <name>dfs.nameservices</name>
  11. <value>cluster</value>
  12. <description>
  13. Comma-separated list of nameservices.
  14. </description>
  15. </property>
  16. <property>
  17. <name>dfs.ha.namenodes.cluster</name>
  18. <value>nn1,nn2</value>
  19. <description>
  20. The prefix for a given nameservice, contains a comma-separated
  21. list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
  22. Unique identifiers for each NameNode in the nameservice, delimited by
  23. commas. This will be used by DataNodes to determine all the NameNodes
  24. in the cluster. For example, if you used Ἶ@~\myclusterἾ@~] as
  25. thh
  26. e nameservice
  27. ID previously, and you wanted to use Ἶ@~\nn1Ἶ@~] and Ἶ@~\nn22
  28. Ἶ@@
  29. ~] as the individual
  30. IDs of the NameNodes, you would configure a property
  31. dfs.ha.namenodes.mycluster, and its value "nn1,nn2".
  32. </description>
  33. </property>
  34. <property>
  35. <name>dfs.namenode.rpc-address.cluster.nn1</name>
  36. <value>master:8020</value>
  37. <description>
  38. A comma separated list of auxiliary ports for the NameNode to listen on.
  39. This allows exposing multiple NN addresses to clients.
  40. Particularly, it is used to enforce different SASL levels on different ports.
  41. Empty list indicates that auxiliary ports are disabled.
  42. </description>
  43. </property>
  44. <property>
  45. <name>dfs.namenode.rpc-address.cluster.nn2</name>
  46. <value>slave1:8020</value>
  47. <description>
  48. A comma separated list of auxiliary ports for the NameNode to listen on.
  49. This allows exposing multiple NN addresses to clients.
  50. Particularly, it is used to enforce different SASL levels on different ports.
  51. Empty list indicates that auxiliary ports are disabled.
  52. </description>
  53. </property>
  54. <property>
  55. <name>dfs.namenode.http-address.cluster.nn1</name>
  56. <value>master:9870</value>
  57. <description>
  58. The address and the base port where the dfs namenode web ui will listen on.
  59. </description>
  60. </property>
  61. <property>
  62. <name>dfs.namenode.http-address.cluster.nn2</name>
  63. <value>slave1:9870</value>
  64. <description>
  65. The address and the base port where the dfs namenode web ui will listen on.
  66. </description>
  67. </property>
  68. <property>
  69. <name>dfs.namenode.shared.edits.dir</name>
  70. <value>qjournal://master:8485;slave1:8485;slave2:8485/cluster</value>
  71. <description>A directory on shared storage between the multiple namenodes
  72. in an HA cluster. This directory will be written by the active and read
  73. by the standby in order to keep the namespaces synchronized. This directory
  74. does not need to be listed in dfs.namenode.edits.dir above. It should be
  75. left empty in a non-HA cluster.
  76. </description>
  77. </property>
  78. <property>
  79. <name>dfs.client.failover.proxy.provider.cluster</name>
  80. <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  81. <description>
  82. The prefix (plus a required nameservice ID) for the class name of the
  83. configured Failover proxy provider for the host. For more detailed
  84. information, please consult the "Configuration Details" section of
  85. the HDFS High Availability documentation.
  86. </description>
  87. </property>
  88. <property>
  89. <name>dfs.ha.automatic-failover.enabled</name>
  90. <value>true</value>
  91. <description>
  92. Whether automatic failover is enabled. See the HDFS High
  93. Availability documentation for details on automatic HA
  94. configuration.
  95. </description>
  96. </property>
  97. <property>
  98. <name>dfs.ha.fencing.methods</name>
  99. <value>shell(/bin/true)</value>
  100. <description>
  101. A list of scripts or Java classes which will be used to fence
  102. the Active NameNode during a failover. See the HDFS High
  103. Availability documentation for details on automatic HA
  104. configuration.
  105. </description>
  106. </property>

vim yarn-site.xml

  1. <property>
  2. <description>A comma separated list of services where service name should only
  3. contain a-zA-Z0-9_ and can not start with numbers</description>
  4. <name>yarn.nodemanager.aux-services</name>
  5. <value>mapreduce_shuffle</value>
  6. </property>
  7. <property>
  8. <name>yarn.resourcemanager.ha.enabled</name>
  9. <value>true</value>
  10. </property>
  11. <property>
  12. <description>Name of the cluster. In a HA setting,
  13. this is used to ensure the RM participates in leader
  14. election for this cluster and ensures it does not affect
  15. other clusters</description>
  16. <name>yarn.resourcemanager.cluster-id</name>
  17. <value>yarn-cluster</value>
  18. </property>
  19. <property>
  20. <description>The list of RM nodes in the cluster when HA is
  21. enabled. See description of yarn.resourcemanager.ha
  22. .enabled for full details on how this is used.</description>
  23. <name>yarn.resourcemanager.ha.rm-ids</name>
  24. <value>rm1,rm2</value>
  25. </property>
  26. <property>
  27. <description>The hostname of the RM.</description>
  28. <name>yarn.resourcemanager.hostname.rm1</name>
  29. <value>master</value>
  30. </property>
  31. <property>
  32. <description>The hostname of the RM.</description>
  33. <name>yarn.resourcemanager.hostname.rm2</name>
  34. <value>slave1</value>
  35. </property>
  36. <property>
  37. <description>
  38. The http address of the RM web application.
  39. If only a host is provided as the value,
  40. the webapp will be served on a random port.
  41. </description>
  42. <name>yarn.resourcemanager.webapp.address.rm1</name>
  43. <value>master:8088</value>
  44. </property>
  45. <property>
  46. <description>
  47. The http address of the RM web application.
  48. If only a host is provided as the value,
  49. the webapp will be served on a random port.
  50. </description>
  51. <name>yarn.resourcemanager.webapp.address.rm2</name>
  52. <value>slave1:8088</value>
  53. </property>
  54. <property>
  55. <name>yarn.resourcemanager.zk-address</name>
  56. <value>master:2181,slave1:2181,slave2:2181</value>
  57. </property>

其余几个配置和前面Hadoop完全分布式一样

6.分发Hadoop

7.首次启动HDFS的HA模式,步骤如下

7.1.在虚拟机master上启动zookeeper集群

7.2.在虚拟机master上格式化zookeeper

hdfs zkfc -formatZK

7.3.分别在虚拟机master,slave1,slave2上启动journalnode进程

 hadoop-daemon.sh start journalnode

7.4.然后格式化

 hdfs namenode -format

7.5.

 start-all.sh报错

 添加进环境变量

  hadoop-daemon.sh start namenode单独启动master上的namenode

  hdfs namenode -bootstrapStandby再在另外你要起的虚拟机上同步namenode

最后 start-all.sh

 

8.在Master节点上使用命令分别查看服务nn2与rm2进程状态

hdfs haadmin -getServiceState nn2

yarn rmadmin -getServiceState rm2

 

报错了:

看看是否是hdfs-site.xml里面写错了,果然

 namenode打成了namenodes,修改过来重启,成功了

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Gausst松鼠会/article/detail/605375
推荐阅读
相关标签
  

闽ICP备14008679号