当前位置:   article > 正文

Hadoop HA集群搭建(详细版)_ha集群中master没有journalnode

ha集群中master没有journalnode

Hadoop 分布式集群搭建

一、Linux系统配置

主机名IP地址
master192.168.16.241
slave-1192.168.16.58
slave-2192.168.16.129

Oracle VM VirtualBox新建三台linux系统(本环境用centos 7系统),并且用MobaXterm工具连接各个linux。

1、修改主机名

master节点:

[root@localhost ~]# hostname master
[root@localhost ~]# su
[root@master ~]# 
  • 1
  • 2
  • 3

slave-1节点:

[root@localhost ~]# hostname slave-1
[root@localhost ~]# su
[root@slave-1 ~]#
  • 1
  • 2
  • 3

slave-2节点:

[root@localhost ~]# hostname slave-2
[root@localhost ~]# su
[root@slave-2 ~]#
  • 1
  • 2
  • 3

2、修改hosts

[root@master ~]# vi /etc/hosts
#在最下面添加各个IP地址和对应的主机名
192.168.16.241 master
192.168.16.58 slave-1
192.168.16.129 slave-2
  • 1
  • 2
  • 3
  • 4
  • 5

将master节点的hosts文件上传到其他两个节点

[root@master ~]# scp /etc/hosts slave-1:/etc/
The authenticity of host 'slave-1 (192.168.16.58)' can't be established.
ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k.
ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave-1,192.168.16.58' (ECDSA) to the list of known hosts.
root@slave-1's password:
hosts                                                                100%  225    72.6KB/s   00:00
[root@master ~]# scp /etc/hosts slave-2:/etc/
The authenticity of host 'slave-2 (192.168.16.129)' can't be established.
ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k.
ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave-2,192.168.16.129' (ECDSA) to the list of known hosts.
root@slave-2's password:
hosts                                                                100%  225   171.0KB/s   00:00
[root@master ~]#
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

3、关闭防火墙(三个节点必须关闭)

关闭防火墙:systemctl stop firewalld

查看防火墙:systemctl status firewalld

[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Mon 2020-09-28 03:09:22 EDT; 6s ago
     Docs: man:firewalld(1)
  Process: 4564 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 4564 (code=exited, status=0/SUCCESS)

Sep 28 03:08:10 master systemd[1]: Starting firewalld - dynamic firewall daemon...
Sep 28 03:08:10 master systemd[1]: Started firewalld - dynamic firewall daemon.
Sep 28 03:09:21 master systemd[1]: Stopping firewalld - dynamic firewall daemon...
Sep 28 03:09:22 master systemd[1]: Stopped firewalld - dynamic firewall daemon.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

4、配置免密登录

只要我们实现master节点能免密登录其他两个节点就行


生成密钥命令:ssh-keygen

拷贝密钥命令:ssh-copy-id [主机名]

#master节点执行生成密钥命令
[root@master ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:tBDAgN/Qoimjr3jUC98RYgEqktS2uO8giLXsFgWZrRs root@master
The key's randomart image is:
+---[RSA 2048]----+
| o+O...          |
|oo*o=  .         |
|=+o*... .        |
|B.E.= .o .       |
|o.o* . .S        |
|++=.. .          |
|++++ o .         |
|o.=.o .          |
|o+o.             |
+----[SHA256]-----+
#将master公钥拷贝给master节点
[root@master ~]# ssh-copy-id master
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'master (192.168.16.241)' can't be established.
ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k.
ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@master's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'master'"
and check to make sure that only the key(s) you wanted were added.
#将master公钥拷贝到slave-1节点
[root@master ~]# ssh-copy-id slave-1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave-1's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave-1'"
and check to make sure that only the key(s) you wanted were added.
#将master公钥拷贝给slave-2节点
[root@master ~]# ssh-copy-id slave-2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@slave-2's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'slave-2'"
and check to make sure that only the key(s) you wanted were added.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59

验证是否能免密登录其他两个节点

[root@master ~]# ssh slave-1
Last login: Mon Sep 28 03:14:54 2020 from master
[root@slave-1 ~]#
#注销登录slave-1
[root@slave-1 ~]# exit
logout
Connection to slave-1 closed.
[root@master ~]# ssh slave-2
Last login: Mon Sep 28 03:00:50 2020
[root@slave-2 ~]#
#只要不提示需要密码,免密就执行成功
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

注意:一定要退回master节点,才能做下一步操作

二、组件下载与上传

下载地址:

jdk: 
https://mirrors.dtops.cc/java/8/8u212/jdk-8u212-linux-x64.tar.gz
zookeeper:
https://mirrors.dtops.cc/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
Hadoop:
https://mirrors.dtops.cc/apache/hadoop/common/hadoop-2.9.0/hadoop-2.9.0.tar.gz
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

用ftp上传工具上传:

#查看一下
[root@master ~]# ls
anaconda-ks.cfg  hadoop-2.9.2.tar.gz  jdk-8u251-linux-x64.tar.gz  zookeeper-3.4.14.tar.gz
  • 1
  • 2
  • 3

三、安装jdk

1、将jdk解压到/usr/local下
[root@master ~]# tar -zxvf jdk-8u251-linux-x64.tar.gz -C /usr/local/
  • 1
2、重命名
[root@master ~]# mv /usr/local/jdk1.8.0_251/ /usr/local/jdk/
  • 1
3、设置jdk的环境变量
#新建并修改jdk.sh
[root@master ~]# vi /etc/profile.d/jdk.sh
#新增内容
export JAVA_HOME=/usr/local/jdk
export PATH=$PATH:$JAVA_HOME/bin
  • 1
  • 2
  • 3
  • 4
  • 5

wq:保存退出

4、生效jdk环境变量
[root@master ~]# source /etc/profile.d/jdk.sh
  • 1
5、验证jdk是否安装成功
[root@master ~]# java -version
java version "1.8.0_251"
Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)
  • 1
  • 2
  • 3
  • 4
6、将master的jdk和环境变量上传到各个节点
#同传jdk到slave-1和slave-2节点
[root@master ~]# scp -r /usr/local/jdk/ slave-1:/usr/local/
[root@master ~]# scp -r /usr/local/jdk/ slave-2:/usr/local/
#同传环境变量到slave-1和slave-2节点
[root@master ~]# scp /etc/profile.d/jdk.sh slave-1:/etc/profile.d/
jdk.sh                                                               100%   65    54.8KB/s   00:00
[root@master ~]# scp /etc/profile.d/jdk.sh slave-2:/etc/profile.d/
jdk.sh                                                               100%   65    48.9KB/s   00:00
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

给其他两个节点生效环境变量即可

三、安装与配置zookeeper

1、解压
[root@master ~]# tar -zxvf zookeeper-3.4.14.tar.gz -C /usr/local/
  • 1
2、重命名
[root@master ~]# mv /usr/local/zookeeper-3.4.14/ /usr/local/zookeeper
  • 1
3、修改环境变量
[root@master ~]# vi /etc/profile.d/zookeeper.sh
#添加环境变量
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
#wq保存退出
  • 1
  • 2
  • 3
  • 4
  • 5
4、生效环境变量
[root@master ~]# source /etc/profile.d/zookeeper.sh
  • 1
5、配置
#进入配置文件,目录在/usr/local/zookeeper/conf/文件夹下
[root@master ~]# cd /usr/local/zookeeper/conf/
# zoo_sample.cfg重命名为zoo.cfg
[root@master conf]# mv zoo_sample.cfg zoo.cfg
#编辑配置文件
[root@master conf]# vi zoo.cfg
#修改临时文件存放目录
dataDir=/usr/local/zookeeper/tmp
#添加以下配置信息
server.1=master:2888:3888
server.2=slave-1:2888:3888
server.3=slave-2:2888:3888
#wq保存退出
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
6、在zookeeper目录下新建tmp文件夹,并且新建myid文件
[root@master zookeeper]# mkdir tmp
[root@master tmp]# touch myid
[root@master tmp]# ls
myid
  • 1
  • 2
  • 3
  • 4
7、将master的zookeeper和zookeeper的环境变量上传给slave-1和slave-2节点
#上传zookeeper
[root@master ~]# scp -r /usr/local/zookeeper/ slave-1:/usr/local/
[root@master ~]# scp -r /usr/local/zookeeper/ slave-2:/usr/local/
#上传zookeeper的环境变量
[root@master ~]# scp /etc/profile.d/zookeeper.sh slave-1:/etc/profile.d/
[root@master ~]# scp /etc/profile.d/zookeeper.sh slave-2:/etc/profile.d/
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

注意:在各个节点刷新环境变量(source /etc/profile.d/zookeeper.sh)

8、修改各个节点的myid

master-------->1

slave-1-------->2

slave-2-------->3

#master的myid
[root@master tmp]# cat myid
1
#slave-1的myid
[root@slave-1 tmp]# cat myid
2
#slave-2的myid
[root@slave-2 tmp]# cat myid
3
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
9、启动zookeeper并验证是否成功

启动命令:zkServer.sh start

验证zookeeper是否正常启动:zkServer.sh status

启动:

#master
[root@master tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
#slave-1
[root@slave-1 tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
#slave-2
[root@slave-2 tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

验证:

#master
[root@master tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
#slave-1
[root@slave-1 tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader
#slave-2
[root@slave-2 tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

如果结果出leader和follower,说明你的也成功.

四、安装与配置Hadoop

1、解压
[root@master ~]# tar -zxvf hadoop-2.9.2.tar.gz -C /usr/local/
  • 1
2、重命名
[root@master ~]# mv /usr/local/hadoop-2.9.2/ /usr/local/hadoop

  • 1
  • 2
3、配置Hadoop环境变量
#新建Hadoop.sh并修改
[root@master ~]# vi /etc/profile.d/hadoop.sh
#修改配置
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#生效环境变量
[root@master ~]# source /etc/profile.d/hadoop.sh
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
4、修改Hadoop配置文件

配置文件夹目录:$HADOOP_HOME/etc/hadoop下

[root@master ~]# cd /usr/local/hadoop/etc/hadoop/
[root@master hadoop]# ls
capacity-scheduler.xml      httpfs-env.sh            mapred-env.sh
configuration.xsl           httpfs-log4j.properties  mapred-queues.xml.template
container-executor.cfg      httpfs-signature.secret  mapred-site.xml.template
core-site.xml               httpfs-site.xml          slaves
hadoop-env.cmd              kms-acls.xml             ssl-client.xml.example
hadoop-env.sh               kms-env.sh               ssl-server.xml.example
hadoop-metrics2.properties  kms-log4j.properties     yarn-env.cmd
hadoop-metrics.properties   kms-site.xml             yarn-env.sh
hadoop-policy.xml           log4j.properties         yarn-site.xml
hdfs-site.xml               mapred-env.cmd

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

主要修改:hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml和slave

1、修改hadoop-env.sh
#修改hadoop-env.sh
[root@master hadoop]# vi hadoop-env.sh
# 配置jdk绝对路径.
export JAVA_HOME=/usr/local/jdk
  • 1
  • 2
  • 3
  • 4
2、修改core-site.xml
#修改core-site.xml
[root@master hadoop]# vi core-site.xml
#添加配置
<configuration>
    <!--该属性用来配置HDFS文件系统的默认寻址入口路径-->
<property>
<name>fs.defaultFS</name>
    <!--这里的值指的是默认的HDFS路径,该值来自hdfs-site.xml文件中的配置项,由于只能有一个Namenode所在主机名被HDFS集群使用,因此这里即不能是master又不能是master0,我们先选择一个配置名称为ns1,而不是具体的某一个Namenode的主机地址,关于ns1将在后续hdfs-site.xml配置文件中进行具体说明-->
<value>hdfs://ns1</value>
</property>
    <!-- 这里指定tmp为Namenode,Datanode,JournalNode等存储数据的公共目录。 --> 
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
     <!-- 这里配置zookeeper集群的地址和端口 --> 
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave-1:2181,slave-2:2181</value>
</property>
</configuration>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

wq:保存退出

3、修改hdfs-site.xml
<configuration>
 <!--指定datanode存储数据块的副本数量。默认副本数量为3个 --> 
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
 <!-- 设置权限之后,可以控制个用户之间的权限,此处将权限先屏蔽掉--> 
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
 <!-- 给HDFS集群命名为ns1,这个名字必须和前面的core-site.xml配置文件中的配置名称统一--> 
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
 <!-- 给两个namenode节点取名为nn1,nn2--> 
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
 <!--指定master'节点。client向hdfs请求的rpc地址和端口号 --> 
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>master:9000</value>
</property>
 <!-- 指定master的http地址,这样从节点datanode就可以通过该地址向主节点master的namenode进程发送心跳信息 --> 
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>master:50070</value>
</property>
 <!-- 指定slave-1节点,client向hdfs文件系统请求的rpc地址和端口号 --> 
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>slave-1:9000</value>
</property>
 <!-- 指定slave-1节点,client的http地址和端口号 --> 
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>slave-1:50070</value>
</property>
 <!-- 指定master节点内部rpc请求地址,这是hdfs内部rpc通信所用 --> 
<property>
<name>dfs.namenode.servicerpc-address.ns1.nn1</name>
<value>master:53310</value>
</property>
 <!-- 指定slave-1节点内部rpc请求地址--> 
<property>
<name>dfs.namenode.servicerpc-address.ns1.nn2</name>
<value>slave-1:53310</value>
</property>
 <!-- 指定ns1是否启动自动故障恢复,即当namenode出故障时,是否自动切换到另一台namenode --> 
<property>
<name>dfs.ha.automatic-failover.enabled.ns1</name>
<value>true</value>
</property>
 <!-- 指定journalnode hadoop自带的共享存储系统,主要用于两个namenode之间数据的共享和同步,也就指定ns1的两个namenode共享edits文件目录时,使用的journalnode集群信息 --> 

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave-1:8485;slave-2:8485/ns1</value>
</property>
 <!-- 指定ns1出故障时,由哪个实现类来负责执行故障切换 --> 
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
 <!-- 指定journalnode集群在对namenode的目录进行共享时,自己存储数据的磁盘路径。 --> 
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/hadoop/journal</value>
</property>
 <!-- 一旦需要namenode切换,配置使用sshfence方式进行操作 --> 
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
 <!-- ssh进行故障切换,所以需要配置无密钥登录。(保证他们有共同的密钥,能达到相互访问)  --> 
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
 <!-- 故障切换时的超时限制设置,单位为ms --> 
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>10000</value>
</property>
 <!-- namenode内部开辟的线程数 --> 
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property>
</configuration>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
4、修改mapred-site.xml
[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@master hadoop]# vi mapred-site.xml
#修改mapred-site.xml
<configuration>
<property>
<!--指定运行MapReduce应用程序的环境时yarn-->
<name>mapreduce.framework</name>
<value>yarn</value>
</property>
</configuration>

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
5、修改yarn-site.xml
[root@master hadoop]# vi yarn-site.xml
#配置信息如下
<configuration>
<!-- 定义resourcemanager的地址--> 
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<!-- yarn上运行Mapreduce的附属服务,使mapreduce可以在yarn上运行 --> 
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<!-- Resourcemanager对client暴露的访问地址。client通过该地址向resourcemanager提交应用程序,结束应用程序等 --> 
<property>
<name>yarn.resourcemanager.address</name>
<value>master:18040</value>
</property>
<!-- resourcemanager对applicationmaster(应用程序)暴露的访问地址。ApplicationMaster通过该地址向RM申请资源、释放资源等 --> 
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<!-- nodemanager通过该地址向Resourcemanager汇报心跳、领取任务等 --> 
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
<!-- 管理员通过该地址向RM发送管理命令等 --> 
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18141</value>
</property>
<!--管理员通过该地址在浏览器中查看集群各类信息  --> 
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
<property>
<name>yarn-log-aggregation-enable</name>
<value>true</value>
</property>
</configuration>

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
6、修改slave

slaves文件用来指定在集群中哪些节点时DataNode的角色

[root@master hadoop]# vi slaves
#添加以下内容
slave-1
slave-2
  • 1
  • 2
  • 3
  • 4
7、同传

将master配置ok的Hadoop和hadoop环境变量文件同传到各个节点

#Hadoop同传到slave-1
[root@master ~]# scp -r /usr/local/hadoop/ slave-1:/usr/local/
#Hadoop同传到slave-1
[root@master ~]# scp -r /usr/local/hadoop/ slave-2:/usr/local/
#Hadoop环境变量文件同传到slave-1
[root@master ~]# scp /etc/profile.d/hadoop.sh slave-1:/etc/profile.d/
#Hadoop环境变量文件同传到slave-2
[root@master ~]# scp /etc/profile.d/hadoop.sh slave-2:/etc/profile.d/
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
8、启动集群

1、启动journalnode共享存储集群(三个节点都要执行)

[root@master ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-master.out
#查看进程是否启动成功
[root@master ~]# jps
6706 QuorumPeerMain
8727 JournalNode
8793 Jps
#slave-1
[root@slave-1 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave-1.out
[root@slave-1 ~]# jps
1744 QuorumPeerMain
3206 Jps
3119 JournalNode
#slave-2
[root@slave-2 ~]# hadoop-daemon.sh start journalnode
starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave-2.out
[root@slave-2 ~]# jps
3218 Jps
1748 QuorumPeerMain
3182 JournalNode

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

2、格式化ActiveNameNode

命令:hadoop namenode -format

[root@master hadoop]# hadoop namenode -format
#出现这个,表示你格式化成功了
20/09/29 03:50:40 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
  • 1
  • 2
  • 3

在mastrt节点启动namenode:hadoop-daemon.sh start namenode

在salve-1节点初始化namenode:hdfs namenode -bootstrapStandby

在master上初始化zkfc:hdfs zkfc -formatZK

命令:hadoop-daemon.sh start zkfc

3、启动ZookeeperFailoverController

[root@master hadoop]# hadoop-daemon.sh start zkfc
starting zkfc, logging to /usr/local/hadoop/logs/hadoop-root-zkfc-master.out
[root@master hadoop]# jps
5170 Jps
3349 JournalNode
5061 DFSZKFailoverController
4471 NameNode
1966 QuorumPeerMain
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

4、启动hdfs

#在master上执行
start-dfs.sh
  • 1
  • 2

5、启动yarn

start-yarn.sh
在salve-1上执行:
yarn-daemon.sh start resourcemanager
  • 1
  • 2
  • 3

6、启动JobHistoryServer

mr-jobhistory-daemon.sh start historyserver
  • 1

7、验证进程(jps)

[root@master hadoop]# jps
5826 DataNode
7122 JobHistoryServer
7539 Jps
3349 JournalNode
5061 DFSZKFailoverController
4471 NameNode
6615 ResourceManager
1966 QuorumPeerMain
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
[root@slave-1 ~]# jps
3827 NameNode
5747 JobHistoryServer
3909 DataNode
1561 JournalNode
6041 Jps
1471 QuorumPeerMain
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
[root@slave-2 ~]# jps
1817 JobHistoryServer
1913 Jps
1484 JournalNode
1405 QuorumPeerMain
1598 DataNode
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/394439
推荐阅读
相关标签
  

闽ICP备14008679号