赞
踩
主机名 | IP地址 |
---|---|
master | 192.168.16.241 |
slave-1 | 192.168.16.58 |
slave-2 | 192.168.16.129 |
在Oracle VM VirtualBox新建三台linux系统(本环境用centos 7系统),并且用MobaXterm工具连接各个linux。
master节点:
[root@localhost ~]# hostname master
[root@localhost ~]# su
[root@master ~]#
slave-1节点:
[root@localhost ~]# hostname slave-1
[root@localhost ~]# su
[root@slave-1 ~]#
slave-2节点:
[root@localhost ~]# hostname slave-2
[root@localhost ~]# su
[root@slave-2 ~]#
[root@master ~]# vi /etc/hosts
#在最下面添加各个IP地址和对应的主机名
192.168.16.241 master
192.168.16.58 slave-1
192.168.16.129 slave-2
将master节点的hosts文件上传到其他两个节点
[root@master ~]# scp /etc/hosts slave-1:/etc/ The authenticity of host 'slave-1 (192.168.16.58)' can't be established. ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k. ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave-1,192.168.16.58' (ECDSA) to the list of known hosts. root@slave-1's password: hosts 100% 225 72.6KB/s 00:00 [root@master ~]# scp /etc/hosts slave-2:/etc/ The authenticity of host 'slave-2 (192.168.16.129)' can't be established. ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k. ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave-2,192.168.16.129' (ECDSA) to the list of known hosts. root@slave-2's password: hosts 100% 225 171.0KB/s 00:00 [root@master ~]#
关闭防火墙:systemctl stop firewalld
查看防火墙:systemctl status firewalld
[root@master ~]# systemctl stop firewalld
[root@master ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Mon 2020-09-28 03:09:22 EDT; 6s ago
Docs: man:firewalld(1)
Process: 4564 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 4564 (code=exited, status=0/SUCCESS)
Sep 28 03:08:10 master systemd[1]: Starting firewalld - dynamic firewall daemon...
Sep 28 03:08:10 master systemd[1]: Started firewalld - dynamic firewall daemon.
Sep 28 03:09:21 master systemd[1]: Stopping firewalld - dynamic firewall daemon...
Sep 28 03:09:22 master systemd[1]: Stopped firewalld - dynamic firewall daemon.
只要我们实现master节点能免密登录其他两个节点就行
生成密钥命令:ssh-keygen
拷贝密钥命令:ssh-copy-id [主机名]
#master节点执行生成密钥命令 [root@master ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:tBDAgN/Qoimjr3jUC98RYgEqktS2uO8giLXsFgWZrRs root@master The key's randomart image is: +---[RSA 2048]----+ | o+O... | |oo*o= . | |=+o*... . | |B.E.= .o . | |o.o* . .S | |++=.. . | |++++ o . | |o.=.o . | |o+o. | +----[SHA256]-----+ #将master公钥拷贝给master节点 [root@master ~]# ssh-copy-id master /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host 'master (192.168.16.241)' can't be established. ECDSA key fingerprint is SHA256:zVFBruz4zTLJfeOgkPR0acOJDBnXqu6qKoADRIbxi6k. ECDSA key fingerprint is MD5:e8:66:78:de:eb:06:fe:06:0d:ae:60:a2:c3:42:ef:f2. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@master's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'master'" and check to make sure that only the key(s) you wanted were added. #将master公钥拷贝到slave-1节点 [root@master ~]# ssh-copy-id slave-1 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave-1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave-1'" and check to make sure that only the key(s) you wanted were added. #将master公钥拷贝给slave-2节点 [root@master ~]# ssh-copy-id slave-2 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@slave-2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'slave-2'" and check to make sure that only the key(s) you wanted were added.
验证是否能免密登录其他两个节点
[root@master ~]# ssh slave-1
Last login: Mon Sep 28 03:14:54 2020 from master
[root@slave-1 ~]#
#注销登录slave-1
[root@slave-1 ~]# exit
logout
Connection to slave-1 closed.
[root@master ~]# ssh slave-2
Last login: Mon Sep 28 03:00:50 2020
[root@slave-2 ~]#
#只要不提示需要密码,免密就执行成功
注意:一定要退回master节点,才能做下一步操作
下载地址:
jdk:
https://mirrors.dtops.cc/java/8/8u212/jdk-8u212-linux-x64.tar.gz
zookeeper:
https://mirrors.dtops.cc/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
Hadoop:
https://mirrors.dtops.cc/apache/hadoop/common/hadoop-2.9.0/hadoop-2.9.0.tar.gz
用ftp上传工具上传:
#查看一下
[root@master ~]# ls
anaconda-ks.cfg hadoop-2.9.2.tar.gz jdk-8u251-linux-x64.tar.gz zookeeper-3.4.14.tar.gz
[root@master ~]# tar -zxvf jdk-8u251-linux-x64.tar.gz -C /usr/local/
[root@master ~]# mv /usr/local/jdk1.8.0_251/ /usr/local/jdk/
#新建并修改jdk.sh
[root@master ~]# vi /etc/profile.d/jdk.sh
#新增内容
export JAVA_HOME=/usr/local/jdk
export PATH=$PATH:$JAVA_HOME/bin
wq:保存退出
[root@master ~]# source /etc/profile.d/jdk.sh
[root@master ~]# java -version
java version "1.8.0_251"
Java(TM) SE Runtime Environment (build 1.8.0_251-b08)
Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)
#同传jdk到slave-1和slave-2节点
[root@master ~]# scp -r /usr/local/jdk/ slave-1:/usr/local/
[root@master ~]# scp -r /usr/local/jdk/ slave-2:/usr/local/
#同传环境变量到slave-1和slave-2节点
[root@master ~]# scp /etc/profile.d/jdk.sh slave-1:/etc/profile.d/
jdk.sh 100% 65 54.8KB/s 00:00
[root@master ~]# scp /etc/profile.d/jdk.sh slave-2:/etc/profile.d/
jdk.sh 100% 65 48.9KB/s 00:00
给其他两个节点生效环境变量即可
[root@master ~]# tar -zxvf zookeeper-3.4.14.tar.gz -C /usr/local/
[root@master ~]# mv /usr/local/zookeeper-3.4.14/ /usr/local/zookeeper
[root@master ~]# vi /etc/profile.d/zookeeper.sh
#添加环境变量
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
#wq保存退出
[root@master ~]# source /etc/profile.d/zookeeper.sh
#进入配置文件,目录在/usr/local/zookeeper/conf/文件夹下
[root@master ~]# cd /usr/local/zookeeper/conf/
# zoo_sample.cfg重命名为zoo.cfg
[root@master conf]# mv zoo_sample.cfg zoo.cfg
#编辑配置文件
[root@master conf]# vi zoo.cfg
#修改临时文件存放目录
dataDir=/usr/local/zookeeper/tmp
#添加以下配置信息
server.1=master:2888:3888
server.2=slave-1:2888:3888
server.3=slave-2:2888:3888
#wq保存退出
[root@master zookeeper]# mkdir tmp
[root@master tmp]# touch myid
[root@master tmp]# ls
myid
#上传zookeeper
[root@master ~]# scp -r /usr/local/zookeeper/ slave-1:/usr/local/
[root@master ~]# scp -r /usr/local/zookeeper/ slave-2:/usr/local/
#上传zookeeper的环境变量
[root@master ~]# scp /etc/profile.d/zookeeper.sh slave-1:/etc/profile.d/
[root@master ~]# scp /etc/profile.d/zookeeper.sh slave-2:/etc/profile.d/
注意:在各个节点刷新环境变量(source /etc/profile.d/zookeeper.sh)
master-------->1
slave-1-------->2
slave-2-------->3
#master的myid
[root@master tmp]# cat myid
1
#slave-1的myid
[root@slave-1 tmp]# cat myid
2
#slave-2的myid
[root@slave-2 tmp]# cat myid
3
启动命令:zkServer.sh start
验证zookeeper是否正常启动:zkServer.sh status
启动:
#master
[root@master tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
#slave-1
[root@slave-1 tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
#slave-2
[root@slave-2 tmp]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
验证:
#master
[root@master tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
#slave-1
[root@slave-1 tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader
#slave-2
[root@slave-2 tmp]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
如果结果出leader和follower,说明你的也成功.
[root@master ~]# tar -zxvf hadoop-2.9.2.tar.gz -C /usr/local/
[root@master ~]# mv /usr/local/hadoop-2.9.2/ /usr/local/hadoop
#新建Hadoop.sh并修改
[root@master ~]# vi /etc/profile.d/hadoop.sh
#修改配置
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#生效环境变量
[root@master ~]# source /etc/profile.d/hadoop.sh
配置文件夹目录:$HADOOP_HOME/etc/hadoop下
[root@master ~]# cd /usr/local/hadoop/etc/hadoop/
[root@master hadoop]# ls
capacity-scheduler.xml httpfs-env.sh mapred-env.sh
configuration.xsl httpfs-log4j.properties mapred-queues.xml.template
container-executor.cfg httpfs-signature.secret mapred-site.xml.template
core-site.xml httpfs-site.xml slaves
hadoop-env.cmd kms-acls.xml ssl-client.xml.example
hadoop-env.sh kms-env.sh ssl-server.xml.example
hadoop-metrics2.properties kms-log4j.properties yarn-env.cmd
hadoop-metrics.properties kms-site.xml yarn-env.sh
hadoop-policy.xml log4j.properties yarn-site.xml
hdfs-site.xml mapred-env.cmd
主要修改:hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml和slave
#修改hadoop-env.sh
[root@master hadoop]# vi hadoop-env.sh
# 配置jdk绝对路径.
export JAVA_HOME=/usr/local/jdk
#修改core-site.xml [root@master hadoop]# vi core-site.xml #添加配置 <configuration> <!--该属性用来配置HDFS文件系统的默认寻址入口路径--> <property> <name>fs.defaultFS</name> <!--这里的值指的是默认的HDFS路径,该值来自hdfs-site.xml文件中的配置项,由于只能有一个Namenode所在主机名被HDFS集群使用,因此这里即不能是master又不能是master0,我们先选择一个配置名称为ns1,而不是具体的某一个Namenode的主机地址,关于ns1将在后续hdfs-site.xml配置文件中进行具体说明--> <value>hdfs://ns1</value> </property> <!-- 这里指定tmp为Namenode,Datanode,JournalNode等存储数据的公共目录。 --> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> <!-- 这里配置zookeeper集群的地址和端口 --> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,slave-1:2181,slave-2:2181</value> </property> </configuration>
wq:保存退出
<configuration> <!--指定datanode存储数据块的副本数量。默认副本数量为3个 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- 设置权限之后,可以控制个用户之间的权限,此处将权限先屏蔽掉--> <property> <name>dfs.permissions</name> <value>false</value> </property> <!-- 给HDFS集群命名为ns1,这个名字必须和前面的core-site.xml配置文件中的配置名称统一--> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <!-- 给两个namenode节点取名为nn1,nn2--> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <!--指定master'节点。client向hdfs请求的rpc地址和端口号 --> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>master:9000</value> </property> <!-- 指定master的http地址,这样从节点datanode就可以通过该地址向主节点master的namenode进程发送心跳信息 --> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>master:50070</value> </property> <!-- 指定slave-1节点,client向hdfs文件系统请求的rpc地址和端口号 --> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>slave-1:9000</value> </property> <!-- 指定slave-1节点,client的http地址和端口号 --> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>slave-1:50070</value> </property> <!-- 指定master节点内部rpc请求地址,这是hdfs内部rpc通信所用 --> <property> <name>dfs.namenode.servicerpc-address.ns1.nn1</name> <value>master:53310</value> </property> <!-- 指定slave-1节点内部rpc请求地址--> <property> <name>dfs.namenode.servicerpc-address.ns1.nn2</name> <value>slave-1:53310</value> </property> <!-- 指定ns1是否启动自动故障恢复,即当namenode出故障时,是否自动切换到另一台namenode --> <property> <name>dfs.ha.automatic-failover.enabled.ns1</name> <value>true</value> </property> <!-- 指定journalnode hadoop自带的共享存储系统,主要用于两个namenode之间数据的共享和同步,也就指定ns1的两个namenode共享edits文件目录时,使用的journalnode集群信息 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master:8485;slave-1:8485;slave-2:8485/ns1</value> </property> <!-- 指定ns1出故障时,由哪个实现类来负责执行故障切换 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 指定journalnode集群在对namenode的目录进行共享时,自己存储数据的磁盘路径。 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop/journal</value> </property> <!-- 一旦需要namenode切换,配置使用sshfence方式进行操作 --> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- ssh进行故障切换,所以需要配置无密钥登录。(保证他们有共同的密钥,能达到相互访问) --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 故障切换时的超时限制设置,单位为ms --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>10000</value> </property> <!-- namenode内部开辟的线程数 --> <property> <name>dfs.namenode.handler.count</name> <value>100</value> </property> </configuration>
[root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@master hadoop]# vi mapred-site.xml
#修改mapred-site.xml
<configuration>
<property>
<!--指定运行MapReduce应用程序的环境时yarn-->
<name>mapreduce.framework</name>
<value>yarn</value>
</property>
</configuration>
[root@master hadoop]# vi yarn-site.xml #配置信息如下 <configuration> <!-- 定义resourcemanager的地址--> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <!-- yarn上运行Mapreduce的附属服务,使mapreduce可以在yarn上运行 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <!-- Resourcemanager对client暴露的访问地址。client通过该地址向resourcemanager提交应用程序,结束应用程序等 --> <property> <name>yarn.resourcemanager.address</name> <value>master:18040</value> </property> <!-- resourcemanager对applicationmaster(应用程序)暴露的访问地址。ApplicationMaster通过该地址向RM申请资源、释放资源等 --> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:18030</value> </property> <!-- nodemanager通过该地址向Resourcemanager汇报心跳、领取任务等 --> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:18025</value> </property> <!-- 管理员通过该地址向RM发送管理命令等 --> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:18141</value> </property> <!--管理员通过该地址在浏览器中查看集群各类信息 --> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> <property> <name>yarn-log-aggregation-enable</name> <value>true</value> </property> </configuration>
slaves文件用来指定在集群中哪些节点时DataNode的角色
[root@master hadoop]# vi slaves
#添加以下内容
slave-1
slave-2
将master配置ok的Hadoop和hadoop环境变量文件同传到各个节点
#Hadoop同传到slave-1
[root@master ~]# scp -r /usr/local/hadoop/ slave-1:/usr/local/
#Hadoop同传到slave-1
[root@master ~]# scp -r /usr/local/hadoop/ slave-2:/usr/local/
#Hadoop环境变量文件同传到slave-1
[root@master ~]# scp /etc/profile.d/hadoop.sh slave-1:/etc/profile.d/
#Hadoop环境变量文件同传到slave-2
[root@master ~]# scp /etc/profile.d/hadoop.sh slave-2:/etc/profile.d/
1、启动journalnode共享存储集群(三个节点都要执行)
[root@master ~]# hadoop-daemon.sh start journalnode starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-master.out #查看进程是否启动成功 [root@master ~]# jps 6706 QuorumPeerMain 8727 JournalNode 8793 Jps #slave-1 [root@slave-1 ~]# hadoop-daemon.sh start journalnode starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave-1.out [root@slave-1 ~]# jps 1744 QuorumPeerMain 3206 Jps 3119 JournalNode #slave-2 [root@slave-2 ~]# hadoop-daemon.sh start journalnode starting journalnode, logging to /usr/local/hadoop/logs/hadoop-root-journalnode-slave-2.out [root@slave-2 ~]# jps 3218 Jps 1748 QuorumPeerMain 3182 JournalNode
2、格式化ActiveNameNode
命令:hadoop namenode -format
[root@master hadoop]# hadoop namenode -format
#出现这个,表示你格式化成功了
20/09/29 03:50:40 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
在mastrt节点启动namenode:hadoop-daemon.sh start namenode
在salve-1节点初始化namenode:hdfs namenode -bootstrapStandby
在master上初始化zkfc:hdfs zkfc -formatZK
命令:hadoop-daemon.sh start zkfc
3、启动ZookeeperFailoverController
[root@master hadoop]# hadoop-daemon.sh start zkfc
starting zkfc, logging to /usr/local/hadoop/logs/hadoop-root-zkfc-master.out
[root@master hadoop]# jps
5170 Jps
3349 JournalNode
5061 DFSZKFailoverController
4471 NameNode
1966 QuorumPeerMain
4、启动hdfs
#在master上执行
start-dfs.sh
5、启动yarn
start-yarn.sh
在salve-1上执行:
yarn-daemon.sh start resourcemanager
6、启动JobHistoryServer
mr-jobhistory-daemon.sh start historyserver
7、验证进程(jps)
[root@master hadoop]# jps
5826 DataNode
7122 JobHistoryServer
7539 Jps
3349 JournalNode
5061 DFSZKFailoverController
4471 NameNode
6615 ResourceManager
1966 QuorumPeerMain
[root@slave-1 ~]# jps
3827 NameNode
5747 JobHistoryServer
3909 DataNode
1561 JournalNode
6041 Jps
1471 QuorumPeerMain
[root@slave-2 ~]# jps
1817 JobHistoryServer
1913 Jps
1484 JournalNode
1405 QuorumPeerMain
1598 DataNode
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。