赞
踩
目录
3.3.3.8 修改 start-dfs.sh 和stop-dfs.sh 文件
3.3.3.9 修改start-yarn.sh和stop-yarn.sh文件
系统环境描述:本教程基于CentOS 8.0版本虚拟机
机器节点信息:
软件版本:
jdk-8u211-linux-x64.tar.gz
apache-zookeeper-3.8.2-bin.tar.gz
hadoop-3.3.4.tar.gz
提示:如果想入门hadoop,特别是集群部署,hadoop涉及的角色还是比较多的,建议先从总体架构抓起,先熟悉总体架构,再熟悉HA部署模式的相关理论和知识,然后再跟着部署的过程中就会有豁然开朗的感觉,印象也会比较深刻,推荐先看下我前面的hadoop系列的理论文章YARN框架和其工作原理流程介绍_夜夜流光相皎洁_小宁的博客-CSDN博客、HDFS介绍_夜夜流光相皎洁_小宁的博客-CSDN博客、MapReduce介绍_夜夜流光相皎洁_小宁的博客-CSDN博客
分别登录进入 192.168.31.215、192.168.31.8、192.168.31.9、192.168.31.167、192.168.31.154等节点服务器,执行命令:
hostnamectl set-hostname master
hostnamectl set-hostname node1
hostnamectl set-hostname node2
hostnamectl set-hostname node3
hostnamectl set-hostname node4
vi /etc/hosts
192.168.31.215 master
192.168.31.8 node1
192.168.31.9 node2
192.168.31.167 node3
192.168.31.154 node4
systemctl stop firewalld
systemctl stop iptables
#永久关闭防火墙
systemctl disable firewalld
#临时关闭
setenforce 0
#永久关闭
vim /etc/selinux/config 然后设置 SELINUX=disabled
dnf install glibc-langpack-zh.x86_64
echo LANG=zh_CN.UTF-8 > /etc/locale.conf
source /etc/locale.conf
timedatectl list-timezones
timedatectl set-timezone Asia/Shanghai
timedatectl
ssh localhost
cd /root/.ssh/
ssh-keygen -t rsa
一路按回车键就行
ssh-copy-id root@master
说明:在本环境中,我们需要将master的密钥,发送给其他node1、node2、node3、node4节点,其中,
master和node1要相互免密,原因是:1)启动start-dfs.sh脚本的机器需要将公钥分发给别的节点
2)在HA模式下,每一个NN身边会启动ZKFC,ZKFC会用免密的方式控制自己和其他NN节点的NN状态
tar -zxvf jdk-8u211-linux-x64.tar.gz
mv jdk1.8.0_211/ /usr/local/
export JAVA_HOME=/usr/local/jdk1.8.0_211
export PATH=$PATH:${JAVA_HOME}/bin
export CLASSPATH=.:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
因为我安装的CentOS 8.0 操作系统,默认安装了自带的open jdk,显然不满足我们的需要,我又不想卸载open jdk,所以才选择使用update-alternatives来管理JDK,如果你们的环境没有多版本jdk 管理需求,可忽略本步骤。
输入命令:update-alternatives --display java
update-alternatives --install /usr/bin/java java /usr/local/jdk1.8.0_211/bin/java 1800265
-- 提示: 1800265为执行update-alternatives --display java 后看到的系统默认jdk的权限编号
update-alternatives --config java
然后输入+2 回车。
java -version
成功配置了我们自己安装的java版本
scp -r /usr/local/jdk1.8.0_211/ root@node1:/usr/local/
scp -r /usr/local/jdk1.8.0_211/ root@node2:/usr/local/
scp -r /usr/local/jdk1.8.0_211/ root@node3:/usr/local/
scp -r /usr/local/jdk1.8.0_211/ root@node4:/usr/local/
scp /etc/profile root@node1:/etc/profile
scp /etc/profile root@node2:/etc/profile
scp /etc/profile root@node3:/etc/profile
scp /etc/profile root@node4:/etc/profile
node1、node2、node3、node4虚拟机上执行 source /etc/profile 使环境生效
提示:如果虚拟机默认了JDK,需要执行3.1.4小节的命令,切换系统默认JDK为你自己安装的JDK
tar -zxvf apache-zookeeper-3.8.2-bin.tar.gz
mv apache-zookeeper-3.8.2-bin /usr/local/
cd /usr/local/apache-zookeeper-3.8.2-bin/conf
cp zoo_sample.cfg zoo.cfg
cd /usr/local/apache-zookeeper-3.8.2-bin/
mkdir zkdata
mkdir logs
vim zoo.cfg
--zoo.cfg 添加内容:
dataDir=/usr/local/apache-zookeeper-3.8.2-bin/zkdata
dataLogDir=/usr/local/apache-zookeeper-3.8.2-bin/logs
server.1=node2:2888:3888
server.2=node3:2888:3888
server.3=node4:2888:3888
node2、node3、node4虚拟机,分别在/usr/local/apache-zookeeper-3.8.2-bin/zkdata目录下创建myid
echo 1 > myid //node2
echo 2 > myid //node3
echo 3 > myid //node4
vim /etc/profile
export ZK_HOME=/usr/local/apache-zookeeper-3.8.2-bin
export PATH=$PATH:$ZK_HOME/bin
scp -r /usr/local/apache-zookeeper-3.8.2-bin/ root@node3:/usr/local/
scp -r /usr/local/apache-zookeeper-3.8.2-bin/ root@node4:/usr/local/
scp /etc/profile root@node3:/etc/profile
scp /etc/profile root@node4:/etc/profile
注意三台虚拟机分贝执行source /etc/profile 使环境生效
分别在node2、node3、node4节点启动zookeeper 服务
命令:zkServer.sh start
通过ps -ef|grep zookeeper 指令,我们查看到我们的zookeeper已经启动成功。
tar -zxvf hadoop-3.3.4.tar.gz
mv hadoop-3.3.4/ /usr/local/
cd /usr/local/hadoop-3.3.4/etc/hadoop
vim hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_211
vim yarn-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_211
vim core-site.xml
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://mycluster</value>
- </property>
-
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/usr/local/hadoop-3.3.4/tmp</value>
- </property>
-
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>node2:2181,node3:2181,node4:2181</value>
- </property>
-
- <property>
- <name>hadoop.proxyuser.root.hosts</name>
- <value>*</value>
- </property>
-
- <property>
- <name>hadoop.proxyuser.root.groups</name>
- <value>*</value>
- </property>
vim hdfs-site.xml
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
-
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>/usr/local/hadoop-3.3.4/ha/dfs/name</value>
- </property>
-
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>/usr/local/hadoop-3.3.4/ha/dfs/data</value>
- </property>
-
- <property>
- <name>dfs.namenode.checkpoint.dir</name>
- <value>/usr/local/hadoop-3.3.4/ha/dfs/secondary</value>
- </property>
-
- <!-- 以下是 一对多,逻辑到物理节点的映射 -->
-
- <property>
- <name>dfs.nameservices</name>
- <value>mycluster</value>
- </property>
-
- <property>
- <name>dfs.ha.namenodes.mycluster</name>
- <value>nn1,nn2</value>
- </property>
-
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn1</name>
- <value>master:9000</value>
- </property>
-
- <property>
- <name>dfs.namenode.http-address.mycluster.nn1</name>
- <value>master:50070</value>
- </property>
-
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn2</name>
- <value>node1:9000</value>
- </property>
-
- <property>
- <name>dfs.namenode.http-address.mycluster.nn2</name>
- <value>node1:50070</value>
- </property>
-
- <!--节点宕机超时设置,默认10分钟+30秒,dfs.namenode.heartbeat.recheck-interval 单位是毫秒,dfs.heartbeat.interval单位是秒 -->
- <property>
- <name>dfs.namenode.heartbeat.recheck-interval</name>
- <value>50000</value>
- </property>
- <property>
- <name>dfs.heartbeat.interval</name>
- <value>3</value>
- </property>
-
-
- <!-- 以下是JN在哪里启动,数据存那个磁盘 -->
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://master:8485;node1:8485;node3:8485/mycluster</value>
- </property>
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/usr/local/hadoop-3.3.4/ha/dfs/jn</value>
- </property>
-
- <!-- HA角色切换的代理类和实现方法,我们用的ssh免密 -->
-
- <property>
- <name>dfs.client.failover.proxy.provider.mycluster</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
- <!--HA模式下,要设置 shell(true) 这个值,不然namenode active节点宕机后,standby不会切换 -->
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- <value>shell(true)</value>
- </property>
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/root/.ssh/id_dsa</value>
- </property>
-
- <property>
- <name>dfs.ha.fencing.ssh.connect-timeout</name>
- <value>30000</value>
- </property>
-
- <!-- 开启自动化: 启动zkfc -->
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
-
- <!-- 开启webhdfs -->
- <property>
- <name>dfs.webhdfs.enabled</name>
- <value>true</value>
- </property>
vim mapred-site.xml
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
-
- <property>
- <name>yarn.app.mapreduce.am.env</name>
- <value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.3.4</value>
- </property>
-
- <property>
- <name>mapreduce.map.env</name>
- <value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.3.4</value>
- </property>
-
- <property>
- <name>mapreduce.reduce.env</name>
- <value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.3.4</value>
- </property>
-
- <property>
- <name>mapreduce.map.memory.mb</name>
- <value>2048</value>
- </property>
-
- <property>
- <name>mapreduce.reduce.memory.mb</name>
- <value>2048</value>
- </property>
-
- <property>
- <name>mapreduce.map.java.opts</name>
- <value>-Xmx1024m</value>
- </property>
-
- <property>
- <name>mapreduce.reduce.java.opts</name>
- <value>-Xmx1024m</value>
- </property>
vim yarn-site.xml
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>node2:2181,node3:2181,node4:2181</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.cluster-id</name>
- <value>ningzhaosheng</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>rm1,rm2</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.hostname.rm1</name>
- <value>node3</value>
- </property>
- <!-- 指定rm1 的web端地址-->
- <property>
- <name>yarn.resourcemanager.webapp.address.rm1</name>
- <value>node3:8088</value>
- </property>
- <!--指定rm1的内部通讯地址 -->
-
- <property>
- <name>yarn.resourcemanager.address.rm1</name>
- <value>node3:8032</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.hostname.rm2</name>
- <value>node4</value>
- </property>
- <!-- 指定rm2 的web端地址-->
- <property>
- <name>yarn.resourcemanager.webapp.address.rm2</name>
- <value>node4:8088</value>
- </property>
- <!--指定rm2的内部通讯地址 -->
-
- <property>
- <name>yarn.resourcemanager.address.rm2</name>
- <value>node4:8032</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.recovery.enabled</name>
- <value>true</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
-
- <property>
- <name>yarn.resourcemanager.store.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.resource.memory-mb</name>
- <value>4096</value>
- </property>
vim workers
- #添加hostname
- master
- node1
- node2
- node3
- node4
start-dfs.sh、stop-dfs.sh文件头部加入:
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
start-yarn.sh、stop-yarn.sh文件头部加入:
YARN_RESOURCEMANAGER_USER=root
HDFS_DATANODE_SECURE_USER=root
YARN_NODEMANAGER_USER=root
scp -r /usr/local/hadoop-3.3.4/ root@node1:/usr/local/
scp -r /usr/local/hadoop-3.3.4/ root@node2:/usr/local/
scp -r /usr/local/hadoop-3.3.4/ root@node3:/usr/local/
scp -r /usr/local/hadoop-3.3.4/ root@node4:/usr/local/
scp /etc/profile root@node1:/etc/profile
scp /etc/profile root@node2:/etc/profile
scp /etc/profile root@node3:/etc/profile
scp /etc/profile root@node4:/etc/profile
注意:分发环境配置文件后,记得在各节点执行:source /etc/profile使环境配置生效
hdfs namenode -format
提示:初次启动需要,后续重启服务不再需要
hdfs zkfc -formatZK
提示:初次启动需要,后续重启服务不再需要
start-all.sh start
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2
查看状态显示,一个active状态,一个standby状态,启动正常。
hdfs haadmin -ns mycluster -getAllServiceState
查看状态显示,一个active状态,一个standby状态,符合预期,启动正常。
http://master:50070/
通过页面访问到了hadoop集群页面,且显示我们的节点都在线,启动正常。
http://node3:8088/cluster
成功访问到了yarn调度集群页面,yarn启动正常。
hdfs dfs -mkdir /bigdata
hdfs dfs -mkdir -p /user/root
今天Hadoop3.3.4集群部署的相关内容就分享到这里,如果帮助到大家,欢迎大家点赞+关注+收藏,有疑问也欢迎大家评论留言!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。