赞
踩
前言:
1. 临时关闭:systemctl stop firewalld.service
2. 永久关闭:systemctl disable firewalld.service
3. 查看防火墙状态:firewall-cmd --state
1. 下载hadoop安装包
地址:下载地址(官网)
本人用的是hadoop3.2.3,下载后的文件我放在/root/jars下,地址:https://archive.apache.org/dist/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
说明:如果你服务器可以用互联网:
wget https://archive.apache.org/dist/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
2. 安装
2.1 解压hadoop安装包 tar -zxvf hadoop-3.2.3.tar.gz
- drwxr-xr-x 9 1000 1000 149 3月 20 2022 hadoop-3.2.3
- -rw-r--r-- 1 root root 492241961 2月 28 17:46 hadoop-3.2.3.tar.gz
- [root@127 jars]# pwd
- /root/jars
- [root@127 jars]#
2.2 移动到/root/
mv hadoop-3.2.3 ../
目录结构如下
- drwxr-xr-x 9 1000 1000 149 3月 20 2022 hadoop-3.2.3
- drwxr-xr-x 2 root root 33 2月 28 17:49 jars
- [root@127 ~]# pwd
- /root
3. 配置
3.1 配置hadoop环境变量
vi /etc/profile
- # java
- export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64
- export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
- export PATH=$PATH:$JAVA_HOME/bin
-
- # hadoop
- export HADOOP_HOME=/root/hadoop-3.2.3
- export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
- export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
-
- export HDFS_NAMENODE_USER=root
- export HDFS_DATANODE_USER=root
- export HDFS_SECONDARYNAMENODE_USER=root
- export YARN_RESOURCEMANAGER_USER=root
- export YARN_NODEMANAGER_USER=root
说明:
3.1.1 安装hadoop前要先安装好 jdk (这里不介绍如何安装jdl)
3.1.2 查看 jdk安装目录 echo $JAVA_HOME
3.1.3 配置好后一定要 source /etc/profile 使环境变量生效
3.1.4 验证hadoop环境变量
- [root@127 ~]# hadoop version
- Hadoop 3.2.3
- Source code repository https://github.com/apache/hadoop -r abe5358143720085498613d399be3bbf01e0f131
- Compiled by ubuntu on 2022-03-20T01:18Z
- Compiled with protoc 2.5.0
- From source with checksum 39bb14faec14b3aa25388a6d7c345fe8
- This command was run using /root/hadoop-3.2.3/share/hadoop/common/hadoop-common-3.2.3.jar
3.2 配置hadoop
3.2.1 配置hadoop-env.sh(目录:/root/hadoop-3.2.3/etc/hadoop/hadoop-env.sh)
vim hadoop-env.sh (快速到达文件尾部:shift + g)
新增变量:export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64
保存,退出!
3.2.2 配置core-site.xml(目录:/root/hadoop-3.2.3/etc/hadoop/core-site.xml)
- <configuration>
- <!--HDFS临时目录-->
- <property>
- <name>hadoop.tmp.dir</name>
- <!--hadoop主目录/tmp-->
- <value>/root/hadoop-3.2.3/tmp</value>
- </property>
- <!--HDFS的默认地址、端口 访问地址-->
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://master:9000</value>
- </property>
- </configuration>
说明:
1. hadoop下默认无tmp文件,需要创建该文件夹
2. master为主节点名称,后面会配置,你也可以自行设置
保存,退出!
3.2.3 配置hdfs-site.xml(目录:/root/hadoop-3.2.3/etc/hadoop/hdfs-site.xml)
- <configuration>
- <!--hdfs web的地址-->
- <property>
- <name>dfs.namenode.http-address</name>
- <value>master:9870</value>
- </property>
- <!--副本数-->
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- <!--是否启用hdfs权限,当值为false时,代表关闭-->
- <property>
- <name>dfs.permissions.enabled</name>
- <value>false</value>
- </property>
- <!--块大小,默认128M-->
- <property>
- <name>dfs.blocksize</name>
- <value>134217728</value>
- </property>
- </configuration>
说明:
1. master:9870,master为主节点机器名称,9870是hadoop3.x的默认端口
保存,退出!
3.2.4 配置mapred-site.xml(目录:/root/hadoop-3.2.3/etc/hadoop/mapred-site.xml)
- <configuration>
- <!--local表示本地运行,classic表示经典mapreduce框架,yarn表示新的框架-->
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <!--如果map和reduce任务访问本地库(压缩等),则必须保留原始值,当此值为空时,设置执行环境的命令将取决于操作系统-->
- <property>
- <name>mapreduce.admin.user.env</name>
- <!--设置为hadoop主目录-->
- <value>HADOOP_MAPRED_HOME=/root/hadoop-3.2.3</value>
- </property>
- <!--可以设置AM【AppMaster】端的环境变量-->
- <property>
- <name>yarn.app.mapreduce.am.env</name>
- <!--设置为hadoop主目录-->
- <value>HADOOP_MAPRED_HOME=/root/hadoop-3.2.3</value>
- </property>
- </configuration>
3.2.5 配置yarn-site.xml(目录:/root/hadoop-3.2.3/etc/hadoop/yarn-site.xml)
- <configuration>
-
- <!-- Site specific YARN configuration properties -->
-
- <!--集群master-->
- <property>
- <name>yarn.resourcemanager.hostname</name>
- <value>master</value>
- </property>
- <!--NodeManager上运行的附属服务-->
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <!--容器可能会覆盖的环境变量,而不是使用NodeManager的默认值-->
- <property>
- <name>yarn.nodemanager.env-whitelist</name>
- <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ</value>
- </property>
- <!--关闭内存检测,在虚拟机环境中不做配置会报错-->
- <property>
- <name>yarn.nodemanager.vmem-check-enabled</name>
- <value>false</value>
- </property>
- </configuration>
3.2.6 配置workers(目录:/root/hadoop-3.2.3/etc/hadoop/workers)
- master
- s1
- s2
说明:
master是主节点名称,s1和s2是从节点名称。
3.3 设置ssh免密码登录(3台虚拟机都执行)
3.3.1 检测是否已安装ssh
- [root@127 hadoop]# rpm -qa | grep ssh
- openssh-server-7.4p1-22.el7_9.x86_64
- openssh-clients-7.4p1-22.el7_9.x86_64
- libssh2-1.8.0-4.el7.x86_64
- openssh-7.4p1-22.el7_9.x86_64
- [root@127 hadoop]#
3.3.2 配置三台服务器的hosts文件
都修改成:
- 192.168.xxx.xxx master
- 192.168.xxx.xxx1 s1
- 192.168.xxx.xxx2 s2
3.3.3 生成免密登录
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub master
ssh-copy-id -i ~/.ssh/id_rsa.pub s1
ssh-copy-id -i ~/.ssh/id_rsa.pub s2
3.3.4 测试免密登录
ssh s1
ssh s2
ssh master
3.4 开启hadoop
3.4 克隆hadoop,从主节点到从节点
scp -r hadoop-3.2.3 root@s1:/root
scp -r hadoop-3.2.3 root@s2:/root
3.5 将从节点的/etc/hosts文件配置java 和 hadoop环境变量
说明:和主节点hosts文件一致。(记得source生效)
3.6 启动hadoop
3.6.1 格式化namenode
hdfs namenode -format
3.6.2 启动hadoop :start-all.sh
3.6.3 jps
1. master进程如下:
- [root@localhost ~]# jps
- 3346 ResourceManager
- 2743 NameNode
- 3096 SecondaryNameNode
- 2875 DataNode
- 4156 Jps
- 3469 NodeManager
2. 查看namenode UI界面
浏览器输入: master:9870
说明:
如果打不开master:9870
1. 检查C:\Windows\System32\drivers\etc\hosts文件是否配置了域名,不然电脑不识别master
2. 检查服务器是否关闭了防火墙
1检查Linux机器的防火墙状态,命令如下
systemctl status firewalld.service
如果防火墙没有关闭,依次执行下面命令关闭防火墙
2.关闭防火墙:systemctl stop firewalld.service
3.设置为开机关闭防火墙
systemctl disable firewalld.service
备注:有问题及时指正,持续修改中。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。