赞
踩
操作系统 | IP地址 | 主机名 | 主机名 | 主机名 |
---|---|---|---|---|
CentOS 7.2.1511 | 192.168.42.131 | node01 | jdk1.8.0_91 | scala-2.12.13 |
CentOS 7.2.1511 | 192.168.42.132 | node02 | jdk1.8.0_91 | scala-2.12.13 |
CentOS 7.2.1511 | 192.168.42.139 | node03 | jdk1.8.0_91 | scala-2.12.13 |
安装的组件 | 版本 |
---|---|
zookeeper | 3.4.6 |
hadoop | 3.2.2 |
hbase | 2.0.6 |
hive | 3.1.2 |
kafka | 2.11-2.0.0 |
solr | 8.9.0 |
atlas | 2.1.0 |
spark | 3.0.3 |
sqoop | 1.4.6 |
flume | 1.9.0 |
elasticsearch | 7.14.1 |
kibana | 7.14.1 |
注:本文所需所有安装包请于我的资源下载(附带本文的PDF版文档):大数据各组件安装(数据中台搭建)所需安装包
hostnamectl set-hostname node01
hostnamectl set-hostname node02
hostnamectl set-hostname node03
vim /etc/hosts
# 增加对应的内容:
192.168.42.131 node01
192.168.42.132 node02
192.168.42.139 node03
# Centos 7:
systemctl stop firewalld.service
# 设置开机不启动:
systemctl disable firewalld.service
# 查看防火墙状态
firewall-cmd --state
# Centos 6:
/etc/init.d/iptables stop
# 设置开机不启动:
chkconfig iptables off
# 查看防火墙状态
service iptables status
# 查看SELinux状态:如果SELinux status参数为enabled即为开启状态
sestatus
# 临时关闭,不用重启机器
setenforce 0
# 永久关闭
vim /etc/selinux/config
# 修改如下配置项:
SELINUX=disabled
Linux操作系统会对每个进程能打开的文件数进行限制(某用户下某进程),Hadoop生态系统的很多组件一般都会打开大量的文件,因此要调大相关参数(生产环境必须调大,学习环境稍微大点就可以)。检查文件描述符限制数,可以用如下命令检查当前用户下一个进程能打开的文件数:
ulimit -Sn
1024
ulimit -Hn
4096
建议的最大打开文件描述符数为10000或更多:
vim /etc/security/limits.conf
# 直接添加内容:
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
如果不关闭THP,Hadoop的系统CPU使用率很高。
# 查看: [root@node01 mnt]# cat /sys/kernel/mm/transparent_hugepage/defrag [always] madvise never [root@node01 mnt]# cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never # 关闭: vim /etc/rc.d/rc.local # 在文末加入一段代码: if test -f /sys/kernel/mm/transparent_hugepage/enabled; then echo never > /sys/kernel/mm/transparent_hugepage/enabled fi if test -f /sys/kernel/mm/transparent_hugepage/defrag; then echo never > /sys/kernel/mm/transparent_hugepage/defrag fi # 保存退出,然后赋予rc.local文件执行权限: chmod +x /etc/rc.d/rc.local
三台机器重启生效:reboot
[root@node01 ~]# getenforce
Disabled
[root@node01 ~]# ulimit -Sn
65536
[root@node01 ~]# ulimit -Hn
65536
[root@node01 ~]# cat /sys/kernel/mm/transparent_hugepage/defrag
always madvise [never]
[root@node01 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
删除所有机器上默认openJDK,注意这里是只写了一个。一些开发版的centos会自带jdk,我们一般用自己的jdk,把自带的删除。先看看有没有安装:
[root@node01 ~]# java -version
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
查找安装位置:
[root@node01 ~]# rpm -qa | grep java
java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.x86_64
javapackages-tools-3.4.1-11.el7.noarch
java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64
tzdata-java-2015g-1.el7.noarch
java-1.7.0-openjdk-1.7.0.91-2.6.2.3.el7.x86_64
java-1.7.0-openjdk-headless-1.7.0.91-2.6.2.3.el7.x86_64
python-javapackages-3.4.1-11.el7.noarch
删除全部,noarch文件可以不用删除:
[root@node01 ~]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.65-3.b17.el7.x86_64
[root@node01 ~]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64
[root@node01 ~]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.91-2.6.2.3.el7.x86_64
[root@node01 ~]# rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.91-2.6.2.3.el7.x86_64
# 检查有没有删除:
[root@node01 ~]# java -version
-bash: /usr/bin/java: No such file or directory
mkdir /opt/java
tar -zxvf jdk-8u91-linux-x64.tar.gz -C /opt/java/
vim /etc/profile
export JAVA_HOME=/opt/java/jdk1.8.0_91
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
source /etc/profile
[root@node01 ~]# java -version
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
[root@node01 ~]# adduser hadoop
[root@node01 ~]# passwd hadoop(密码123456)
Changing password for user hadoop.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
# 三台都做 [hadoop@node01 ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:omdYd47S9iPHrABsycOTFaJc3xowcTOYokKtf0bAKZg root@node03 The key's randomart image is: +---[RSA 2048]----+ |..o B+= | |E+.Bo* = | |..=.. + . | |o. + = o | |. . % + S . | | o X + + | | = = +o. | | o +..= | | .+.. | +----[SHA256]-----+ # 之后你会发现,在/root/.ssh目录下生成了公钥文件: [hadoop@node01 ~]# ll /root/.ssh total 8 -rw-------. 1 root root 1679 Mar 24 06:47 id_rsa -rw-r--r--. 1 root root 393 Mar 24 06:47 id_rsa.pub ssh-copy-id node01 ssh-copy-id node02 ssh-copy-id node03 # 检查免密登录是否设置成功: ssh node01 ssh node02 ssh node03
[hadoop@node01 ~]$ tar -zxvf /mnt/zookeeper-3.4.6.tar.gz -C .
将zookeeper-3.4.6/conf目录下面的 zoo_sample.cfg修改为zoo.cfg:
[hadoop@node01 ~]$ cd zookeeper-3.4.6/conf
[hadoop@node01 conf]$ mv zoo_sample.cfg zoo.cfg
[hadoop@node01 conf]$ vim zoo.cfg
# 添加:
dataDir=/home/hadoop/zookeeper-3.4.6/data
dataLogDir=/home/hadoop/zookeeper-3.4.6/log
server.1=192.168.42.131:2888:3888
server.2=192.168.42.132:2888:3888
server.3=192.168.42.133:2888:3888
# 注:2888端口号是zookeeper服务之间通信的端口,而3888是zookeeper与其他应用程序通信的端口
注意:原配置文件中已经有:
tickTime=2000
initLimit=10
syncLimit=5
clientPort=2181
创建目录:
[hadoop@node01 conf]$ cd ..
[hadoop@node01 zookeeper-3.4.6]$ mkdir -pv data log
拷贝给所有节点:
scp -r /home/hadoop/zookeeper-3.4.6 node02:/home/hadoop
scp -r /home/hadoop/zookeeper-3.4.6 node03:/home/hadoop
在节点1上设置myid为1,节点2上设置myid为2,节点3上设置myid为3:
[hadoop@node01 zookeeper-3.4.6]$ vim /home/hadoop/zookeeper-3.4.6/data/myid
1
[hadoop@node02 ~]$ vim /home/hadoop/zookeeper-3.4.6/data/myid
2
[hadoop@node03 ~]$ vim /home/hadoop/zookeeper-3.4.6/data/myid
3
所有节点都切换到root用户,/var 目录有其他用户写权限:
[root@node01 ~]# chmod 777 /var
[root@node02 ~]# chmod 777 /var
[root@node03 ~]# chmod 777 /var
启动:
[hadoop@node01 ~]$ cd zookeeper-3.4.6/bin/
[hadoop@node01 bin]$ ./zkServer.sh start
JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@node02 bin]$ ./zkServer.sh start
[hadoop@node03 bin]$ ./zkServer.sh start
分别在3个节点上查看状态:
[hadoop@node01 bin]$ ./zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower [hadoop@node02 bin]$ ./zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader [hadoop@node03 bin]$ ./zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower [hadoop@node01 bin]$ jps 19219 QuorumPeerMain 19913 Jps [hadoop@node02 bin]$ jps 20016 Jps 19292 QuorumPeerMain [hadoop@node03 bin]$ jps 19195 QuorumPeerMain 19900 Jpsnode01
注:你会发现node01和node03是follower,node02是leader,你会有所疑问node03的myid不是最大他为什么不是leader呢?
zookeeper是两个两个比较,当node01和node02比较时node02的myid大就将node02选举为leader了,就不和后面的node03做比较了。。。
注意:在Centos和RedHat中一定要将防火墙和selinux关闭,否则在查看状态的时候会报这个错:
JMX enabled by default
Using config: /home/hadoop/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
[hadoop@node01 ~]$ tar -zxvf /mnt/hadoop-3.2.2.tar.gz -C .
配置核心组件文件(core-site.xml):
[hadoop@node01 ~]$ vim hadoop-3.2.2/etc/hadoop/core-site.xml
#修改 etc/hadoop/core-site.xml 文件,增加以下配置(一开始的配置文件没有内容的!!!)
<configuration>
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://node01:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储路径 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-3.2.2/tmp</value>
</property>
</configuration>
配置文件系统(hdfs-site.xml):
[hadoop@node01 ~]$ vim hadoop-3.2.2/etc/hadoop/hdfs-site.xml #修改 etc/hadoop/hdfs-site.xml 文件,增加以下配置 <configuration> <property> <name>dfs.name.dir</name> <value>/home/hadoop/hadoop-3.2.2/hdfs/name</value> <description>namenode上存储hdfs名字空间元数据 </description> </property> <property> <name>dfs.data.dir</name> <value>/home/hadoop/hadoop-3.2.2/hdfs/data</value> <description>datanode上数据块的物理存储位置</description> </property> <!-- 设置hdfs副本数量 --> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
配置MapReduce计算框架文件:
[hadoop@node01 ~]$ vim hadoop-3.2.2/etc/hadoop/mapred-site.xml #增加以下配置 <configuration> <!-- 通知框架MR使用YARN --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 可以设置AM【AppMaster】端的环境变量 如果上面缺少配置,可能会造成mapreduce失败 --> <property> <name>yarn.app.mapreduce.am.env</name> <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value> </property> <property> <name>mapreduce.map.env</name> <value>HADOOP_MAPRED_HOME=${ HADOOP_HOME}</value> </property> <property> <name>mapreduce.reduce.env</name> <value>HADOOP_MAPRED_HOME=${ HADOOP_HOME}</value> </property> </configuration>
配置yarn-site.xml文件:
[hadoop@node01 ~]$ vim hadoop-3.2.2/etc/hadoop/yarn-site.xml #增加以下配置 <configuration> <!--集群master,--> <property> <name>yarn.resourcemanager.hostname</name> <value>node01</value> </property> <!-- reducer取数据的方式是mapreduce_shuffle --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 关闭内存检测,虚拟机需要,不配会报错--> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>
配置Hadoop环境变量:
[hadoop@node01 ~]$ vim hadoop-3.2.2/etc/hadoop/hadoop-env.sh
# 添加如下内容:
export JAVA_HOME=/opt/java/jdk1.8.0_91
【选】配置workers文件(hadoop3.x修改workers):
vim etc/hadoop/workers
node01
node02
node03
【选】配置slaves文件(hadoop2.x修改slaves):
vim etc/hadoop/slaves
node02
node03
复制node01上的Hadoop到node02和node03节点上:
[hadoop@node01 ~]$ scp -r hadoop-3.2.2/ node02:/home/hadoop/
[hadoop@node01 ~]$ scp -r hadoop-3.2.2/ node03:/home/hadoop/
配置操作系统环境变量(需要在所有节点上进行,且使用一般用户权限):
[hadoop@node01 ~]$ vim ~/.bash_profile
#以下是新添加入代码
export JAVA_HOME=/opt/java/jdk1.8.0_91
export PATH=$JAVA_HOME/bin:$PATH
#hadoop
export HADOOP_HOME=/home/hadoop/hadoop-3.2.2
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
[hadoop@node02 ~]$ vim ~/.bash_profile
[hadoop@node03 ~]$ vim ~/.bash_profile
# 使配置文件生效(三台都执行):
source ~/.bash_profile
<
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。