赞
踩
从0开始构建一套Hadoop大数据平台,实战环境为目前主流服务器操作系统CentOS 7.x系统。Hadoop的安装部署都属于java进程,就是启动了JVM进程,运行服务。
HDFS:存储数据,提供分析的数据;
NameNode/DataNode
YARN:提供程序运行的资源
ResourceManager/NodeManager
系统版本:CentOS 7.x x86_64
JAVA版本:JDK-1.8.0_131
Hadoop版本:hadoop-3.2.3
192.168.199.145 datanode、nodemanager、namenode、secondary namenode、resource manager
192.168.1.146 datanode、nodemanager
192.168.1.147 datanode、nodemanager
node1、node2、node3节点进行如下配置:
cat >/etc/hosts<<EOF
127.0.0.1 localhost localhost.localdomain
192.168.199.145 node1
192.168.199.146 node2
192.168.199.147 node3
EOF
sed -i '/SELINUX/s/enforcing/disabled/g' /etc/sysconfig/selinux
setenforce 0
systemctl stop firewalld.service
systemctl disable firewalld.service
yum install ntpdate rsync lrzsz -y
ntpdate pool.ntp.org
hostname `cat /etc/hosts|grep $(ifconfig|grep broadcast|awk '{print $2}')|awk '{print $2}'`;su
节点作为Master控制节点,执行如下指令创建公钥和私钥,然后将公钥拷贝至其余节点即可。
ssh-keygen -t rsa -N '' -f /root/.ssh/id_rsa -q
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2
ssh-copy-id -i /root/.ssh/id_rsa.pub root@node3
在每个节点安装
#解压JDK软件包; tar -xvzf jdk1.8.0_131.tar.gz #创建JDK部署目录; mkdir -p /usr/java/ \mv jdk1.8.0_131 /usr/java/ #设置环境变量; cat>>/etc/profile<<EOF export JAVA_HOME=/usr/java/jdk1.8.0_131/ export HADOOP_HOME=/data/hadoop/ export JAVA_LIBRARY_PATH=/data/hadoop/lib/native/ export PATH=\$PATH:\$HADOOP_HOME/bin/:\$JAVA_HOME/bin EOF #使其环境变量生效; source /etc/profile java -version
#下载Hadoop软件包;
yum install wget -y
wget -c https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.3/hadoop-3.2.3.tar.gz
#解压Hadoop软件包;
tar -xzvf hadoop-3.2.3.tar.gz
#创建Hdoop程序&数据目录;
mkdir -p /data/
#将Hadoop程序部署至/data/hadoop目录下;
\mv hadoop-3.2.3/ /data/hadoop/
#查看Hadoop是否部署成功;
ls -l /data/hadoop/
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://node1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>node1:9001</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.name.dir</name> <value>/data/hadoop/data_name1,/data/hadoop/data_name2</value> </property> <property> <name>dfs.data.dir</name> <value>/data/hadoop/data_1,/data/hadoop/data_2</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
echo "export JAVA_HOME=/usr/java/jdk1.8.0_131/" >> /data/hadoop/etc/hadoop/hadoop-env.sh
cat>/data/hadoop/etc/hadoop/workers<<EOF
node1
node2
node3
EOF
cd /data/hadoop/sbin/
for i in `ls start*.sh stop*.sh`;do sed -i "1a\HDFS_DATANODE_USER=root\nHDFS_DATANODE_SECURE_USER=root\nHDFS_NAMENODE_USER=root\nHDFS_SECONDARYNAMENODE_USER=root\nYARN_RESOURCEMANAGER_USER=root\n\YARN_NODEMANAGER_USER=root" $i ;done
for i in `seq 2 3`;do ssh -l root node$i -a "mkdir -p /data/hadoop/" ;done
for i in `seq 2 3`;do rsync -aP --delete /data/hadoop/ root@node$i:/data/hadoop/ ;done
在启动hadoop之前,我们需要做一步非常关键的步骤,需要在Namenode上执行初始化命令,初始化name目录和数据目录。
#初始化集群;
hadoop namenode -format
#停止所有服务;
/data/hadoop/sbin/stop-all.sh
#kill方式停止服务;
ps -ef|grep hadoop|grep java |grep -v grep |awk '{print $2}'|xargs kill -9
sleep 2
#启动所有服务;
/data/hadoop/sbin/start-all.sh
分别查看3个节点Hadoop服务进程和端口信息,命令操作如下
#查看服务进程;
ps -ef|grep -aiE hadoop
#查看服务监听端口;
netstat -ntpl
#执行JPS命令查看JAVA进程;
jps
#查看Hadoop日志内容;
tail -fn 100 /data/hadoop/logs/*
根据如上Hadoop配置,Hadoop大数据平台部署成功,访问Node1 9870端口(URL:http://192.168.199.145:9870/),如图所示:
访问Hadoop集群WEB地址:http://192.168.199.145:8088/
至此hadoop大数据集群搭建完毕。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。