赞
踩
采用YARN集群
详见机器列表
下载后存放目录:/home/work/soft
Spark:2.0.0
Hadoop:2.7.3
服务地址:zookeeper.waimai.baidu.com:2181/waimai/inf/spark-yarn
首先配置代理,然后下载hadoop
和spark
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.0-bin-hadoop2.7.tgz
wget http://apache.fayea.com/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
op
为我们提供的spark
集群的时间都是shanghai
时间东8区,所以无需配置,可以通过date -R
命令查看
如果不同,可以通过如下命令设置
sudo cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
vi /etc/sysconfig/clock 查看具体配置信息
我们机器划分为2台
ResourceManager
,8台NodeManager
spark00为Active
spark01为Standby
ip | 主机名 | 进程 |
---|---|---|
A.212 | AHOST.00.name | ResourceManager |
A.213 | AHOST.01.name | ResourceManager |
A.214 | AHOST.02.name | NodeManager |
A.215 | AHOST.03.name | NodeManager |
A.216 | AHOST.04.name | NodeManager |
A.217 | AHOST.05.name | NodeManager |
A.218 | AHOST.06.name | NodeManager |
A.219 | AHOST.07.name | NodeManager |
A.220 | AHOST.08.name | NodeManager |
A.221 | AHOST.09.name | NodeManager |
我们把下面的配置添加到10台机器的/etc/hosts
中
A.212 AHOST.00.name
A.213 AHOST.01.name
A.214 AHOST.02.name
A.215 AHOST.03.name
A.216 AHOST.04.name
A.217 AHOST.05.name
A.218 AHOST.06.name
A.219 AHOST.07.name
A.220 AHOST.08.name
A.221 AHOST.09.name
实践证明,防火墙不影响我们搭建,如果有影响下面命令可能会帮到你
查看状态:sudo service iptables status
开启: sudo service iptables start
关闭: sudo service iptables stop
在节点机器中执行下面命令
ssh-keygen -t rsa
将/home/work/.ssh/id_rsa.pub
中信息追加到Master机器中的~/.ssh/authorized_keys
中,所有节点机器都讲id_rsa.pub的信息都完成追加后,将master机器的
~/.ssh/authorized_keys
复制到所有其他机器中。
scp ~/.ssh/authorized_keys work@A.213:~/.ssh/
scp ~/.ssh/authorized_keys work@A.214:~/.ssh/
scp ~/.ssh/authorized_keys work@A.215:~/.ssh/
scp ~/.ssh/authorized_keys work@A.216:~/.ssh/
scp ~/.ssh/authorized_keys work@A.217:~/.ssh/
scp ~/.ssh/authorized_keys work@A.218:~/.ssh/
scp ~/.ssh/authorized_keys work@A.219:~/.ssh/
scp ~/.ssh/authorized_keys work@A.220:~/.ssh/
scp ~/.ssh/authorized_keys work@A.221:~/.ssh/
进入spark00
机器(主Master
机器)
安装目录:
/home/work/hadoop
进入
/home/work/soft
执行如下命令:
tar -zxvf hadoop-2.7.3.tar.gz
cp -r hadoop-2.7.3/* /home/work/hadoop/
hadoop文件结构如下
配置hadoop环境变量
在/etc/profile
(NNA和NNS两台机器)配置如下信息
export HADOOP_HOME=/home/work/hadoop
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_HOME=/home/work/hadoop
export YARN_CONF_DIR=${YARN_HOME}/etc/hadoop
PATH=$JAVA_HOME:$PATH:$HADOOP_HOME/bin
使用source /etc/profile
使环境变量生效,验证方式是使用echo $HADOOP_HOME
查看环境变量信息。
进入
/home/work/hadoop/etc/hadoop
,需要配置的文件都在该目录下.由于实验阶段,我先采用1主1备3slave
结构。
配置之前我们定义好一些文件夹
mkdir -p /home/work/tmp
mkdir -p /home/work/data/tmp/journal
mkdir -p /home/work/data/dfs/namenode
mkdir -p /home/work/data/dfs/datanode
mkdir -p /home/work/data/yarn/local
mkdir -p /home/work/log/yarn
hadoop
环境使用的jdk
路径
export JAVA_HOME=/usr/java/jdk1.8.0_65/
yarn环境使用的jdk路径
export JAVA_HOME=/usr/java/jdk1.8.0_65/
配置节点信息
AHOST.02.name
AHOST.03.name
AHOST.04.name
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration><property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property><property><name>io.file.buffer.size</name><value>131072</value></property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/work/tmp</value>
</property>
<property><name>hadoop.proxyuser.hadoop.hosts</name><value>*</value></property>
<property><name>hadoop.proxyuser.hadoop.groups</name><value>*</value></property>
<property><name>ha.zookeeper.quorum</name><value>zookeeper.waimai.baidu.com:2181/waimai/inf/spark-yarn</value></property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>nna,nns</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nna</name>
<value>AHOST.00.name:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nns</name>
<value>AHOST.01.name:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nna</name>
<value>AHOST.00.name:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nns</name>
<value>AHOST.01.name:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://AHOST.02.name:8485;AHOST.03.name:8485;AHOST.04.name:8485/cluster1</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/work/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/work/data/tmp/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/work/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/work/data/dfs/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8480</value>
</property>
<property>
<name>dfs.journalnode.rpc-address</name>
<value>0.0.0.0:8485</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zookeeper.waimai.baidu.com:2181/waimai/inf/spark-yarn</value>
</property>
</configuration>

先从模板复制一份mapred-site.xml
文件
cp mapred-site.xml.template mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>AHOST.00.name:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>AHOST.00.name:19888</value>
</property>
</configuration>

<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>AHOST.00.name</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>AHOST.01.name</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>AHOST.00.name:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>AHOST.01.name:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>zookeeper.waimai.baidu.com:2181/waimai/inf/spark-yarn</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>102400</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>327680</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>AHOST.00.name:8130</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>AHOST.01.name:8130</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>AHOST.00.name:8131</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>AHOST.01.name:8131</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>AHOST.00.name:8033</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>AHOST.01.name:8033</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>12</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>48</value>
</property>
</configuration>

scp命令将/home/work/hadoop目录分开到其他机器
scp -r ~/hadoop A.213:~/
scp -r ~/hadoop A.214:~/
scp -r ~/hadoop A.215:~/
scp -r ~/hadoop A.216:~/
这个地方要注意点,
hadoop 2.7.3
的hadoop namenode -format
需要先启动dfs
,也可以启动journalnode
进程(/home/work/hadoop/sbin/hadoop-daemon.sh start journalnode)
spark00机器侠执行
/home/work/hadoop/sbin/start-dfs.sh
从日志信息中可以看出,start-dfs.sh
的脚本启动的进程如下:
- 启动主备机器的namenode
进程
- 启动所有slave
节点机器的datanode
进程
- 启动所有slave
节点机器的journalnode
进程
- 启动主备机器的zkfc
进程
有启动就有关闭,如果你不想让
dfs
继续运行,执行/home/work/hadoop/sbin/stop-dfs.sh
如果启动不成功,去/home/work/hadoop/logs
找相关以.log
结尾的文件查看执行log
,比如Master机器下的namenode机器启动不成功,那么我们就要去看
/home/work/hadoop/logs/hadoop-work-namenode-AHOST.00.name.log
这个问题是信息不一致,格式化一下就行
hadoop namenode -format
/home/work/hadoop/sbin/stop-dfs.sh
/home/work/hadoop/sbin/start-dfs.sh
如果这个地方打开网页无法访问,可能需要配置一下代理
代理为nj02-lbs-impala2.nj02.baidu.com
,端口为8002
.这个地方要注意配置的套接字的代理,不要配成HTTP
的
ResourceManager主机启动,还会启动NodeManager,进入spark00机器
> /home/work/hadoop/sbin/start-yarn.sh
ResourceManager备机启动,进入spark01机器
/home/work/hadoop/sbin/yarn-daemon.sh start resourcemanager
ResourceManager主机启动,还会启动NodeManager,进入spark00机器
> /home/work/hadoop/sbin/stop-yarn.sh
ResourceManager备机启动,进入spark01机器
/home/work/hadoop/sbin/yarn-daemon.sh stop resourcemanager
spark安装目录为
/home/work/spark
,进入/home/work/soft
执行如下命令:
tar -zxvf spark-2.0.0-bin-hadoop2.7.tgz
mv spark-2.0.0-bin-hadoop2.7/ ~/spark
spark文件目录结构
spark环境变量配置
export SPARK_HOME=/home/work/spark
PATH=$JAVA_HOME:$PATH:$HADOOP_HOME/bin:$SPARK_HOME/bin
进入/home/work/spark/conf中复制spark-env.sh.template为spark-env.sh
cp spark-env.sh.template spark-env.sh
vi spark-env.sh
设置如下信息
export JAVA_HOME=/usr/java/jdk1.8.0_65/
export HADOOP_HOME=/home/work/hadoop
进入/home/work/spark/conf中复制slaves.template为slaves
cp slaves.template slaves
vi slaves
设置如下信息
AHOST.02.name
AHOST.03.name
AHOST.04.name
cp log4j.properties.template log4j.properties
配置完成后的目录结构
(需要启动dfs,不然master和slave直接无法相互通信)
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 examples/jars/spark-examples*.jar 10
运行情况可以通过Yarn的webui查看,而不是通过Spark的ui来查看,spark ui中是没有Application数据的。
我们刚开始尝试搭建的时候只用了3台slave,还剩5台机器没添加,现在刚好来尝试如何增加集群中的slave机器,也实验一下扩容的案例.
spark00执行如下命令
/home/work/hadoop/sbin/stop-dfs.sh
/home/work/hadoop/sbin/stop-yarn.sh
/home/work/spark/sbin/stop-all.sh
spark01执行如下命令
/home/work/hadoop/sbin/yarn-daemon.sh stop resourcemanager
STEP 1:vi /home/work/hadoop/etc/hadoop/slaves
AHOST.05.name
AHOST.06.name
AHOST.07.name
AHOST.08.name
AHOST.09.name
STEP 2:vi /home/work/spark/conf/slaves
AHOST.05.name
AHOST.06.name
AHOST.07.name
AHOST.08.name
AHOST.09.name
同步hadoop目录和spark目录到其他机器
scp -r ~/hadoop A.213:~/
scp -r ~/hadoop A.214:~/
scp -r ~/hadoop A.215:~/
scp -r ~/hadoop A.216:~/
scp -r ~/hadoop A.217:~/
scp -r ~/hadoop A.218:~/
scp -r ~/hadoop A.219:~/
scp -r ~/hadoop A.220:~/
scp -r ~/hadoop A.221:~/
spark00执行如下命令
/home/work/hadoop/sbin/start-dfs.sh
/home/work/hadoop/sbin/start-yarn.sh
/home/work/spark/sbin/start-all.sh
spark01执行如下命令
/home/work/hadoop/sbin/yarn-daemon.sh start resourcemanager
~/hadoop/etc/hadoop/hdfs-site.xml
的文件要替换
spark版本更新很频繁,会经常进行升级,下面以2.0.2为例说明如何升级
下载完spark-2.0.2-bin-hadoop2.7.tgz后
mv ~/spark ~/spark-2.0.1
mv spark-2.0.2-bin-hadoop2.7 ~/spark
cp -r ~/spark-2.0.1/conf/* ~/spark/conf/
更新spark完成
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。