当前位置:   article > 正文

Spark入门到精通:第一节 Spark 1.5.0集群搭建

Spark入门到精通:第一节 Spark 1.5.0集群搭建

转载地址:http://blog.csdn.net/lovehuangjiaju/article/details/48183485?spm=5176.100239.blogcont60309.4.2c4910bdupAcRE


作者:周志湖 
网名:摇摆少年梦 
微信号:zhouzhihubeyond

本节主要内容

  1. 操作系统环境准备
  2. Hadoop 2.4.1集群搭建
  3. Spark 1.5.0 集群部署

注:在利用CentOS 6.5操作系统安装spark 1.5集群过程中,本人发现Hadoop 2.4.1集群可以顺利搭建,但在Spark 1.5.0集群启动时出现了问题(可能原因是64位操作系统原因,源码需要重新编译,但本人没经过测试),经本人测试在ubuntu 10.04 操作系统上可以顺利成功搭建。大家可以利用CentOS 6.5进行尝试,如果有问题,再利用ubuntu 10.04搭建,所有步骤基本一致

1. 操作系统环境准备

(1)安装VMWare

  1. 下载地址:http://pan.baidu.com/s/1bniBipD
  2. 密码:pbdw
  3. 安装过程略

(2)下载操作系统并安装

Ubuntu 10.04操作系统下载地址:

链接:http://pan.baidu.com/s/1kTy9Umj 密码:2w5b

CentOS 6.5下载地址:

  1. 下载地址:http://pan.baidu.com/s/1mgkuKdi
  2. 密码:xtm5

本实验要求装三台:CentOS 6.5,可以分别安装,也可以安装完一台后克隆两台,具体过程略。初学者,建议三台分别安装。安装后如下图所示: 
这里写图片描述

(3)CentOS 6.5网络配置

安装好的虚拟机一般默认使用的是NAT(关于NAT、桥接等虚拟机网络连接方式参见本人博客:http://blog.csdn.net/lovehuangjiaju/article/details/48183485),由于三台机器之间需要互通之外,还需要与本机连通,因此采用将网络连接方式设置为Bridged(三台机器相同的设置),如下图所法: 
这里写图片描述

修改主机名

(1)修改centos_salve01虚拟机主机名:

vim /etc/sysconfig/network

/etc/sysconfig/network修改后的内容如下: 
这里写图片描述

(2)vim /etc/sysconfig/network命令修改centos_slave02虚拟机主机名 
/etc/sysconfig/network修改后的内容如下: 
这里写图片描述

(3)vim /etc/sysconfig/network命令修改centos_slave03虚拟机主机名 
/etc/sysconfig/network修改后的内容如下: 
这里写图片描述

修改主机IP地址

在大家在配置时,修改/etc/sysconfig/network-scripts/ifcfg-eth0文件对应的BOOTPROT=static、IPADDR、NETMASK、GATEWAY及DNS1信息即可

(1)修改centos_salve01虚拟机主机IP地址:

vim /etc/sysconfig/network-scripts/ifcfg-eth0

修改后内容如下:

  1. DEVICE="eth0"
  2. BOOTPROTO="static"
  3. HWADDR="00:0c:29:3f:69:4d"
  4. IPV6INIT="yes"
  5. NM_CONTROLLED="yes"
  6. ONBOOT="yes"
  7. TYPE="Ethernet"
  8. UUID="5315276c-db0d-4061-9c76-9ea86ba9758e"
  9. IPADDR="192.168.1.111"
  10. NETMASK="255.255.255.0"
  11. GATEWAY="192.168.1.1"
  12. DNS1="8.8.8.8"

这里写图片描述
(2)修改centos_salve02虚拟机主机IP地址:

vim /etc/sysconfig/network-scripts/ifcfg-eth0

修改后内容如下:

  1. DEVICE="eth0"
  2. BOOTPROTO="static"
  3. HWADDR="00:0c:29:64:f9:80"
  4. IPV6INIT="yes"
  5. NM_CONTROLLED="yes"
  6. ONBOOT="yes"
  7. TYPE="Ethernet"
  8. UUID="5315276c-db0d-4061-9c76-9ea86ba9758e"
  9. IPADDR="192.168.1.112"
  10. NETMASK="255.255.255.0"
  11. GATEWAY="192.168.1.1"
  12. DNS1="8.8.8.8"

这里写图片描述

(3)修改centos_salve03虚拟机主机IP地址:

vim /etc/sysconfig/network-scripts/ifcfg-eth0

修改后内容如下:

  1. DEVICE="eth0"
  2. BOOTPROTO="static"
  3. HWADDR="00:0c:29:1e:80:b1"
  4. IPV6INIT="yes"
  5. NM_CONTROLLED="yes"
  6. ONBOOT="yes"
  7. TYPE="Ethernet"
  8. UUID="5315276c-db0d-4061-9c76-9ea86ba9758e"
  9. IPADDR="192.168.1.113"
  10. NETMASK="255.255.255.0"
  11. GATEWAY="192.168.1.1"
  12. DNS1="8.8.8.8"

这里写图片描述

/etc/sysconfig/network-scripts/ifcfg-eth0文件内容解析:

  1. DEVICE=eth0 //指出设备名称
  2. BOOTPROT=static //启动类型 dhcp|static,使用桥接模式,必须是static
  3. HWADDR=00:06:5B:FE:DF:7C //硬件Mac地址
  4. IPADDR=192.168.0.2 //IP地址
  5. NETMASK=255.255.255.0 //子网掩码
  6. NETWORK=192.168.0.0 //网络地址
  7. GATEWAY=192.168.0.1 //网关地址
  8. ONBOOT=yes //是否启动应用
  9. TYPE=Ethernet //网络类型

设置完成后,使用

service network restart

命令重新启动网络,配置即可生效。

设置主机名与IP地址映射

(1)修改centos_salve01主机名与IP地址映射

vim /etc/hosts

设置内容如下:

  1. 127.0.0.1 slave01.example.com localhost localhost.localdomain localhost4 localhost4.localdomain4
  2. ::1 slave01.example.com
  3. 192.168.1.111 slave01.example.com
  4. 192.168.1.112 slave02.example.com
  5. 192.168.1.113 slave03.example.com
  6. 具体如下图:

这里写图片描述

(2)修改centos_salve02主机名与IP地址映射

vim /etc/hosts

设置内容如下:

  1. 127.0.0.1 slave02.example.com localhost localhost.localdomain localhost4 localhost4.localdomain4
  2. ::1 slave02.example.com
  3. 192.168.1.111 slave01.example.com
  4. 192.168.1.112 slave02.example.com
  5. 192.168.1.113 slave03.example.com

具体如下图: 
这里写图片描述

(3)修改centos_salve03主机名与IP地址映射

vim /etc/hosts

设置内容如下:

  1. 127.0.0.1 slave03.example.com localhost localhost.localdomain localhost4 localhost4.localdomain4
  2. ::1 slave03.example.com
  3. 192.168.1.111 slave01.example.com
  4. 192.168.1.112 slave02.example.com
  5. 192.168.1.113 slave03.example.com

这里写图片描述

修改主机DNS

采用下列命令设置各主机DNS(三台机器进行相同的设置)

vim /etc/resolv.conf 

设置后的内容:

  1. # Generated by NetworkManager
  2. search example.com
  3. nameserver 8.8.8.8

8.8.8.8为Google提供的DNS服务器

网络连通测试

前面所有的配置完成后,重启centos_salve01、centos_salve02、centos_salve03使主机名设置生效,然后分别在三台机器上作如下测试命令: 
下面只给出在centos_salve01虚拟机上的测试

  1. [root@slave01 ~]# ping slave02.example.com
  2. PING slave02.example.com (192.168.1.112) 56(84) bytes of data.
  3. 64 bytes from slave02.example.com (192.168.1.112): icmp_seq=1 ttl=64 time=0.417 ms
  4. 64 bytes from slave02.example.com (192.168.1.112): icmp_seq=2 ttl=64 time=0.355 ms
  5. 64 bytes from slave02.example.com (192.168.1.112): icmp_seq=3 ttl=64 time=0.363 ms
  6. ^C
  7. --- slave02.example.com ping statistics ---
  8. 3 packets transmitted, 3 received, 0% packet loss, time 2719ms
  9. rtt min/avg/max/mdev = 0.355/0.378/0.417/0.031 ms
  10. [root@slave01 ~]# ping slave03.example.com
  11. PING slave03.example.com (192.168.1.113) 56(84) bytes of data.
  12. 64 bytes from slave03.example.com (192.168.1.113): icmp_seq=1 ttl=64 time=0.386 ms
  13. 64 bytes from slave03.example.com (192.168.1.113): icmp_seq=2 ttl=64 time=0.281 ms
  14. ^C
  15. --- slave03.example.com ping statistics ---
  16. 2 packets transmitted, 2 received, 0% packet loss, time 1799ms
  17. rtt min/avg/max/mdev = 0.281/0.333/0.386/0.055 ms

测试外网的连通性(我在装的时候,8.8.8.8,已经被禁用….心中一万头cnm):

  1. [root@slave01 ~]# ping www.baidu.com
  2. ping: unknown host www.baidu.com
  3. [root@slave01 ~]# ping 8.8.8.8
  4. PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
  5. From 192.168.1.111 icmp_seq=2 Destination Host Unreachable
  6. From 192.168.1.111 icmp_seq=3 Destination Host Unreachable
  7. From 192.168.1.111 icmp_seq=4 Destination Host Unreachable
  8. From 192.168.1.111 icmp_seq=6 Destination Host Unreachable
  9. From 192.168.1.111 icmp_seq=7 Destination Host Unreachable
  10. From 192.168.1.111 icmp_seq=8 Destination Host Unreachable

(4)SSH完密码登录

#### (1) OpenSSH安装 
如果大家在配置时,ping 8.8.8.8能够ping通,则主机能够正常上网;如果不能上网,则将网络连接方式重新设置为NAT,并修改网络配置文件为dhcp方式。在保证网络连通的情况下执行下列命令:

yum install openssh-server

#### (2) 无密码登录实现

使用以下命令生成相应的密钥(三台机器进行相同的操作)

ssh-keygen -t rsa 

执行过程一直回车即可

  1. [root@slave01 ~]# ssh-keygen -t rsa
  2. Generating public/private rsa key pair.
  3. Enter file in which to save the key (/root/.ssh/id_rsa):
  4. Enter passphrase (empty for no passphrase):
  5. Enter same passphrase again:
  6. Your identification has been saved in /root/.ssh/id_rsa.
  7. Your public key has been saved in /root/.ssh/id_rsa.pub.
  8. The key fingerprint is:
  9. 4e:2f:39:ed:f4:32:2e:a3:55:62:f5:8a:0d:c5:2c:16 root@slave01.example.com
  10. The key's randomart image is:
  11. +--[ RSA 2048]----+
  12. | E |
  13. | + |
  14. | o = |
  15. | . + . |
  16. | S . . |
  17. | + X . |
  18. | B * |
  19. | .o=o. |
  20. | .. +oo. |
  21. +-----------------+

生成的文件分别为/root/.ssh/id_rsa(私钥)、/root/.ssh/id_rsa.pub(公钥)

完成后将公钥拷贝到要免登陆的机器上(三台可进行相同操作):

  1. ssh-copy-id -i slave01.example.com
  2. ssh-copy-id -i slave02.example.com
  3. ssh-copy-id -i slave03.example.com

2. Hadoop 2.4.1集群搭建

集群搭建相关软件下载地址:

链接:http://pan.baidu.com/s/1sjIG3b3 密码:38gh

下载后将所有软件都放置在E盘的share目录下: 
这里写图片描述

设置share文件夹为虚拟机的共享目录,如下图所示: 
这里写图片描述

在linux系统中,采用

  1. [root@slave01 /]# cd /mnt/hgfs/share
  2. [root@slave01 share]# ls

命令可以切换到该目录下,如下图 
这里写图片描述

Spark官方要求的JDK、Scala版本

Spark runs on Java 7+, Python 2.6+ and R 3.1+. For the Scala API, Spark 1.5.0 uses Scala 2.10. You will need to use a compatible Scala version (2.10.x).

(1)JDK 1.8 安装

在根目录下创建sparkLearning目前,后续所有相关软件都放置在该目录下,代码如下:

  1. [root@slave01 /]# mkdir /sparkLearning
  2. [root@slave01 /]# ls
  3. bin etc lib media proc selinux sys var
  4. boot hadoopLearning lib64 mnt root sparkLearning tmp
  5. dev home lost+found opt sbin srv usr

将共享目录中的jdk安装包复制到/sparkLearning目录

  1. [root@slave01 share]# cp /mnt/hgfs/share/jdk-8u40-linux-x64.gz /sparkLearning/
  2. [root@slave01 share]# cd /sparkLearning/
  3. //解压
  4. [root@slave01 sparkLearning]# tar -zxvf jdk-8u40-linux-x64.gz

设置环境变量:

[root@slave01 sparkLearning]# vim /etc/profile

在文件最后添加:

  1. export JAVA_HOME=/sparkLearning/jdk1.8.0_40
  2. export PATH=${JAVA_HOME}/bin:$PATH

如下图: 
这里写图片描述

测试配置是否成功:

  1. //使修改后的配置生效
  2. [root@slave01 sparkLearning]# source /etc/profile
  3. //环境变量是否已经设置
  4. [root@slave01 sparkLearning]# $JAVA_HOME
  5. bash: /sparkLearning/jdk1.8.0_40: is a directory
  6. //测试java是否安装配置成功
  7. [root@slave01 sparkLearning]# java -version
  8. java version "1.8.0_40"
  9. Java(TM) SE Runtime Environment (build 1.8.0_40-b25)
  10. Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)

(2)Scala 2.10.4 安装

  1. //复制文件到sparkLearning目录下
  2. [root@slave01 sparkLearning]# cp /mnt/hgfs/share/scala-2.10.4.tgz .
  3. //解压
  4. [root@slave01 sparkLearning]# tar -zxvf scala-2.10.4.tgz > /dev/null
  5. [root@slave01 sparkLearning]# vim /etc/profile

将/etc/profile文件末尾内容修改如下:

  1. export JAVA_HOME=/sparkLearning/jdk1.8.0_40
  2. export SCALA_HOME=/sparkLearning/scala-2.10.4
  3. export PATH=${JAVA_HOME}/bin:${SCALA_HOME}/bin:$PATH

测试Scala是否安装成功

  1. [root@slave01 sparkLearning]# source /etc/profile
  2. [root@slave01 sparkLearning]# $SCALA_HOME
  3. bash: /sparkLearning/scala-2.10.4: is a directory
  4. [root@slave01 sparkLearning]# scala -version
  5. Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL

(3)Zookeeper-3.4.5 集群搭建

  1. [root@slave01 sparkLearning]# cp /mnt/hgfs/share/zookeeper-3.4.5.tar.gz .
  2. [root@slave01 sparkLearning]# tar -zxvf zookeeper-3.4.5.tar.gz > /dev/null
  3. [root@slave01 sparkLearning]# cp zookeeper-3.4.5/conf/zoo_sample.cfg zoo.cfg
  4. [root@slave01 sparkLearning]# vim zoo.cfg

修改dataDir为:

dataDir=/sparkLearning/zookeeper-3.4.5/zookeeper_data

在文件末尾添加如下内容:

  1. server.1=slave01.example.com:2888:3888
  2. server.2=slave02.example.com:2888:3888
  3. server.3=slave03.example.com:2888:3888

如图所示: 
这里写图片描述

这里写图片描述

创建ZooKeeper集群数据保存目录

  1. [root@slave01 sparkLearning]# cd zookeeper-3.4.5/
  2. [root@slave01 zookeeper-3.4.5]# mkdir zookeeper_data
  3. [root@slave01 zookeeper-3.4.5]# cd zookeeper_data/
  4. [root@slave01 zookeeper_data]# touch myid
  5. [root@slave01 zookeeper_data]# echo 1 > myid

将slave01.example.com(centos_slave01)上的sparkLearning目录拷贝到另外两台服务器上:

  1. [root@slave01 /]# scp -r /sparkLearning slave02.example.com:/
  2. [root@slave01 /]# scp -r /sparkLearning slave03.example.com:/

/etc/profile文件也进行覆盖

  1. [root@slave01 /]# scp /etc/profile slave02.example.com:/etc/profile
  2. [root@slave01 /]# scp /etc/profile slave03.example.com:/etc/profile

修改zookeeper_data中的myid信息:

  1. //配置slave02.example.com上的myid
  2. [root@slave01 /]# ssh salve02.example.com
  3. [root@slave02 ~]# echo 2 > /sparkLearning/zookeeper-3.4.5/zookeeper_data/myid
  4. [root@slave02 ~]# more /sparkLearning/zookeeper-3.4.5/zookeeper_data/myid
  5. 2
  6. //配置slave03.example.com上的myid
  7. [root@slave02 ~]# ssh slave03.example.com
  8. Last login: Fri Sep 18 01:33:29 2015 from slave01.example.com
  9. [root@slave03 ~]# echo 3 > /sparkLearning/zookeeper-3.4.5/zookeeper_data/myid
  10. [root@slave03 ~]# more /sparkLearning/zookeeper-3.4.5/zookeeper_data/myid
  11. 3

如此便完成配置,下面对集群进行测试:

  1. //在slave03.example.com主机上
  2. [root@slave03 ~]# cd /sparkLearning/zookeeper-3.4.5/bin
  3. [root@slave03 bin]# ls
  4. README.txt zkCli.cmd zkEnv.cmd zkServer.cmd
  5. zkCleanup.sh zkCli.sh zkEnv.sh zkServer.sh
  6. //启动slave03.example.com上的ZooKeeper
  7. [root@slave03 bin]# ./zkServer.sh start
  8. JMX enabled by default
  9. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  10. Starting zookeeper ... STARTED
  11. [root@slave03 bin]# ./zkServer.sh status
  12. JMX enabled by default
  13. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  14. Mode: leader
  15. //在slave02.example.com主机上
  16. [root@slave02 bin]# ./zkServer.sh start
  17. JMX enabled by default
  18. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  19. Starting zookeeper ... STARTED
  20. //查看zookeeper集群状态,如果Mode显示为follower或leader则表明配置成功
  21. [root@slave02 bin]# ./zkServer.sh status
  22. JMX enabled by default
  23. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  24. Mode: follower
  25. //在slave01.example.com主机上
  26. [root@slave01 bin]# ./zkServer.sh start
  27. JMX enabled by default
  28. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  29. Starting zookeeper ... STARTED
  30. [root@slave01 bin]# ./zkServer.sh status
  31. JMX enabled by default
  32. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  33. Mode: follower
  34. //在slave03.example.com主机上zookeeper状态
  35. [root@slave03 bin]# ./zkServer.sh status
  36. JMX enabled by default
  37. Using config: /sparkLearning/zookeeper-3.4.5/bin/../conf/zoo.cfg
  38. Mode: leader

(4)Hadoop 2.4.1 集群搭建

(1)Hadoop 2.4.1基本目录浏览
  1. root@slave01 bin]# cp /mnt/hgfs/share/hadoop-2.4.1.tar.gz /sparkLearning/
  2. [root@slave01 bin]# cd /sparkLearning/
  3. [root@slave01 sparkLearning]# tar -zxvf hadoop-2.4.1.tar.gz > /dev/null
  4. [root@slave01 sparkLearning]# cd hadoop-2.4.1
  5. [root@slave01 hadoop-2.4.1]# ls
  6. bin include libexec NOTICE.txt sbin
  7. etc lib LICENSE.txt README.txt share
  8. cd
  9. [root@slave01 hadoop-2.4.1]# cd etc/hadoop/
  10. [root@slave01 hadoop]# ls
  11. capacity-scheduler.xml hdfs-site.xml mapred-site.xml.template
  12. configuration.xsl httpfs-env.sh slaves
  13. container-executor.cfg httpfs-log4j.properties ssl-client.xml.example
  14. core-site.xml httpfs-signature.secret ssl-server.xml.example
  15. hadoop-env.cmd httpfs-site.xml yarn-env.cmd
  16. hadoop-env.sh log4j.properties yarn-env.sh
  17. hadoop-metrics2.properties mapred-env.cmd yarn-site.xml
  18. hadoop-metrics.properties mapred-env.sh
  19. hadoop-policy.xml mapred-queues.xml.template
(2)将Hadoop 2.4.1添加到环境变量

使用命令:vim /etc/profile 将环境变量信息修改如下:

  1. export JAVA_HOME=/sparkLearning/jdk1.8.0_40
  2. export SCALA_HOME=/sparkLearning/scala-2.10.4
  3. export HADOOP_HOME=/sparkLearning/hadoop-2.4.1
  4. export PATH=${JAVA_HOME}/bin:${SCALA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
(3)将Hadoop 2.4.1添加到环境变量

使用命令:vim hadoop-env.sh 将环境变量信息修改如下,在export JAVA_HOME修改为:

export JAVA_HOME=/sparkLearning/jdk1.8.0_40

这里写图片描述

(4)修改core-site.xml文件

利用vim core-site.xml命令,文件内容如下:

  1. <configuration>
  2. <!-- 指定hdfs的nameservice为ns1 -->
  3. <property>
  4. <name>fs.defaultFS</name>
  5. <value>hdfs://ns1</value>
  6. </property>
  7. <!-- 指定hadoop临时目录 -->
  8. <property>
  9. <name>hadoop.tmp.dir</name>
  10. <value>/sparkLearning/hadoop-2.4.1/tmp</value>
  11. </property>
  12. <!-- 指定zookeeper地址 -->
  13. <property>
  14. <name>ha.zookeeper.quorum</name>
  15. <value>slave01.example.com:2181,slave02.example.com:2181,slave03.example.com:2181</value>
  16. </property>
  17. </configuration>
(5)修改hdfs-site.xml文件

vim hdfs-site.xml内容如下:

  1. <configuration>
  2. <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->
  3. <property>
  4. <name>dfs.nameservices</name>
  5. <value>ns1</value>
  6. </property>
  7. <!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
  8. <property>
  9. <name>dfs.ha.namenodes.ns1</name>
  10. <value>nn1,nn2</value>
  11. </property>
  12. <!-- nn1的RPC通信地址 -->
  13. <property>
  14. <name>dfs.namenode.rpc-address.ns1.nn1</name>
  15. <value>slave01.example.com:9000</value>
  16. </property>
  17. <!-- nn1的http通信地址 -->
  18. <property>
  19. <name>dfs.namenode.http-address.ns1.nn1</name>
  20. <value>slave01.example.com:50070</value>
  21. </property>
  22. <!-- nn2的RPC通信地址 -->
  23. <property>
  24. <name>dfs.namenode.rpc-address.ns1.nn2</name>
  25. <value>slave02.example.com:9000</value>
  26. </property>
  27. <!-- nn2的http通信地址 -->
  28. <property>
  29. <name>dfs.namenode.http-address.ns1.nn2</name>
  30. <value>slave02.example.com:50070</value>
  31. </property>
  32. <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
  33. <property>
  34. <name>dfs.namenode.shared.edits.dir</name>
  35. <value>qjournal://slave01.example.com:8485;slave02.example.com:8485;slave03.example.com:8485/ns1</value>
  36. </property>
  37. <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
  38. <property>
  39. <name>dfs.journalnode.edits.dir</name>
  40. <value>/sparkLearning/hadoop-2.4.1/journal</value>
  41. </property>
  42. <!-- 开启NameNode失败自动切换 -->
  43. <property>
  44. <name>dfs.ha.automatic-failover.enabled</name>
  45. <value>true</value>
  46. </property>
  47. <!-- 配置失败自动切换实现方式 -->
  48. <property>
  49. <name>dfs.client.failover.proxy.provider.ns1</name>
  50. <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  51. </property>
  52. <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
  53. <property>
  54. <name>dfs.ha.fencing.methods</name>
  55. <value>
  56. sshfence
  57. shell(/bin/true)
  58. </value>
  59. </property>
  60. <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
  61. <property>
  62. <name>dfs.ha.fencing.ssh.private-key-files</name>
  63. <value>/home/hadoop/.ssh/id_rsa</value>
  64. </property>
  65. <!-- 配置sshfence隔离机制超时时间 -->
  66. <property>
  67. <name>dfs.ha.fencing.ssh.connect-timeout</name>
  68. <value>30000</value>
  69. </property>
  70. </configuration>
(4)修改mapred-site.xml文件
[root@slave01 hadoop]# cp mapred-site.xml.template mapred-site.xml

vim mapred-site.xml修改文件内容如下:

  1. <configuration>
  2. <!-- 指定mr框架为yarn方式 -->
  3. <property>
  4. <name>mapreduce.framework.name</name>
  5. <value>yarn</value>
  6. </property>
  7. </configuration>
(6)修改yarn-site.xml文件
  1. <?xml version="1.0"?>
  2. <!--
  3. Licensed under the Apache License, Version 2.0 (the "License");
  4. you may not use this file except in compliance with the License.
  5. You may obtain a copy of the License at
  6. http://www.apache.org/licenses/LICENSE-2.0
  7. Unless required by applicable law or agreed to in writing, software
  8. distributed under the License is distributed on an "AS IS" BASIS,
  9. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  10. See the License for the specific language governing permissions and
  11. limitations under the License. See accompanying LICENSE file.
  12. -->
  13. <configuration>
  14. <!-- 开启RM高可靠 -->
  15. <property>
  16. <name>yarn.resourcemanager.ha.enabled</name>
  17. <value>true</value>
  18. </property>
  19. <!-- 指定RM的cluster id -->
  20. <property>
  21. <name>yarn.resourcemanager.cluster-id</name>
  22. <value>SparkCluster</value>
  23. </property>
  24. <!-- 指定RM的名字 -->
  25. <property>
  26. <name>yarn.resourcemanager.ha.rm-ids</name>
  27. <value>rm1,rm2</value>
  28. </property>
  29. <!-- 分别指定RM的地址 -->
  30. <property>
  31. <name>yarn.resourcemanager.hostname.rm1</name>
  32. <value>slave01.example.com</value>
  33. </property>
  34. <property>
  35. <name>yarn.resourcemanager.hostname.rm2</name>
  36. <value>slave02.example.com</value>
  37. </property>
  38. <!-- 指定zk集群地址 -->
  39. <property>
  40. <name>yarn.resourcemanager.zk-address</name>
  41. <value>
  42. </value>
  43. </property>
  44. <property>
  45. <name>yarn.nodemanager.aux-services</name>
  46. <value>mapreduce_shuffle</value>
  47. </property>
  48. </configuration>
(7)修改slaves文件
  1. slave01.example.com
  2. slave02.example.com
  3. slave03.example.com
(8)配置文件拷贝到其它服务器
  1. //slave01.example.com上的配置文件拷贝到slave02.example.com
  2. [root@slave01 hadoop]# scp -r /etc/profile slave02.example.com:/etc/profile
  3. profile 100% 2027 2.0KB/s 00:00
  4. [root@slave01 hadoop]# scp -r /sparkLearning/hadoop-2.4.1 slave02.example.com:/sparkLearning/
  5. //slave01.example.com上的配置文件拷贝到slave03.example.com
  6. [root@slave01 hadoop]# scp -r /etc/profile slave03.example.com:/etc/profile
  7. profile 100% 2027 2.0KB/s 00:00
  8. [root@slave01 hadoop]# scp -r /sparkLearning/hadoop-2.4.1 slave03.example.com:/sparkLearning/
(9)启动journalnode
  1. //使用下列命令启动journalnode
  2. [root@slave01 hadoop]# hadoop-daemons.sh start journalnode
  3. slave02.example.com: starting journalnode, logging to /sparkLearning/hadoop-2.4.1/logs/hadoop-root-journalnode-slave02.example.com.out
  4. slave03.example.com: starting journalnode, logging to /sparkLearning/hadoop-2.4.1/logs/hadoop-root-journalnode-slave03.example.com.out
  5. slave01.example.com: starting journalnode, logging to /sparkLearning/hadoop-2.4.1/logs/hadoop-root-journalnode-slave01.example.com.out
  6. //JournalNode进程存在,启动成功
  7. [root@slave01 hadoop]# jps
  8. 11261 JournalNode
  9. 11295 Jps
  10. [root@slave01 hadoop]# ssh slave02.example.com
  11. Last login: Fri Sep 18 05:33:05 2015 from slave01.example.com
  12. [root@slave02 ~]# jps
  13. 6598 JournalNode
  14. 6795 Jps
  15. [root@slave02 ~]# ssh slave03.example.com
  16. Last login: Fri Sep 18 05:33:26 2015 from slave02.example.com
  17. [root@slave03 ~]# jps
  18. 5876 JournalNode
  19. 6047 Jps
  20. [root@slave03 ~]#
(10)格式化HDFS

登录slave02.example.com服务器,执行下列命令

  1. [root@slave02 ~]# hdfs namenode -format
  2. //下面是执行结果
  3. 15/09/18 06:05:26 INFO namenode.NameNode: STARTUP_MSG:
  4. /************************************************************
  5. STARTUP_MSG: Starting NameNode
  6. STARTUP_MSG: host = slave02.example.com/127.0.0.1
  7. STARTUP_MSG: args = [-format]
  8. STARTUP_MSG: version = 2.4.1
  9. STARTUP_MSG: classpath = /sparkLearning/hadoop-2.4.1/etc/hadoop:/sparkLearning/hadoop-........省略无关信息...............
  10. STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common -r 1604318; compiled by 'jenkins' on 2014-06-21T05:43Z
  11. STARTUP_MSG: java = 1.8.0_40
  12. .....................................................省略.....
  13. /sparkLearning/hadoop-2.4.1/tmp/dfs/name has been successfully formatted.
  14. 15/09/18 06:05:30 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
  15. 15/09/18 06:05:30 INFO util.ExitUtil: Exiting with status 0
  16. 15/09/18 06:05:30 INFO namenode.NameNode: SHUTDOWN_MSG:
  17. /************************************************************
  18. SHUTDOWN_MSG: Shutting down NameNode at slave02.example.com/127.0.0.1
  19. ************************************************************/
(11)格式化HDFS信息复制到slave03.example.com服务器
  1. [root@slave02 ~]# scp -r /sparkLearning/hadoop-2.4.1/tmp/ slave01.example.com:/sparkLearning/hadoop-2.4.1/
  2. fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
  3. seen_txid 100% 2 0.0KB/s 00:00
  4. fsimage_0000000000000000000 100% 350 0.3KB/s 00:00
  5. VERSION 100% 200 0.2KB/s 00:00
(12)格式化ZK(在slave02.example.com上执行即可)
  1. [root@slave02 hadoop]# hdfs zkfc -formatZK
  2. Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /sparkLearning/hadoop-2.4.1/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
  3. ......省略无关信息...............
  4. //执行成功
  5. 15/09/18 06:14:22 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.
  6. 15/09/18 06:14:22 INFO zookeeper.ZooKeeper: Session: 0x34fe096c3ca0000 closed
  7. 15/09/18 06:14:22 INFO zookeeper.ClientCnxn: EventThread shut down
(13)启动HDFS(在slave02.example.com上执行)
  1. [root@slave02 hadoop]# start-dfs.sh
  2. [root@slave02 hadoop]# jps
  3. 7714 QuorumPeerMain
  4. 6598 JournalNode
  5. 8295 DataNode
  6. 8202 NameNode
  7. 8716 Jps
  8. 8574 DFSZKFailoverController
  9. [root@slave02 hadoop]# ssh slave01.example.com
  10. Last login: Thu Aug 27 06:24:16 2015 from slave01.example.com
  11. [root@slave01 ~]# jps
  12. 13744 DataNode
  13. 13681 NameNode
  14. 11862 QuorumPeerMain
  15. 14007 Jps
  16. 13943 DFSZKFailoverController
  17. 13851 JournalNode
  18. [root@slave03 ~]# jps
  19. 5876 JournalNode
  20. 7652 Jps
  21. 7068 DataNode
  22. 6764 QuorumPeerMain
(14)启动YARN(在slave01.example.com上执行)
  1. //slave01.example.com
  2. [root@slave01 ~]# start-yarn.sh
  3. ...输出省略.....
  4. [root@slave01 ~]# jps
  5. 14528 Jps
  6. 13744 DataNode
  7. 13681 NameNode
  8. 14228 NodeManager
  9. 11862 QuorumPeerMain
  10. 13943 DFSZKFailoverController
  11. 14138 ResourceManager
  12. 13851 JournalNode
  1. //slave02.example.com
  2. [root@slave02 ~]# jps
  3. 11216 Jps
  4. 10656 JournalNode
  5. 7714 QuorumPeerMain
  6. 11010 NodeManager
  7. 10427 DataNode
  8. 10844 DFSZKFailoverController
  9. 10334 NameNode
  1. //slave03.example.com
  2. [root@slave03 ~]# jps
  3. 8610 JournalNode
  4. 8791 NodeManager
  5. 8503 DataNode
  6. 9001 Jps
  7. 6764 QuorumPeerMain
(15)查看hadoop运行管理界面

打开浏览器,输入http://slave01.example.com:8088/,可以得到hadoop集群管理界面: 
这里写图片描述

输入http://slave01.example.com:50070 可以得到HDFS管理界面 
这里写图片描述

至此Hadoop集群配置成功

3. Spark 1.5.0 集群部署

(1)将Spark添加到环境变量
  1. [root@slave01 hadoop]# cp /mnt/hgfs/share/spark-1.5.0-bin-hadoop2.4.tgz /sparkLearning/
  2. [root@slave01 sparkLearning]# tar -zxvf spark-1.5.0-bin-hadoop2.4.tgz > /dev/null
  3. [root@slave01 sparkLearning]# vim /etc/profile

将/etc/profile内容修改如下:

  1. export JAVA_HOME=/sparkLearning/jdk1.8.0_40
  2. export SCALA_HOME=/sparkLearning/scala-2.10.4
  3. export HADOOP_HOME=/sparkLearning/hadoop-2.4.1
  4. export SPARK_HOME=/sparkLearning/spark-1.5.0-bin-hadoop2.4
  5. export PATH=${JAVA_HOME}/bin:${SCALA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${SPARK_HOME}/bin:${SPARK_HOME}/sbin:$PATH
(2)将Spark添加到环境变量
  1. [root@slave01 sparkLearning]# cd spark-1.5.0-bin-hadoop2.4/conf
  2. [root@slave01 conf]# ls
  3. docker.properties.template metrics.properties.template spark-env.sh.template
  4. fairscheduler.xml.template slaves.template
  5. log4j.properties.template spark-defaults.conf.template
  6. //复制模板文件
  7. [root@slave01 conf]# cp spark-env.sh.template spark-env.sh
  8. [root@slave01 conf]# vim spark-env.sh

在spark-env.sh文件中添加如下内容:

  1. export JAVA_HOME=/sparkLearning/jdk1.8.0_40
  2. export SCALA_HOME=/sparkLearning/scala-2.10.4
  3. export HADOOP_CONF_DIR=/sparkLearning/hadoop-2.4.1/etc/hadoop
  1. [root@slave01 conf]# cp slaves.template slaves
  2. [root@slave01 conf]# vim slaves

slaves文件内容如下:

  1. # A Spark Worker will be started on each of the machines listed below.
  2. slave01.example.com
  3. slave02.example.com
  4. slave03.example.com
(3)将配置信息复制到其它服务器
  1. [root@slave01 sparkLearning]# scp /etc/profile slave02.example.com:/etc/profile
  2. profile 100% 2123 2.1KB/s 00:00
  3. [root@slave01 sparkLearning]# scp /etc/profile slave03.example.com:/etc/profile
  4. profile 100% 2123 2.1KB/s 00:00
  5. [root@slave01 sparkLearning]# vim /etc/profile
  6. [root@slave01 sparkLearning]# scp -r spark-1.5.0-bin-hadoop2.4 slave02.example.com:/sparkLearning/
  7. ...执行过程省略.....
  8. [root@slave01 sparkLearning]# scp -r spark-1.5.0-bin-hadoop2.4 slave03.example.com:/sparkLearning/
  9. ...执行过程省略.....
(4)启动Spark集群

因为本人机器上装了Ambari Server,占用了8080端口,而Spark Master默认端是8080,因此将sbin/start-master.sh中的SPARK_MASTER_WEBUI_PORT修改为8888

  1. if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then
  2. SPARK_MASTER_WEBUI_PORT=8888
  3. fi
  1. [root@slave01 sbin]# ./start-all.sh
  2. starting org.apache.spark.deploy.master.Master, logging to /sparkLearning/spark-1.5.0-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-slave01.example.com.out
  3. slave03.example.com: starting org.apache.spark.deploy.worker.Worker, logging to /sparkLearning/spark-1.5.0-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave03.example.com.out
  4. slave02.example.com: starting org.apache.spark.deploy.worker.Worker, logging to /sparkLearning/spark-1.5.0-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave02.example.com.out
  5. slave01.example.com: starting org.apache.spark.deploy.worker.Worker, logging to /sparkLearning/spark-1.5.0-bin-hadoop2.4/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-slave01.example.com.out
  6. [root@slave01 sbin]# jps
  7. 13744 DataNode
  8. 13681 NameNode
  9. 14228 NodeManager
  10. 16949 Master
  11. 11862 QuorumPeerMain
  12. 13943 DFSZKFailoverController
  13. 14138 ResourceManager
  14. 13851 JournalNode
  15. 17179 Jps
  16. 17087 Worker

浏览器中输入slave01.example.com:8888 
这里写图片描述
但是在启动过程中出现了错误,查看日志文件

[root@slave02 logs]# more spark-root-org.apache.spark.deploy.worker.Worker-1-slave02.example.com.out

日志内容中包括下列错误:

  1. akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://
  2. sparkMaster@slave01.example.com:7077/), Path(/user/Master)]
  3. at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.
  4. scala:65)
  5. at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.
  6. scala:63)
  7. at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
  8. at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExe
  9. cutor.scala:55)
  10. at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:73)
  11. at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatched
  12. Execute(Future.scala:74)
  13. at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:1
  14. 20)
  15. at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(F
  16. uture.scala:73)
  17. at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala
  18. :40)
  19. at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scal
  20. a:248)
  21. at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:266)
  22. at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:533)
  23. at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:569)
  24. .....省略.....................

没找到具体原因,在ubuntu 10.04服务器上进行相同的配置,集群搭建却成功了(心中一万头…..),运行界面如下: 
这里写图片描述

(5)测试Spark集群

采用下列命上传spark-1.5.0-bin-hadoop2.4目录下的README.md文件到相应的根目录。

 hadoop dfs -put README.md 

如下图: 
这里写图片描述

进入/spark-1.5.0-bin-hadoop2.4/bin目录,启动./spark-shell,如下图所示: 
这里写图片描述

执行REDME.md文件的wordcount操作:

scala> val textCount = sc.textFile(“README.md”).filter(line => line.contains(“Spark”)).count()

如下图: 
这里写图片描述

执行结果如下图: 
这里写图片描述

至此,Spark 1.5集群搭建成功。


声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/神奇cpp/article/detail/756949
推荐阅读
相关标签
  

闽ICP备14008679号