赞
踩
如果使用swarm来构建Hadoop、Spark之类的集群,一个绕不过去的问题每个容器都需要支持SSH免密互联——因为Hadoop需要。这就需要事先准备可以一键进行集群化部署的SSH镜像。
由于Centos7已经停止维护,官方的镜像源已经不能使用,所以每次pull下来以后都需要更换镜像源,为避免麻烦,我们可以自己构造一个更换清华镜像源的镜像。
新建一个空目录,在其中编辑Dockerfile文件如下。由于build的时候docker会将目录里面的东西统统打包,所以尽量空的目录是有必要的。
- FROM centos:centos7
- RUN sed -e 's|^mirrorlist=|#mirrorlist=|g' \
- -e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=https://mirrors.tuna.tsinghua.edu.cn/centos|g' \
- -i.bak \
- /etc/yum.repos.d/CentOS-*.repo \
- && yum makecache
- CMD ["/bin/bash"]
在Dockerfile同路径下
- [root@pig1 docker]# docker build -t pig/centos7 .
- [+] Building 223.0s (6/6) FINISHED
- => [internal] load build definition from Dockerfile 0.1s
- => => transferring dockerfile: 314B 0.0s
- => [internal] load .dockerignore 0.1s
- => => transferring context: 2B 0.0s
- => [internal] load metadata for docker.io/library/centos:centos7 0.0s
- => [1/2] FROM docker.io/library/centos:centos7 0.0s
- => [2/2] RUN sed -e 's|^mirrorlist=|#mirrorlist=|g' -e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=https://mirrors.tuna.tsinghua.edu.cn/centos|g' 219.6s
- => exporting to image 3.2s
- => => exporting layers 3.2s
- => => writing image sha256:dd9333ee62cd83a0b0db29ac247f9282ab00bd59354074aec28e0d934ffb1677 0.0s
- => => naming to docker.io/pig/centos7 0.0s
- [root@pig1 docker]# docker images
- REPOSITORY TAG IMAGE ID CREATED SIZE
- pig/centos7 latest dd9333ee62cd 12 seconds ago 632MB
- [root@pig1 docker]#

在构建镜像之前,我们先梳理一下SSH免密需要安装和配置的文件:
需要安装的文件:
需要修改的文件:
需要运行的指令:
构建镜像需要准备的文件:
在Dockfile同目录下新建一个hostlist文件,用来预设集群的主机名和IP地址:
- [root@pig1 docker]# python3
- Python 3.6.8 (default, Oct 26 2022, 09:13:21)
- [GCC 8.5.0 20210514 (Red Hat 8.5.0-17)] on linux
- Type "help", "copyright", "credits" or "license" for more information.
- >>>
- >>> infile = open("hostlist","w")
- >>> for i in range(1,16):
- ... infile.write("172.17.0.{:d} pignode{:d}\n".format(i+1,i))
- ...
-
- >>> infile.close()
一个一个敲太麻烦,借用一下python生成一个hostlist,注意172.17.0.2是网关,不能用。
- [root@pig1 docker]# cat hostlist
- 172.17.0.2 pignode1
- 172.17.0.3 pignode2
- 172.17.0.4 pignode3
- 172.17.0.5 pignode4
- 172.17.0.6 pignode5
- 172.17.0.7 pignode6
- 172.17.0.8 pignode7
- 172.17.0.9 pignode8
- 172.17.0.10 pignode9
- 172.17.0.11 pignode10
- 172.17.0.12 pignode11
- 172.17.0.13 pignode12
- 172.17.0.14 pignode13
- 172.17.0.15 pignode14
- 172.17.0.16 pignode15

如前所述,在docker build阶段,诸如更改hosts、启动ssh服务等操作是没有办法执行的,所以需要我们在容器启动时,通过dockfile中CMD、ENTRYPOINT等指示默认加载的启动脚本来进行。
- #!/bin/bash
-
- #1.向/etc/host文件尾部添加IP到主机名映射
-
- #1.1 从/etc/host文件尾部提取已有主机名和IP映射
- ipaddrs=`cat /etc/hosts |grep -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s\w+$'|sed 's/\s[[:alnum:]]\+$//g'`
-
- #1.2 从hostlist文件中提取需要向/etc/hosts文件追加的IP到主机名映射表
- hostlists=`cat /root/hostlist`
-
- #1.3 从hostlist表中排除/etc/hosts中已经存在的IP到主机名映射关系
- for line in $ipaddrs
- do
- hostlists=`echo "${hostlists}"|sed '/'"${line}"'/d'`
- done
-
- #1.4 将剩余不重复的主机名追加到hosts文件尾部
- if [ -n "$hostlists" ]
- then
- echo 将"${hostlists}"添加到/etc/hosts中
- echo $hostlists >> /etc/hosts
- fi
-
- #2. 启动SSH服务,&表示在后台启动
- /sbin/sshd -D &
-
- #3. 因为sshd在后台运行,此处前台程序执行完毕,docker会自行exit
- # 所以在此处需要重新调用/bin/bash,让程序保持在前台
- /bin/bash

试装ssh
把上面构造的pig/centos7容器启动起来一个,执行如下操作:
主要的目的,实在计划构造ssh镜像的同样的环境下先验证一遍将要执行的操作,并获取为客户端生成的公私钥文件,以方便改造
- [root@pig1 docker]# docker run -it --name pig1 --hostname pignode1 --ip 172.17.0.2 pig/centos7 bash
- [root@pignode1 /]# hostname
- pignode1
- [root@pignode1 /]# cat /etc/hosts
- 127.0.0.1 localhost
- ::1 localhost ip6-localhost ip6-loopback
- fe00::0 ip6-localnet
- ff00::0 ip6-mcastprefix
- ff02::1 ip6-allnodes
- ff02::2 ip6-allrouters
- 172.17.0.2 pignode1
- [root@pignode1 /]# yum install openssh openssh-server openssh-clients -y
- Loaded plugins: fastestmirror, ovl
- Loading mirror speeds from cached hostfile
- Resolving Dependencies
- --> Running transaction check
- …………………………
- Installed:
- openssh.x86_64 0:7.4p1-22.el7_9 openssh-clients.x86_64 0:7.4p1-22.el7_9 openssh-server.x86_64 0:7.4p1-22.el7_9
- Dependency Installed:
- fipscheck.x86_64 0:1.4.1-6.el7 fipscheck-lib.x86_64 0:1.4.1-6.el7 libedit.x86_64 0:3.0-12.20121213cvs.el7
- tcp_wrappers-libs.x86_64 0:7.6-77.el7
- Complete!
- [root@pignode1 /]# cd /sbin
- [root@pignode1 sbin]# sshd-keygen
- [root@pignode1 ssh]# /sbin/sshd -D &
- [1] 78
- [root@pignode1 ssh]# ps -au
- USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
- root 1 0.1 0.0 11844 3080 pts/0 Ss 12:18 0:00 bash
- root 78 0.0 0.1 112952 7928 pts/0 S 12:22 0:00 /sbin/sshd -D
- root 79 0.0 0.0 51748 3460 pts/0 R+ 12:22 0:00 ps -au
- [root@pignode1 ssh]# ssh-keygen
- Generating public/private rsa key pair.
- Enter file in which to save the key (/root/.ssh/id_rsa):
- Created directory '/root/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /root/.ssh/id_rsa.
- Your public key has been saved in /root/.ssh/id_rsa.pub.
- …………………………………………
- +----[SHA256]-----+
- [root@pignode1 ssh]# passwd
- New password:
- Retype new password:
- passwd: all authentication tokens updated successfully.
- [root@pignode1 ssh]# ssh-copy-id pignode1
- /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
- /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
- /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
- root@pignode1's password:
- Number of key(s) added: 1
- Now try logging into the machine, with: "ssh 'pignode1'"
- and check to make sure that only the key(s) you wanted were added.
- [root@pignode1 ssh]# cd ~/.ssh
- [root@pignode1 .ssh]# ls
- authorized_keys id_rsa id_rsa.pub known_hosts

伪造公钥文件
如同我们在 CENTO OS上的网络安全工具(二十)ClickHouse swarm容器化集群部署中提到过的扩充公钥的方法,对15各pignode节点都赋予访问密钥,并将.ssh文件夹中除了known_hosts文件的文件都拷贝出来备用。
在准备创建镜像的文件夹下,准备好如下所示文件:
- [root@pig1 docker]# ls -a
- . .. Dockerfile hostlist init-ssh.sh .ssh
-
编写Dockerfile
做好以上准备工作后,可以开始构建SSH的镜像了。编写SSH的Dockerfile如下。主要需要做一下几步工作:
假设(也就是要求)启动容器的时候会指定hostname为我们拷贝的hostlist中的一个,ip为对应的ip;
然后在ENTRYPOINT中,设置启动init-ssh.sh。其中,将hostlist中和当前容器hostname、ip不同的行追加到/etc/hosts的后面,构建整个集群的映射关系;启动sshd服务;返回/bin/bash。之所以不用CMD,是需要避免被容器启动时指定的运行选项屏蔽掉。
- FROM pig/centos7
- COPY init-ssh.sh /root/init-ssh.sh
- COPY hostlist /root/hostlist
- COPY .ssh /root/.ssh
- RUN chmod +x /root/init-ssh.sh \
- && chmod 0400 /root/.ssh/id_rsa \
- && echo 'default123' | passwd --stdin root \
- && yum install openssh openssh-server openssh-clients -y \
- && /sbin/sshd-keygen \
- && echo -e '\nHost *\nStrictHostKeyChecking no\nUserKnownHostsFile=/dev/null' >> etc/ssh/ssh_config
- ENTRYPOINT ["/root/init-ssh.sh"]
编译。如果是虚拟机上测试的话,编译前一定记得看看docker0的ip地址是否正常,否则重启一下docker服务——又一次掉坑里了,检查快半小时。
- [root@pig1 docker]# docker build -t pig/ssh .
- [+] Building 20.5s (10/10) FINISHED
- => [internal] load build definition from Dockerfile 0.0s
- => => transferring dockerfile: 378B 0.0s
- => [internal] load .dockerignore 0.0s
- => => transferring context: 2B 0.0s
- => [internal] load metadata for docker.io/pig/centos7:latest 0.0s
- => [internal] load build context 0.0s
- => => transferring context: 382B 0.0s
- => [1/5] FROM docker.io/pig/centos7 0.0s
- => CACHED [2/5] COPY init-ssh.sh /root/init-ssh.sh 0.0s
- => CACHED [3/5] COPY hostlist /root/hostlist 0.0s
- => CACHED [4/5] COPY .ssh /root/.ssh 0.0s
- => [5/5] RUN chmod +x /root/init-ssh.sh && chmod 0400 /root/.ssh && echo 'default123' | passwd 18.3s
- => exporting to image 2.0s
- => => exporting layers 2.0s
- => => writing image sha256:3e4be2ca4730b61cc8aa7f4349ebf5b0afa582aeb1f3e3a3577ce15ffcd4eee5 0.0s
- => => naming to docker.io/pig/ssh 0.0s
- [root@pig1 docker]#

启动两个容器测试一下:
pignode1
- [root@pig1 centos]# docker run -it --name pig1 --hostname pignode1 --ip 172.17.0.2 pig/ssh bash
- 将172.17.0.3 pignode2
- 172.17.0.4 pignode3
- 172.17.0.5 pignode4
- 172.17.0.6 pignode5
- 172.17.0.7 pignode6
- 172.17.0.8 pignode7
- 172.17.0.9 pignode8
- 172.17.0.10 pignode9
- 172.17.0.11 pignode10
- 172.17.0.12 pignode11
- 172.17.0.13 pignode12
- 172.17.0.14 pignode13
- 172.17.0.15 pignode14
- 172.17.0.16 pignode15添加到/etc/hosts中
- [root@pignode1 /]# ssh pignode2
- Warning: Permanently added 'pignode2,172.17.0.3' (ECDSA) to the list of known hosts.
- [root@pignode2 ~]# exit
- logout

pignode2
- [root@pig1 docker]# docker run -it --name pig2 --hostname pignode2 --ip 172.17.0.3 pig/ssh bash
- 将172.17.0.2 pignode1
- 172.17.0.4 pignode3
- 172.17.0.5 pignode4
- 172.17.0.6 pignode5
- 172.17.0.7 pignode6
- 172.17.0.8 pignode7
- 172.17.0.9 pignode8
- 172.17.0.10 pignode9
- 172.17.0.11 pignode10
- 172.17.0.12 pignode11
- 172.17.0.13 pignode12
- 172.17.0.14 pignode13
- 172.17.0.15 pignode14
- 172.17.0.16 pignode15添加到/etc/hosts中
- [root@pignode2 /]# ssh pignode1
- Warning: Permanently added 'pignode1,172.17.0.2' (ECDSA) to the list of known hosts.
- Last login: Fri Apr 14 04:59:39 2023 from pignode1
- [root@pignode1 ~]# exit
- logout

可以看到已经可以做到启动容器即免密登录了。美中不足在于:因为要实现准备好hostlist,所以容器启动的时候必须按照hostlist指定ip和主机名,不过这个问题不大,因为如果要使用swarm集群,一并编写好swarm脚本就是了。
有了可以装载即免密的ssh镜像,下一步,我们就可以借助swarm用它来部署ssh免密集群了。不过需要注意的是,在swarm集群中部署ssh,和在单机节点上部署ssh的docker镜像有些许不同:
首先,swarm是不支持将/bin/bash作为docker前台程序的。也就是说,如果我们像前面一样,将docker镜像的ENTRYPOINT设为/bin/bash,则swarm在启动镜像后的几秒之内,就会认为容器已经没有活跃的前台程序,从而退出。然后,按照默认的swarm重启机制重启。于是我们就会看到一群不断重启的服务:
比如,我们直接使用centos的官方镜像centos/centos7启动swarm,配置文件如下:
- version: "3"
- services:
- pigssh1:
- image: centos:centos7
- networks:
- - pig
- hostname: pignode1
- pigssh2:
- image: centos:centos7
- networks:
- networks:
- - pig
- hostname: pignode2
- pigssh3:
- image: centos:centos7
- networks:
- networks:
- - pig
- hostname: pignode3
- networks:
- pig:

则不断重启的过程如下:
- [root@pig1 docker]# docker node ls
- ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
- 3rrx62qy2gtwcixg46xpsffas * pig1 Ready Active Leader 23.0.1
- v3p0j04u0wbxfkhtkzlj0zq0d pig2 Ready Active 23.0.1
- u8phg5zq1rlay99acmyca1vlo pig3 Ready Active 23.0.1
- [root@pig1 docker]#
- [root@pig1 docker]# docker stack deploy -c docker-compose.yml ttt
- Updating service ttt_pigssh3 (id: msks8cep346rmpzujo99j91xk)
- Updating service ttt_pigssh1 (id: ousc72qs2ygyzcbno2i300zh2)
- Updating service ttt_pigssh2 (id: mi6nd9l1bn5st0d97zfmxj62b)
- [root@pig1 docker]# docker stack ps ttt
- ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
- 6srjdbhgglae ttt_pigssh1.1 centos:centos7 pig1 Ready Ready 4 seconds ago
- b95dfzw79nwa \_ ttt_pigssh1.1 centos:centos7 pig1 Shutdown Complete 4 seconds ago
- rv5e1vko0asc \_ ttt_pigssh1.1 centos:centos7 pig1 Shutdown Complete 10 seconds ago
- jd7650kov15k ttt_pigssh2.1 centos:centos7 pig1 Ready Ready less than a second ago
- yn1t2lli0j28 \_ ttt_pigssh2.1 centos:centos7 pig1 Shutdown Complete less than a second ago
- u4bwnzi4pvgi ttt_pigssh3.1 centos:centos7 pig2 Ready Ready 2 seconds ago
- 5vwa1d98o2bo \_ ttt_pigssh3.1 centos:centos7 pig2 Shutdown Complete 3 seconds ago
- 1vxrkembyuh4 \_ ttt_pigssh3.1 centos:centos7 pig2 Shutdown Complete 10 seconds ago
- z815wmav05m1 \_ ttt_pigssh3.1 centos:centos7 pig2 Shutdown Complete 17 seconds ago

所以我们需要改造一下官方的镜像,Dockerfile如下:
- FROM centos:centos7
- ENTRYPOINT ["tail","-f","/dev/null"]
也就是更改官方镜像最后从/bin/bash入口的方式,使用CMD或ENTRYPOINT,以tail -f /dev/null命令作为前台,该命令会一直将前台进程阻塞,从而避免被swarm错误退出。
修改yml文件中的镜像:
- version: "3"
- services:
- pigssh1:
- image: pig/test
- networks:
- - pig
- hostname: pignode1
- pigssh2:
- image: pig/test
- networks:
- networks:
- networks:
- - pig
- hostname: pignode2
- pigssh3:
- image: pig/test
- networks:
- networks:
- networks:
- - pig
- hostname: pignode3
- networks:
- pig:

重新用swarm部署:
- [root@pig1 docker]# docker stack deploy -c docker-compose.yml ttt
- Creating network ttt_pig
- Creating service ttt_pigssh2
- Creating service ttt_pigssh3
- Creating service ttt_pigssh1
- [root@pig1 docker]# docker stack ps ttt
- ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
- 7vqun3os7por ttt_pigssh1.1 pig/test:latest pig1 Running Running 2 seconds ago
- hjnb05mcabhm ttt_pigssh2.1 pig/test:latest pig3 Running Running 15 seconds ago
- y0wyocsblwrf ttt_pigssh3.1 pig/test:latest pig1 Running Running 8 seconds ago
- [root@pig1 docker]# docker ps -a
- CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
- 6d90c934bb09 pig/test:latest "tail -f /dev/null" 16 seconds ago Up 15 seconds ttt_pigssh1.1.7vqun3os7poryxdbzbr844gxe
- 3d47036fd047 pig/test:latest "tail -f /dev/null" 22 seconds ago Up 21 seconds ttt_pigssh3.1.y0wyocsblwrfw49v57l6huv82
- [root@pig1 docker]# docker exec -it 6d90c934bb09 bash
- [root@pignode1 /]#

可以看到,这次很顺利的部署完成,并且使用容器ID可以登录某个节点上部署的容器。
PS:在前面swarm集群部署clickhouse的过程中,一直没有将clickhouse-client部署成功,与这个可能有很大关系。留待有空再试。
书接上一小节。在记录本节内容之前,先利用上节部署的环境做一个小实验。
- [root@pignode1 /]# ping pignode2
- PING pignode2 (10.0.1.3) 56(84) bytes of data.
- 64 bytes from ttt_pigssh2.1.hjnb05mcabhm4vk2loeg89o3v.ttt_pig (10.0.1.3): icmp_seq=1 ttl=64 time=1.86 ms
- 64 bytes from ttt_pigssh2.1.hjnb05mcabhm4vk2loeg89o3v.ttt_pig (10.0.1.3): icmp_seq=2 ttl=64 time=1.04 ms
- 64 bytes from ttt_pigssh2.1.hjnb05mcabhm4vk2loeg89o3v.ttt_pig (10.0.1.3): icmp_seq=3 ttl=64 time=1.40 ms
- ^C
- --- pignode2 ping statistics ---
- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms
- rtt min/avg/max/mdev = 1.047/1.438/1.865/0.334 ms
- [root@pignode1 /]# cat /etc/hosts
- 127.0.0.1 localhost
- ::1 localhost ip6-localhost ip6-loopback
- fe00::0 ip6-localnet
- ff00::0 ip6-mcastprefix
- ff02::1 ip6-allnodes
- ff02::2 ip6-allrouters
- 10.0.1.9 pignode1

在我们已经启动的这些pignodes中,我们登入其中一个——比如pignode1,从pignode1里面直接ping pignode2,是能ping通的,也就是说,pignode1能够正确解析pignode2的名字;然后cat一下hosts文件,发现两件事——一是pignode2的IP映射关系并不是通过该文件记录的;二是pignode1的IP和一般容器部署时的IP不一样,是10.0.1.*。
这是swarm的特点。swarm实际上直接又建立了一套网络体系,它自己就可以负责解析加入到swarm网络的服务的IP,服务名就对应hostname,所以无需我们去建立hostname和IP的对应关系。另一方面,swarm为了负载均衡和容器重启的需要,自己建立了一套虚拟IP的网络,没有使用docker0下的网段,所以也不支持固定的IP分配。
这就是一个尴尬的事,物理机,虚拟机,docker,swarm,我现在的机器上网段套网段,套了4层。虽然确实也有办法在swarm中设置静态地址,比如先建立一个overlay网络,然后网上attach服务的方式,但既然swarm不愿意支持,最好也别强人所难,不然容易掉进不可预知的大坑……
既然解决不了问题,我们就解决提出问题的人——ssh。因为进行ssh免密部署时,需要指定ssh节点的IP和hostname映射关系——这是我们必须设置固定IP的原因。但是如果这个映射关系不需要我们提供呢?比如swarm已经提供了名字解析,我们完全可以不必再去做费力不讨好的hosts列表啊。
所以,如下更改init.sh,只负责启动sshd服务,并且使用tail -f /dev/null挂住前台就好:
- #!/bin/bash
-
- #1. 启动SSH服务,&表示在后台启动
- /sbin/sshd -D &
-
- #2. 因为sshd在后台运行,此处前台程序执行完毕,docker会自行exit
- # 另swarm集群下,似乎会将bash认为是后台程序,从而自动退出
- # 故而此处使用tail -f /dev/null阻塞程序,让程序保持在前台
- tail -f /dev/null
再创建镜像时,也 不需要再拷贝hostlist:
- FROM pig/centos7
- COPY init-ssh.sh /root/init-ssh.sh
- COPY .ssh /root/.ssh
- RUN chmod +x /root/init-ssh.sh \
- && chmod 0400 /root/.ssh/id_rsa \
- && echo 'default123' | passwd --stdin root \
- && yum install openssh openssh-server openssh-clients -y \
- && /sbin/sshd-keygen \
- && echo -e '\nHost *\nStrictHostKeyChecking no\nUserKnownHostsFile=/dev/null' >> etc/ssh/ssh_config
- ENTRYPOINT ["/root/init-ssh.sh"]
将镜像build为pig/sshs镜像,使用下面的部署文件部署:
- version: "3"
- services:
- pigssh1:
- image: pig/sshs
- networks:
- - pig
- hostname: pignode1
- pigssh2:
- image: pig/sshs
- networks:
- - pig
- hostname: pignode2
- pigssh3:
- image: pig/sshs
- networks:
- - pig
- hostname: pignode3
- networks:
- pig:

启动部署(记得把镜像导出到各个节点上先):
- [root@pig1 docker]# docker stack deploy -c docker-compose.yml ttt
- Creating network ttt_pig
- Creating service ttt_pigssh1
- Creating service ttt_pigssh2
- Creating service ttt_pigssh3
- [root@pig1 docker]# docker stack ps ttt
- ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
- wqtkk5uwb1oa ttt_pigssh1.1 pig/sshs:latest pig3 Running Running 7 seconds ago
- qbczq4fx8ulb ttt_pigssh2.1 pig/sshs:latest pig2 Running Running 3 seconds ago
- vfeakouuzsbu ttt_pigssh3.1 pig/sshs:latest pig1 Running Running less than a second ago
- [root@pig1 docker]# docker ps -a
- CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
- 002fe4668083 pig/sshs:latest "/root/init-ssh.sh" 8 seconds ago Up 7 seconds ttt_pigssh3.1.vfeakouuzsbuidqubs3yoruz1
- [root@pig1 docker]# docker exec -it 002fe4668083 bash
- [root@pignode3 /]# ssh pignode1
- Warning: Permanently added 'pignode1,10.0.2.3' (ECDSA) to the list of known hosts.
- [root@pignode1 ~]#

试验成功!
根据前面构建SSH镜像的方法,我们只需要在SSH镜像的基础上,继续安装JAVA环境,下载hadoop压缩包并释放到我们准备的目录下(比如/root/hadoop),然后设置好HADOOP_HOME、JAVA_HOME等环境变量。就可以按照CENTOS上的网络安全工具(十二)走向Hadoop(4)
中的配置方法进行配置了。
先把Dockerfile扔出来再解释:
- # 1. 还是从官方的centos7镜像为起点
- FROM centos:centos7
-
- # 2. 口令参数需要从外部传入,即 docker build --build-arg password='default123' -t pig/hadoop .
- ARG password
-
- # 3. 构造更改了清华镜像源的centos7镜像,其实如果采取离线安装方式也不需要
- RUN sed -e 's|^mirrorlist=|#mirrorlist=|g' \
- -e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=https://mirrors.tuna.tsinghua.edu.cn/centos|g' \
- -i.bak \
- /etc/yum.repos.d/CentOS-*.repo\
- && yum clean all\
- && yum makecache
-
- # 4. 拷贝启动容器时的初始化脚本,用于执行启动sshd服务、初始化hadoop的系列操作
- COPY init-hadoop.sh /root/init-hadoop.sh
-
- # 5. 拷贝SSH免密登录的相关密钥文件,目前只放置了15个
- COPY .ssh /root/.ssh
-
- # 6. 拷贝所有待安装软件(主要是用于离线安装openssh和javasdk的rpm包)
- COPY ./rpm /root/rpm/.
-
- # 7. 解压Hadoop到/root目录下,一般会解压形成一个名为欸hadoop-3.3.5的文件夹
- ADD hadoop-3.3.5.tar.gz /root
-
- # 8. 构建ssh一键部署相关配置(私钥文件、公钥认证文件权限设置,root用户口令设置)
- RUN chmod 0400 /root/.ssh/id_rsa \
- && chmod 0600 /root/.ssh/authorized_keys \
- && echo ${password} | passwd --stdin root
-
- # 9. 安装openssh
- # 在线安装方式: RUN yum install openssh openssh-server openssh-clients -y
- # 离线安装方式:
- RUN rpm -ivh /root/rpm/tcp_wrappers-libs-7.6-77.el7.x86_64.rpm\
- && rpm -ivh /root/rpm/libedit-3.0-12.20121213cvs.el7.x86_64.rpm\
- && rpm -ivh /root/rpm/fipscheck-1.4.1-6.el7.x86_64.rpm /root/rpm/fipscheck-lib-1.4.1-6.el7.x86_64.rpm\
- && rpm -ivh /root/rpm/openssh-7.4p1-22.el7_9.x86_64.rpm\
- && rpm -ivh /root/rpm/openssh-clients-7.4p1-22.el7_9.x86_64.rpm\
- && rpm -ivh /root/rpm/openssh-server-7.4p1-22.el7_9.x86_64.rpm
-
- # 10.1 生成服务器端密钥
- RUN /sbin/sshd-keygen \
- # 10.2 配置SSHD免密登录(更改强制指纹验证为no,避免弹出指纹确认问题)
- && echo -e '\nHost *\nStrictHostKeyChecking no\nUserKnownHostsFile=/dev/null' >> etc/ssh/ssh_config
-
- # 11. 安装JAVA环境
- # 在线安装方式:RUN yum install java-11* -y
- # 离线安装方式:
- RUN rpm -ivh /root/rpm/jdk-11.0.19_linux-x64_bin.rpm
-
- # 由于一些文章说不安装这个包会导致namenode相互不能连接,反正也不大,不管有没有用,先装一个以防万一
- RUN rpm -ivh /root/rpm/psmisc-22.20-17.el7.x86_64.rpm
-
- # 12. 设置初始化脚本可执行属性,并删除已经安装完成的rpm包,避免镜像过大
- RUN chmod +x /root/init-hadoop.sh \
- && rm /root/rpm -rf\
-
-
- #------------------------------------安装Hadoop环境-------------------------------------#
- # 1. 设置与HADOOP相关的全局环境变量,设置hadoop安装及工作目录,并赋值给HADOOP_HOME,然后将HADOOP_HOME加入到PATH,这样执行hdfs start-dfs.sh等命令时,不用必须进入到hadoop工作目录。
- # 1.1 将hadoop工作目录改个名,用起来方便
- RUN mv /root/hadoop-3.3.5 /root/hadoop\
- # 1.2 实际只有/.bashrc中的配置会在容器启动时被加载并发挥作用,不过无所谓,都改了也没啥
- && echo -e "export HADOOP_HOME=/root/hadoop\nexport PATH=\$PATH:\$HADOOP_HOME/bin\nexport PATH=\$PATH:\$HADOOP_HOME/sbin" >> /etc/profile\
- && echo -e "export HADOOP_HOME=/root/hadoop\nexport PATH=\$PATH:\$HADOOP_HOME/bin\nexport PATH=\$PATH:\$HADOOP_HOME/sbin" >> /root/.bashrc\
- && source /root/.bashrc
-
- # 2. 设置$HADOOP_HOME/etc/hadoop/hadoop-env.sh中的JAVA_HOME环境变量
- RUN sed -i 's|#[[:blank:]]export[[:blank:]]JAVA_HOME=$|export JAVA_HOME=/usr|g' /root/hadoop/etc/hadoop/hadoop-env.sh
-
- # 3. 设置 HDFS的用户角色
- RUN echo -e "export HDFS_NAMENODE_USER=root\nexport HDFS_DATANODE_USER=root\nexport HDFS_SECONDARYNAMENODE_USER=root\n">>/root/hadoop/etc/hadoop/hadoop-env.sh\
- # 4. 设置 YARN的用户角色
- && echo -e "export YARN_RESOURCEMANAGER_USER=root\nexport YARN_NODEMANAGER_USER=root\nexport YARN_PROXYSERVER_USER=root">>/root/hadoop/etc/hadoop/yarn-env.sh
-
- # 5. 默认启动脚本
- CMD ["/root/init-hadoop.sh"]

从#1到#10的步骤,和第一部分中配置SSH的过程完全一样,其实也可以直接使用第一部分中的镜像作为起点构建。不同之处,一个是root用户的口令我们改作编译时指定,另一个是改用离线方式安装openssh:
因为不想将口令这种特别个性化的东西放在Dockerfile文件里面,所以使用ARG标识设置了一个名为password的参数,然后再设置口令的指令中,使用shell格式的变量调用形式${password}将这个参数传入指令中。
创建镜像的时候,使用--build-arg选项代入参数:
- [root@pighost1 Dockerfile-hadoop]# docker build --build-arg password='your password' -t pig/hadoop:cluster .
- [+] Building 85.1s (12/12) FINISHED
- => [internal] load build definition from Dockerfile 0.0s
- => => transferring dockerfile: 3.21kB 0.0s
- => [internal] load .dockerignore 0.0s
- => => transferring context: 2B 0.0s
- => [internal] load metadata for docker.io/library/centos:centos7 0.0s
- => [1/7] FROM docker.io/library/centos:centos7 0.0s
- => [internal] load build context 0.0s
- => => transferring context: 6.12kB 0.0s
- => CACHED [2/7] RUN sed -e 's|^mirrorlist=|#mirrorlist=|g' -e 's|^#baseurl=http://mirror.centos.org/centos|baseurl=ht 0.0s
- => [3/7] COPY init-hadoop.sh /root/init-hadoop.sh 0.0s
- => [4/7] COPY .ssh /root/.ssh 0.0s
- => [5/7] COPY ./rpm /root/rpm/. 1.1s
- => [6/7] ADD hadoop-3.3.5.tar.gz /root 14.0s
- => [7/7] RUN chmod +x /root/init-hadoop.sh && chmod 0400 /root/.ssh/id_rsa && chmod 0600 /root/.ssh/authorized_ke 56.4s
- => exporting to image 13.3s
- => => exporting layers 13.3s
- => => writing image sha256:6bb64f678a7292b9edb7d6b8d58a9b61e8cc8718ef545f9623a84e19652cb77a 0.0s
- => => naming to docker.io/pig/hadoop:cluster 0.0s

这里password使用单引号,以避免使用双引号将特殊字符转义。
说起离线方式,貌似已经很久没有这么干了,主要是最近单位网络还比较给力,各种环境试验没出过什么岔子。这次这篇记录,本来是打算五一前发出来,结果接到一个出差任务耽误了一周。想着五一期间趁着休息花一天时间整理出来,于是在家重新搭了个环境。没想到,就这么一搭,搭出来一个接一个的连环坑,把整个五一都搭了进去。
网络环境造就了大坑之一:其实虽然现在已经从坑里爬了出来,但仍然是没有完全弄明白发生了什么事情,因为一个悲剧的发生,往往不是只有一个环节出问题,而是一连串的环节出了问题导致的。这里不完全记录一下:
Docker升级了……
因为是重新搭的环境,所以在家里的虚拟机上,Dockers是五一期间新下载安装的,然后神奇的发生了拉取镜像失败的问题。这个问题极为飘忽不定。具体症结就是,docker会报对于注册如武器registry-1.docker.io,找不到对应的主机。
- [root@pighost1 ~]# ping registry-1.docker.io
- PING registry-1.docker.io (52.1.184.176) 56(84) bytes of data.
- ^C
- --- registry-1.docker.io ping statistics ---
- 6 packets transmitted, 0 received, 100% packet loss, time 5105ms
-
- [root@pighost1 ~]# docker pull hello-world
- Using default tag: latest
- Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.21.2:53: no such host
但是同样是在虚拟机所在的windows主机上,还有一个早期安装的Docker Desktop,工作则完全正常,丝滑无比。所以一开始我以为是虚拟机网络的问题,查了好久。
直到我解决了问题(见后),打算拿出笔记本移植环境的时候,发现笔记本上的Desktop出问题了,于是进行了重装,结果发现重装后的Desktop也出现了同样的问题。这才确定这一故障应该是和Docker升级相关。
DNS的不稳定性
抓狂的是,这种故障并不稳定。比如,如果我一直pull hello-world的话,可能在数十次尝试后突然就可以了,然后又在几分钟后就又不行了……。一开始我以为是Docker升级了用户权限控制,因为中间有一段,在我登录了以后,就可以在desktop上流畅无比的pull。但是隔了两天,我没有登录的情况下,也能够在desktop(就是后面新安装的那个)流畅无比的pull……。
抓狂到最后,不可不拿出wireshar抓包分析。因为排查过程极度混乱,也没有留什么记录,还好讨论时捏了一张照片
其中框出的两部分,就是一次成功pull(上面),一次失败的pull(下面)。可以看出,失败的主要原因,就是对registry-1.dockers.io的dns查询失败了。虽然如上上面那张图所示,即使ping registry-1.docker.io确实能够看到IP的情况下,pull指令本身仍会执行失败。感觉好像就是pull命令会发出一次dns查询,失败了它就不干活一样。
这个DNS的问题,在单位的网络上就不会出现,在家里的网络上就会时不时出点问题。所以,我只好认为是电信宽带线路上某个地方的问题。也许撞墙了,也许就是指派给我这一片的DNS服务器有问题。只是我不知道这个为什么虚拟机上不行,宿主机上就没问题。
宿主机:
- C:\Users\pig> nslookup registry-1.docker.io
- 服务器: UnKnown
- Address: 2408:8000:1010:1::8
-
- 非权威应答:
- 名称: registry-1.docker.io
- Addresses: 18.215.138.58
- 34.194.164.123
- 52.1.184.176
虚拟机:
- [root@pighost1 ~]# docker pull hello-world
- Using default tag: latest
- Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.21.2:53: no such host
- [root@pighost1 ~]# nslookup registry-1.docker.io
- Server: 192.168.21.2
- Address: 192.168.21.2#53
-
- Non-authoritative answer:
- Name: registry-1.docker.io
- Address: 52.1.184.176
- Name: registry-1.docker.io
- Address: 34.194.164.123
- Name: registry-1.docker.io
- Address: 18.215.138.58
-
-
- [root@pighost1 ~]# nslookup registry-1.docker.io
- Server: 192.168.21.2
- Address: 192.168.21.2#53
-
- Non-authoritative answer:
- Name: registry-1.docker.io
- Address: 52.1.184.176

而且在虚拟机上,使用nslookup,结果在短短几分钟之内就是不一样……
进一步,如下图,左边为虚拟机,右边为宿主机。主要的区别,就是右边在确定registry-1.docker.io的SOA服务器亚马逊route53解析服务后,宿主机查出了IP,然后进行下去了,虚拟机没有。很大的可能是虚拟机是IPv4,宿主机是IPv6……
网上对这一故障也有很多讨论,解决办法就是手工更改DNS服务,把google的DNS加到列表中去,包括8.8.8.8,8.8.4.4,另外还有著名的114.114.114.114。更改虚拟机里面的DNS后,pull大概率就可以成功了。
用魔法打败魔法…
当然因为后面我还得弄集群,并且我现在看着IPv6地址也头疼,就不尝试ipv6能不能解决这个问题了。但是改了DNS以后,docker pull是没有问题,但是在电信宽带的网上,容器内清华的镜像repo库又开始剧烈的不稳定了,甚至一些情况下压根就不通了。无奈之下,尝试使用代理穿透,在改DNS和魔法的双重加持下,pull和repo就都正常了。总之,故障似乎不在我能控制的范围内,一脸懵逼的情况下,还是尝试离线安装吧,又稳定又快。
毕竟,JAVA和Hadoop安装包的在线安装每试一次就会下载一次,速度也确实感人。
离线安装所需的rpm包可以参考我们之前的方法,使用yumdownloader下载,然后使用rpm -ivh命令去安装,就如#9步骤一样。
至于jdk11,需要登陆oracle去下载;hadoop,可以去hadoop的官方站下载。
Dockerfile里最后的部分,就是设置与Hadoop相关的全局环境变量了。主要包括3部分:
在PATH中加入$HADOOP_HOME/bin 和 $HADOOP_HOME/sbin, 以下在#1中完成
HADOOP_HOME就是hadoop的安装地址,一开始我们把它解压到了/root/hadoop-3.3.5下面,然后把它改成了/root/hadoop;这个环境变量在后面大量的脚本中都会用到,所以设上为好
PATH变量中加入bin和sbin,是为了后面执行hdfs、yarn、mapred指令和start-dfs.sh、start-yarn.sh脚本的时候,不用费劲巴拉的敲完整地址。当然不设貌似也没啥问题。
最后,就是在容器里,这些变量需要在~/.bashrc中添加,也就是/root/.bashrc,容器启动时会自动载入;而不是在linux系统里设置全局变量一般,在/etc/profile中去设置,容器里似乎对这个文件不会source,设上也不会搭理。
以下在#2中完成:
如同之前的记录中搭建hadoop集群所要做的一样,需要在$HADOOP_HOME/etc/hadoop下的hadoop-env.sh文件中,取消注释export JAVA_HOME哪一行,并且将其填为/usr,因为java在/usr/bin/java处。
增加HDFS和YARN的root用户
以下在#3、#4中完成:
hadoop默认使用hdfs作为用户,root并不是hadoop的用户。然而容器进去的时候基本就在root用户下,所以我们需要将root增加为hdfs和yarn的用户。否则,启动各类服务的时候,会被告知root不是对应的用户而拒绝启动。
其实大致的配置和CENTOS上的网络安全工具(十二)走向Hadoop(4) Hadoop 集群搭建差不多,只是要根据swarm的特点做一点小小的改变——虽然这个小小的改变也是一个超级大坑。
不多说,先贴配置。
core-site.xml
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property><!-- 设置名字服务器的URI,Hadoop根据此设置识别NameNode,否则datanode无法识别namenode -->
- <name>fs.defaultFS</name>
- <value>hdfs://pignode1:9000</value>
- </property>
- <property><!-- 设置可以通过WEB页面访问的用户身份, 否则无法登录WEB管理界面 -->
- <name>hadoop.http.staticuser.user</name>
- <value>root</value>
- </property>
- <property><!-- 设置dfs目录, 容器需要将宿主机目录映射到该目录上 -->
- <name>hadoop.tmp.dir</name>
- <value>/hadoopdata</value>
- </property>
- </configuration>

hdfs-site.xml
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property><!--设置文件系统副本数, 一般默认为3 -->
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- <property><!--设置名字服务器WEB管理页面访问端口, 注意这个0.0.0.0,和之前设置为pignode1不一样了 -->
- <name>dfs.namenode.http-address</name>
- <value>0.0.0.0:9870</value>
- </property>
- <property><!--设置第二名字服务器,地址与页面访问端口, start-dfs.sh会根据这个设置在相应节点启动第二名字服务器 -->
- <name>dfs.namenode.secondary.http-address</name>
- <value>pignode2:9890</value>
- </property>
- </configuration>

yarn-site.xml
- <?xml version="1.0"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <configuration>
-
- <!-- Site specific YARN configuration properties -->
- <property><!--设置YARN服务器主机,但似乎start-yarn.sh识别但不使用这个,如果在pignode1上运行start-all.sh,resourcemanager是无法启动的,必须在pignode2上start-yarn.sh才行-->
- <name>yarn.resourcemanager.hostname</name>
- <value>pignode2</value>
- </property>
- <property><!--设置YARN的WEB管理界面登录端口, 同样需要设置0.0.0.0 -->
- <name>yarn.resourcemanager.webapp.address</name>
- <value>0.0.0.0:8088</value>
- </property>
- <property><!--设置YARN的WEB代理服务器地址 -->
- <name>yarn.web-proxy.address</name>
- <value>pignode2:8090</value>
- </property>
- <property><!--设置工作节点的算法为mapreduce -->
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property><!--设置环境变量白名单,不设会工作异常 -->
- <name>yarn.nodemanager.env-whitelist</name>
- <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
- </property>
- </configuration>

- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property><!--指定MapReduce使用的集群框架(YARN)-->
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property><!--设置Job History Server的地址-->
- <name>mapreduce.jobhistory.address</name>
- <value>pignode3:10020</value>
- </property>
- <property><!--设置Job History Server的WEB管理界面端口-->
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>0.0.0.0:19888</value>
- </property>
- <property><!--设置mapreduce 库/算子的路径, 不设就找不到, 找不到就算不了 -->
- <name>mapreduce.application.classpath</name>
- <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
- </property>
- </configuration>

对比之前的集群部署,这个唯一的区别就是,在所有需要规定WEB管理界面登录地址的地方,我们没有像之前那样填写hostname,而是填写了0.0.0.0:
这个原因就是大坑之二所在:
在我按照原来的配置启动容器后,无论如何也无法访问这几个曾经存在的管理界面,在各个节点上使用jps、netstat查看,明明进程和端口好好的存在,就是登不进去。一开始我怀疑是容器网络的问题——因为容器和宿主机并不位于一个网络上,swarm的容器内部,是要通过网桥docker_gwbridge回到宿主机的,这个完全又和docker_0不一样。而且,swarm内部网络更为复杂,内部网络地址一般在10.0.0.0/16上,并且启动一次stack变一次网段,比如上次是10.0.1.0/24,下次就是10.0.2.0/24,其网关又都在docker_gwbridge上,这货的地址是172.18.0.1……,然后宿主机又是我自己设的192.168.21.1/24,再外面才是真正的windows宿主机。
而且,由于之前的大坑之一,让我对当前的网络产生深深的不信任感,所以总以为是网络的问题——何况从宿主机也ping不同swarm容器。为了排除,我极不情愿的在宿主机上增加了去往10.0.0.0/24网段的路由:
- [root@pighost1 Dockerfile-hadoop]# route add -net 10.0.0.0/16 gw 172.18.0.1 dev docker_gwbridge
- [root@pighost1 Dockerfile-hadoop]#
这样,swarm容器内和宿主机就可以双向ping通了。此时,在宿主机上使用pignode1的内部ip地址,比如:
就可以看到久违的管理界面。总之,这样能证明hadoop应该是没有问题。但是使用pighost1:9870就是访问不到……
在之前配置clickhouse的时候,我也啥都没干,就照样顺畅的连上了clickhouse的服务器啊,唯一的区别,只不过是那次我用的是官方的镜像而已。 总不至于是我的Dockerfile或者docker-compse.yml又问题吧。在纠结了两天应该如何EXPOSE以及如何在宿主机的iptables下添加记录(明明防火墙都关了)以后,我突然想起来,在clickhouse的配置中,还有一样东西是在hadoop配置里没有做到的:CENTO OS上的网络安全工具(二十)ClickHouse swarm容器化集群部署
就是这个为了远程接入服务器设置的0.0.0.0地址。在hadoop中,是否需要对应设置呢?
不得不说有这样需求的人真是不多,在穷尽搜索之能事后,终于找了一两篇文章的只言片语让我突然反应过来,xml配置文件中的有些hostname其实是应该设置为0.0.0.0的……
于是就有了上面配置文件中的那些改动。改完后,pighost1:9870就可以用了:
初始化脚本如下,看注释就好。总之就是启动sshd,然后看是不是所有节点都启动成功了,启动成功了,就判断是否格式化,没有就先格了,然后依次start dfs,yarn和history server。
- #! /bin/bash
- # the NODE_COUNT param set by swarm config yml file, using endpoint_environment flag.
- NODECOUNT=$NODE_COUNT
- TRYLOOP=50
-
- ############################################################################################################
- ## 1. source一下环境变量,虽然docker也会在载入的时候source它,保险起见,自己也来一遍
- ############################################################################################################
- source /etc/profile
- source /root/.bashrc
-
- ############################################################################################################
- ## 2. 启动openssh服务
- ############################################################################################################
- /sbin/sshd -D &
-
- ############################################################################################################
- ## 3. 定义后面初始化过程中要调用的函数
- ############################################################################################################
-
- #FUNCTION:测试是否所有节点都已经启动的函数,避免在节点尚未全部启动时就执行format的尴尬----------------------------
- #param1: 节点hostname的前缀(就是不包含尾巴后面数字的部分)
- #param2: 节点数量
- #param3: 在放弃前执行多少轮转圈ping节点的操作
- isAllNodesConnected(){
- PIGNODE_PRENAME=$1
- PIGNODE_COUNT=$2
- TRYLOOP_COUNT=$3
- tryloop=0
- ind=1
- #init pignode hostname array,and pignode status array
- while(( $ind <= $PIGNODE_COUNT ))
- do
- pignodes[$ind]="$PIGNODE_PRENAME$ind"
- pignodes_stat[$ind]=0
- let "ind++"
- done
-
- #check wether all the pignodes can be connected
- noactivecount=$PIGNODE_COUNT
- while(( $noactivecount > 0 ))
- do
- noactivecount=$PIGNODE_COUNT
- ind=1
- while(( $ind <= $PIGNODE_COUNT ))
- do
- if (( ${pignodes_stat[$ind]}==0 ))
- then
- ping -c 1 ${pignodes[$ind]} > /dev/null
- if (($?==0))
- then
- pignodes_stat[$ind]=1
- let "noactivecount-=1"
- echo "Try to connect ${pignodes[$ind]}:successed." >>init.log
- else
- echo "Try to connect ${pignodes[$ind]}: failed." >>init.log
- fi
- else
- let "noactivecount-=1"
- fi
- let "ind++"
- done
- if (( ${noactivecount}>0 ))
- then
- let "tryloop++"
- if (($tryloop>$TRYLOOP_COUNT))
- then
- echo "ERROR Tried ${TRYLOOP_COUNT} loops. ${noactivecount} nodes failed, exit." >>init.log
- break;
- fi
- echo "${noactivecount} left for ${PIGNODE_COUNT} nodes not connected, waiting for next try">>init.log
- sleep 5
- else
- echo "All nodes are connected.">>init.log
- fi
- done
- return $noactivecount
- }
- #----------------------------------------------------------------------------------------------------------
-
- #FUNCTION:从core-site文件中获取所设置的hadoop dfs所在文件夹---------------------------------------------------
- getDataDirectory(){
- configfiledir=`echo "${HADOOP_HOME}/etc/hadoop/core-site.xml"`
- datadir=`cat ${configfiledir} | grep -A 2 'hadoop.tmp.dir' | grep '<value>' | sed 's/^[[:blank:]]*<value>//g' | sed 's/<\/value>$//g'`
- echo $datadir
- }
-
- ############################################################################################################
- ## 4. 测试是否是主节点(hostname1),是则执行初始化操作 ##
- ############################################################################################################
- nodehostname=`hostname`
- nodehostnameprefix=`echo $nodehostname|sed -e 's|[[:digit:]]\+$||g'`
- nodeindex=`hostname | sed "s/${nodehostnameprefix}//g"`
- #切换到Hadoop安装目录
- cd $HADOOP_HOME
- #判断节点ID,主节点则执行初始化,否则等待即可
- if (($nodeindex!=1));then
- echo $nodehostname waiting for init...>>init.log
- else
- # 求yarn节点id(默认装在第2节点)和mapreduce节点id(默认装在第3节点)
- if (($NODECOUNT>=2));then
- yarnnodeid=2
- else
- yarnnodeid=1
- fi
-
- if (($NODECOUNT>=3));then
- maprednodeid=3
- else
- maprednodeid=1
- fi
-
- # 测试是否所有节点都可以ping通
- echo $nodehostname is one of the init manager nodes...>>init.log
- #waiting for all the nodes connected
- isAllNodesConnected $nodehostnameprefix $NODECOUNT $TRYLOOP
- if (($?==0));then
- #all the nodes is connected,from then to init hadoop
- datadirectory=`echo $(getDataDirectory)`
- #如果hadoop数据目录不为空,证明已经格式化,直接启动dfs,否则需执行格式化
- if [ $datadirectory ];then
- #check wether hadoop was formatted.
- datadircontent=`ls -A ${datadirectory}`
- if [ -z $datadircontent ];then
- echo "format dfs">>init.log
- bin/hdfs namenode -format >>init.log
- else
- echo "dfs is already formatted.">>init.log
- fi
- else
- echo "ERROR:Can not get hadoop tmp data directory.init can not be done. ">>init.log
- fi
- #start-all.sh已经弃用,所以分别使用start-dfs.sh和start-yarn.sh启动
- echo "Init dfs --------------------------------------------------------------------" >> init.log
- sbin/start-dfs.sh
- echo "Init yarn -------------------------------------------------------------------" >> init.log
- ssh root@${nodehostnameprefix}${yarnnodeid} "bash ${HADOOP_HOME}/sbin/start-yarn.sh" >> init.log
- # history server需要单独启动
- echo "Init JobHistory server-------------------------------------------------------" >> init.log
- ssh root@${nodehostnameprefix}${maprednodeid} "bash ${HADOOP_HOME}/bin/mapred --daemon start historyserver">>init.log
- else
- echo "ERROR:Not all the nodes is connected. init can not be done. exit...">>init.log
- fi
- fi
-
- #挂住前台,防止swarn重启
- tail -f /dev/null

哦,为了方便后面映射hadoop的dfs目录,以及每次测试完清空目录以免重复格式化,还需要一个小脚本来在宿主机上清空和创建目录,一并贴了:
- #! /bin/bash
-
- index=1
- rm /hadoopdata/* -rf
- while(($index<=12));do
- file="/hadoopdata/${index}"
- mkdir $file
- let "index++"
- done
接下来是最大的一个坑了,先上docker-compse.yml:
version: "3.7" services: # 使用pignode1作为Hadoop的Nameode,开放9000端口 # 使用pignode1作为Hadoop的Namenode Http服务器,开放9870端口 pignode1: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: constraints: - node.hostname==pighost1 hostname: pignode1 environment: - NODE_COUNT=12 networks: - pig ports: - target: 22 published: 9011 protocol: tcp mode: host - target: 9000 published: 9000 protocol: tcp mode: host - target: 9870 published: 9870 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/1:/hadoopdata:wr pignode2: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Second Namenode限制部署在第二个节点上 constraints: - node.hostname==pighost2 networks: - pig hostname: pignode2 environment: - NODE_COUNT=12 ports: # 第二名字服务器接口 - target: 22 published: 9012 protocol: tcp mode: host - target: 9890 published: 9890 protocol: tcp mode: host - target: 8088 published: 8088 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/2:/hadoopdata:wr pignode3: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: - node.hostname==pighost3 networks: - pig hostname: pignode3 environment: - NODE_COUNT=12 ports: - target: 22 published: 9013 protocol: tcp mode: host - target: 10020 published: 10020 protocol: tcp mode: host - target: 19888 published: 19888 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/3:/hadoopdata:wr #------------------------------------------------------------------------------------------------ #以下均为工作节点,可在除leader以外的主机上部署 pignode4: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager # node.role==worker - node.hostname==pighost3 networks: - pig environment: - NODE_COUNT=12 ports: - target: 22 published: 9014 protocol: tcp mode: host hostname: pignode4 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/4:/hadoopdata:wr pignode5: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost3 networks: - pig ports: - target: 22 published: 9015 protocol: tcp mode: host hostname: pignode5 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/5:/hadoopdata:wr pignode6: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost3 networks: - pig ports: - target: 22 published: 9016 protocol: tcp mode: host hostname: pignode6 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/6:/hadoopdata:wr pignode7: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9017 protocol: tcp mode: host hostname: pignode7 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/7:/hadoopdata:wr pignode8: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9018 protocol: tcp mode: host hostname: pignode8 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/8:/hadoopdata:wr pignode9: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9019 protocol: tcp mode: host hostname: pignode9 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/9:/hadoopdata:wr pignode10: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9020 protocol: tcp mode: host hostname: pignode10 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/10:/hadoopdata:wr pignode11: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9021 protocol: tcp mode: host hostname: pignode11 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/11:/hadoopdata:wr pignode12: image: pig/hadoop:cluster deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9022 protocol: tcp mode: host hostname: pignode12 environment: - NODE_COUNT=12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/12:/hadoopdata:wr networks: pig:
如果不说,没啥特别的地方。但是出问题的时候,在网上查,发现和我一样掉坑的兄弟也有,爬出来的不多……
按照原来的方式,直接使用9000:9000之类的端口映射,swarm默认采用vip形式来构建名字服务。所谓vip形式,就是swarm会帮助我们对每个节点/服务的hostname映射一个ip,然后启用自己的负载均衡来管理名字和ip的映射关系。
其实这段话在之前学习stack的配置方式时也看过,但是没觉着有什么需要特别注意的地方。直到这次遭遇到datanode节点随机只启动一半的问题……
排查了一个下午,才发现原因,是在某些datanode启动的时候,swarm名字服务给出了错误的namenode ip地址……犯罪现场是已经没有了,当时也百思不得其解,不过搞明白后还原现场的照片还在:
在不断的ping下,两个ip反复交替出现,所以马上让人联想到负载均衡问题。
所以最终问题才回到正轨上来——如果vip会被负载均衡的话,原来那个不太理解的dnsrr模式,应该就是需要“自行负载均衡”的简单ip轮询方式了。果不其然,在更改endpoint_mode为dnsrr模式后(当然对应导致端口映射的写法也要改),ip解析终于稳定了。datanode也能够赏心悦目地一次全部启动了。
高可用有两种方式,一种是使用Quorun Journal node管理器(QJM)进行活跃名字服务器和待机名字服务器间编辑信息的同步,另一种是使用传统的NFS共享存储来帮助编辑信息同步。鉴于把NFS的高可用性建立在另一个NFS上的这种奇怪逻辑,这里毫不犹豫选择了QJM模式,希望是对的。
高可用模式网上配置教程很多,有些写得极为复杂,有的宣称简单到一把搞定,我都试了,感觉复杂未必,简单也不见得。所谓复杂未必,是说虽然配置项很多,但是配哪些如何配,官网说得比较清楚,一项项照配就行;所谓简单不见得,主要在于启动过程,不同的前提条件启动过程不太一样,虽然官网说得比较明白,但也有些步骤容易忽视,勤看log是个不错的爬坑习惯。
按照官网的说法,要配置QJM高可用集群,需要准备2类节点。一是Namenode服务器,2台以上,我们用了3台,所有这些名字服务器需要有相同的硬件配置;二是Journalnode服务器,其实一种轻载的进程,所以官网推荐和Namenode、JobTracker或者ResourceManager装在一起,且要求为大于3的奇数台,用以容忍(N-1)/2个节点失效。所以,我们干脆运行了3个管理节点,每节点用来承载一个namenode,一个Journalnode,一个resourcemanager及其它。
默认文件系统,由原先的pignode1:9000改成群组名,群族名在后面hdfs-site.xml中定义。
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://mycluster</value>
- </property>
由于使用了名字服务器群组,所以在hdfs中需要更改的配置主要是定义名字服务器群组及其中的名字服务器。相关配置包括:
名字服务群组的逻辑名称(自己取个名字就行)
- <property>
- <name>dfs.nameservices</name>
- <value>mycluster</value>
- </property>
定义名字服务器群组中每个名字服务器的名称,比如mycluster群组中,包含nn1、nn2、nn3这3个名字服务器。注意这个nn1,nn2,nn3就是为名字服务器取的名字,不一定是名字服务器主机所在的名字。为免混淆,最好不一样。
- <property>
- <name>dfs.ha.namenodes.mycluster</name>
- <value>nn1,nn2, nn3</value>
- </property>
定义每个名字服务器需要监听RPC调用的地址和端口。因为定义了3个,所以要有3个property
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn1</name>
- <value>machine1.example.com:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn2</name>
- <value>machine2.example.com:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.mycluster.nn3</name>
- <value>machine3.example.com:8020</value>
- </property>
定义名字服务器监听的WEB服务的地址和端口号。就是最后我们经常用的WEB管理页面的地址。也可以设置为https的,为免麻烦,直接http了。
- <property>
- <name>dfs.namenode.http-address.mycluster.nn1</name>
- <value>machine1.example.com:9870</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.mycluster.nn2</name>
- <value>machine2.example.com:9870</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.mycluster.nn3</name>
- <value>machine3.example.com:9870</value>
- </property>
到此,和名字服务器相关的配置基本就搞定了。上文说了,还要准备Journalnode服务器,相关配置也要配好:
该属性定义了所有的Journalnode服务器和监听端口,名字服务器用这个地址来同步编辑信息
- <property>
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster</value>
- </property>
然后就是配置故障迁移相关的脚本:
定义HDFS客户端用来确定活跃名字服务器的Java class,也就是客户端用这个来确定应该和哪个名字服务器通信。这里只有nameserviceID需要改一下,改成我们自己取的那个名字。
- <property>
- <name>dfs.client.failover.proxy.provider.mycluster</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
一组java class或脚本的列表,用于在故障迁移期间用于锁定活跃的名字服务器。比如sshfence,使用ssh链接到活跃的名字服务器并kill进程。所以被杀节点上应该有发起节点的公钥和authkey文件,发起节点的私钥存储位置如下定义。
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- </property>
-
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/home/exampleuser/.ssh/id_rsa</value>
- </property>
配置Journalnode用于存储编辑信息的本地目录,应该是一个绝对路径。在容器中最好映射到宿主机。
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/path/to/journal/node/local/data</value>
- </property>
hadoop建议部署3个以上,同样最好是奇数个的zookeeper节点,并且在启动hadoop前验证zookeeper节点是否正常工作。
这个可以通过在每个zookeeper节点上运行zkServer.sh status来查看,进一步可使用zkCli.sh -ls /来查看目录,判断是否工作正常。
zookeeper相关的配置一共2个,一个涉及名字服务器的同步和故障迁移,在core-site.xml中;一个涉及reoursemanager的同步和迁移,在yarn-site.xml中,后面会涉及,此处不赘述。
涉及自动故障迁移的主要有两个配置项,分别在两个不同的配置文件中:
配置用于自动切换的zookeeper节点及端口
- <property>
- <name>ha.zookeeper.quorum</name>
- <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value>
- </property>
指示集群自动故障切换
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
前面说过,说复杂不复杂,说简单不简单。关键在于两点,一点就是上面的配置不要敲错了——这个确实很难检查,很坑;二呢,就是下面的启动步骤,不要颠倒了。之所以如此怪异和复杂,主要还是和服务器间初始化时的同步相关,一旦格式化、初始化都搞定了,后续还是可以通过start-dfs.sh脚本一键启动的。
在所有的3个名字服务器上,执行:hdfs --daemon start journalnode
- [root@pignode1 ~]# hdfs --daemon start journalnode
- WARNING: /root/hadoop/logs does not exist. Creating.
- [root@pignode1 ~]# jps
- 75 JournalNode
- 123 Jps
- [root@pignode1 ~]#
-
-
-
- [root@pignode2 ~]# hdfs --daemon start journalnode
- WARNING: /root/hadoop/logs does not exist. Creating.
- [root@pignode2 ~]# jps
- 75 JournalNode
- 123 Jps
-
-
-
- [root@pignode3 ~]# hdfs --daemon start journalnode
- WARNING: /root/hadoop/logs does not exist. Creating.
- [root@pignode3 ~]# jps
- 75 JournalNode
- 123 Jps

这里只介绍安装完全新鲜的HA集群的做法,升级HA或者迁移数据什么的,请参考官网描述Apache Hadoop 3.3.5 – HDFS High Availability Using the Quorum Journal Manager
在其中一个名字服务器节点上,进行格式化,比如pignode1: hdfs namenode -format
- [root@pignode1 ~]# hdfs namenode -format
- 2023-05-11 09:39:06,842 INFO namenode.NameNode: STARTUP_MSG:
- /************************************************************
- STARTUP_MSG: Starting NameNode
- ……………
- ……………
- 2023-05-11 09:39:09,145 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1327835470-10.0.30.18-1683797949145
- 2023-05-11 09:39:09,159 INFO common.Storage: Storage directory /hadoopdata/hdfs_name has been successfully formatted.
- 2023-05-11 09:39:09,320 INFO namenode.FSImageFormatProtobuf: Saving image file /hadoopdata/hdfs_name/current/fsimage.ckpt_0000000000000000000 using no compression
- 2023-05-11 09:39:09,403 INFO namenode.FSImageFormatProtobuf: Image file /hadoopdata/hdfs_name/current/fsimage.ckpt_0000000000000000000 of size 396 bytes saved in 0 seconds .
- 2023-05-11 09:39:09,409 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
- 2023-05-11 09:39:09,456 INFO namenode.FSNamesystem: Stopping services started for active state
- 2023-05-11 09:39:09,457 INFO namenode.FSNamesystem: Stopping services started for standby state
- 2023-05-11 09:39:09,466 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
- 2023-05-11 09:39:09,467 INFO namenode.NameNode: SHUTDOWN_MSG:
- /************************************************************
- SHUTDOWN_MSG: Shutting down NameNode at pignode1/10.0.30.18
- ************************************************************/
- [root@pignode1 ~]#

启动该名字服务器 ,否则后面在其它服务器上同步元数据会因为连不上服务器而失败
- [root@pignode1 ~]# hdfs --daemon start namenode
- [root@pignode1 ~]# jps
- 259 NameNode
- 341 Jps
- 75 JournalNode
在其它名字服务器节点上,执行:hdfs namenode -bootstrapStandby,以确保将已格式化节点的元数据通过Journalnode同步到没有格式化的名字服务器上。这也就是为什么必须首先启动journalnode的原因。
- [root@pignode2 ~]# hdfs namenode -bootstrapStandby
- 2023-05-11 09:43:54,097 INFO namenode.NameNode: STARTUP_MSG:
- /************************************************************
- STARTUP_MSG: Starting NameNode
- STARTUP_MSG: host = pignode2/10.0.30.21
- STARTUP_MSG: args = [-bootstrapStandby]
- STARTUP_MSG: version = 3.3.5
- …………
- …………
- 2023-05-11 09:58:32,730 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- 2023-05-11 09:58:32,730 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- =====================================================
- About to bootstrap Standby ID pignamenode2 from:
- Nameservice ID: pignamenodecluster
- Other Namenode ID: pignamenode1
- Other NN's HTTP address: http://pignode1:9870
- Other NN's IPC address: pignode1/10.0.31.21:8020
- Namespace ID: 1898329509
- Block pool ID: BP-1342056252-10.0.31.21-1683799022316
- Cluster ID: CID-ddaf258a-47c4-4dde-b681-2c9c70872ef1
- Layout version: -66
- isUpgradeFinalized: true
- =====================================================
- 2023-05-11 09:58:33,140 INFO common.Storage: Storage directory /hadoopdata/hdfs_name has been successfully formatted.
- 2023-05-11 09:58:33,171 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- 2023-05-11 09:58:33,172 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- 2023-05-11 09:58:33,200 INFO namenode.FSEditLog: Edit logging is async:true
- 2023-05-11 09:58:33,300 INFO namenode.TransferFsImage: Opening connection to http://pignode1:9870/imagetransfer?getimage=1&txid=0&storageInfo=-66:1898329509:1683799022316:CID-ddaf258a-47c4-4dde-b681-2c9c70872ef1&bootstrapstandby=true
- 2023-05-11 09:58:33,436 INFO common.Util: Combined time for file download and fsync to all disks took 0.00s. The file download took 0.00s at 0.00 KB/s. Synchronous (fsync) write to disk of /hadoopdata/hdfs_name/current/fsimage.ckpt_0000000000000000000 took 0.00s.
- 2023-05-11 09:58:33,437 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 399 bytes.
- 2023-05-11 09:58:33,443 INFO ha.BootstrapStandby: Skipping InMemoryAliasMap bootstrap as it was not configured
- 2023-05-11 09:58:33,456 INFO namenode.NameNode: SHUTDOWN_MSG:
- /************************************************************
- SHUTDOWN_MSG: Shutting down NameNode at pignode2/10.0.31.12
- ************************************************************/

同步完了也启动:
- [root@pignode2 ~]# hdfs --daemon start namenode
- [root@pignode2 ~]# jps
- 75 JournalNode
- 251 NameNode
- 332 Jps
第3个节点照抄:
- [root@pignode3 ~]# hdfs namenode -bootstrapStandby
- 2023-05-11 09:46:55,393 INFO namenode.NameNode: STARTUP_MSG:
- /************************************************************
- STARTUP_MSG: Starting NameNode
- STARTUP_MSG: host = pignode3/10.0.30.6
- STARTUP_MSG: args = [-bootstrapStandby]
- STARTUP_MSG: version = 3.3.5
- …………
- …………
- 2023-05-11 10:02:24,114 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- =====================================================
- About to bootstrap Standby ID pignamenode3 from:
- Nameservice ID: pignamenodecluster
- Other Namenode ID: pignamenode1
- Other NN's HTTP address: http://pignode1:9870
- Other NN's IPC address: pignode1/10.0.31.21:8020
- Namespace ID: 1898329509
- Block pool ID: BP-1342056252-10.0.31.21-1683799022316
- Cluster ID: CID-ddaf258a-47c4-4dde-b681-2c9c70872ef1
- Layout version: -66
- isUpgradeFinalized: true
- =====================================================
- 2023-05-11 10:02:24,409 INFO common.Storage: Storage directory /hadoopdata/hdfs_name has been successfully formatted.
- 2023-05-11 10:02:24,420 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- 2023-05-11 10:02:24,421 INFO common.Util: Assuming 'file' scheme for path /hadoopdata/hdfs_name in configuration.
- 2023-05-11 10:02:24,450 INFO namenode.FSEditLog: Edit logging is async:true
- 2023-05-11 10:02:24,542 INFO namenode.TransferFsImage: Opening connection to http://pignode1:9870/imagetransfer?getimage=1&txid=0&storageInfo=-66:1898329509:1683799022316:CID-ddaf258a-47c4-4dde-b681-2c9c70872ef1&bootstrapstandby=true
- 2023-05-11 10:02:24,567 INFO common.Util: Combined time for file download and fsync to all disks took 0.00s. The file download took 0.00s at 0.00 KB/s. Synchronous (fsync) write to disk of /hadoopdata/hdfs_name/current/fsimage.ckpt_0000000000000000000 took 0.00s.
- 2023-05-11 10:02:24,568 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 399 bytes.
- 2023-05-11 10:02:24,574 INFO ha.BootstrapStandby: Skipping InMemoryAliasMap bootstrap as it was not configured
- 2023-05-11 10:02:24,590 INFO namenode.NameNode: SHUTDOWN_MSG:
- /************************************************************
- SHUTDOWN_MSG: Shutting down NameNode at pignode3/10.0.31.18
- ************************************************************/

启动:
- [root@pignode3 ~]# hdfs --daemon start namenode
- [root@pignode3 ~]# jps
- 249 NameNode
- 330 Jps
- 75 JournalNode
在这里,就可以检查名字服务器的状态了,随便找个名字服务器查看一下,状态都是standby,这很正常,因为启动尚未完成,同志们还需努力。
- [root@pignode3 ~]# hdfs haadmin -getAllServiceState
- pignode1:8020 standby
- pignode2:8020 standby
- pignode3:8020 standby
从一个名字服务器节点执行:hdfs zkfc -formatZK
- [root@pignode1 ~]# hdfs zkfc -formatZK
- 2023-05-11 10:06:06,802 INFO tools.DFSZKFailoverController: STARTUP_MSG:
- /************************************************************
- STARTUP_MSG: Starting DFSZKFailoverController
- STARTUP_MSG: host = pignode1/10.0.31.21
- STARTUP_MSG: args = [-formatZK]
- STARTUP_MSG: version = 3.3.5
- …………
- …………
- 2023-05-11 10:06:07,564 INFO ha.ActiveStandbyElector: Session connected.
- 2023-05-11 10:06:07,618 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/pignamenodecluster in ZK.
- 2023-05-11 10:06:07,731 INFO zookeeper.ZooKeeper: Session: 0x300052118910000 closed
- 2023-05-11 10:06:07,731 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x300052118910000
- 2023-05-11 10:06:07,732 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x300052118910000
- 2023-05-11 10:06:07,736 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG:
- /************************************************************
- SHUTDOWN_MSG: Shutting down DFSZKFailoverController at pignode1/10.0.31.21
- ************************************************************/

按照官网的说法,从这里可以开始start-dfs.sh了;当然也可以手工 hdfs --daemon start zkfc
但是3个zkfc手工搞问题还不大,9个datanode就比较难受了,所以还是偷个懒:
- [root@pignode1 hadoop]# sbin/start-dfs.sh
- Starting namenodes on [pignode1 pignode2 pignode3]
- Last login: Thu May 11 10:05:49 UTC 2023 from 192.168.21.11 on pts/0
- pignode1: Warning: Permanently added 'pignode1,10.0.31.21' (ECDSA) to the list of known hosts.
- pignode2: Warning: Permanently added 'pignode2,10.0.31.12' (ECDSA) to the list of known hosts.
- pignode3: Warning: Permanently added 'pignode3,10.0.31.18' (ECDSA) to the list of known hosts.
- pignode1: namenode is running as process 259. Stop it first and ensure /tmp/hadoop-root-namenode.pid file is empty before retry.
- pignode2: namenode is running as process 251. Stop it first and ensure /tmp/hadoop-root-namenode.pid file is empty before retry.
- pignode3: namenode is running as process 249. Stop it first and ensure /tmp/hadoop-root-namenode.pid file is empty before retry.
- Starting datanodes
- Last login: Thu May 11 10:07:40 UTC 2023 on pts/0
- pignode5: Warning: Permanently added 'pignode5,10.0.31.2' (ECDSA) to the list of known hosts.
- pignode4: Warning: Permanently added 'pignode4,10.0.31.14' (ECDSA) to the list of known hosts.
- pignode6: Warning: Permanently added 'pignode6,10.0.31.16' (ECDSA) to the list of known hosts.
- pignode9: Warning: Permanently added 'pignode9,10.0.31.8' (ECDSA) to the list of known hosts.
- pignode11: Warning: Permanently added 'pignode11,10.0.31.15' (ECDSA) to the list of known hosts.
- pignode7: Warning: Permanently added 'pignode7,10.0.31.20' (ECDSA) to the list of known hosts.
- pignode10: Warning: Permanently added 'pignode10,10.0.31.19' (ECDSA) to the list of known hosts.
- pignode8: Warning: Permanently added 'pignode8,10.0.31.6' (ECDSA) to the list of known hosts.
- pignode12: Warning: Permanently added 'pignode12,10.0.31.4' (ECDSA) to the list of known hosts.
- pignode4: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode5: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode6: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode7: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode9: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode8: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode10: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode11: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode12: WARNING: /root/hadoop/logs does not exist. Creating.
- Starting journal nodes [pignode3 pignode2 pignode1]
- Last login: Thu May 11 10:07:40 UTC 2023 on pts/0
- pignode1: Warning: Permanently added 'pignode1,10.0.31.21' (ECDSA) to the list of known hosts.
- pignode2: Warning: Permanently added 'pignode2,10.0.31.12' (ECDSA) to the list of known hosts.
- pignode3: Warning: Permanently added 'pignode3,10.0.31.18' (ECDSA) to the list of known hosts.
- pignode2: journalnode is running as process 75. Stop it first and ensure /tmp/hadoop-root-journalnode.pid file is empty before retry.
- pignode1: journalnode is running as process 74. Stop it first and ensure /tmp/hadoop-root-journalnode.pid file is empty before retry.
- pignode3: journalnode is running as process 75. Stop it first and ensure /tmp/hadoop-root-journalnode.pid file is empty before retry.
- Starting ZK Failover Controllers on NN hosts [pignode1 pignode2 pignode3]
- Last login: Thu May 11 10:07:47 UTC 2023 on pts/0
- pignode1: Warning: Permanently added 'pignode1,10.0.31.21' (ECDSA) to the list of known hosts.
- pignode2: Warning: Permanently added 'pignode2,10.0.31.12' (ECDSA) to the list of known hosts.
- pignode3: Warning: Permanently added 'pignode3,10.0.31.18' (ECDSA) to the list of known hosts.
- [root@pignode1 hadoop]#

启动完了再看,一个服务器已经上线了:
- [root@pignode1 hadoop]# hdfs haadmin -getAllServiceState
- pignode1:8020 active
- pignode2:8020 standby
- pignode3:8020 standby
然后使用WEB管理器看看成果:
脚本的好处,是Datanode也都起来了:
参考Apache Hadoop 3.3.5 – ResourceManager High Availability进行Yarn的高可靠配置,涉及的主要参数包括:
打开resourcemanager的高可靠开关
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
定义resourcemanager的集群id,也就是自己取一个,而且目前看再别的什么地儿也没用上
定义resoucemanager集群的内部成员名称
yarn.resourcemanager.hostname.
rm-id定义每个resourcemanager部署的节点
yarn.resourcemanager.webapp.address.
rm-id定义每个resourcemanager的WEB管理页面的端口
- <property><!--设置YARN服务器主机-->
- <name>yarn.resourcemanager.cluster-id</name>
- <value>pignode-ha</value>
- </property>
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>pigresourcemanager1,pigresourcemanager2,pigresourcemanager3</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.pigresourcemanager1</name>
- <value>pignode1</value>
- </property>
- ……
- <property>
- <name>yarn.resourcemanager.webapp.address.pigresourcemanager1</name>
- <value>0.0.0.0:8088</value>
- </property>
- ……

官方给出的是hadoop.zk.address,但很多网上文章给出的是yarn.resourcemanaget.zk-address,估计是版本问题。whatever,能用就行。
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value>
- </property>
yarn可以直接使用start-yarn.sh脚本启动。使用yarn rmadmin -getAllServiceState可以查看resourcemanager的状态。
- [root@pignode1 hadoop]# sbin/start-yarn.sh
- Starting resourcemanagers on [ pignode1 pignode2]
- Last login: Thu May 11 10:07:50 UTC 2023 on pts/0
- pignode2: Warning: Permanently added 'pignode2,10.0.31.22' (ECDSA) to the list of known hosts.
- pignode1: Warning: Permanently added 'pignode1,10.0.31.21' (ECDSA) to the list of known hosts.
- pignode2: WARNING: /root/hadoop/logs does not exist. Creating.
- Starting nodemanagers
- Last login: Thu May 11 13:32:28 UTC 2023 on pts/0
- pignode6: Warning: Permanently added 'pignode6,10.0.31.27' (ECDSA) to the list of known hosts.
- pignode11: Warning: Permanently added 'pignode11,10.0.31.31' (ECDSA) to the list of known hosts.
- pignode4: Warning: Permanently added 'pignode4,10.0.31.25' (ECDSA) to the list of known hosts.
- pignode8: Warning: Permanently added 'pignode8,10.0.31.6' (ECDSA) to the list of known hosts.
- pignode7: Warning: Permanently added 'pignode7,10.0.31.20' (ECDSA) to the list of known hosts.
- pignode9: Warning: Permanently added 'pignode9,10.0.31.8' (ECDSA) to the list of known hosts.
- pignode12: Warning: Permanently added 'pignode12,10.0.31.30' (ECDSA) to the list of known hosts.
- pignode5: Warning: Permanently added 'pignode5,10.0.31.28' (ECDSA) to the list of known hosts.
- pignode10: Warning: Permanently added 'pignode10,10.0.31.29' (ECDSA) to the list of known hosts.
- pignode6: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode11: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode4: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode12: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode5: WARNING: /root/hadoop/logs does not exist. Creating.
- pignode10: WARNING: /root/hadoop/logs does not exist. Creating.
- Last login: Thu May 11 13:32:30 UTC 2023 on pts/0
- pignode3: Warning: Permanently added 'pignode3,10.0.31.24' (ECDSA) to the list of known hosts.
- pignode3: WARNING: /root/hadoop/logs does not exist. Creating.
- [root@pignode1 hadoop]#

启动后可以通过 yarn rmadmin命令查看resourcemanager的情况:
- [root@pignode1 hadoop]# yarn rmadmin -getAllServiceState
- pignode1:8033 standby
- pignode2:8033 active
- pignode3:8033 standby
和namenode不太一样的是,如果是不活跃的resourcemanager,似乎就无法访问管理页面:
但是活跃的rm,是可以访问的
MapReduce和非HA模式下配置、启动方式均一样,不赘述。
不多说了,前文已经很罗嗦,这里直接贴。
- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property>
- <!-- 设置名字服务器的URI,高可用配置下是逻辑名,也就是名字服务器群组的名字-->
- <name>fs.defaultFS</name>
- <value>hdfs://pignamenodecluster</value>
- </property>
- <property>
- <!-- 设置高可用集群的Zookeeper服务器 -->
- <name>ha.zookeeper.quorum</name>
- <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value>
- </property>
- <property>
- <!-- 设置root为名字服务器的用户 -->
- <name>hadoop.http.staticuser.user</name>
- <value>root</value>
- </property>
- <property>
- <!-- 设置hadoop dfs的目录,会被映射到宿主机的对应目录上 -->
- <name>hadoop.tmp.dir</name>
- <value>/hadoopdata/data</value>
- </property>
- </configuration>
- ~

- <?xml version="1.0" encoding="UTF-8"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property>
- <!-- 设置名字服务器群组 -->
- <name>dfs.nameservices</name>
- <value>pignamenodecluster</value>
- </property>
- <property>
- <name>dfs.ha.namenodes.pignamenodecluster</name>
- <value>pignamenode1,pignamenode2,pignamenode3</value>
- </property>
-
- <property><!--设置文件系统副本数 -->
- <name>dfs.replication</name>
- <value>3</value>
- </property>
-
- <!-- 配置namenode和datanode的工作目录-数据存储目录 -->
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>/hadoopdata/hdfs_name</value>
- </property>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>/hadoopdata/hdfs_data</value>
- </property>
-
- <!-- 启用webhdfs -->
- <property>
- <name>dfs.webhdfs.enabled</name>
- <value>true</value>
- </property>
-
- <property>
- <name>dfs.namenode.rpc-address.pignamenodecluster.pignamenode1</name>
- <value>pignode1:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.pignamenodecluster.pignamenode2</name>
- <value>pignode2:8020</value>
- </property>
- <property>
- <name>dfs.namenode.rpc-address.pignamenodecluster.pignamenode3</name>
- <value>pignode3:8020</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.pignamenodecluster.pignamenode1</name>
- <value>0.0.0.0:9870</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.pignamenodecluster.pignamenode2</name>
- <value>0.0.0.0:9870</value>
- </property>
- <property>
- <name>dfs.namenode.http-address.pignamenodecluster.pignamenode3</name>
- <value>0.0.0.0:9870</value>
- </property>
-
- <property>
- <!--设置名字服务器读写编辑条目的JournalNode集群-->
- <name>dfs.namenode.shared.edits.dir</name>
- <value>qjournal://pignode1:8485;pignode2:8485;pignode3:8485/pignamenodecluster</value>
- </property>
- <property>
- <name>dfs.journalnode.edits.dir</name>
- <value>/hadoopdata/journal</value>
- </property>
- <property>
- <name>dfs.ha.fencing.methods</name>
- <value>sshfence</value>
- </property>
- <property>
- <name>dfs.ha.fencing.ssh.private-key-files</name>
- <value>/root/.ssh/id_rsa</value>
- </property>
- <property>
- <name>dfs.ha.automatic-failover.enabled</name>
- <value>true</value>
- </property>
-
- <!-- HDFS客户端用来联系活跃名字服务器的Java Class -->
- <property>
- <name>dfs.client.failover.proxy.provider.pignamenodecluster</name>
- <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
-
- </configuration>

- <?xml version="1.0"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <configuration>
-
- <!-- Site specific YARN configuration properties -->
- <property>
- <name>yarn.resourcemanager.ha.enabled</name>
- <value>true</value>
- </property>
- <property>
- <name>yarn.resourcemanager.recovery.enabled</name>
- <value>true</value>
- </property>
- <property><!--设置YARN服务器主机-->
- <name>yarn.resourcemanager.cluster-id</name>
- <value>pignode-ha</value>
- </property>
- <property>
- <name>yarn.resourcemanager.ha.rm-ids</name>
- <value>pigresourcemanager1,pigresourcemanager2,pigresourcemanager3</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.pigresourcemanager1</name>
- <value>pignode1</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.pigresourcemanager2</name>
- <value>pignode2</value>
- </property>
- <property>
- <name>yarn.resourcemanager.hostname.pigresourcemanager3</name>
- <value>pignode3</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address.pigresourcemanager1</name>
- <value>0.0.0.0:8088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address.pigresourcemanager2</name>
- <value>0.0.0.0:8088</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address.pigresourcemanager3</name>
- <value>0.0.0.0:8088</value>
- </property>
-
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property>
- <name>yarn.nodemanager.env-whitelist</name>
- <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
- </property>
- <property>
- <name>yarn.resourcemanager.stored.class</name>
- <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
- </property>
- <property>
- <name>yarn.resourcemanager.zk-address</name>
- <value>zookeeper1:2181,zookeeper2:2181,zookeeper3:2181</value>
- </property>
- </configuration>

- <?xml version="1.0"?>
- <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
- <!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
-
- <!-- Put site-specific property overrides in this file. -->
-
- <configuration>
- <property><!--指定MapReduce使用的集群框架(YARN)-->
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property><!--设置Job History Server的地址-->
- <name>mapreduce.jobhistory.address</name>
- <value>pignode3:10020</value>
- </property>
- <property><!--设置Job History Server的网络接口-->
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>0.0.0.0:19888</value>
- </property>
- <property>
- <name>mapreduce.application.classpath</name>
- <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
- </property>
- </configuration>

主要功能,就是清理宿主机上映射的hadoop、zookeeper目录,防止因为反复格式化等操作,在试验过程中出现莫名其妙的错误。
- #! /bin/bash
-
- index=1
- rm /hadoopdata/* -rf
- while(($index<=12));do
- file="/hadoopdata/${index}"
- mkdir $file
- mkdir ${file}/data
- mkdir ${file}/hdfs_name
- mkdir ${file}/hdfs_data
- mkdir ${file}/journal
- let "index++"
- done
-
- index=1
- while(($index<=3));do
- file="/hadoopdata/zoo/${index}"
- mkdir ${file}/data -p
- mkdir ${file}/datalog -p
- mkdir ${file}/logs -p
- let "index++"
- done

初始化脚本贴在这里完全是给我自己看的,因为一些参数我没有做得很灵活。而且,仅仅只考虑了没有格式化和已经格式化2种情况下,所有节点启动的情况……也就是swarm启动的情况,没有考虑单个节点失效重启情况下如何启动,仅仅是在等待5分钟后执行start-dfs.sh和start-yarn.sh脚本,由脚本帮助判断是否需要启动进程。总之应付当前需求是够了。
- #! /bin/bash
- # the NODE_COUNT param set by swarm config yml file, using endpoint_environment flag.
- NODECOUNT=$NODE_COUNT
- TRYLOOP=50
- ZOOKEEPERNODECOUNT=$ZOOKEEPER_COUNT
-
- ############################################################################################################
- ## 1. get enviorenment param
- ############################################################################################################
- source /etc/profile
- source /root/.bashrc
-
- ############################################################################################################
- ## 2. for every node, init sshd service
- ############################################################################################################
- /sbin/sshd -D &
-
- ############################################################################################################
- ## 3. define functions
- ############################################################################################################
-
- #FUNCTION:to test all the nodes can be connected------------------------------------------------------------
- #param1: node's hostname prefix
- #param2: node count
- #param3: how many times the manager node try connect
- isAllNodesConnected(){
- PIGNODE_PRENAME=$1
- PIGNODE_COUNT=$2
- TRYLOOP_COUNT=$3
- tryloop=0
- ind=1
- #init pignode hostname array,and pignode status array
- while(( $ind <= $PIGNODE_COUNT ))
- do
- pignodes[$ind]="$PIGNODE_PRENAME$ind"
- pignodes_stat[$ind]=0
- let "ind++"
- done
-
- #check wether all the pignodes can be connected
- noactivecount=$PIGNODE_COUNT
- while(( $noactivecount > 0 ))
- do
- noactivecount=$PIGNODE_COUNT
- ind=1
- while(( $ind <= $PIGNODE_COUNT ))
- do
- if (( ${pignodes_stat[$ind]}==0 ))
- then
- ping -c 1 ${pignodes[$ind]} > /dev/null
- if (($?==0))
- then
- pignodes_stat[$ind]=1
- let "noactivecount-=1"
- echo "Try to connect ${pignodes[$ind]}:successed." >>init.log
- else
- echo "Try to connect ${pignodes[$ind]}: failed." >>init.log
- fi
- else
- let "noactivecount-=1"
- fi
- let "ind++"
- done
- if (( ${noactivecount}>0 ))
- then
- let "tryloop++"
- if (($tryloop>$TRYLOOP_COUNT))
- then
- echo "ERROR Tried ${TRYLOOP_COUNT} loops. ${noactivecount} nodes failed, exit." >>init.log
- break;
- fi
- echo "${noactivecount} left for ${PIGNODE_COUNT} nodes not connected, waiting for next try">>init.log
- sleep 5
- else
- echo "All nodes are connected.">>init.log
- fi
- done
- return $noactivecount
- }
- #----------------------------------------------------------------------------------------------------------
-
- #FUNCTION:get the hadoop data directory--------------------------------------------------------------------
- getDataDirectory(){
- #when use tmp data directory
- # configfiledir=`echo "${HADOOP_HOME}/etc/hadoop/core-site.xml"`
- # datadir=`cat ${configfiledir} | grep -A 2 'hadoop.tmp.dir' | grep '<value>' | sed 's/^[[:blank:]]*<value>//g' | sed 's/<\/value>$//g'`
- # echo $datadir
-
- #when use namenode.name.dir direcotry
- datadir=`cat ${HADOOP_HOME}/etc/hadoop/hdfs-site.xml|grep -A 2 "dfs.namenode.name.dir"|grep "<value>"|sed -e "s/<value>//g"|sed -e "s/<\/value>//g"`
- echo $datadir
- }
- #---------------------------------------------------------------------------------------------------------
-
- #FUNCTION:init hadoop while dfs not formatted.------------------------------------------------------------
- initHadoop_format(){
- #init journalnode
- echo 'start all Journalnode' >> init.log
- journallist=`cat $HADOOP_HOME/etc/hadoop/hdfs-site.xml |grep -A 2 'dfs.namenode.shared.edits.dir'|grep '<value>'|sed -e "s/<value>qjournal:\/\/\(.*\)\/.*<\/value>/\1/g"|sed "s/;/ /g"|sed -e "s/:[[:digit:]]\{2,5\}/ /g"`
- for journalnode in $journallist;do
- ssh root@${journalnode} "hdfs --daemon start journalnode"
- done
-
- #format and start the main namenode
- echo 'format and start namenode 1'>>init.log
- hdfs namenode -format
- if (( $?!=0 )); then
- exit $?
- fi
- hdfs --daemon start namenode
- if (( $?!=0 )); then
- exit $?
- fi
-
- #sync and start other namenodes
- echo 'sync and start others.'>>init.log
- dosyncid=2
- while (($dosyncid<=3));do
- ssh root@$nodehostnameprefix$dosyncid "hdfs namenode -bootstrapStandby"
- if (( $?!=0 )); then
- exit $?
- fi
- ssh root@$nodehostnameprefix$dosyncid "hdfs --daemon start namenode"
- if (( $?!=0 )); then
- exit $?
- fi
- let "dosyncid++"
- done
-
- #format zookeeper directory
- hdfs zkfc -formatZK
- }
- #---------------------------------------------------------------------------------------------------------
-
- #FUNCTION:init hadoop while dfs formatted-----------------------------------------------------------------
- initHadoop_noformat(){
- echo 'name node formatted. go on to start dfs related nodes and service'>>init.log
- sbin/start-dfs.sh
- if (( $?!=0 )); then
- exit $?
- fi
-
- echo 'start yarn resourcemanager and node manager'>>init.log
- sbin/start-yarn.sh
- if (( $?!=0 )); then
- exit $?
- fi
-
- echo 'start mapreduce history server'>>init.log
- historyservernode=`cat $HADOOP_HOME/etc/hadoop/mapred-site.xml |grep -A 2 'mapreduce.jobhistory.address'|grep '<value>' |sed -e "s/^.*<value>//g"|sed -e "s/<\/value>//g"|sed -e "s/:[[:digit:]]*//g"`
- ssh root@$historyservernode "mapred --daemon start historyserver"
- if (( $?!=0 )); then
- exit $?
- fi
- }
-
- ############################################################################################################
- ## 4. test wether this is the main node ##
- ############################################################################################################
- #get the host node's name, name prefix, and name No.
- nodehostname=`hostname`
- nodehostnameprefix=`echo $nodehostname|sed -e 's|[[:digit:]]\+$||g'`
- nodeindex=`hostname | sed "s/${nodehostnameprefix}//g"`
-
- #get the zookeeper's name prefix from yarn-site.xml
- zookeepernameprefix=`cat ${HADOOP_HOME}/etc/hadoop/yarn-site.xml |grep -A 2 '<name>yarn.resourcemanager.zk-address</name>'|grep '<value>'|sed -e "s/[[:blank:]]\+<value>\([[:alpha:]]\+\)[[:digit:]]\+:.*/\1/g"`
-
-
- #1.ensure in working directory, only the first node can go on initiation.
- cd $HADOOP_HOME
- #check the NODECOUNT param,if it is less than 3, do notion and return err for 3 node can not support ha mode.
- if (($NODECOUNT<=3));then
- echo "Nodes count must more than 3.">>init.log
- exit 1
- fi
-
- #check node id,if node id not equal 1, do nothing.
- if (($nodeindex!=1));then
- echo $nodehostname waiting for init...>>init.log
- sleep 5m
- cd $HADOOP_HOME
- sbin/start-dfs.sh
- sbin/start-yarn.sh
- if (($nodeindex==3));then
- mapred --daemon start historyserver
- fi
- tail -f /dev/null
- exit 0
- fi
-
- #2.Try to connect to all host nodes and zookeeper nodes.
- echo $nodehostname is the init manager nodes...>>init.log
- #waiting for all the nodes connected
- isAllNodesConnected $nodehostnameprefix $NODECOUNT $TRYLOOP
- isHadoopOK=$?
- isAllNodesConnected $zookeepernameprefix $ZOOKEEPERNODECOUNT $TRYLOOP
- isZookeeperOK=$?
- if ([ $isHadoopOK != 0 ] || [ $isZookeeperOK != 0 ]);then
- echo "Not all the host nodes or not all the zookeeper nodes actived. exit 1">>init.log
- exit 0
- fi
-
- #3. whether dfs is formatted.
- datadirectory=`echo $(getDataDirectory)`
- if [ $datadirectory ];then
- datadircontent=`ls -A ${datadirectory}`
- if [ -z $datadircontent ];then
- echo "dfs is not formatted.">>init.log
- isDfsFormat=0
- else
- echo "dfs is already formatted.">>init.log
- isDfsFormat=1
- fi
- else
- echo "ERROR:Can not get hadoop tmp data directory.init can not be done. ">>init.log
- exit 1
- fi
-
- #4. if not fomatted, then do format and sync
- if (( $isDfsFormat == 0 ));then
- initHadoop_format
- fi
- if (( $? != 0 ));then
- echo "ERROR:Init Hadoop interruptted...">>init.log
- exit $?
- fi
-
- #5. start all dfs node, yarn node and mapreduce history server
- initHadoop_noformat
- if (( $? != 0 ));then
- echo "ERROR:Init Hadoop interruptted...">>init.log
- exit $?
- fi
-
- echo "hadoop init work has been done. hang up for swarm."
-
- tail -f /dev/null

version: "3.7" services: pignode1: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: constraints: - node.hostname==pighost1 hostname: pignode1 environment: - NODE_COUNT=12 - ZOOKEEPER_COUNT=3 networks: - pig ports: - target: 22 published: 9011 protocol: tcp mode: host - target: 9000 published: 9000 protocol: tcp mode: host - target: 9870 published: 9870 protocol: tcp mode: host - target: 8088 published: 8088 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/1:/hadoopdata:wr pignode2: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Second Namenode限制部署在第二个节点上 constraints: - node.hostname==pighost2 networks: - pig hostname: pignode2 ports: # 第二名字服务器接口 - target: 22 published: 9012 protocol: tcp mode: host - target: 9890 published: 9890 protocol: tcp mode: host - target: 9870 published: 9871 protocol: tcp mode: host - target: 8088 published: 8089 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/2:/hadoopdata:wr pignode3: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: - node.hostname==pighost3 networks: - pig hostname: pignode3 ports: - target: 22 published: 9013 protocol: tcp mode: host - target: 9870 published: 9872 protocol: tcp mode: host - target: 8088 published: 8087 protocol: tcp mode: host - target: 8090 published: 8090 protocol: tcp mode: host - target: 10020 published: 10020 protocol: tcp mode: host - target: 19888 published: 19888 protocol: tcp mode: host volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/3:/hadoopdata:wr #------------------------------------------------------------------------------------------------ #以下均为工作节点,可在除leader以外的主机上部署 pignode4: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager # node.role==worker - node.hostname==pighost3 networks: - pig ports: - target: 22 published: 9014 protocol: tcp mode: host hostname: pignode4 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/4:/hadoopdata:wr pignode5: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost3 networks: - pig ports: - target: 22 published: 9015 protocol: tcp mode: host hostname: pignode5 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/5:/hadoopdata:wr pignode6: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost3 networks: - pig ports: - target: 22 published: 9016 protocol: tcp mode: host hostname: pignode6 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/6:/hadoopdata:wr pignode7: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9017 protocol: tcp mode: host hostname: pignode7 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/7:/hadoopdata:wr pignode8: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9018 protocol: tcp mode: host hostname: pignode8 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/8:/hadoopdata:wr pignode9: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost4 networks: - pig ports: - target: 22 published: 9019 protocol: tcp mode: host hostname: pignode9 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/9:/hadoopdata:wr pignode10: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9020 protocol: tcp mode: host hostname: pignode10 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/10:/hadoopdata:wr pignode11: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9021 protocol: tcp mode: host hostname: pignode11 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/11:/hadoopdata:wr pignode12: image: pig/hadoop:ha deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: # 将Mapreduce限制部署在第三个节点上 constraints: # node.role==manager - node.hostname==pighost5 networks: - pig ports: - target: 22 published: 9022 protocol: tcp mode: host hostname: pignode12 volumes: # 映射xml配置文件 - ./config/core-site.xml:/root/hadoop/etc/hadoop/core-site.xml:r - ./config/hdfs-site.xml:/root/hadoop/etc/hadoop/hdfs-site.xml:r - ./config/yarn-site.xml:/root/hadoop/etc/hadoop/yarn-site.xml:r - ./config/mapred-site.xml:/root/hadoop/etc/hadoop/mapred-site.xml:r # 映射workers文件 - ./config/workers:/root/hadoop/etc/hadoop/workers:r # 映射数据目录 - /hadoopdata/12:/hadoopdata:wr zookeeper1: image: zookeeper:latest deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: constraints: - node.hostname==pighost1 networks: - pig ports: - target: 2181 published: 2181 protocol: tcp mode: host hostname: zookeeper1 environment: - ZOO_MY_ID=1 - ZOO_SERVERS=server.1=zookeeper1:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=zookeeper3:2888:3888;2181 volumes: - /hadoopdata/zoo/1/data:/data - /hadoopdata/zoo/1/datalog:/datalog - /hadoopdata/zoo/1/logs:/logs zookeeper2: image: zookeeper:latest deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: constraints: - node.hostname==pighost2 networks: - pig ports: - target: 2181 published: 2182 protocol: tcp mode: host hostname: zookeeper2 environment: - ZOO_MY_ID=2 - ZOO_SERVERS=server.1=zookeeper1:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=zookeeper3:2888:3888;2181 volumes: - /hadoopdata/zoo/2/data:/data - /hadoopdata/zoo/2/datalog:/datalog - /hadoopdata/zoo/2/logs:/logs zookeeper3: image: zookeeper:latest deploy: endpoint_mode: dnsrr restart_policy: condition: on-failure placement: constraints: - node.hostname==pighost3 networks: - pig ports: - target: 2181 published: 2183 protocol: tcp mode: host hostname: zookeeper3 environment: - ZOO_MY_ID=3 - ZOO_SERVERS=server.1=zookeeper1:2888:3888;2181 server.2=zookeeper2:2888:3888;2181 server.3=zookeeper3:2888:3888;2181 volumes: - /hadoopdata/zoo/3/data:/data - /hadoopdata/zoo/3/datalog:/datalog - /hadoopdata/zoo/3/logs:/logs networks: pig:
到这吧,不能再写了,内容太多,网页都卡……重启一篇。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。