当前位置:   article > 正文

Docker搭建hadoop完全分布式集群_docker安装全分布式hadoop centos

docker安装全分布式hadoop centos

一、环境

1、Linux

  1. [root@localhost docker-hadoop]# uname -a
  2. Linux localhost.localdomain 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  3. [root@localhost docker-hadoop]# cat /etc/centos-release
  4. CentOS Linux release 7.6.1810 (Core)

2、docker

  1. [root@localhost docker-hadoop]# docker version
  2. Client:
  3. Version: 18.09.4
  4. API version: 1.39
  5. Go version: go1.10.8
  6. Git commit: d14af54266
  7. Built: Wed Mar 27 18:34:51 2019
  8. OS/Arch: linux/amd64
  9. Experimental: false
  10. Server: Docker Engine - Community
  11. Engine:
  12. Version: 18.09.4
  13. API version: 1.39 (minimum version 1.12)
  14. Go version: go1.10.8
  15. Git commit: d14af54
  16. Built: Wed Mar 27 18:04:46 2019
  17. OS/Arch: linux/amd64
  18. Experimental: false
  19. [root@localhost docker-hadoop]#

3、java

  1. java version "1.8.0_101"
  2. Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
  3. Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)

4、hadoop

  1. [root@0a360e41e726 /]# hadoop version
  2. Hadoop 2.7.3
  3. Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
  4. Compiled by root on 2016-08-18T01:41Z
  5. Compiled with protoc 2.5.0
  6. From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
  7. This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
  8. [root@0a360e41e726 /]#

二、机器规划

3台机器,一主二从
主机名: hadoop2、ip地址: 172.19.0.2  (master)

主机名: hadoop3、ip地址: 172.19.0.3   (slaves)
主机名: hadoop4、ip地址: 172.19.0.4    (slaves)

三、构建镜像

1、构建centos-ssh镜像

注:docker hub上已经有安装好ssh服务的docker:komukomo/centos-sshd,我这里直接拉取,不去自己搭建了

 

运行一个容器:

docker run -itd --name centos-ssh komukomo/centos-sshd /bin/bash

开启ssh服务:

  1. [root@e75b27396db3 /]# /usr/sbin/sshd
  2. [root@e75b27396db3 /]#

查看ssh服务是否已开启:

  1. [root@e75b27396db3 /]# netstat -antp | grep sshd
  2. tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 19/sshd
  3. tcp 0 0 :::22 :::* LISTEN 19/sshd
  4. [root@e75b27396db3 /]#

设置免密登录:

  1. ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  2. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

验证免密登录是否已生效(密码为root):

  1. [root@e75b27396db3 /]# ssh root@localhost
  2. The authenticity of host 'localhost (127.0.0.1)' can't be established.
  3. RSA key fingerprint is e5:ab:55:1b:73:c4:51:33:c6:3b:45:a0:b2:34:e7:74.
  4. Are you sure you want to continue connecting (yes/no)? yes
  5. Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
  6. [root@e75b27396db3 ~]# ssh root@localhost
  7. Last login: Fri Apr 12 07:20:00 2019 from localhost
  8. [root@e75b27396db3 ~]# exit
  9. logout
  10. Connection to localhost closed.
  11. [root@e75b27396db3 ~]#

根据该容器构建一个centos-ssh镜像:

  1. [root@localhost ~]# docker commit centos-ssh centos-ssh
  2. sha256:97ef260595ae36d81c9f26b6ed0ed5d13502b7699e079554928ec8cc6fc1b159
  3. [root@localhost ~]# docker images
  4. REPOSITORY TAG IMAGE ID CREATED SIZE
  5. centos-ssh latest 97ef260595ae 4 seconds ago 410MB
  6. centos7-ssh latest 4e4796f7e8ef About an hour ago 289MB
  7. <none> <none> e08ee32cfd93 About an hour ago 289MB
  8. <none> <none> 3cff40060339 About an hour ago 289MB
  9. centos-tools latest bb563754f296 4 hours ago 391MB
  10. jquery134/mycentos v1.0 8c63d14863d3 4 days ago 354MB
  11. tomcat latest f1332ae3f570 13 days ago 463MB
  12. nginx latest 2bcb04bdb83f 2 weeks ago 109MB
  13. centos latest 9f38484d220f 4 weeks ago 202MB
  14. ubuntu latest 94e814e2efa8 4 weeks ago 88.9MB
  15. jdeathe/centos-ssh latest f68976440f24 6 weeks ago 226MB
  16. komukomo/centos-sshd latest d969d0bdc7ac 2 years ago 289MB
  17. [root@localhost ~]#

2、根据centos-ssh镜像构建hadoop镜像

构建时宿主机上的目录结构:

Dockerfile文件内容:

  1. FROM centos-ssh
  2. ADD jdk-8u101-linux-x64.tar.gz /usr/local/
  3. RUN mv /usr/local/jdk1.8.0_101 /usr/local/jdk1.8
  4. ENV JAVA_HOME /usr/local/jdk1.8
  5. ENV PATH $JAVA_HOME/bin:$PATH
  6. ADD hadoop-2.7.3.tar.gz /usr/local
  7. RUN mv /usr/local/hadoop-2.7.3 /usr/local/hadoop
  8. ENV HADOOP_HOME /usr/local/hadoop
  9. ENV PATH $HADOOP_HOME/bin:$PATH
  10. RUN yum install -y which sudo

构建hadoop镜像:

[root@localhost Hadoop]# docker build -t="hadoop" .

3、根据hadoop镜像创建三个容器,并在容器中开启ssh服务

创建自定义网络:

  1. [root@localhost Hadoop]# docker network create --subnet=172.19.0.0/16 mynetwork
  2. 522bc0ed2d6048e5f303245d0c85ae36e62d0735f1d2e9ca5c73a11f103c1954
  3. [root@localhost Hadoop]# docker network ls
  4. NETWORK ID NAME DRIVER SCOPE
  5. 76cf156331a1 bridge bridge local
  6. 2059529b97fd bridge1 bridge local
  7. 2c43b9b438d5 host host local
  8. 522bc0ed2d60 mynetwork bridge local
  9. 6c75caf7d102 none null local
  10. [root@localhost Hadoop]#

运行容器:

  1. [root@localhost Hadoop]# docker run -itd --name hadoop2 --net mynetwork --ip 172.19.0.2 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -p 8088:8088 -p 9000:9000 -p 50070:50070 -p 9001:9001 -p 8030:8030 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 10020:10020 -p 19888:19888 jquery134/hadoop /bin/bash
  2. 10c1a242c22efd92d8f9007f4f51f5ff6c9e4511daa6d5fd29152ab1ac43c0e5
  3. [root@localhost Hadoop]# docker run -itd --name hadoop3 --net mynetwork --ip 172.19.0.3 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P jquery134/hadoop /bin/bash
  4. 8276aa51a9584ba23aab9cbcc069a157ea34f95cb21eba67189f1bc7347cca81
  5. [root@localhost Hadoop]# docker run -itd --name hadoop4 --net mynetwork --ip 172.19.0.4 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P jquery134/hadoop /bin/bash
  6. ea17f5a50d5a1c5e2effe26c84e93387440debb91316026a9c7f5dc3700cca56
  7. [root@localhost Hadoop]#

分别开启三个容器的ssh服务:

  1. [root@localhost Hadoop]# docker exec -d hadoop2 /usr/sbin/sshd
  2. [root@localhost Hadoop]# docker exec -d hadoop3 /usr/sbin/sshd
  3. [root@localhost Hadoop]# docker exec -d hadoop4 /usr/sbin/sshd

验证环境包括java、hadoop、ssh、网络连通性、免密登录:

  1. [root@10c1a242c22e /]# java -version
  2. java version "1.8.0_101"
  3. Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
  4. Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
  5. [root@10c1a242c22e /]# javac -version
  6. javac 1.8.0_101
  7. [root@10c1a242c22e /]# ssh root@172.19.0.3
  8. Last login: Fri Apr 12 08:07:46 2019 from hadoop2
  9. [root@8276aa51a958 ~]# exit;
  10. logout
  11. Connection to 172.19.0.3 closed.
  12. [root@10c1a242c22e /]# hadoop version
  13. Hadoop 2.7.3
  14. Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
  15. Compiled by root on 2016-08-18T01:41Z
  16. Compiled with protoc 2.5.0
  17. From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
  18. This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
  19. [root@10c1a242c22e /]# ping hadoop3
  20. PING hadoop3 (172.19.0.3) 56(84) bytes of data.
  21. 64 bytes from hadoop3 (172.19.0.3): icmp_seq=1 ttl=64 time=0.248 ms
  22. 64 bytes from hadoop3 (172.19.0.3): icmp_seq=2 ttl=64 time=0.145 ms
  23. ^C
  24. --- hadoop3 ping statistics ---
  25. 2 packets transmitted, 2 received, 0% packet loss, time 1936ms
  26. rtt min/avg/max/mdev = 0.145/0.196/0.248/0.053 ms
  27. [root@10c1a242c22e /]# ping hadoop4
  28. PING hadoop4 (172.19.0.4) 56(84) bytes of data.
  29. 64 bytes from hadoop4 (172.19.0.4): icmp_seq=1 ttl=64 time=0.233 ms
  30. 64 bytes from hadoop4 (172.19.0.4): icmp_seq=2 ttl=64 time=0.095 ms
  31. ^C
  32. --- hadoop4 ping statistics ---
  33. 2 packets transmitted, 2 received, 0% packet loss, time 1754ms
  34. rtt min/avg/max/mdev = 0.095/0.164/0.233/0.069 ms
  35. [root@10c1a242c22e /]# ping hadoop4

 

4、配置hadoop

在/usr/local/hadoop/etc/hadoop/hadoop-env.sh中,添加JAVA_HOME信息:

 export JAVA_HOME=/usr/local/jdk1.8

 

 core-site.xml

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <property>
  17. <name>fs.default.name</name>
  18. <value>hdfs://hadoop2/</value>
  19. </property>
  20. <property>
  21. <name>io.file.buffer.size</name>
  22. <value>131072</value>
  23. </property>
  24. <property>
  25. <name>hadoop.tmp.dir</name>
  26. <value>/home/hadoop/tmp</value>
  27. <description>Abase for other temporary directories.</description>
  28. </property>
  29. </configuration>

wAAACH5BAEKAAAALAAAAAABAAEAAAICRAEAOw==

hdfs-site.xml

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <property>
  17. <name>dfs.namenode.secondary.http-address</name>
  18. <value>hadoop2:9001</value>
  19. <description># 通过web界面来查看HDFS状态 </description>
  20. </property>
  21. <property>
  22. <name>dfs.namenode.name.dir</name>
  23. <value>/home/hadoop/dfs/name</value>
  24. </property>
  25. <property>
  26. <name>dfs.datanode.data.dir</name>
  27. <value>/home/hadoop/dfs/data</value>
  28. </property>
  29. <property>
  30. <name>dfs.replication</name>
  31. <value>2</value>
  32. <description># 每个Block有2个备份</description>
  33. </property>
  34. <property>
  35. <name>dfs.webhdfs.enabled</name>
  36. <value>true</value>
  37. </property>
  38. </configuration>

mapred-site.xml

  1. <?xml version="1.0"?>
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
  3. <!--
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License. See accompanying LICENSE file.
  13. -->
  14. <!-- Put site-specific property overrides in this file. -->
  15. <configuration>
  16. <property>
  17. <name>mapreduce.framework.name</name>
  18. <value>yarn</value>
  19. </property>
  20. <property>
  21. <name>mapreduce.jobhistory.address</name>
  22. <value>hadoop2:10020</value>
  23. </property>
  24. <property>
  25. <name>mapreduce.jobhistory.webapp.address</name>
  26. <value>hadoop2:19888</value>
  27. </property>
  28. </configuration>

yarn-site.xml

  1. <?xml version="1.0"?>
  2. <!--
  3. Licensed under the Apache License, Version 2.0 (the "License");
  4. you may not use this file except in compliance with the License.
  5. You may obtain a copy of the License at
  6. http://www.apache.org/licenses/LICENSE-2.0
  7. Unless required by applicable law or agreed to in writing, software
  8. distributed under the License is distributed on an "AS IS" BASIS,
  9. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  10. See the License for the specific language governing permissions and
  11. limitations under the License. See accompanying LICENSE file.
  12. -->
  13. <configuration>
  14. <!-- Site specific YARN configuration properties -->
  15. <property>
  16. <name>yarn.nodemanager.aux-services</name>
  17. <value>mapreduce_shuffle</value>
  18. </property>
  19. <property>
  20. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  21. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  22. </property>
  23. <property>
  24. <name>yarn.resourcemanager.address</name>
  25. <value>hadoop2:8032</value>
  26. </property>
  27. <property>
  28. <name>yarn.resourcemanager.scheduler.address</name>
  29. <value>hadoop2:8030</value>
  30. </property>
  31. <property>
  32. <name>yarn.resourcemanager.resource-tracker.address</name>
  33. <value>hadoop2:8031</value>
  34. </property>
  35. <property>
  36. <name>yarn.resourcemanager.admin.address</name>
  37. <value>hadoop2:8033</value>
  38. </property>
  39. <property>
  40. <name>yarn.resourcemanager.webapp.address</name>
  41. <value>hadoop2:8088</value>
  42. </property>
  43. <property>
  44. <name>yarn.nodemanager.resource.memory-mb</name>
  45. <value>1024</value>
  46. </property>
  47. <property>
  48. <name>yarn.nodemanager.resource.cpu-vcores</name>
  49. <value>1</value>
  50. </property>
  51. </configuration>

slaves

  1. hadoop3
  2. hadoop4

将配置好的hadoop拷贝好到hadoop3、hadoop4中:

  1. scp -rq /usr/local/hadoop hadoop3:/usr/local
  2. scp -rq /usr/local/hadoop hadoop4:/usr/local

执行格式化:

bin/hdfs namenode -format

在master主机上执行start-all.sh脚本启动集群:

查看集群启动结果:

hadoop2上

  1. [root@10c1a242c22e bin]# jps
  2. 643 ResourceManager
  3. 310 NameNode
  4. 492 SecondaryNameNode
  5. 956 Jps
  6. [root@10c1a242c22e bin]#

hadoop3上

  1. [root@8276aa51a958 /]# jps
  2. 369 Jps
  3. 153 DataNode
  4. 250 NodeManager
  5. [root@8276aa51a958 /]#

hadoop4上

  1. [root@ea17f5a50d5a /]# jps
  2. 144 NodeManager
  3. 263 Jps
  4. 47 DataNode
  5. [root@ea17f5a50d5a /]#

 

注:可以将hadoop2、hadoop3、hadoop4提交为镜像,方便以后修改端口映射等操作:

  1. [root@localhost Hadoop]# docker run -itd --name hadoop2 --net mynetwork --ip 172.19.0.2 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -p 8088:8088 -p 50070:50070 -p 19888:19888 hadoop2 /bin/bash
  2. 10c1a242c22efd92d8f9007f4f51f5ff6c9e4511daa6d5fd29152ab1ac43c0e5
  3. [root@localhost Hadoop]# docker run -itd --name hadoop3 --net mynetwork --ip 172.19.0.3 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P hadoop3 /bin/bash
  4. 8276aa51a9584ba23aab9cbcc069a157ea34f95cb21eba67189f1bc7347cca81
  5. [root@localhost Hadoop]# docker run -itd --name hadoop4 --net mynetwork --ip 172.19.0.4 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P hadoop4 /bin/bash
  6. ea17f5a50d5a1c5e2effe26c84e93387440debb91316026a9c7f5dc3700cca56
  7. [root@localhost Hadoop]#

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小蓝xlanll/article/detail/730629
推荐阅读
相关标签
  

闽ICP备14008679号