赞
踩
服务器集群 | 单节点,机器最低配置:双核CPU、8GB内存、100G硬盘 |
运行环境 | CentOS 7.4 |
服务和组件 | 服务和组件根据实验需求安装 |
- [root@localhost ~]# ip add show
- 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
- qlen 1
- link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
- inet 127.0.0.1/8 scope host lo
- valid_lft forever preferred_lft forever
- inet6 ::1/128 scope host
- valid_lft forever preferred_lft forever
- 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
- state UP qlen 1000
- link/ether 00:0c:29:b7:35:be brd ff:ff:ff:ff:ff:ff18
- inet 192.168.47.140/24 brd 192.168.47.255 scope global dynamic ens33
- valid_lft 1460sec preferred_lft 1460sec
- inet6 fe80::29cc:5498:c98a:af4b/64 scope link
- valid_lft forever preferred_lft forever
- [root@localhost ~]# hostnamectl set-hostname master
- [root@localhost ~]# bash
- [root@master ~]# hostname
- master
- [root@master ~]# vi /etc/hosts
- 127.0.0.1 localhost localhost.localdomain localhost4
- localhost4.localdomain4
- ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
- 192.168.47.140 master
- [root@master ~]# systemctl status sshd
- ● sshd.service - OpenSSH server daemon
- Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor
- preset: enabled)
- Active: active (running) since 一 2021-12-20 08:22:16 CST; 10 months 21
- days ago
- Docs: man:sshd(8)
- man:sshd_config(5)
- Main PID: 1048 (sshd)
- CGroup: /system.slice/sshd.service
- └─1048 /usr/sbin/sshd -D
- [root@master ~]# systemctl stop firewalld
- 关闭防火墙后要查看防火墙的状态,确认一下。
- [root@master ~]# systemctl status firewalld
- 看到 inactive (dead)就表示防火墙已经关闭。不过这样设置后,Linux 系统如
- 果重启,防火墙仍然会重新启动。执行如下命令可以永久关闭防火墙。
- [root@master ~]# systemctl disable firewalld
- Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
- Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
- [root@master ~]# useradd hadoop
- [root@master ~]# echo "1" |passwd --stdin hadoop
- 更改用户 hadoop 的密码 。
- passwd:所有的身份验证令牌已经成功更新。
- [root@master ~]# rpm -qa | grep java
- javapackages-tools-3.4.1-11.el7.noarch
- java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64
- tzdata-java-2022e-1.el7.noarch
- python-javapackages-3.4.1-11.el7.noarch
- java-1.8.0-openjdk-headless-1.8.0.352.b08-2.el7_9.x86_64
- 卸载相关服务,键入命令
- [root@master ~]# rpm -e --nodeps javapackages-tools-3.4.1-11.el7.noarch
- [root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-1.8.0.352.b08-
- 2.el7_9.x86_64
- [root@master ~]# rpm -e --nodeps tzdata-java-2022e-1.el7.noarch
- [root@master ~]# rpm -e --nodeps python-javapackages-3.4.1-11.el7.noarch
- [root@master ~]# rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.352.b08-
- 2.el7_9.x86_64
- [root@master ~]# rpm -qa | grep java
- 查看删除结果再次键入命令 java -version 出现以下结果表示删除功
- [root@master ~]# java --version
- bash: java: 未找到命令
- [root@master ~]# vi /etc/profile
- 在文件的最后增加如下两行:
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- 执行 source 使设置生效:
- [root@master ~]# source /etc/profile
- 检查 JAVA 是否可用。
- [root@master ~]# echo $JAVA_HOME
- /usr/local/src/jdk1.8.0_152
- [root@master ~]# java -version
- java version "1.8.0_152"
- Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
- Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
- 能够正常显示 Java 版本则说明 JDK 安装并配置成功。
- 安装命令如下,将安装包解压到/usr/local/src/目录下
- [root@master ~]# tar -zxvf /opt/software/hadoop-2.7.1.tar.gz -C
- /usr/local/src/
- [root@master ~]# ll /usr/local/src/
- 总用量 0
- drwxr-xr-x. 9 10021 10021 149 6月 29 2015 hadoop-2.7.1
- drwxr-xr-x. 8 10 143 255 9月 14 2017 jdk1.8.0_152
- 查看 Hadoop 目录,得知 Hadoop 目录内容如下:
- [root@master ~]# ll /usr/local/src/hadoop-2.7.1/
- 总用量 28
- drwxr-xr-x. 2 10021 10021 194 6月 29 2015 bin
- drwxr-xr-x. 3 10021 10021 20 6月 29 2015 etc
- drwxr-xr-x. 2 10021 10021 106 6月 29 2015 include
- drwxr-xr-x. 3 10021 10021 20 6月 29 2015 lib
- drwxr-xr-x. 2 10021 10021 239 6月 29 2015 libexec
- -rw-r--r--. 1 10021 10021 15429 6月 29 2015 LICENSE.txt
- -rw-r--r--. 1 10021 10021 101 6月 29 2015 NOTICE.txt
- -rw-r--r--. 1 10021 10021 1366 6月 29 2015 README.txt
- drwxr-xr-x. 2 10021 10021 4096 6月 29 2015 sbin
- drwxr-xr-x. 4 10021 10021 31 6月 29 2015 share
解析:
bin:此目录中存放 Hadoop、HDFS、YARN 和 MapReduce 运行程序和管理软件。
etc:存放 Hadoop 配置文件。
include: 类似 C 语言的头文件
lib:本地库文件,支持对数据进行压缩和解压。
libexe:同 lib
sbin:Hadoop 集群启动、停止命令
share:说明文档、案例和依赖 jar 包。
- 和设置 JAVA 环境变量类似,修改/etc/profile 文件。
- [root@master ~]# vi /etc/profile
- 在文件的最后增加如下两行:
- export HADOOP_HOME=/usr/local/src/hadoop-2.7.1
- export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
- 执行 source 使用设置生效:
- [root@master ~]# source /etc/profile
- 检查设置是否生效:
- [root@master ~]# hadoop
- Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
- CLASSNAME run the class named CLASSNAME
- or
- where COMMAND is one of:
- fs run a generic filesystem user client
- version print the version
- jar <jar> run a jar file
- note: please use "yarn jar" to launch
- YARN applications, not this command.
- checknative [-a|-h] check native hadoop and compression libraries
- availability
- distcp <srcurl> <desturl> copy file or directories recursively
- archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop
- archive
- classpath prints the class path needed to get the
- credential interact with credential providers
- Hadoop jar and the required libraries
- daemonlog get/set the log level for each daemon
- trace view and modify Hadoop tracing settings
- Most commands print help when invoked w/o parameters.
- 23
- 24
- [root@master ~]#
- 出现上述 Hadoop 帮助信息就说明 Hadoop 已经安装好了。
- [root@master ~]# chown -R hadoop:hadoop /usr/local/src/
- [root@master ~]# ll /usr/local/src/
- 总用量 0
- drwxr-xr-x. 9 hadoop hadoop 149 6月 29 2015 hadoop-2.7.1
- drwxr-xr-x. 8 hadoop hadoop 255 9月 14 2017 jdk1.8.0_152
- /usr/local/src 目录的所有者已经改为 hadoop 了。
- [root@master ~]# cd /usr/local/src/hadoop-2.7.1/
- [root@master hadoop-2.7.1]# ls
- bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share
- [root@master hadoop-2.7.1]# vi etc/hadoop/hadoop-env.sh
- 在文件中查找 export JAVA_HOME 这行,将其改为如下所示内容:
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- [root@master hadoop-2.7.1]# su - hadoop
- [hadoop@master ~]$ id
- uid=1001(hadoop) gid=1001(hadoop) 组=1001(hadoop) 环境
- =unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
- 将输入数据存放在~/input 目录(hadoop 用户主目录下的 input 目录中)。
- [hadoop@master ~]$ mkdir ~/input
- [hadoop@master ~]$ ls
- Input
- 创建数据文件 data.txt,将要测试的数据内容输入到 data.txt 文件中。
- [hadoop@master ~]$ vi input/data.txt
- 输入如下内容,保存退出。
- Hello World
- Hello Hadoop
- Hello Husan
- [hadoop@master ~]$ hadoop jar /usr/local/src/hadoop-
- 2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
- wordcount ~/input/data.txt ~/output
- 运行结果保存在~/output 目录中(注:结果输出目录不能事先存在),命令执
- 行后查看结果:
- [hadoop@master ~]$ ll output/
- 总用量 4
- -rw-r--r--. 1 hadoop hadoop 33 11月 10 23:50 part-r-00000
- -rw-r--r--. 1 hadoop hadoop 0 11月 10 23:50 _SUCCESS
- 文件_SUCCESS 表示处理成功,处理的结果存放在 part-r-00000 文件中,查看该
- 文件。
- [hadoop@master ~]$ cat output/part-r-00000
- Hadoop1
- Hello 3
- Husan 1
- World 1
- 修改 slave1 机器主机名
- [root@localhost ~]# hostnamectl set-hostname slave1
- [root@localhost ~]# bash
- [root@slave1 ~]#
- 修改 slave2 机器主机名
- [root@localhost ~]# hostnamectl set-hostname slave2
- [root@localhost ~]# bash
- [root@slave2 ~]#
- 根据实验环境下集群网络 IP 地址规划(根据自己主机的ip即可):
- master 设置 IP 地址是“192.168.47.140”,掩码是“255.255.255.0”;
- slave1 设置 IP 地址“192.168.47.141”,掩码是“255.255.255.0”;
- slave2 设置 IP 地址是“192.168.47.142”,掩码是“255.255.255.0”。
- 根据我们为 Hadoop 设置的主机名为“master、slave1、slave2”,映地址是
- “192.168.47.140、192.168.47.141、192.168.47.142”,分别修改主机配置文件“/etc/hosts”,
- 在命令终端输入如下命令:
- [root@master ~]# vi /etc/hosts
- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
- ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
- 192.168.47.140 master
- 192.168.47.141 slave1
- 192.168.47.142 slave2
- [root@slave1 ~]# vi /etc/hosts
- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
- ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
- 192.168.47.140 master
- 192.168.47.141 slave1
- 192.168.47.142 slave2
- [root@slave2 ~]# vi /etc/hosts
- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
- 28
- ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
- 192.168.47.140 master
- 192.168.47.141 slave1
- 192.168.47.142 slave2
- [root@master ~]# rpm -qa | grep openssh
- openssh-server-7.4p1-11.el7.x86_64
- openssh-7.4p1-11.el7.x86_64
- openssh-clients-7.4p1-11.el7.x86_64
- [root@master ~]# rpm -qa | grep rsync
- rsync-3.1.2-11.el7_9.x86_64
- [root@master ~]# su - hadoop
- [hadoop@master ~]$
- [root@slave1 ~]# useradd hadoop
- [root@slave1 ~]# su - hadoop
- [hadoop@slave1 ~]$
- [root@slave2 ~]# useradd hadoop
- [root@slave2 ~]# su - hadoop
- [hadoop@slave2 ~]$
- #在 master 上生成密钥
- [hadoop@master ~]$ ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
- Created directory '/home/hadoop/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /home/hadoop/.ssh/id_rsa.
- Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:LOwqw+EjBHJRh9U1GdRHfbhV5+5BX+/hOHTEatwIKdU hadoop@master
- The key's randomart image is:
- +---[RSA 2048]----+
- | ..oo. o==...o+|
- | . .. . o.oE+.=|
- | . . o . *+|
- |o . . . . o B.+|
- |o. o S * =+|
- | .. . . o +oo|
- |.o . . o .o|
- |. * . . |
- | . +. |
- +----[SHA256]-----+
- #slave1 生成密钥
- [hadoop@slave1 ~]$ ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
- Created directory '/home/hadoop/.ssh'.
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /home/hadoop/.ssh/id_rsa.
- Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
- 29
- The key fingerprint is:
- SHA256:RhgNGuoa3uSrRMjhPtWA5NucyhbLr9NsEZ13i01LBaA
- hadoop@slave1
- The key's randomart image is:
- +---[RSA 2048]----+
- | . . o+... |
- |o .. o.o. . |
- | +..oEo . . |
- |+.=.+o o + |
- |o*.*... S o |
- |*oO. o + |
- |.@oo. |
- |o.o+. |
- | o=o |
- +----[SHA256]-----+
-
- #slave2 生成密钥
- [hadoop@slave2 ~]$ ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
- Enter passphrase (empty for no passphrase):
- Enter same passphrase again:
- Your identification has been saved in /home/hadoop/.ssh/id_rsa.
- Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
- The key fingerprint is:
- SHA256:yjp6AQEu2RN81Uv6y40MI/1p5WKWbVeGfB8/KK6iPUA
- hadoop@slave2
- The key's randomart image is:
- +---[RSA 2048]----+
- |.o. ... |
- |.oo.. o |
- |o.oo o . |
- |. .. E. . |
- | ... .S . . |
- | oo+.. . o +. |
- | o+* X +..o|
- | o..o& =... .o|
- | .o.o.=o+oo. .|
- +----[SHA256]-----+
- [hadoop@master ~]$ ls ~/.ssh/
- id_rsa id_rsa.pub
- #master
- [hadoop@master ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- [hadoop@master ~]$ ls ~/.ssh/
- authorized_keys id_rsa id_rsa.pub
- #slave1
- [hadoop@slave1 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- [hadoop@slave1 ~]$ ls ~/.ssh/
- authorized_keys id_rsa id_rsa.pub
- #slave2
- [hadoop@slave2 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- [hadoop@slave2 ~]$ ls ~/.ssh/
- authorized_keys id_rsa id_rsa.pub
- #master
- [hadoop@master ~]$ chmod 600 ~/.ssh/authorized_keys
- [hadoop@master ~]$ ll ~/.ssh/
- 总用量 12
- -rw-------. 1 hadoop hadoop 395 11月 14 16:18 authorized_keys
- -rw-------. 1 hadoop hadoop 1679 11月 14 16:14 id_rsa
- -rw-r--r--. 1 hadoop hadoop 395 11月 14 16:14 id_rsa.pub
- #slave1
- [hadoop@slave1 ~]$ chmod 600 ~/.ssh/authorized_keys
- [hadoop@slave1 ~]$ ll ~/.ssh/
- 总用量 12
- -rw-------. 1 hadoop hadoop 395 11月 14 16:18 authorized_keys
- -rw-------. 1 hadoop hadoop 1675 11月 14 16:14 id_rsa
- -rw-r--r--. 1 hadoop hadoop 395 11月 14 16:14 id_rsa.pub
- #slave2
- [hadoop@slave2 ~]$ chmod 600 ~/.ssh/authorized_keys
- [hadoop@slave2 ~]$ ll ~/.ssh/
- 总用量 12
- -rw-------. 1 hadoop hadoop 395 11月 14 16:19 authorized_keys
- -rw-------. 1 hadoop hadoop 1679 11月 14 16:15 id_rsa
- -rw-r--r--. 1 hadoop hadoop 395 11月 14 16:15 id_rsa.pub
[root@master ~]# systemctl restart sshd
- [root@master ~]# su - hadoop
- 上一次登录:一 11月 14 16:11:14 CST 2022pts/1 上
- [hadoop@master ~]$
- [hadoop@master ~]$ ssh localhost
- The authenticity of host 'localhost (::1)' can't be established.
- ECDSA key fingerprint is
- SHA256:KvO9HlwdCTJLStOxZWN7qrfRr8FJvcEw2hzWAF9b3bQ.
- ECDSA key fingerprint is MD5:07:91:56:9e:0b:55:05:05:58:02:15:5e:68:db:be:73.
- Are you sure you want to continue connecting (yes/no)? yes
- Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
- Last login: Mon Nov 14 16:28:30 2022
- [hadoop@master ~]$
- hadoop 用户登录,通过 scp 命令实现密钥拷贝。
- [hadoop@master ~]$ scp ~/.ssh/id_rsa.pub hadoop@slave1:~/
- hadoop@slave1's password:
- id_rsa.pub 100% 395 303.6KB/s 00:00
- [hadoop@master ~]$ scp ~/.ssh/id_rsa.pub hadoop@slave2:~/
- The authenticity of host 'slave2 (192.168.47.142)' can't be established.
- ECDSA key fingerprint is
- SHA256:KvO9HlwdCTJLStOxZWN7qrfRr8FJvcEw2hzWAF9b3bQ.
- ECDSA key fingerprint is MD5:07:91:56:9e:0b:55:05:05:58:02:15:5e:68:db:be:73.
- Are you sure you want to continue connecting (yes/no)? yes
- Warning: Permanently added 'slave2,192.168.47.142' (ECDSA) to the list of known
- hosts.
- hadoop@slave2's password:
- id_rsa.pub 100% 395 131.6KB/s 00:00
- hadoop 用户登录 slave1 和 slave2 节点,执行命令。
- [hadoop@slave1 ~]$ cat ~/id_rsa.pub >>~/.ssh/authorized_keys
- [hadoop@slave2 ~]$ cat ~/id_rsa.pub >>~/.ssh/authorized_keys
- [hadoop@slave1 ~]$ rm -rf ~/id_rsa.pub
- [hadoop@slave2 ~]$ rm -rf ~/id_rsa.pub
- (1)将 Slave1 节点的公钥复制到 Master
- [hadoop@slave1 ~]$ scp ~/.ssh/id_rsa.pub hadoop@master:~/
- The authenticity of host 'master (192.168.47.140)' can't be established.
- ECDSA key fingerprint is
- SHA256:KvO9HlwdCTJLStOxZWN7qrfRr8FJvcEw2hzWAF9b3bQ.
- ECDSA key fingerprint is
- MD5:07:91:56:9e:0b:55:05:05:58:02:15:5e:68:db:be:73.
- Are you sure you want to continue connecting (yes/no)? yes
- Warning: Permanently added 'master,192.168.47.140' (ECDSA) to the list of
- known hosts.
- hadoop@master's password:
- id_rsa.pub 100% 395 317.8KB/s 00:00
- [hadoop@slave1 ~]$
-
- (2)在 Master 节点把从 Slave 节点复制的公钥复制到 authorized_keys 文件
- [hadoop@master ~]$ cat ~/id_rsa.pub >>~/.ssh/authorized_keys
-
- (3)在 Master 节点删除 id_rsa.pub 文件
- [hadoop@master ~]$ rm -rf ~/id_rsa.pub
-
- (4)将 Slave2 节点的公钥复制到 Master
- [hadoop@slave2 ~]$ scp ~/.ssh/id_rsa.pub hadoop@master:~/
- The authenticity of host 'master (192.168.47.140)' can't be established.
- ECDSA key fingerprint is
- SHA256:KvO9HlwdCTJLStOxZWN7qrfRr8FJvcEw2hzWAF9b3bQ.
- ECDSA key fingerprint is MD5:07:91:56:9e:0b:55:05:05:58:02:15:5e:68:db:be:73.
- Are you sure you want to continue connecting (yes/no)? yes
- Warning: Permanently added 'master,192.168.47.140' (ECDSA) to the list of known
- hosts.
- hadoop@master's password:
- id_rsa.pub 100% 395 326.6KB/s 00:00
- [hadoop@slave2 ~]$
-
- (5)在 Master 节点把从 Slave 节点复制的公钥复制到 authorized_keys 文件
- [hadoop@master ~]$ cat ~/id_rsa.pub >>~/.ssh/authorized_keys
-
- (6)在 Master 节点删除 id_rsa.pub 文件
- [hadoop@master ~]$ rm -rf ~/id_rsa.pub
- [hadoop@master ~]$ cat ~/.ssh/authorized_keys
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDzHmpOfy7nwV1X453YY0UOZNTppiPA
- 9DI/vZWgWsK6hhw0pupzyxmG5LnNh7IhBlDCAKKmohOMUq9cKM3XMBq8R1f8
- ys8VOPlWSKYndGxu6mbTY8wdcPWvINlAvCf2GN6rE1QJXwBAYdvZ8n5UGWqbQ
- 0zdqQG1uhix9FN327dCmUGozmCuCR/lY4utU3ltS3faAz7GHUCchpPTE6OopaAk9
- yH5ynl+Y7BCwAWblcwf4pYoGWvQ8kMJIIr+k6cZXabsdwa3Y29OODsOsh4EfTmQ
- iQbjMKpLahVrJIiL8C/6vuDX8Fh3wvgkvFgrppfzsAYNpKro27JvVgRzdKg7+/BD
- hadoop@master
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDKUKduFzGYN41c0gFXdt3nALXhSqfgH
- gmZuSjJnIlpvtQQH1IYm2S50ticwk8fr2TL/lMC/THJbuP6xoT0ZlJBPkbcEBZwkTEd
- eb+0uvzUItx7viWb3oDs5s0UGtrQnrP70GszuNnitb+L+f6PRtUVVEYMKagyIpntfIC
- AIP8kMRKL3qrwOJ1smtEjwURKbOMDOJHV/EiHP4l+VeVtrPnH6MG3tZbrTTCgFQ
- ijSo8Hb4RGFO4NxtSHPH74YMwZBREZ7DPeZMNjqpAttQUH0leM4Ji93RQkcFoy2n
- lZljhmKVKzdqazhjJ4DAgT3/FcRvF7YrULKxOHHYj/Jk0rrWwB hadoop@slave1
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDjlopSpw5GUvoOSiEMQG15MRUrNqsAf
- NlnB/TcwDh7Xu7R1qND+StCb7rFScYI+NcDD0JkMBeXZVbQA5T21LSZlmet/38xeJ
- Jy53Jx6X1bmf/XnYYf2nnUPRkAUtJeKNPDDA4TN1qnhvAdoSUZgr3uW0oV01jW5
- Ai7YFYu1aSHsocmDRKFW2P8kpJZ3ASC7r7+dWFzMjT5Lu3/bjhluAPJESwV48aU2
- +wftlT4oJSGTc9vb0HnBpLoZ/yfuAC1TKsccI9p2MnItUUbqI1/uVH2dgmeHwRVpq
- qc1Em9hcVh0Gs0vebIGPRNx5eHTf3aIrxR4eRFSwMgF0QkcFr/+yzp
- hadoop@slave2
- [hadoop@master ~]$
- 可以看到 Master 节点 authorized_keys 文件中包括 master、slave1、slave2 三个节点
- 的公钥
- [hadoop@slave1 ~]$ cat ~/.ssh/authorized_keys
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDKUKduFzGYN41c0gFXdt3nALXhS
- qfgHgmZuSjJnIlpvtQQH1IYm2S50ticwk8fr2TL/lMC/THJbuP6xoT0ZlJBPkbcE
- BZwkTEdeb+0uvzUItx7viWb3oDs5s0UGtrQnrP70GszuNnitb+L+f6PRtUVVEY
- MKagyIpntfICAIP8kMRKL3qrwOJ1smtEjwURKbOMDOJHV/EiHP4l+VeVtrPnH
- 6MG3tZbrTTCgFQijSo8Hb4RGFO4NxtSHPH74YMwZBREZ7DPeZMNjqpAttQU
- H0leM4Ji93RQkcFoy2nlZljhmKVKzdqazhjJ4DAgT3/FcRvF7YrULKxOHHYj/Jk
- 0rrWwB hadoop@slave1
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDzHmpOfy7nwV1X453YY0UOZNTp
- piPA9DI/vZWgWsK6hhw0pupzyxmG5LnNh7IhBlDCAKKmohOMUq9cKM3XM
- Bq8R1f8ys8VOPlWSKYndGxu6mbTY8wdcPWvINlAvCf2GN6rE1QJXwBAYdvZ
- 8n5UGWqbQ0zdqQG1uhix9FN327dCmUGozmCuCR/lY4utU3ltS3faAz7GHUCc
- hpPTE6OopaAk9yH5ynl+Y7BCwAWblcwf4pYoGWvQ8kMJIIr+k6cZXabsdwa3
- Y29OODsOsh4EfTmQiQbjMKpLahVrJIiL8C/6vuDX8Fh3wvgkvFgrppfzsAYNpK
- ro27JvVgRzdKg7+/BD hadoop@master
- [hadoop@slave2 ~]$ cat ~/.ssh/authorized_keys
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDjlopSpw5GUvoOSiEMQG15MRUrN
- qsAfNlnB/TcwDh7Xu7R1qND+StCb7rFScYI+NcDD0JkMBeXZVbQA5T21LSZl
- met/38xeJJy53Jx6X1bmf/XnYYf2nnUPRkAUtJeKNPDDA4TN1qnhvAdoSUZgr3
- uW0oV01jW5Ai7YFYu1aSHsocmDRKFW2P8kpJZ3ASC7r7+dWFzMjT5Lu3/bj
- hluAPJESwV48aU2+wftlT4oJSGTc9vb0HnBpLoZ/yfuAC1TKsccI9p2MnItUUbq
- I1/uVH2dgmeHwRVpqqc1Em9hcVh0Gs0vebIGPRNx5eHTf3aIrxR4eRFSwMg
- F0QkcFr/+yzp hadoop@slave2
- ssh-rsa
- AAAAB3NzaC1yc2EAAAADAQABAAABAQDzHmpOfy7nwV1X453YY0UOZNTp
- piPA9DI/vZWgWsK6hhw0pupzyxmG5LnNh7IhBlDCAKKmohOMUq9cKM3XM
- Bq8R1f8ys8VOPlWSKYndGxu6mbTY8wdcPWvINlAvCf2GN6rE1QJXwBAYdvZ
- 8n5UGWqbQ0zdqQG1uhix9FN327dCmUGozmCuCR/lY4utU3ltS3faAz7GHUCc
- hpPTE6OopaAk9yH5ynl+Y7BCwAWblcwf4pYoGWvQ8kMJIIr+k6cZXabsdwa3
- Y29OODsOsh4EfTmQiQbjMKpLahVrJIiL8C/6vuDX8Fh3wvgkvFgrppfzsAYNpK
- ro27JvVgRzdKg7+/BD hadoop@master
- 可以看到 Slave 节点 authorized_keys 文件中包括 Master、当前 Slave 两个节点
- 的公钥
- hadoop 用户登录 master 节点,执行 SSH 命令登录 slave1 和 slave2 节点。可以观察
- 到不需要输入密码即可实现 SSH 登录。
- [hadoop@master ~]$ ssh slave1
- Last login: Mon Nov 14 16:34:56 2022
- [hadoop@slave1 ~]$
- [hadoop@master ~]$ ssh slave2
- Last login: Mon Nov 14 16:49:34 2022 from 192.168.47.140
- [hadoop@slave2 ~]$
- [hadoop@slave1 ~]$ ssh master
- Last login: Mon Nov 14 16:30:45 2022 from ::1
- [hadoop@master ~]$
- [hadoop@slave2 ~]$ ssh master
- Last login: Mon Nov 14 16:50:49 2022 from 192.168.47.141
- [hadoop@master ~]$
- [root@master ~]# cd /usr/local/src/
- [root@master src]# ls
- hadoop-2.7.1 jdk1.8.0_152
- [root@master src]# scp -r jdk1.8.0_152 root@slave1:/usr/local/src/
- [root@master src]# scp -r jdk1.8.0_152 root@slave2:/usr/local/src/
- #slave1
- [root@slave1 ~]# ls /usr/local/src/
- jdk1.8.0_152
- [root@slave1 ~]# vi /etc/profile #此文件最后添加下面两行
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- [root@slave1 ~]# source /etc/profile
- [root@slave1 ~]# java -version
- java version "1.8.0_152"
- Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
- Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
- #slave2
- [root@slave2 ~]# ls /usr/local/src/
- jdk1.8.0_152
- [root@slave2 ~]# vi /etc/profile #此文件最后添加下面两行
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- [root@slave2 ~]# source /etc/profile、
- [root@slave2 ~]# java -version
- java version "1.8.0_152"
- Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
- Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
- 1. 将 hadoop-2.7.1 文件夹重命名为 Hadoop
- [root@master ~]# cd /usr/local/src/
- [root@master src]# mv hadoop-2.7.1 hadoop
- [root@master src]# ls
- hadoop jdk1.8.0_152
-
- 2. 配置 Hadoop 环境变量
- [root@master src]# yum install -y vim
- [root@master src]# vim /etc/profile
- [root@master src]# tail -n 4 /etc/profile
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- export HADOOP_HOME=/usr/local/src/hadoop
- export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
-
- 3. 使配置的 Hadoop 的环境变量生效
- [root@master src]# su - hadoop
- 上一次登录:一 2 月 28 15:55:37 CST 2022 从 192.168.41.143pts/1 上
- [hadoop@master ~]$ source /etc/profile
- [hadoop@master ~]$ exit
- 登出
-
- 4. 执行以下命令修改 hadoop-env.sh 配置文件
- [root@master src]# cd /usr/local/src/hadoop/etc/hadoop/
- [root@master hadoop]# vim hadoop-env.sh #修改以下配置
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- [root@master hadoop]# vim hdfs-site.xml #编辑以下内容
- [root@master hadoop]# tail -n 14 hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/name</value>
- </property>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>file:/usr/local/src/hadoop/dfs/data</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- </configuration>
- [root@master hadoop]# vim core-site.xml #编辑以下内容
- [root@master hadoop]# tail -n 14 core-site.xml
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://192.168.47.140:9000</value>
- </property>
- <property>
- <name>io.file.buffer.size</name>
- <value>131072</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>file:/usr/local/src/hadoop/tmp</value>
- </property>
- </configuration>
- [root@master hadoop]# pwd
- /usr/local/src/hadoop/etc/hadoop
- [root@master hadoop]# cp mapred-site.xml.template mapred-site.xml
- [root@master hadoop]# vim mapred-site.xml #添加以下配置
- [root@master hadoop]# tail -n 14 mapred-site.xml
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.address</name>
- <value>master:10020</value>
- </property>
- <property>
- <name>mapreduce.jobhistory.webapp.address</name>
- <value>master:19888</value>
- </property>
- </configuration>
- [root@master hadoop]# vim yarn-site.xml #添加以下配置
- [root@master hadoop]# tail -n 32 yarn-site.xml
- <configuration>
- <!-- Site specific YARN configuration properties -->
- <property>
- <name>yarn.resourcemanager.address</name>
- <value>master:8032</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>master:8030</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>master:8031</value>
- </property>
- <property>
- <name>yarn.resourcemanager.admin.address</name>
- <value>master:8033</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>master:8088</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
- <property>
- <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
- </configuration>
- 1. 配置 masters 文件
- [root@master hadoop]# vim masters
- [root@master hadoop]# cat masters
- 192.168.47.140
-
- 2. 配置 slaves 文件
- [root@master hadoop]# vim slaves
- [root@master hadoop]# cat slaves
- 192.168.47.141
- 192.168.47.142
-
- 3. 新建目录
- [root@master hadoop]# mkdir /usr/local/src/hadoop/tmp
- [root@master hadoop]# mkdir /usr/local/src/hadoop/dfs/name -p
- [root@master hadoop]# mkdir /usr/local/src/hadoop/dfs/data -p
-
- 4. 修改目录权限
- [root@master hadoop]# chown -R hadoop:hadoop /usr/local/src/hadoop/
-
- 5. 同步配置文件到 Slave 节点
- [root@master ~]# scp -r /usr/local/src/hadoop/ root@slave1:/usr/local/src/
- The authenticity of host 'slave1 (192.168.47.141)' can't be established.
- ECDSA key fingerprint is SHA256:vnHclJTJVtDbeULN8jdOLhTCmqxJNqUQshH9g9LfJ3k.
- ECDSA key fingerprint is MD5:31:03:3d:83:46:aa:c4:d0:c9:fc:5f:f1:cf:2d:fd:e2.
- Are you sure you want to continue connecting (yes/no)? yes
- * * * * * * *
- [root@master ~]# scp -r /usr/local/src/hadoop/ root@slave2:/usr/local/src/
- The authenticity of host 'slave1 (192.168.47.142)' can't be established.
- ECDSA key fingerprint is SHA256:vnHclJTJVtDbeULN8jdOLhTCmqxJNqUQshH9g9LfJ3k.
- ECDSA key fingerprint is MD5:31:03:3d:83:46:aa:c4:d0:c9:fc:5f:f1:cf:2d:fd:e2.
- Are you sure you want to continue connecting (yes/no)? yes
- * * * * * * *
-
- #slave1 配置
- [root@slave1 ~]# yum install -y vim
- [root@slave1 ~]# vim /etc/profile
- [root@slave1 ~]# tail -n 4 /etc/profile
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- export HADOOP_HOME=/usr/local/src/hadoop
- export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
- [root@slave1 ~]# chown -R hadoop:hadoop /usr/local/src/hadoop/
- [root@slave1 ~]# su - hadoop
- 上一次登录:四 2 月 24 11:29:00 CST 2022 从 192.168.41.148pts/1 上
- [hadoop@slave1 ~]$ source /etc/profile
-
- #slave2 配置
- [root@slave2 ~]# yum install -y vim
- [root@slave2 ~]# vim /etc/profile
- [root@slave2 ~]# tail -n 4 /etc/profile
- export JAVA_HOME=/usr/local/src/jdk1.8.0_152
- export PATH=$PATH:$JAVA_HOME/bin
- export HADOOP_HOME=/usr/local/src/hadoop
- export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
- [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/src/hadoop/
- [root@slave2 ~]# su - hadoop
- 上一次登录:四 2 月 24 11:29:19 CST 2022 从 192.168.41.148pts/1 上
- [hadoop@slave2 ~]$ source /etc/profile
- [root@master ~]# su – hadoop
- [hadoop@master ~]# cd /usr/local/src/hadoop/
- [hadoop@master hadoop]$ bin/hdfs namenode –format
- 结果:
- 20/05/02 16:21:50 INFO namenode.NameNode: SHUTDOWN_MSG:
- /************************************************************ SHUTDOWN_MSG:
- Shutting down NameNode at master/192.168.1.6
- ************************************************************/
注:将 NameNode 上的数据清零,第一次启动 HDFS 时要进行格式化,以后启动无需再格式化,否则会缺失 DataNode 进程。另外,只要运行过 HDFS,Hadoop 的工作目录(本书设置为/usr/local/src/hadoop/tmp)就会有数据,如果需要重新格式化,则在格式化之前一定要先删除工作目录下的数据,否则格式化时会出问题。
- [hadoop@master hadoop]$ hadoop-daemon.sh start namenode
- starting namenode, logging to /opt/module/hadoop-
- 2.7.1/logs/hadoop-hadoop-namenode-master.out
- 启动完成后,可以使用 JPS 命令查看是否成功。JPS 命令是 Java 提供的一个显示当前所有
- Java 进程 pid 的命令。
- [hadoop@master hadoop]$ jps
- 3557 NameNode
- 3624 Jps
- [hadoop@slave1 hadoop]$ hadoop-daemon.sh start datanode
- starting datanode, logging to /opt/module/hadoop-
- 2.7.1/logs/hadoop-hadoop-datanode-master.out
- [hadoop@slave2 hadoop]$ hadoop-daemon.sh start datanode
- starting datanode, logging to /opt/module/hadoop-
- 2.7.1/logs/hadoop-hadoop-datanode-master.out
- [hadoop@slave1 hadoop]$ jps
- 3557 DataNode
- 3725 Jps
- [hadoop@slave2 hadoop]$ jps
- 3557 DataNode
- 3725 Jps
- [hadoop@master hadoop]$ hadoop-daemon.sh start secondarynamenode
- starting secondarynamenode, logging to /opt/module/hadoop-
- 2.7.1/logs/hadoop-hadoop-secondarynamenode-master.out
- [hadoop@master hadoop]$ jps
- 34257 NameNode
- 34449 SecondaryNameNode
- 34494 Jps
- [hadoop@master hadoop]$ ll dfs/
- 总用量 0
- drwx------ 3 hadoop hadoop 21 8 月 14 15:26 data
- drwxr-xr-x 3 hadoop hadoop 40 8 月 14 14:57 name
- [hadoop@master hadoop]$ ll ./tmp/dfs
- 总用量 0
- drwxrwxr-x. 3 hadoop hadoop 21 5 月 2 16:34 namesecondary
- 可以看出 HDFS 的数据保存在/usr/local/src/hadoop/dfs 目录下,NameNode、
- DataNode和/usr/local/src/hadoop/tmp/目录下,SecondaryNameNode 各有一个目
- 录存放数据。
- [hadoop@master sbin]$ hdfs dfsadmin -report
- Configured Capacity: 8202977280 (7.64 GB)
- Present Capacity: 4421812224 (4.12 GB)
- DFS Remaining: 4046110720 (3.77 GB)
- DFS Used: 375701504 (358.30 MB)
- DFS Used%: 8.50%
- Under replicated blocks: 88
- Blocks with corrupt replicas: 0
- Missing blocks: 0
- -------------------------------------------------
- Live datanodes (2):
- Name: 192.168.47.141:50010 (slave1)
- Hostname: slave1
- Decommission Status : Normal
- Configured Capacity: 4101488640 (3.82 GB)
- DFS Used: 187850752 (179.15 MB)
- Non DFS Used: 2109939712 (1.97 GB)
- DFS Remaining: 1803698176 (1.68 GB)
- DFS Used%: 4.58%
- DFS Remaining%: 43.98%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Mon May 04 18:32:32 CST 2020
- Name: 192.168.47.142:50010 (slave2)
- Hostname: slave2
- Decommission Status : Normal
- Configured Capacity: 4101488640 (3.82 GB)
- DFS Used: 187850752 (179.15 MB)
- Non DFS Used: 1671225344 (1.56 GB)
- DFS Remaining: 2242412544 (2.09 GB)
- DFS Used%: 4.58%
- DFS Remaining%: 54.67%
- Configured Cache Capacity: 0 (0 B)
- Cache Used: 0 (0 B)
- Cache Remaining: 0 (0 B)
- Cache Used%: 100.00%
- Cache Remaining%: 0.00%
- Xceivers: 1
- Last contact: Mon May 04 18:32:32 CST 2020
- [hadoop@master hadoop]$ stop-dfs.sh
- [hadoop@master hadoop]$ start-dfs.sh
- [hadoop@master hadoop]$ start-yarn.sh
- [hadoop@master hadoop]$ jps
- 34257 NameNode
- 34449 SecondaryNameNode
- 34494 Jps
- 32847 ResourceManager
- [hadoop@master hadoop]$ hdfs dfs -mkdir /input
- [hadoop@master hadoop]$ hdfs dfs -ls /
- Found 1 items
- drwxr-xr-x - hadoop supergroup 0 2020-05-02 22:26
- /input
- 此处创建的/input 目录是在 HDFS 文件系统中,只能用 HDFS 命令查看和操作。
- [hadoop@master hadoop]$ cat ~/input/data.txt
- Hello World
- Hello Hadoop
- Hello Huasan
- 执行如下命令,将输入数据文件复制到 HDFS 的/input 目录中:
- [hadoop@master hadoop]$ hdfs dfs -put ~/input/data.txt /input
- 确认文件已复制到 HDFS 的/input 目录:
- [hadoop@master hadoop]$ hdfs dfs -ls /input
- Found 1 items
- -rw-r--r-- 1 hadoop supergroup 38 2020-05-02 22:32
- /input/data.txt
- 自动创建的/output 目录在 HDFS 文件
- 系统中,使用 HDFS 命令查看和操作。
- [hadoop@master hadoop]$ hdfs dfs -mkdir /output
-
- 先执行如下命令查看 HDFS 中的文件:
- [hadoop@master hadoop]$ hdfs dfs -ls /
- Found 3 items
- drwxr-xr-x - hadoop supergroup 0 2020-05-02 22:32
- /input
- drwxr-xr-x - hadoop supergroup 0 2020-05-02 22:49
- /output
-
- 上述目录中/input 目录是输入数据存放的目录,/output 目录是输出数据存放的目录。执
- 行如下命令,删除/output 目录。
- [hadoop@master hadoop]$ hdfs dfs -rm -r -f /output
- 20/05/03 09:43:43 INFO fs.TrashPolicyDefault: Namenode trash
- configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. Deleted
- /output
-
- 执行如下命令运行 WordCount 案例:
- [hadoop@master hadoop]$ hadoop jar share/hadoop/mapreduce/hado
- op-- mapreduce-examples-2.7.1.jar wordcount /input/data.txt /output
-
- MapReduce 程序运行过程中的输出信息如下所示:
- 20/05/02 22:39:41 INFO client.RMProxy: Connecting to
- ResourceManager at localhost/127.0.0.1:8032
- 20/05/02 22:39:43 INFO input.FileInputFormat: Total input paths
- to process : 1
- 20/05/02 22:39:43 INFO mapreduce.JobSubmitter: number of
- splits:1
- 20/05/02 22:39:44 INFO mapreduce.JobSubmitter: Submitting tokens
- for job: job_1588469277215_0001
- …… 省略 ……
- 20/05/02 22:40:32 INFO mapreduce.Job: map 0% reduce 0%
- 20/05/02 22:41:07 INFO mapreduce.Job: map 100% reduce 0% 20/05/02
- 22:41:25 INFO mapreduce.Job: map 100% reduce 100% 20/05/02 22:41:27 INFO
- mapreduce.Job: Job job_1588469277215_0001
- completed successfully
- …… 省略 ……
-
- 由上述信息可知 MapReduce 程序提交了一个作业,作业先进行 Map,再进行
- Reduce 操作。 MapReduce 作业运行过程也可以在 YARN 集群网页中查看。在浏
- 览器的地址栏输入:http://master:8088
- 可以使用 HDFS 命令直接查看 part-r-00000 文件内容,结果如下所示:
- [hadoop@master hadoop]$ hdfs dfs -cat /output/part-r-00000
- Hadoop 1
- Hello 3
- Huasan 1
- World 1
- 可以看出统计结果正确,说明 Hadoop 运行正常
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。