当前位置:   article > 正文

linux安装Hadoop

linux安装hadoop

一、Hadoop分布式计算存储框架

 

二、HDFS组成角色和功能

1、Client:客户端

2、NameNode:元数据节点

管理文件系统的Namespace 元数据

一个HDFS集群只有一个Active的NN

3、Secondary NameNode:从元数据节点

合并NameNode的edit logs到fsimage文件中

辅助NN将内存中元数据信息持久化

4、DataBode:数据节点

数据存储节点,保存和检索Block

一个集群可以有多个数据节点

三、HDFS副本机制

Block:数据块

HDFS最基本的存储单元

默认块大小:128M(2X)

副本机制

1、作用:避免数据丢失

2、副本数默认为3

3、存放机制:

一个在本地机架节点

一个在同一个机架不同节点

一个在不同机架的节点

四、HDFS优缺点

优点

缺点

1、高容错性

1、不适合延时数据访问场景

2、不适合小文件存取场景

3、不适合并发写入,文件随机修改场景

2、适合大数据处理

3、流式数据访问

4、可构建在廉价的机器上

  • HDFS CLI(shell命令行)

基本格式

hdfs dfs -cmd <args>

hadoop fs -cmd <args>

查看可用命令

hdfs dfs

lunix安装Hadoop步骤

一、上传Hadoop文件

[root@kb129 ~]# cd /opt/kb23/shell

[root@kb129 shell]# ls

hadoop-3.1.3.tar.gz         mysql-8.0.30-linux-glibc2.12-x86_64.tar.xz

jdk-8u321-linux-x64.tar.gz

二、解压文件到指定文件内

[root@kb129 install]# tar -zxf ./hadoop-3.1.3.tar.gz -C ../soft/

[root@kb129 install]# cd ../soft/

三、改名字为Hadoop313

[root@kb129 soft]# mv hadoop-3.1.3/ hadoop313

三、授权

[root@kb129 soft]# chown -R root:root ./hadoop313/

四、/etc/profile/配置镜像文件

# HADOOP_HOME

export HADOOP_HOME=/opt/soft/hadoop313

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib

export HDFS_NAMENODE_USER=root

export HDFS_DATANODE_USER=root

export HDFS_SECONDARYNAMENODE_USER=root

export HDFS_JOURNALNODE_USER=root

export HDFS_ZKFC_USER=root

export YARN_RESOURCEMANAGER_USER=root

export YARN_NODEMANAGER_USER=root

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export HADOOP_YARN_HOME=$HADOOP_HOME

export HADOOP_INSTALL=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec

export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

 

[root@kb129 soft]# source /etc/profile

五、在hadoop313目录下创建data文件夹

[root@kb129 hadoop313]# mkdir ./data

[root@kb129 hadoop313]# cd ./etc/hadoop/

[root@kb129 hadoop]# ls

capacity-scheduler.xml      hadoop-user-functions.sh.example  kms-log4j.properties        ssl-client.xml.example

configuration.xsl           hdfs-site.xml③                     kms-site.xml                ssl-server.xml.example

container-executor.cfg      httpfs-env.sh                     log4j.properties            user_ec_policies.xml.template

core-site.xml①               httpfs-log4j.properties           mapred-env.cmd              workers⑥

hadoop-env.cmd              httpfs-signature.secret           mapred-env.sh               yarn-env.cmd

hadoop-env.sh②               httpfs-site.xml                   mapred-queues.xml.template  yarn-env.sh

hadoop-metrics2.properties  kms-acls.xml                      mapred-site.xml ④             yarnservice-log4j.properties

hadoop-policy.xml           kms-env.sh                        shellprofile.d              yarn-site.xml ⑤

六、配置文件

(一)配置文件出处

Apache Hadoop

配置文件详细代码

配置文件(一)core-site.xml

[root@kb129 hadoop]# vim ./core-site.xml

 <configuration>

    <property>

      <name>fs.defaultFS</name>

      <value>hdfs://kb129:9000</value>

    </property>

    <property>

      <name>hadoop.tmp.dir</name>

      <value>/opt/soft/hadoop313/data</value>

    </property>

    <property>

      <name>hadoop.http.staticuser.user</name>

      <value>root</value>

    </property>

    <property>

      <name>io.file.buffer.size</name>

      <value>131073</value>

    </property>

    <property>

      <name>hadoop.proxyuser.root.hosts</name>

      <value>*</value>

    </property>

    <property>

      <name>hadoop.proxyuser.root.groups</name>

      <value>*</value>

    </property>

  </configuration>

配置文件(二)hadoop-env.sh

[root@kb129 hadoop]# vim ./hadoop-env.sh

  export JAVA_HOME=/opt/soft/jdk180

配置文件(三)hdfs-site.xml

[root@kb129 hadoop]# vim ./hdfs-site.xml

 <configuration>

    <property>

      <name>dfs.replication</name>

      <value>1</value>

    </property>

    <property>

      <name>dfs.namenode.name.dir</name>

      <value>/opt/soft/hadoop313/data/dfs/name</value>

    </property>

    <property>

      <name>dfs.datanode.data.dir</name>

      <value>/opt/soft/hadoop313/data/dfs/data</value>

    </property>

    <property>

      <name>dfs.permissions.enabled</name>

      <value>false</value>

    </property>

  </configuration>

配置文件(四)mapred-site.xml

[root@kb129 hadoop]# vim ./mapred-site.xml

 <configuration>

    <property>

      <name>mapreduce.framework.name</name>

      <value>yarn</value>

    </property>

    <property>

      <name>mapreduce.jobhistory.address</name>

      <value>kb129:10020</value>

    </property>

    <property>

      <name>mapreduce.jobhistory.webapp.address</name>

      <value>kb129:19888</value>

    </property>

    <property>

      <name>mapreduce.map.memory.mb</name>

      <value>4096</value>

    </property>

    <property>

      <name>mapreduce.reduce.memory.mb</name>

      <value>4096</value>

    </property>

    <property>

      <name>mapreduce.application.classpath</name>

    <value>/opt/soft/hadoop313/etc/hadoop:/opt/soft/hadoop313/share/hadoop/common/lib/*:/opt/soft/hadoop313/share/had    oop/common/*:/opt/soft/hadoop313/share/hadoop/hdfs/*:/opt/soft/hadoop313/share/hadoop/hdfs/lib/*:/opt/soft/hadoop313/    share/hadoop/mapreduce/*:/opt/soft/hadoop313/share/hadoop/mapreduce/lib/*:/opt/soft/hadoop313/share/hadoop/yarn/*:/op    t/soft/hadoop313/share/hadoop/yarn/lib/*</value>

    </property>

  </configuration>

配置文件(五)yarn-site.xml

[root@kb129 hadoop]# vim ./yarn-site.xml

  <configuration>

  <!-- Site specific YARN configuration properties -->

    <property>

     <name>yarn.resourcemanager.connect.retry-interval.ms</name>

      <value>20000</value>

    </property>

    <property>

      <name>yarn.resourcemanager.scheduler.class</name>

     <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>

    </property>

    <property>

      <name>yarn.nodemanager.localizer.address</name>

      <value>kb129:8040</value>

    </property>

    <property>

      <name>yarn.nodemanager.address</name>

      <value>kb129:8050</value>

    </property>

    <property>

      <name>yarn.nodemanager.webapp.address</name>

      <value>kb129:8042</value>

    </property>

   <property>

      <name>yarn.nodemanager.aux-services</name>

      <value>mapreduce_shuffle</value>

    </property>

    <property>

      <name>yarn.nodemanager.local-dirs</name>

      <value>/opt/soft/hadoop313/yarndata/yarn</value>

    </property>

    <property>

      <name>yarn.nodemanager.log-dirs</name>

      <value>/opt/soft/hadoop313/yarndata/log</value>

    </property>

    <property>

      <name>yarn.nodemanager.vmem-check-enabled</name>

      <value>false</value>

    </property>

  </configuration>

配置文件(六)workers

[root@kb129 hadoop]# vim ./workers

kb129

七、初始化

[root@kb129 bin]# hadoop namenode -format

八、免密登录

[root@test3 etc]# cd

(一)[root@test3 ~]# ssh-keygen -t rsa -P "" 

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Created directory '/root/.ssh'.

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

SHA256:DvvTw8jj8AEFT9G2qJbUdSdRdmChscnRuNuLA9niQW4 root@test3

The key's randomart image is:

+---[RSA 2048]----+

|      . oo  +=*o.|

|       +  +.+Oo. |

|       .o+ o=+   |

|      ..o o .    |

|     .ooSo o o   |

|      +=  E o .  |

|     .o.o* + . . |

|       +=.= o .  |

|       .+o . .   |

+----[SHA256]-----+

(二)[root@test3 ~]# ll -a

total 36

dr-xr-x---.  3 root root  185 Aug 24 01:25 .

dr-xr-xr-x. 17 root root  224 Aug 23 18:52 ..

-rw-------.  1 root root 1423 Aug 23 18:54 anaconda-ks.cfg

-rw-------.  1 root root 1948 Aug 23 20:19 .bash_history

-rw-r--r--.  1 root root   18 Dec 29  2013 .bash_logout

-rw-r--r--.  1 root root  176 Dec 29  2013 .bash_profile

-rw-r--r--.  1 root root  176 Dec 29  2013 .bashrc

-rw-r--r--.  1 root root  100 Dec 29  2013 .cshrc

-rw-------.  1 root root  148 Aug 23 12:11 .mysql_history

drwx------.  2 root root   38 Aug 24 01:25 .ssh

-rw-r--r--.  1 root root  129 Dec 29  2013 .tcshrc

-rw-------.  1 root root 2332 Aug 23 23:59 .viminfo

(三)[root@test3 ~]# cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys   #追加内容

(四)[root@test3 ~]# ssh -p 22 root@test3     # 登录

The authenticity of host 'test3 (192.168.7.33)' can't be established.

ECDSA key fingerprint is SHA256:K0TUqSGjk3cOC+dBkiKl2bj+qeaEPOk5tb3ziQcz6CA.

ECDSA key fingerprint is MD5:97:c7:2b:bc:6f:7c:47:04:3c:d3:63:66:02:ed:e4:94.

Are you sure you want to continue connecting (yes/no)? y

Please type 'yes' or 'no': yes

Warning: Permanently added 'test3,192.168.7.33' (ECDSA) to the list of known hosts.

Last login: Wed Aug 23 23:45:12 2023 from 192.168.7.1

(五)[root@test3 ~]# exit        # 退出

logout

Connection to test3 closed.

(六)[root@test3 ~]# ssh -p 22 root@test3           # 登录

Last login: Thu Aug 24 01:28:25 2023 from test3       # 登录成功!

  • [root@kb129 hadoop]# start-all.sh  #开启

#[root@kb129 hadoop]# stop-all.sh  #关闭(页面无法显示,需要先关闭再开启,重新加载页面)

(八)[root@kb129 hadoop]# jps    #查询

15089 NodeManager

16241 Jps

14616 DataNode

13801 ResourceManager

14476 NameNode

16110 SecondaryNameNode

拓展

在kb129中连接kb128页面

[root@kb129 hadoop]# ssh-copy-id -i ~/.ssh/id_rsa.pub -p22 root@kb128

[root@kb129 hadoop]# ssh -p root@kb128

[root@kb129 hadoop]# exit

 

九、浏览器中输入网址:

http://192.168.7.23:9870/

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/2023面试高手/article/detail/516842
推荐阅读
相关标签
  

闽ICP备14008679号