当前位置:   article > 正文

Hadoop集群的启动脚本整理及守护线程源码_hadoop守护进程

hadoop守护进程

在前三篇文章中,我们进行了HDFS的搭建,我们也使用start-dfs.sh脚本启动了集群环境,并且上传了一个文件到HDFS上,还使用了mapreduce程序对HDFS上的这个文件进行了单词统计。今天我们就来简单了解一下启动脚本的相关内容和HDFS的一些重要的默认配置属性。

一、启动脚本

hadoop的脚本/指令目录,就两个,一个是bin/,一个是sbin/。现在,就来看看几个比较重要的脚本/指令。

1、sbin/start-all.sh

  1. # Start all hadoop daemons. Run this on master node.
  2. # 开启所有的hadoop守护进程,在主节点上运行
  3. echo "This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh"
  4. #这个脚本已经被弃用,使用start-dfs.sh和start-yarn.sh替代
  5. bin=`dirname "${BASH_SOURCE-$0}"`
  6. bin=`cd "$bin"; pwd`
  7. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  8. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  9. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh
  10. #运行libexe/hadoop-config.sh指令,加载配置文件
  11. # start hdfs daemons if hdfs is present
  12. if [ -f "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh ]; then
  13. "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR
  14. #运行 sbin/start-dfs.sh指令
  15. fi
  16. # start yarn daemons if yarn is present
  17. if [ -f "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh ]; then
  18. "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR
  19. #运行 sbin/start-yarn.sh指令
  20. fi

我们可以看到,这个脚本的内容不多,实际上被弃用了,只不过是在这个start-all.sh脚本中,先执行hadoop-config.sh指令加载hadoop的一些环境变量,然后再分别执行start-dfs.sh脚本和start-yarn.sh脚本。

从此可以看出,我们也可以直接执行start-dfs.sh脚本来启动hadoop集群,无需执行start-all.sh脚本而已。(如果配置了yarn,再执行start-yarn.sh脚本)。

2、libexec/hadoop-config.sh

  1. this="${BASH_SOURCE-$0}"
  2. common_bin=$(cd -P -- "$(dirname -- "$this")" && pwd -P)
  3. script="$(basename -- "$this")"
  4. this="$common_bin/$script"
  5. [ -f "$common_bin/hadoop-layout.sh" ] && . "$common_bin/hadoop-layout.sh"
  6. HADOOP_COMMON_DIR=${HADOOP_COMMON_DIR:-"share/hadoop/common"}
  7. HADOOP_COMMON_LIB_JARS_DIR=${HADOOP_COMMON_LIB_JARS_DIR:-"share/hadoop/common/lib"}
  8. HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_COMMON_LIB_NATIVE_DIR:-"lib/native"}
  9. HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"}
  10. HDFS_LIB_JARS_DIR=${HDFS_LIB_JARS_DIR:-"share/hadoop/hdfs/lib"}
  11. YARN_DIR=${YARN_DIR:-"share/hadoop/yarn"}
  12. YARN_LIB_JARS_DIR=${YARN_LIB_JARS_DIR:-"share/hadoop/yarn/lib"}
  13. MAPRED_DIR=${MAPRED_DIR:-"share/hadoop/mapreduce"}
  14. MAPRED_LIB_JARS_DIR=${MAPRED_LIB_JARS_DIR:-"share/hadoop/mapreduce/lib"}
  15. # the root of the Hadoop installation
  16. # See HADOOP-6255 for directory structure layout
  17. HADOOP_DEFAULT_PREFIX=$(cd -P -- "$common_bin"/.. && pwd -P)
  18. HADOOP_PREFIX=${HADOOP_PREFIX:-$HADOOP_DEFAULT_PREFIX}
  19. export HADOOP_PREFIX
  20. ............................
  21. ...........省略细节,看重点..............
  22. ....................................
  23. # 调用 hadoop-env.sh加载其他环境变量
  24. if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
  25. . "${HADOOP_CONF_DIR}/hadoop-env.sh"
  26. fi

这个脚本的作用,其实就是配置了一些hadoop集群的所需要的环境变量而已,内部还执行了hadoop-env.sh脚本,加载其他的比较重要的环境变量,如jdk等等

3、sbin/start-dfs.sh

  1. # Start hadoop dfs daemons. #开启HDFS的相关守护线程
  2. # Optinally upgrade or rollback dfs state. #可选升级或回滚DFS状态
  3. # Run this on master node. #在主节点上运行这个脚本
  4. #这是start-dfs.sh的用法 单独启动一个clusterId
  5. usage="Usage: start-dfs.sh [-upgrade|-rollback] [other options such as -clusterId]"
  6. bin=`dirname "${BASH_SOURCE-$0}"`
  7. bin=`cd "$bin"; pwd`
  8. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  9. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  10. #使用hdfs-config.sh加载环境变量
  11. . $HADOOP_LIBEXEC_DIR/hdfs-config.sh
  12. # get arguments
  13. if [[ $# -ge 1 ]]; then
  14. startOpt="$1"
  15. shift
  16. case "$startOpt" in
  17. -upgrade)
  18. nameStartOpt="$startOpt"
  19. ;;
  20. -rollback)
  21. dataStartOpt="$startOpt"
  22. ;;
  23. *)
  24. echo $usage
  25. exit 1
  26. ;;
  27. esac
  28. fi
  29. #Add other possible options
  30. nameStartOpt="$nameStartOpt $@"
  31. #---------------------------------------------------------
  32. # namenodes
  33. NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes)
  34. echo "Starting namenodes on [$NAMENODES]"
  35. #执行hadoop-daemons.sh 调用bin/hdfs指令 启动namenode守护线程
  36. "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
  37. --config "$HADOOP_CONF_DIR" \
  38. --hostnames "$NAMENODES" \
  39. --script "$bin/hdfs" start namenode $nameStartOpt
  40. #---------------------------------------------------------
  41. # datanodes (using default slaves file)
  42. if [ -n "$HADOOP_SECURE_DN_USER" ]; then
  43. echo \
  44. "Attempting to start secure cluster, skipping datanodes. " \
  45. "Run start-secure-dns.sh as root to complete startup."
  46. else
  47. #执行hadoop-daemons.sh 调用bin/hdfs指令 启动datanode守护线程
  48. "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
  49. --config "$HADOOP_CONF_DIR" \
  50. --script "$bin/hdfs" start datanode $dataStartOpt
  51. fi
  52. #---------------------------------------------------------
  53. # secondary namenodes (if any)
  54. SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null)
  55. if [ -n "$SECONDARY_NAMENODES" ]; then
  56. echo "Starting secondary namenodes [$SECONDARY_NAMENODES]"
  57. #执行hadoop-daemons.sh 调用bin/hdfs指令 启动secondarynamenode守护线程
  58. "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
  59. --config "$HADOOP_CONF_DIR" \
  60. --hostnames "$SECONDARY_NAMENODES" \
  61. --script "$bin/hdfs" start secondarynamenode
  62. fi
  63. ...................................
  64. ............省略细节.................
  65. ...................................
  66. # eof

在start-dfs.sh脚本中,先执行hdfs-config.sh脚本加载环境变量,然后通过hadoop-daemons.sh脚本又调用bin/hdfs指令来分别开启namenode、datanode以及secondarynamenode等守护进程。

如此我们也能发现,其实直接执行hadoop-daemons.sh脚本,配合其用法,也应该可以启动HDFS等相关守护进程。

4、sbin/hadoop-daemons.sh

  1. # 在所有的从节点上运行hadoop指令
  2. # Run a Hadoop command on all slave hosts.
  3. #hadoop-daemons.sh脚本的用法,
  4. usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."
  5. # if no args specified, show usage
  6. if [ $# -le 1 ]; then
  7. echo $usage
  8. exit 1
  9. fi
  10. bin=`dirname "${BASH_SOURCE-$0}"`
  11. bin=`cd "$bin"; pwd`
  12. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  13. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  14. #调用hadoop-config.sh加载环境比那里
  15. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh
  16. #调用sbin/slaves.sh脚本 加载配置文件,然后使用hadoop-daemon.sh脚本读取配置文件
  17. exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"

参考hadoop-daemons.sh的使用方法,不难发现直接使用hadoop-daemons.sh脚本,然后配合指令,就可以启动相关守护线程,如:

在这个脚本中,我们可以看到内部执行了slaves.sh脚本读取环境变量,然后再调用了hadoop-daemon.sh脚本读取相关配置信息并执行了hadoop指令。

5、sbin/slaves.sh

  1. # Run a shell command on all slave hosts.
  2. #
  3. # Environment Variables
  4. #
  5. # HADOOP_SLAVES File naming remote hosts.
  6. # Default is ${HADOOP_CONF_DIR}/slaves.
  7. # HADOOP_CONF_DIR Alternate conf dir. Default is ${HADOOP_PREFIX}/conf.
  8. # HADOOP_SLAVE_SLEEP Seconds to sleep between spawning remote commands.
  9. # HADOOP_SSH_OPTS Options passed to ssh when running remote commands.
  10. ##
  11. # 使用方法
  12. usage="Usage: slaves.sh [--config confdir] command..."
  13. # if no args specified, show usage
  14. if [ $# -le 0 ]; then
  15. echo $usage
  16. exit 1
  17. fi
  18. bin=`dirname "${BASH_SOURCE-$0}"`
  19. bin=`cd "$bin"; pwd`
  20. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  21. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  22. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh #读取环境变量
  23. if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
  24. . "${HADOOP_CONF_DIR}/hadoop-env.sh" #读取环境变量
  25. fi
  26. # Where to start the script, see hadoop-config.sh
  27. # (it set up the variables based on command line options)
  28. if [ "$HADOOP_SLAVE_NAMES" != '' ] ; then
  29. SLAVE_NAMES=$HADOOP_SLAVE_NAMES
  30. else
  31. SLAVE_FILE=${HADOOP_SLAVES:-${HADOOP_CONF_DIR}/slaves}
  32. SLAVE_NAMES=$(cat "$SLAVE_FILE" | sed 's/#.*$//;/^$/d')
  33. fi
  34. # start the daemons
  35. for slave in $SLAVE_NAMES ; do
  36. ssh $HADOOP_SSH_OPTS $slave $"${@// /\\ }" \
  37. 2>&1 | sed "s/^/$slave: /" &
  38. if [ "$HADOOP_SLAVE_SLEEP" != "" ]; then
  39. sleep $HADOOP_SLAVE_SLEEP
  40. fi
  41. done

这个脚本也就是加载环境变量,然后通过ssh连接从节点。

6、sbin/hadoop-daemon.sh

  1. #!/usr/bin/env bash
  2. # Runs a Hadoop command as a daemon. 以守护进程的形式运行hadoop命令
  3. .....................
  4. .....................、
  5. # 使用方法 command就是hadoop指令,下面有判读
  6. usage="Usage: hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] [--script script] (start|stop) <hadoop-command> <args...>"
  7. .....................
  8. .....................
  9. #使用hadoop-config.sh加载环境变量
  10. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  11. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  12. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh
  13. #使用hadoop-env.sh加载环境变量
  14. if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
  15. . "${HADOOP_CONF_DIR}/hadoop-env.sh"
  16. fi
  17. .....................
  18. .....................
  19. case $startStop in
  20. (start)
  21. [ -w "$HADOOP_PID_DIR" ] || mkdir -p "$HADOOP_PID_DIR"
  22. if [ -f $pid ]; then
  23. if kill -0 `cat $pid` > /dev/null 2>&1; then
  24. echo $command running as process `cat $pid`. Stop it first.
  25. exit 1
  26. fi
  27. fi
  28. if [ "$HADOOP_MASTER" != "" ]; then
  29. echo rsync from $HADOOP_MASTER
  30. rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX"
  31. fi
  32. hadoop_rotate_log $log
  33. echo starting $command, logging to $log
  34. cd "$HADOOP_PREFIX"
  35. #判断command是什么指令,然后调用bin/hdfs指令 读取配置文件,执行相关指令
  36. case $command in
  37. namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)
  38. if [ -z "$HADOOP_HDFS_HOME" ]; then
  39. hdfsScript="$HADOOP_PREFIX"/bin/hdfs
  40. else
  41. hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs
  42. fi
  43. nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
  44. ;;
  45. (*)
  46. nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
  47. ;;
  48. esac
  49. ........................
  50. ........................
  51. esac

在hadoop-daemon.sh脚本中,同样读取了环境变量,然后依据传入的参数$@(上一个脚本中)来判断要启动的hadoop的守护线程($command),最后调用bin/hdfs指令 读取配置信息 并启动hadoop的守护线程。

7、bin/hdfs

这是一个指令,而非shell脚本。我们可以发现,在启动hadoop集群时,不管使用什么脚本,最终都指向了bin/hdfs这个指令,那么这个指令里到底是什么呢,我们来看一下,就明白了。

  1. bin=`which $0`
  2. bin=`dirname ${bin}`
  3. bin=`cd "$bin" > /dev/null; pwd`
  4. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  5. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  6. . $HADOOP_LIBEXEC_DIR/hdfs-config.sh
  7. #除了上面继续加载环境变化外,这个函数其实就是提示我们在使用什么
  8. #比如namenode -format 是格式化DFS filesystem
  9. #再比如 namenode 说的是运行一个DFS namenode
  10. # 我们往下看
  11. function print_usage(){
  12. echo "Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND"
  13. echo " where COMMAND is one of:"
  14. echo " dfs run a filesystem command on the file systems supported in Hadoop."
  15. echo " classpath prints the classpath"
  16. echo " namenode -format format the DFS filesystem"
  17. echo " secondarynamenode run the DFS secondary namenode"
  18. echo " namenode run the DFS namenode"
  19. echo " journalnode run the DFS journalnode"
  20. echo " zkfc run the ZK Failover Controller daemon"
  21. echo " datanode run a DFS datanode"
  22. echo " dfsadmin run a DFS admin client"
  23. echo " haadmin run a DFS HA admin client"
  24. echo " fsck run a DFS filesystem checking utility"
  25. echo " balancer run a cluster balancing utility"
  26. echo " jmxget get JMX exported values from NameNode or DataNode."
  27. echo " mover run a utility to move block replicas across"
  28. echo " storage types"
  29. echo " oiv apply the offline fsimage viewer to an fsimage"
  30. echo " oiv_legacy apply the offline fsimage viewer to an legacy fsimage"
  31. echo " oev apply the offline edits viewer to an edits file"
  32. echo " fetchdt fetch a delegation token from the NameNode"
  33. echo " getconf get config values from configuration"
  34. echo " groups get the groups which users belong to"
  35. echo " snapshotDiff diff two snapshots of a directory or diff the"
  36. echo " current directory contents with a snapshot"
  37. echo " lsSnapshottableDir list all snapshottable dirs owned by the current user"
  38. echo " Use -help to see options"
  39. echo " portmap run a portmap service"
  40. echo " nfs3 run an NFS version 3 gateway"
  41. echo " cacheadmin configure the HDFS cache"
  42. echo " crypto configure HDFS encryption zones"
  43. echo " storagepolicies list/get/set block storage policies"
  44. echo " version print the version"
  45. echo ""
  46. echo "Most commands print help when invoked w/o parameters."
  47. # There are also debug commands, but they don't show up in this listing.
  48. }
  49. if [ $# = 0 ]; then
  50. print_usage
  51. exit
  52. fi
  53. COMMAND=$1
  54. shift
  55. case $COMMAND in
  56. # usage flags
  57. --help|-help|-h)
  58. print_usage
  59. exit
  60. ;;
  61. esac
  62. # Determine if we're starting a secure datanode, and if so, redefine appropriate variables
  63. if [ "$COMMAND" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then
  64. if [ -n "$JSVC_HOME" ]; then
  65. if [ -n "$HADOOP_SECURE_DN_PID_DIR" ]; then
  66. HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR
  67. fi
  68. if [ -n "$HADOOP_SECURE_DN_LOG_DIR" ]; then
  69. HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR
  70. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
  71. fi
  72. HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER
  73. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
  74. starting_secure_dn="true"
  75. else
  76. echo "It looks like you're trying to start a secure DN, but \$JSVC_HOME"\
  77. "isn't set. Falling back to starting insecure DN."
  78. fi
  79. fi
  80. # Determine if we're starting a privileged NFS daemon, and if so, redefine appropriate variables
  81. if [ "$COMMAND" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then
  82. if [ -n "$JSVC_HOME" ]; then
  83. if [ -n "$HADOOP_PRIVILEGED_NFS_PID_DIR" ]; then
  84. HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR
  85. fi
  86. if [ -n "$HADOOP_PRIVILEGED_NFS_LOG_DIR" ]; then
  87. HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR
  88. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
  89. fi
  90. HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER
  91. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
  92. starting_privileged_nfs="true"
  93. else
  94. echo "It looks like you're trying to start a privileged NFS server, but"\
  95. "\$JSVC_HOME isn't set. Falling back to starting unprivileged NFS server."
  96. fi
  97. fi
  98. # 停停停,对就是这
  99. # 我们可以看到,通过相应的hadoop指令,在加载相应的class文件
  100. # 然后在jvm运行此程序。别忘记了,hadoop是用java语言开发的
  101. if [ "$COMMAND" = "namenode" ] ; then
  102. CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode' #namenode守护线程对应的CLASS字节码
  103. HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"
  104. elif [ "$COMMAND" = "zkfc" ] ; then
  105. CLASS='org.apache.hadoop.hdfs.tools.DFSZKFailoverController'
  106. HADOOP_OPTS="$HADOOP_OPTS $HADOOP_ZKFC_OPTS"
  107. elif [ "$COMMAND" = "secondarynamenode" ] ; then
  108. CLASS='org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode' #SecondaryNameNode守护线程对应的CLASS字节码
  109. HADOOP_OPTS="$HADOOP_OPTS $HADOOP_SECONDARYNAMENODE_OPTS"
  110. elif [ "$COMMAND" = "datanode" ] ; then
  111. CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode' #DataNode守护线程对应的CLASS字节码
  112. if [ "$starting_secure_dn" = "true" ]; then
  113. HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"
  114. else
  115. HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"
  116. fi
  117. elif [ "$COMMAND" = "journalnode" ] ; then
  118. CLASS='org.apache.hadoop.hdfs.qjournal.server.JournalNode'
  119. HADOOP_OPTS="$HADOOP_OPTS $HADOOP_JOURNALNODE_OPTS"
  120. .......................................
  121. ...............省略很多..............
  122. .......................................
  123. # Check to see if we should start a secure datanode
  124. if [ "$starting_secure_dn" = "true" ]; then
  125. if [ "$HADOOP_PID_DIR" = "" ]; then
  126. HADOOP_SECURE_DN_PID="/tmp/hadoop_secure_dn.pid"
  127. else
  128. HADOOP_SECURE_DN_PID="$HADOOP_PID_DIR/hadoop_secure_dn.pid"
  129. fi
  130. JSVC=$JSVC_HOME/jsvc
  131. if [ ! -f $JSVC ]; then
  132. echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run secure datanodes. "
  133. echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\
  134. "and set JSVC_HOME to the directory containing the jsvc binary."
  135. exit
  136. fi
  137. if [[ ! $JSVC_OUTFILE ]]; then
  138. JSVC_OUTFILE="$HADOOP_LOG_DIR/jsvc.out"
  139. fi
  140. if [[ ! $JSVC_ERRFILE ]]; then
  141. JSVC_ERRFILE="$HADOOP_LOG_DIR/jsvc.err"
  142. fi
  143. #运行 java字节码文件
  144. exec "$JSVC" \
  145. -Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \
  146. -errfile "$JSVC_ERRFILE" \
  147. -pidfile "$HADOOP_SECURE_DN_PID" \
  148. -nodetach \
  149. -user "$HADOOP_SECURE_DN_USER" \
  150. -cp "$CLASSPATH" \
  151. $JAVA_HEAP_MAX $HADOOP_OPTS \
  152. org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"
  153. elif [ "$starting_privileged_nfs" = "true" ] ; then
  154. if [ "$HADOOP_PID_DIR" = "" ]; then
  155. HADOOP_PRIVILEGED_NFS_PID="/tmp/hadoop_privileged_nfs3.pid"
  156. else
  157. HADOOP_PRIVILEGED_NFS_PID="$HADOOP_PID_DIR/hadoop_privileged_nfs3.pid"
  158. fi
  159. JSVC=$JSVC_HOME/jsvc
  160. if [ ! -f $JSVC ]; then
  161. echo "JSVC_HOME is not set correctly so jsvc cannot be found. jsvc is required to run privileged NFS gateways. "
  162. echo "Please download and install jsvc from http://archive.apache.org/dist/commons/daemon/binaries/ "\
  163. "and set JSVC_HOME to the directory containing the jsvc binary."
  164. exit
  165. fi
  166. if [[ ! $JSVC_OUTFILE ]]; then
  167. JSVC_OUTFILE="$HADOOP_LOG_DIR/nfs3_jsvc.out"
  168. fi
  169. if [[ ! $JSVC_ERRFILE ]]; then
  170. JSVC_ERRFILE="$HADOOP_LOG_DIR/nfs3_jsvc.err"
  171. fi
  172. #运行 java字节码文件
  173. exec "$JSVC" \
  174. -Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \
  175. -errfile "$JSVC_ERRFILE" \
  176. -pidfile "$HADOOP_PRIVILEGED_NFS_PID" \
  177. -nodetach \
  178. -user "$HADOOP_PRIVILEGED_NFS_USER" \
  179. -cp "$CLASSPATH" \
  180. $JAVA_HEAP_MAX $HADOOP_OPTS \
  181. org.apache.hadoop.hdfs.nfs.nfs3.PrivilegedNfsGatewayStarter "$@"
  182. else
  183. #运行 java字节码文件
  184. # run it
  185. exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
  186. fi

看完懂了吗?在这个指令中,加载了各个守护线程对应的CLASS字节码文件,然后在JVM上来运行相应的守护线程。

hadoop的另一个指令bin/hadoop,内部也调用了bin/hdfs指令,感兴趣的话,可以自己看看,我就不展示出来了。至于跟yarn有关的脚本和指令也是相同的逻辑关系,我也不一一展示了。

使用图片重写整理了一下启动脚本的执行先后顺序:

使用文字再次整理一下:

  1. #一个脚本启动所有线程
  2. start-all.sh #执行此脚本可以启动所有线程
  3. 1. hadoop-config.sh
  4. a. hadoop-env.sh
  5. 2. start-dfs.sh #执行此脚本可以启动HDFS相关线程
  6. a.hadoop-config.sh
  7. b.hadoop-daemons.sh hdfs namenode
  8. hadoop-daemons.sh hdfs datanode
  9. hadoop-daemons.sh hdfs secondarynamenode
  10. 3. start-yarn.sh #执行此脚本可以启动YARN相关线程
  11. #启动单个线程
  12. #方法1
  13. hadoop-daemons.sh --config [start|stop] command
  14. 1. hadoop-config.sh
  15. a. hadoop-env.sh
  16. 2. slaves.sh
  17. a. hadoop-config.sh
  18. b. hadoop-env.sh
  19. 3. hadoop-daemon.sh --config [start|stop] command
  20. a.hdfs $command
  21. #方法2
  22. hadoop-daemon.sh --config [start|stop] command
  23. 1. hadoop-config.sh
  24. a. hadoop-env.sh
  25. 2. hdfs $command

二、底层源码查看

我们通过捋顺启动脚本发现,启动namenode对应的字节码文件是:org.apache.hadoop.hdfs.server.namenode.NameNode。启动datanode对应的字节码文件是:org.apache.hadoop.hdfs.server.datanode.DataNode。而启动secondarynamenode对应的字节码文件是:org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode。

这些源码所在的har包:hadoop-hdfs-2.7.3-sources.jar

1、namenode的源码

  1. package org.apache.hadoop.hdfs.server.namenode;
  2. .......................
  3. import org.apache.hadoop.hdfs.HdfsConfiguration;
  4. ..........................
  5. @InterfaceAudience.Private
  6. public class NameNode implements NameNodeStatusMXBean {
  7. static{ //静态块
  8. HdfsConfiguration.init(); //调用HdfsConfiguration的init方法,进行读取配置文件
  9. }
  10. ...................
  11. public static void main(String argv[]) throws Exception {
  12. if (DFSUtil.parseHelpArgument(argv, NameNode.USAGE, System.out, true)) {
  13. System.exit(0);
  14. }
  15. try {
  16. StringUtils.startupShutdownMessage(NameNode.class, argv, LOG);
  17. NameNode namenode = createNameNode(argv, null); //创建namenode
  18. if (namenode != null) {
  19. namenode.join(); //启动namenode线程
  20. }
  21. } catch (Throwable e) {
  22. LOG.error("Failed to start namenode.", e);
  23. terminate(1, e);
  24. }
  25. }
  26. ...........
  27. }

看一下HdfsConfiguration类

  1. package org.apache.hadoop.hdfs;
  2. /**
  3. * Adds deprecated keys into the configuration.
  4. */
  5. @InterfaceAudience.Private
  6. public class HdfsConfiguration extends Configuration {
  7. static { //静态块
  8. addDeprecatedKeys();
  9. // adds the default resources
  10. Configuration.addDefaultResource("hdfs-default.xml"); //取默认配置文件
  11. Configuration.addDefaultResource("hdfs-site.xml"); //读取个人设置文件
  12. }
  13. public static void init() {}
  14. private static void addDeprecatedKeys() {}
  15. public static void main(String[] args) {
  16. init();
  17. Configuration.dumpDeprecatedKeys();
  18. }
  19. }

2、datanode源码

  1. package org.apache.hadoop.hdfs.server.datanode;
  2. ..............
  3. import org.apache.hadoop.hdfs.HdfsConfiguration;
  4. ..............
  5. @InterfaceAudience.Private
  6. public class DataNode extends ReconfigurableBase
  7. implements InterDatanodeProtocol, ClientDatanodeProtocol,
  8. TraceAdminProtocol, DataNodeMXBean {
  9. public static final Log LOG = LogFactory.getLog(DataNode.class);
  10. static{
  11. HdfsConfiguration.init(); //同样在静态块中调用了HdfsConfiguration类,用于加载配置文件
  12. }
  13. }
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/一键难忘520/article/detail/880024
推荐阅读
相关标签
  

闽ICP备14008679号