Hadoop脚本启动之Start-all.sh脚本流程解析

自己的虚拟机环境中部署了一套自己测试用的hadoop环境,平时为了方便启动都是直接调用start-all.sh脚本直接启动,那么这个脚本中的执行流程是如何的,脚本是如何配置参数然后启动集群中各个服务进程的呢?之前只是知道用start-all.sh偷懒,有时间刚好看了一遍这个脚本的整体流程,以start-all.sh作为出发,了解整个脚本启动的流程对于理解集群配置还是有一定帮助的,起码可以了解bin目录下对应的各种脚本大致是做什么用的,废话不多说,直接开始分析流程,还是和之前的模式一样,直接在脚本中添加了注释,通过注释的方式进行说明。
start-all.sh脚本内容如下:

bin=`dirname "${BASH_SOURCE-$0}"`  
bin=`cd "$bin"; pwd`   #bin中存储当前脚本所在路径DEFAULT_LIBEXEC_DIR="$bin"/../libexec
#获取HADOOP_LIBEXEC_DIR路径,如果系统环境变量已经配置则用环境变量中路径,否则用获取到的默认的路径
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} 
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh  ## start hdfs daemons if hdfs is present
if [ -f "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh ]; then"${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR
fi# start yarn daemons if yarn is present
if [ -f "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh ]; then"${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR
fi

这个脚本中主要先对寻找到hadoop-config.sh的位置,然后执行这个脚本,这个脚本见名知意很显然是进行hadoop配置设置的,脚本内容和注释如下:

this="${BASH_SOURCE-$0}"
common_bin=$(cd -P -- "$(dirname -- "$this")" && pwd -P)
script="$(basename -- "$this")"
this="$common_bin/$script"[ -f "$common_bin/hadoop-layout.sh" ] && . "$common_bin/hadoop-layout.sh"#以下这些路径如果没有设置,后面的相对路径都是在hadoop安装路径下
HADOOP_COMMON_DIR=${HADOOP_COMMON_DIR:-"share/hadoop/common"}  #指定hadoop-common-2.6.0.jar这类jar包的路径
HADOOP_COMMON_LIB_JARS_DIR=${HADOOP_COMMON_LIB_JARS_DIR:-"share/hadoop/common/lib"} #上面common路径更深一层的lib路径下的各种依赖jar包路径
HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_COMMON_LIB_NATIVE_DIR:-"lib/native"} #依赖的native动态链接库路径
HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"} #hdfs的依赖
HDFS_LIB_JARS_DIR=${HDFS_LIB_JARS_DIR:-"share/hadoop/hdfs/lib"} #hdfs的第三方依赖包
YARN_DIR=${YARN_DIR:-"share/hadoop/yarn"} #yarn的依赖
YARN_LIB_JARS_DIR=${YARN_LIB_JARS_DIR:-"share/hadoop/yarn/lib"} #yarn第三方依赖
MAPRED_DIR=${MAPRED_DIR:-"share/hadoop/mapreduce"} #mapreduce依赖
MAPRED_LIB_JARS_DIR=${MAPRED_LIB_JARS_DIR:-"share/hadoop/mapreduce/lib"} #mapreduce第三方依赖# the root of the Hadoop installation
# See HADOOP-6255 for directory structure layout
HADOOP_DEFAULT_PREFIX=$(cd -P -- "$common_bin"/.. && pwd -P)
HADOOP_PREFIX=${HADOOP_PREFIX:-$HADOOP_DEFAULT_PREFIX}
export HADOOP_PREFIX#这一段获取hadoop-evn.sh这类配置文件所在路径,如果入参中有通过--config指定路径则用指定的路径,否则就用默认的hadoop安装路径下etc/hadoop/的路径
#check to see if the conf dir is given as an optional argument 
if [ $# -gt 1 ]
thenif [ "--config" = "$1" ]thenshiftconfdir=$1if [ ! -d "$confdir" ]; thenecho "Error: Cannot find configuration directory: $confdir"exit 1fishiftHADOOP_CONF_DIR=$confdirfi
fi# 由于版本不同对默认配置文件的路径有变化,兼容不同版本而获取到默认配置文件路径
# Allow alternate conf dir location.
if [ -e "${HADOOP_PREFIX}/conf/hadoop-env.sh" ]; thenDEFAULT_CONF_DIR="conf"
elseDEFAULT_CONF_DIR="etc/hadoop"
fiexport HADOOP_CONF_DIR="${HADOOP_CONF_DIR:-$HADOOP_PREFIX/$DEFAULT_CONF_DIR}"# 用户可以在环境变量中配置HADOOP_SLAVES或是HADOOP_SLAVE_NAMES,但是不能两个都同时配置有值
# User can specify hostnames or a file where the hostnames are (not both)
if [[ ( "$HADOOP_SLAVES" != '' ) && ( "$HADOOP_SLAVE_NAMES" != '' ) ]] ; thenecho 
"Error: Please specify one variable HADOOP_SLAVES or " 
"HADOOP_SLAVE_NAME and not both."exit 1
fi# Process command line options that specify hosts or file with host
# 读取命令行中用户配置的--hosts或是--hostnames,表示HADOOP_SLAVES、HADOOP_SLAVE_NAMES
if [ $# -gt 1 ]
thenif [ "--hosts" = "$1" ]thenshiftexport HADOOP_SLAVES="${HADOOP_CONF_DIR}/$1"shiftelif [ "--hostnames" = "$1" ]thenshiftexport HADOOP_SLAVE_NAMES=$1shiftfi
fi#这里再次校验HADOOP_SLAVES和HADOOP_SLAVE_NAMES不能同时存在,但是这里再报错就知道错误原因是由于启动命令行传参有误导致
# User can specify hostnames or a file where the hostnames are (not both)
# (same check as above but now we know it's command line options that cause
# the problem)
if [[ ( "$HADOOP_SLAVES" != '' ) && ( "$HADOOP_SLAVE_NAMES" != '' ) ]] ; thenecho 
"Error: Please specify one of --hosts or --hostnames options and not both."exit 1
fi#hadoop-env.sh主要是配置各种环境变量(hadoop和jvm的)
if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then. "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi# check if net.ipv6.bindv6only is set to 1
bindv6only=$(/sbin/sysctl -n net.ipv6.bindv6only 2> /dev/null)
if [ -n "$bindv6only" ] && [ "$bindv6only" -eq "1" ] && [ "$HADOOP_ALLOW_IPV6" != "yes" ]
thenecho "Error: "net.ipv6.bindv6only" is set to 1 - Java networking could be broken"echo "For more info: http://wiki.apache.org/hadoop/HadoopIPv6"exit 1
fi# Newer versions of glibc use an arena memory allocator that causes virtual
# memory usage to explode. This interacts badly with the many threads that
# we use in Hadoop. Tune the variable down to prevent vmem explosion.
export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-4}# Attempt to set JAVA_HOME if it is not set
if [[ -z $JAVA_HOME ]]; then# On OSX use java_home (or /Library for older versions)if [ "Darwin" == "$(uname -s)" ]; thenif [ -x /usr/libexec/java_home ]; thenexport JAVA_HOME=($(/usr/libexec/java_home))elseexport JAVA_HOME=(/Library/Java/Home)fifi# Bail if we did not detect itif [[ -z $JAVA_HOME ]]; thenecho "Error: JAVA_HOME is not set and could not be found." 1>&2exit 1fi
fiJAVA=$JAVA_HOME/bin/java
# some Java parameters
JAVA_HEAP_MAX=-Xmx1000m # check envvars which might override default args
if [ "$HADOOP_HEAPSIZE" != "" ]; then#echo "run with heapsize $HADOOP_HEAPSIZE"JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"#echo $JAVA_HEAP_MAX
fi
...(略)

由于脚本较长,配置的内容过多,只针对性的选择了部分内容贴出来加以说明,显然通过hadoop-config.sh是类似于配置中心,通过其对各种参数配置完毕后,start-all.sh会调用start-dfs.sh,在start-dfs.sh中会调用hadoop-daemon.sh这个我们十分熟悉的脚本,一般我们如果单独启动某些应用例如namenode、datanode就是通过这个脚本来启动的,这里通过脚本的形式自动调用hadoop-daemon.sh去启动需要启动的进程服务,start-dfs.sh脚本内容如下:

#---------------------------------------------------------
# namenodesNAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes)echo "Starting namenodes on [$NAMENODES]""$HADOOP_PREFIX/sbin/hadoop-daemons.sh" 

–config “ H A D O O P C O N F D I R " − − h o s t n a m e s " HADOOP_CONF_DIR" --hostnames " HADOOPCONFDIR"hostnames"NAMENODES”
–script “$bin/hdfs” start namenode $nameStartOpt

#---------------------------------------------------------
# datanodes (using default slaves file)if [ -n "$HADOOP_SECURE_DN_USER" ]; thenecho 
"Attempting to start secure cluster, skipping datanodes. " 
"Run start-secure-dns.sh as root to complete startup."
else"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" 
--config "$HADOOP_CONF_DIR" 
--script "$bin/hdfs" start datanode $dataStartOpt
fi#---------------------------------------------------------
# secondary namenodes (if any)SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null)if [ -n "$SECONDARY_NAMENODES" ]; thenecho "Starting secondary namenodes [$SECONDARY_NAMENODES]""$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --hostnames "$SECONDARY_NAMENODES" --script "$bin/hdfs" start secondarynamenode
fi

这里也是列出了核心脚本,显然依据传入参数不同,会调用hadoop-daemon.sh针对性的启动不同的进程服务,hadoop-daemon.sh脚本如下:

usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."# if no args specified, show usage
if [ $# -le 1 ]; thenecho $usageexit 1
fibin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.shexec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" ; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"

在hadoop-daemon.sh最后通过exec调用slaves.sh脚本,slaves.sh脚本的作用就是在各个slave节点上启动对应服务,这里就会读取slave配置文件中配置的数据,slaves.sh脚本如下:

usage="Usage: slaves.sh [--config confdir] command..."# if no args specified, show usage
if [ $# -le 0 ]; thenecho $usageexit 1
fibin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.shif [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then. "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi# Where to start the script, see hadoop-config.sh
# (it set up the variables based on command line options)
if [ "$HADOOP_SLAVE_NAMES" != '' ] ; thenSLAVE_NAMES=$HADOOP_SLAVE_NAMES
elseSLAVE_FILE=${HADOOP_SLAVES:-${HADOOP_CONF_DIR}/slaves}SLAVE_NAMES=$(cat "$SLAVE_FILE" | sed  's/#.*$//;/^$/d')
fi# start the daemons
for slave in $SLAVE_NAMES ; dossh $HADOOP_SSH_OPTS $slave $"${@// /\ }" 

2>&1 | sed “s/^/KaTeX parse error: Expected 'EOF', got '&' at position 11: slave: /" &̲ if [ "HADOOP_SLAVE_SLEEP” != “” ]; then
sleep $HADOOP_SLAVE_SLEEP
fi
done

wait

slaves.sh脚本主要是依据hadoop-daemon脚本的入参,去子节点上执行对应脚本,启动对应服务进程。如上在start-dfs.sh调用hadoop-daemon.sh的时候就传入了启动进程的初始脚本,都是./bin/hdfs start namenode这种,很显然最终的启动行应该是在hdfs中,脚本比较长,主要就是依据入参需要启动的服务不同(这里面涉及的服务特别多),组装好对应参数,然后调用:

exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"

完成服务的启动,这个是在hdfs脚本最后一行。以上就是脚本整体流程,可以看到主要需要了解的个人觉得还是hadoop-config.sh这个脚本中的各个配置项,因为如果要深入掌握集群情况,有时候这些参数还是需要知道甚至调整的。


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部