赞
踩
组件 | 版本 |
---|---|
hadoop | 2.6.5 |
hive | 2.3.6 |
tez | 0.8.5 |
tez对hadoop版本是有要求的。tez 0.8及以上需要hadoop 2.6及以上。tez 0.9及以上需要hadoop 2.7及以上。
apache-tez-0.8.5-bin.tar.gz
,解压后放在/usr/local/src
目录下并建立软连接。如下图所示。tez官网介绍的是用源码编译的方式获取tez,由于源码编译太慢了,直接采用编译好的tez包apache-tez-0.8.5-bin.tar.gz
。tez.tar.gz
包放在${TEZ_HOME}/share
目录下。hdfs dfs -mkdir -p /apps/tez
hdfs dfs -put tez/share/tez.tar.gz /apps/tez
tez-site.xml
文件,放在${HADOOP_HOME}/etc/hadoop
目录下,内容如下。tez.lib.uris
属性指向刚刚上传到hdfs上的tez.tar.gz
路径。编写完成后拷贝tez-site.xml
文件到所有节点的${HADOOP_HOME}/etc/hadoop
目录下<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez/tez.tar.gz</value>
</property>
</configuration>
export TEZ_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export TEZ_HOME=/usr/local/src/tez
export TEZ_JARS=${TEZ_HOME}/*:${TEZ_HOME}/lib/*
export HADOOP_CLASSPATH=$TEZ_CONF_DIR:$TEZ_JARS:$HADOOP_CLASSPATH
YARN timelineserver
服务。在hadoop2.4版本之前对任务执行的监控只开发了针对MR的Job History Server,它可以提供给用户用户查询已经运行完成的作业的信息,但是后来,随着在YARN上面集成的越来越多的计算框架,比如spark、Tez,也有必要为基于这些计算引擎的技术开发相应的作业任务监控工具,所以hadoop的开发人员就考虑开发一款更加通用的Job History Server
,即YARN Timeline Server
。在yarn-site.xml
文件添加如下内容配置YARN Timeline Server
。更加详细的配置可参考TimelineServer
特别需要注意的是yarn.timeline-service.hostname
需要改成启动TimelineServer
服务的节点地址,如我在master机器上启动,这里就写master。官网默认是0.0.0.0,这样DAG Master会报错找不到TimelineServer。改成真正的hostname即可。
<!--configurations for timelineserver--> <property> <name>yarn.timeline-service.hostname</name> <value>master</value> </property> <property> <description>Address for the Timeline server to start the RPC server.</description> <name>yarn.timeline-service.address</name> <value>${yarn.timeline-service.hostname}:10200</value> </property> <property> <description>The http address of the Timeline service web application.</description> <name>yarn.timeline-service.webapp.address</name> <value>${yarn.timeline-service.hostname}:8188</value> </property> <property> <description>The https address of the Timeline service web application.</description> <name>yarn.timeline-service.webapp.https.address</name> <value>${yarn.timeline-service.hostname}:8190</value> </property> <property> <description>Handler thread count to serve the client RPC requests.</description> <name>yarn.timeline-service.handler-thread-count</name> <value>10</value> </property> <property> <description>The max number of applications could be fetched by using REST API or application history protocol and shown in timeline server web ui. Defaults to `10000`.</description> <name>yarn.timeline-service.generic-application-history.max-applications</name> <value>10000</value> </property> <property> <description>Enables cross-origin support (CORS) for web services where cross-origin web response headers are needed. For example, javascript making a web services request to the timeline server.</description> <name>yarn.timeline-service.http-cross-origin.enabled</name> <value>true</value> </property> <property> <description>Comma separated list of origins that are allowed for web services needing cross-origin (CORS) support. Wildcards (*) and patterns allowed</description> <name>yarn.timeline-service.http-cross-origin.allowed-origins</name> <value>*</value> </property> <property> <description>Comma separated list of methods that are allowed for web services needing cross-origin (CORS) support.</description> <name>yarn.timeline-service.http-cross-origin.allowed-methods</name> <value>GET,POST,HEAD</value> </property> <property> <description>Comma separated list of headers that are allowed for web services needing cross-origin (CORS) support.</description> <name>yarn.timeline-service.http-cross-origin.allowed-headers</name> <value>X-Requested-With,Content-Type,Accept,Origin</value> </property> <property> <description>The number of seconds a pre-flighted request can be cached for web services needing cross-origin (CORS) support.</description> <name>yarn.timeline-service.http-cross-origin.max-age</name> <value>1800</value> </property> <property> <description>Indicate to ResourceManager as well as clients whether history-service is enabled or not. If enabled, ResourceManager starts recording historical data that Timelien service can consume. Similarly, clients can redirect to the history service when applications finish if this is enabled.</description> <name>yarn.timeline-service.generic-application-history.enabled</name> <value>true</value> </property> <property> <description>Store class name for history store, defaulting to file system store</description> <name>yarn.timeline-service.generic-application-history.store-class</name> <value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value> </property> <property> <description>Indicate to clients whether Timeline service is enabled or not. If enabled, the TimelineClient library used by end-users will post entities and events to the Timeline server.</description> <name>yarn.timeline-service.enabled</name> <value>true</value> </property> <property> <description>Store class name for timeline store.</description> <name>yarn.timeline-service.store-class</name> <value>org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore</value> </property> <property> <description>Enable age off of timeline store data.</description> <name>yarn.timeline-service.ttl-enable</name> <value>true</value> </property> <property> <description>Publish YARN information to Timeline Server</description> <name>yarn.resourcemanager.system-metrics-publisher.enabled</name> <value>true</value> </property> <property> <description>Time to live for timeline store data in milliseconds.</description> <name>yarn.timeline-service.ttl-ms</name> <value>604800000</value> </property>
在tez-site.xml
文件添加如下内容来配置tez ui
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
<property>
<name>tez.tez-ui.history-url.base</name>
<value>http://master:8080/tez-ui/</value>
</property>
${TEZ_HOME}/tez-ui-0.8.5.war
文件拷贝到${TOMCAT_HOME}/webapps/
下并重命名为tez-ui.war
,如下图。这就对应上面tez-site.xml
文件中的tez.tez-ui.history-url.base
属性值YARN Timeline Server
服务启动的节点,就需要修改tez-ui/scripts/configs.js
文件,如下所示,timelineBaseUrl
和RMWebUrl
写成正确的地址hive-site.xml
文件,将执行引擎修改为tez,如下所示<property>
<name>hive.execution.engine</name>
<value>tez</value>
<description/>
</property>
Timeline Server
服务和tomcatstart-dfs.sh
start-yarn.sh
yarn-daemon.sh start historyserver
在hive里执行hql语句后出现如下图所示的结果,并且能在yarn ui上点开进入到tez ui界面
默认情况下,application对应的历史文件会存储在yarn.timeline-service.leveldb-timeline-store.path
,默认值是${hadoop.tmp.dir}/yarn/timeline
如果想退回用hive on mr,则可以通过unset命令取消掉当前会话下关于TEZ的环境变量和HADOOP_CLASSPATH,并同时修改hive-site.xml文件中的执行引擎,然后重启hiveserver2服务重新进入beeline
就可以退回了。
如果想再次用hive on tez,则需要source /etc/profile
来加载关于TEZ的环境变量和HADOOP_CLASSPATH,并同时修改hive-site.xml文件中的执行引擎,然后重启hiveserver2服务重新进入beeline
。
unset HADOOP_CLASSPATH
unset TEZ_CONF_DIR
unset TEZ_HOME
unset TEZ_JARS
beeline -u jdbc:hive2://master:10000 -n root --hiveconf hive.execution.engine=mr
不按照上述操作的话直接换成mr引擎,可能报SuchNoField
等错误,明显的版本不兼容。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。