赞
踩
基本环境资源
Hadoop:2.7.X
Hive:2.1.X.bin.tar.gz 版本
Hive:1.x.src.tar.gz 源码版本
第一步:windows 安装Hadoop2.7.x,请参考:
第二步:下载Hive.tar.gz,官网下载地址:http://archive.apache.org/dist/hive
第二步:解压Hive.tar.gz 至指定文件夹目录(C:\hive),配置Hive 全局环境变量。
Hive 全局环境变量:
第三步:Hive 配置文件(C:\hive\apache-hive-2.1.1-bin\conf)
配置文件目录C:\hive\apache-hive-2.1.1-bin\conf\conf有4个默认的配置文件模板拷贝成新的文件名
hive-default.xml.template -----> hive-site.xml
hive-env.sh.template -----> hive-env.sh
hive-exec-log4j.properties.template -----> hive-exec-log4j2.properties
hive-log4j.properties.template -----> hive-log4j2.properties
第四步: 新建本地目录后面配置文件用到
C:\hive\apache-hive-2.1.1-bin\my_hive
第五步:Hive需要调整的配置文件(hive-site.xml 和hive-env.sh)
编辑C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-site.xml 文件
-
<!--hive的临时数据目录,指定的位置在hdfs上的目录-->
-
<property>
-
<name>hive.metastore.warehouse.dir
</name>
-
<value>/user/hive/warehouse
</value>
-
<description>location of default database for the warehouse
</description>
-
</property>
-
-
-
-
<!--hive的临时数据目录,指定的位置在hdfs上的目录-->
-
<property>
-
<name>hive.exec.scratchdir
</name>
-
<value>/tmp/hive
</value>
-
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/
<username
> is created, with ${hive.scratch.dir.permission}.
</description>
-
</property>
-
-
-
-
<!-- scratchdir 本地目录 -->
-
<property>
-
<name>hive.exec.local.scratchdir
</name>
-
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/scratch_dir
</value>
-
<description>Local scratch space for Hive jobs
</description>
-
</property>
-
-
<!-- resources_dir 本地目录 -->
-
<property>
-
<name>hive.downloaded.resources.dir
</name>
-
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/resources_dir/${hive.session.id}_resources
</value>
-
<description>Temporary local directory for added resources in the remote file system.
</description>
-
</property>
-
-
<!-- querylog 本地目录 -->
-
<property>
-
<name>hive.querylog.location
</name>
-
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/querylog_dir
</value>
-
<description>Location of Hive run time structured log file
</description>
-
</property>
-
-
<!-- operation_logs 本地目录 -->
-
<property>
-
<name>hive.server2.logging.operation.log.location
</name>
-
<value>C:/hive/apache-hive-2.1.1-bin/my_hive/operation_logs_dir
</value>
-
<description>Top level directory where operation logs are stored if logging functionality is enabled
</description>
-
</property>
-
-
<!-- 数据库连接地址配置 -->
-
<property>
-
<name>javax.jdo.option.ConnectionURL
</name>
-
<value>jdbc:mysql://192.168.60.178:3306/hive?serverTimezone=UTC
&useSSL=false
&allowPublicKeyRetrieval=true
</value>
-
<description>
-
JDBC connect string for a JDBC metastore.
-
</description>
-
</property>
-
-
<!-- 数据库驱动配置 -->
-
<property>
-
<name>javax.jdo.option.ConnectionDriverName
</name>
-
<value>com.mysql.cj.jdbc.Driver
</value>
-
<description>Driver class name for a JDBC metastore
</description>
-
</property>
-
-
<!-- 数据库用户名 -->
-
<property>
-
<name>javax.jdo.option.ConnectionUserName
</name>
-
<value>admini
</value>
-
<description>Username to use against metastore database
</description>
-
</property>
-
-
<!-- 数据库访问密码 -->
-
<property>
-
<name>javax.jdo.option.ConnectionPassword
</name>
-
<value>123456
</value>
-
<description>password to use against metastore database
</description>
-
</property>
-
-
<!-- 解决 Caused by: MetaException(message:Version information not found in metastore. ) -->
-
<property>
-
<name>hive.metastore.schema.verification
</name>
-
<value>false
</value>
-
<description>
-
Enforce metastore schema version consistency.
-
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
-
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
-
proper metastore schema migration. (Default)
-
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
-
</description>
-
</property>
-
-
<!-- 自动创建全部 -->
-
<!-- hive Required table missing : "DBS" in Catalog""Schema" 错误 -->
-
<property>
-
<name>datanucleus.schema.autoCreateAll
</name>
-
<value>true
</value>
-
<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.
</description>
-
</property>
编辑(C:\hive\apache-hive-2.1.1-bin\conf\conf\hive-env.sh 文件)
-
# Set HADOOP_HOME to point to a specific hadoop install directory
-
export HADOOP_HOME=
C:\hadoop\hadoop-
2.7.
6
-
-
# Hive Configuration Directory can be controlled by:
-
export HIVE_CONF_DIR=
C:\hive\apache-hive-
2.1.
1-bin\conf
-
-
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
-
export HIVE_AUX_JARS_PATH=
C:\hive\apache-hive-
2.1.
1-bin\
lib
第六步:在hadoop上创建hdfs目录
-
hadoop
fs -mkdir /tmp
-
hadoop
fs -mkdir /user/
-
hadoop
fs -mkdir /user/hive/
-
hadoop
fs -mkdir /user/hive/warehouse
-
hadoop
fs -chmod g+w /tmp
-
hadoop
fs -chmod g+w /user/hive/warehouse
第七步:创建Hive 初始化依赖的数据库hive,注意编码格式:latin1
第八步:启动Hive 服务
(1)、首先启动Hadoop,执行指令:stall-all.cmd
(2)、Hive 初始化数据,执行指令:hive --service metastore
如果一切正常,cmd 窗口指令显示如下截图
如果Hive 初始化正常,MySQL中Hive 数据库涉及表,如下截图:
(3)、启动Hive服务,执行指令:hive
至此,windows 10 搭建Hive 服务结束。
遇到的问题(1):Hive 执行数据初始化(hive --service metastore),总是报错。
解决思路:通过Hive 自身携带的脚本,完成Hive 数据库的初始化。
Hive 携带脚本的文件位置(C:\hive\apache-hive-2.1.1-bin\scripts\metastore\upgrade),选择执行SQL的版本,如下截图:
选择需要执行的Hive版本(Hive_x.x.x)所对应的sql 版本(hive-schema-x.x.x.mysql.sql)
说明:我选择Hive版本时2.1.1,所以我选项的对应sql 版本hive-schema-2..1.0.mysql.sql 脚本。
遇到的问题(2):Hive 的Hive_x.x.x_bin.tar.gz 版本在windows 环境中缺少 Hive的执行文件和运行程序。
解决版本:下载低版本Hive(apache-hive-1.0.0-src),将bin 目录替换目标对象(C:\hive\apache-hive-2.1.1-bin)原有的bin目录。
截图如下:apache-hive-1.0.0-src\bin 目录 结构
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。