赞
踩
Requirements
- Java 1.7
Note: Hive versions 1.2 onward require Java 1.7 or newer. Hive versions 0.14 to 1.1 work with Java 1.6 as well. Users are strongly advised to start moving to Java 1.8 (see HIVE-8607).- Hadoop 2.x (preferred), 1.x (not supported by Hive 2.0.0 onward).
Hive versions up to 0.13 also supported Hadoop 0.20.x, 0.23.x.- Hive is commonly used in production Linux and Windows environment. Mac is a commonly used development environment. The instructions in this document are applicable to Linux and Mac. Using it on Windows would require slightly different steps.
Hive Metastore Administration.
Introduction
All the metadata for Hive tables and partitions are accessed through the Hive Metastore. Metadata is persisted using JPOX ORM solution (Data Nucleus) so any database that is supported by it can be used by Hive. Most of the commercial relational databases and many open source databases are supported. See the list of supported databases in section below.
You can find an E/R diagram for the metastore here.
There are 2 different ways to setup the metastore server and metastore database using different Hive configurations:
Configuration options for metastore database where metadata is persisted:
Configuration options for metastore server:
安装并配置mysql
yum install -y mysql-server
service mysqld
mysql
# 设置远程登录权限
grant all privileges on *.* to 'root'@'%' identified by '123' with grant option;
# 删除原有重复登录权限
use mysql;
delete from user where host!='%';
# 修改后需要刷新权限或者重启服务生效
flush privileges;
# 退出后 用密码登录测试
启动集群
zkServer.sh start
# 在执行start-all.sh 命令的节点同时也是ResourceManager节点 可以直接启动RM(不包含RM备机)
start-all.sh
# 启动RM
yarn-daemon.sh start resourcemanager
上传 apache-hive-1.2.1-bin.tar.gz 安装包和 mysql-connector-java-5.1.32-bin.jar mysql 驱动包
安装hive1.2.1
tar xf apache-hive-1.2.1-bin.tar.gz /opt/
配置环境变量 使用 hive
可以测试是否配置成功
如果做分离多用户模式 需要拷贝hive到其他节点中
Remote Metastore Database
In this configuration, you would use a traditional standalone RDBMS server. The following example configuration will set up a metastore in a MySQL server. This configuration of metastore database is recommended for any real use.
Config Param Config Value Comment javax.jdo.option.ConnectionURL jdbc:mysql:///?createDatabaseIfNotExist=true
metadata is stored in a MySQL server javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver
MySQL JDBC driver class javax.jdo.option.ConnectionUserName <user name>
user name for connecting to MySQL server javax.jdo.option.ConnectionPassword <password>
password for connecting to MySQL server Local/Embedded Metastore Server
In local/embedded metastore setup, the metastore server component is used like a library within the Hive Client. Each Hive Client will open a connection to the database and make SQL queries against it. Make sure that the database is accessible from the machines where Hive queries are executed since this is a local store. Also make sure the JDBC client library is in the classpath of Hive Client. This configuration is often used with HiveServer2 (to use embedded metastore only with HiveServer2 add “–hiveconf hive.metastore.uris=’ '” in command line parameters of the hiveserver2 start command or use hiveserver2-site.xml (available in Hive 0.14)).
Config Param Config Value Comment hive.metastore.uris not needed because this is local store hive.metastore.local true
this is local store (removed in Hive 0.10, see configuration description section) hive.metastore.warehouse.dir <base hdfs path>
Points to default location of non-external Hive tables in HDFS.
hive-site.xml (hive-default.xml.template 重命名并删除其中默认配置)
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --><configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://node01/hive_remote?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive_remote/warehouse</value> </property> </configuration>
mysql 驱动包拷入 hive/lib 下
删除 hive/lib
或 hadoop-*/share/hadoop/yarn/lib
中 版本较低的 jline-*.jar
并将高版本拷入
运行hive hive
hive
show tables;
create table tbl(id int,age int);
show tables;
insert into tbl value(1,20);
# 此时可用网页查看yran任务,也可在hdfs后台查看hive <base hdfs path> 目录
mysql 中查看元数据
mysql -uroot -p
show databases;
use hive_remote;
# hive 为我们创建了大概29个表,
show tables;
select * from TBLS;
select * from COLUMNS_V2;
服务端配置,同单点模式相同
Server Configuration Parameters
The following example uses a Remote Metastore Database.
Config Param Config Value Comment javax.jdo.option.ConnectionURL jdbc:mysql:///?createDatabaseIfNotExist=true
metadata is stored in a MySQL server javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver
MySQL JDBC driver class javax.jdo.option.ConnectionUserName <user name>
user name for connecting to MySQL server javax.jdo.option.ConnectionPassword <password>
password for connecting to MySQL server hive.metastore.warehouse.dir <base hdfs path>
default location for Hive tables. hive.metastore.thrift.bind.host <host_name> Host name to bind the metastore service to. When empty, “localhost” is used. This configuration is available Hive 4.0.0 onwards. From Hive 3.0.0 (HIVE-16452) onwards the metastore database stores a GUID which can be queried using the Thrift API get_metastore_db_uuid by metastore clients in order to identify the backend database instance. This API can be accessed by the HiveMetaStoreClient using the method getMetastoreDbUuid().
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://node01/hive?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> </configuration>
客户端配置
Client Configuration Parameters
Config Param Config Value Comment hive.metastore.uris thrift://<host_name>:<port>
host and port for the Thrift metastore server. If hive.metastore.thrift.bind.host is specified, host should be same as that configuration. Read more about this in dynamic service discovery configuration parameters. hive.metastore.local false
Metastore is remote. Note: This is no longer needed as of Hive 0.10. Setting hive.metastore.uri is sufficient. hive.metastore.warehouse.dir <base hdfs path>
Points to default location of non-external Hive tables in HDFS.
拷贝服务端hive 文件到所有客户端节点,并配置环境变量
hive-site.xml (注意 thrift 默认端口号9083)
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <configuration> <property> <name>hive.metastore.uris</name> <value>thrift://node02:9083</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> </configuration>
hive --service metastore
(启动会阻塞窗口,使用 ss -nal
可以看到9083端口号)hive
(报错处理 参考单节点模式)Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。