赞
踩
检查Hive服务是否已经启动
使用Hive交互方式(输入hive即可)
使用beeline
需启动hiveserver2服务
nohup hive --service metastore &
nohup hive --service hiveserver2 &
输入命令回车
[root@nodefour ~]# nohup hive --service hiveserver2 &
[2] 4237
[root@nodefour ~]# nohup: 忽略输入并把输出追加到"nohup.out"
[root@nodefour ~]#
查看进程状态,如果启动RunJar三个及以上,则进行下一步
[root@nodefour ~]# jps
4035 RunJar
3813 RunJar
1687 SecondaryNameNode
4136 Jps
1532 DataNode
3644 RunJar
1405 NameNode
3903 RunJar
输入beeline进入beeline交互模式
[root@nodefour ~]# beeline
which: no hbase in (/root/software/jdk-11.0.8/bin:/root/software/jdk-11.0.8/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/install/hadoop/bin:/opt/install/hadoop/sbin:/opt/install/hive/bin:/root/software/apache-maven-3.3.9/bin:/root/bin)
Beeline version 1.1.0-cdh5.14.2 by Apache Hive
beeline>
在beeline模式输入(需要按几下回车)
!connect jdbc:hive2://nodefour:10000
示例:
beeline> !connect jdbc:hive2://nodefour:10000
scan complete in 1ms
Connecting to jdbc:hive2://nodefour:10000
Enter username for jdbc:hive2://nodefour:10000:
Enter password for jdbc:hive2://nodefour:10000:
Connected to: Apache Hive (version 1.1.0-cdh5.14.2)
Driver: Hive JDBC (version 1.1.0-cdh5.14.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
展示数据库
0: jdbc:hive2://nodefour:10000> show databases; INFO : Compiling command(queryId=root_20201210142929_3a2403f6-7cbd-4c07-a365-6df20369abfb): show databases INFO : Semantic Analysis Completed INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=root_20201210142929_3a2403f6-7cbd-4c07-a365-6df20369abfb); Time taken: 0.872 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=root_20201210142929_3a2403f6-7cbd-4c07-a365-6df20369abfb): show databases INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=root_20201210142929_3a2403f6-7cbd-4c07-a365-6df20369abfb); Time taken: 0.193 seconds INFO : OK +----------------+--+ | database_name | +----------------+--+ | default | +----------------+--+ 1 row selected (1.894 seconds)
退出
0: jdbc:hive2://nodefour:10000> !quit;
(1)类似于SQL的数据类型
(2)Hive数据结构
(1)打开分桶表
set hive.enforce.bucketing=true;
(2)建表
在建表时,如果表名和关键词相同,则使用 ``进行处理,其余则不用
create table if not exits hr_bucket(
name string,
id int,
num string,
time2 string
)
clustored by (id) into 8 buckets
row format delimited
fields terminated by '|';
(3)往分桶表插入数据
insert into table hr_bucket
(4)查询分桶表数据
select * from employee_hr limit 10;
(5)清空数据的命令
truncate table hr_bucket;
(6)查看分桶
dfs -cat /home/hadoop/hive/warehouse/test1.db/hr_bucket/000000
(1)随机抽样基于整行数据
select * from table_name TABLESAMPLE(BUCKET 3 OUT OF 32 ON rand()) s;
(2)随机抽样基于指定列(使用分桶列更高效)
取值方式:
分桶数b/out of 的数m=从第n个分桶开始取,往后再去取,从n+m处取
select * from table_name TABLESAMPLE(BUCKET 3 OUT OF 32 ON id) s;
(1)概述
1.通过隐藏子查询、连接和函数来简化查询的逻辑结构
2.只保存定义,不存储数据
3.如果删除或更改基础表,则查询视图将失败
4.视图是只读的,不能插入或装载数据
(2)应用场景
1.将特定的列提供给用户,保护数据隐私
2.用于查询语句复杂的场景
(3)操作命令
1.创建视图
create view view_hr as select name,id from employee_hr where id>10;
2.查找视图(SHOW VIEWS在hive v2.2.0之后)
show tables;
3.查看视图定义
show create table view_hr;
4.删除视图
drop view view_hr;
与表生成函数结合使用,将函数的输入和输出连接 OUTTER关键字:即使output为空也会生成结果
select
name,
wps,
skills,
score
from employee
lateral view explode(work_place) workplace as wps
lateral view explode(skills score) sksas as skills,score;
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。