赞
踩
hive> select id,name,explode(likes) from person;
FAILED: SemanticException [Error 10081]:
UDTF's are not supported outside the
SELECT clause, nor nested in expressions
lateral view udtf(expression) tablealias as columnalias (',' columnalias)
select id,name,address,mycol1 from person
lateral view explode(likes) mytable1 as mycol1;
hive> select id,name,address,mycol1
> from person
> lateral view explode(likes) mytable1 ————>可以解决当我们查询出所有的爱好(单独一列)想要对应到name时候,缺没有保存的方法,就做了一个类似虚拟表一样的东西,然后就可以关联出name
as mycol1;
OK
id name mycol1
1 小明1 lol
1 小明1 book
1 小明1 movie
2 小明2 lol
2 小明2 book
2 小明2 movie
......
查询id,name,爱好,以及地址:
select id,name,mycol1,mycol2,mycol3 from person
lateral view explode(likes) mytable1 as mycol1
lateral view explode(address) mytable2 as mycol2,mycol3;
和关系型数据库中的普通视图一样,hive也支持视图————>理解为保存数据(不是保存函数)
特点:
View语法
创建视图:
CREATE VIEW [IF NOT EXISTS] [db_name.]view_name
[(column_name [COMMENT column_comment], ...) ]
[COMMENT view_comment]
[TBLPROPERTIES (property_name =
property_value, ...)]
AS SELECT ... ;
创建视图实战:
create view v_psn as select * from person;
在对应元数据库hive_remote中的TBLS中多出一条记录:
查询视图:
select colums from view; :
hive> select id,name from v_psn;
OK
id name
1 小明1
2 小明2
3 小明3
4 小明4
5 小明5
6 小明6
7 小明7
8 小明8
9 小明9
带有排序的视图:
hive>create view v_psn2 as select * from person
order by id desc;
hive> select * from v_psn2 order by id desc;#和
视图排序一致是一个job
WARNING: Order/Sort by without limit in sub
query or view [v_psn2] is removed, as it's
pointless and bad for performance.
Query ID = root_20211122095900_497e1fd7-3e13-
4c20-9268-015ee23d6089
Total jobs = 1
Launching Job 1 out of 1
hive> select * from v_psn2 order by id;#hive3中
视图排序不一致是一个job,在hive1.2该情况2个job
pointless and bad for performance.
Query ID = root_20211122100037_c1159e80-86e4-
4733-869e-72803f8f0ee0
Total jobs = 1
Launching Job 1 out of 1
删除视图:
DROP VIEW [IF EXISTS] [db_name.]view_name; drop view v_psn;
目的:优化查询以及提交检索性能
创建索引
create index indexName on table
tableName(columnName)
as
'org.apache.hadoop.hive.ql.index.compact.Compac
tIndexHandler' with deferred rebuild
in table t1_index_table;
在person2表的name列上创建索引(不指定索引表)
create index t1_index on table person2(name)
as
'org.apache.hadoop.hive.ql.index.compact.Compac
tIndexHandler' with deferred rebuild;
# 默认多出一个名称 default__person2_t1_index__,开始时是一个空表
查询索引:
show index on tableName
重建索引(建立索引之后必须重建索引才能生效):
ALTER INDEX indexName ON tableName REBUILD;
删除索引:
DROP INDEX indexName ON tableName;
注意:索引在hive3.0+不适用,在hive3.0以前的版本可以使用
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。