当前位置:   article > 正文

Hive之爆炸函数_hive explode函数

hive explode函数

本部分主要讲述了列转行函数,是最最最最常用的函数,爆炸函数

1、函数说明

1)explode(col)

explode 函数是UFTF函数,将hive一列中复杂的array或者map结构拆分成多行Explode函数是不允许在select再有其他字段,

explode(ARRAY)  列表中的每个元素生成一行

explode(MAP)     map中每个key-value对,生成一行,key为一列,value为一列

限制:
1、No other expressions are allowed in SELECT
        SELECT pageid, explode(adid_list) AS myCol... is not supported
2、UDTF's can't be nested
        SELECT explode(explode(adid_list)) AS myCol... is not supported
3、GROUP BY / CLUSTER BY / DISTRIBUTE BY / SORT BY is not supported
        SELECT explode(adid_list) AS myCol ... GROUP BY myCol is not supported

举例1:拆分array

这里写图片描述

Then running the query:  SELECT explode(myCol) AS myNewCol FROM myTable;

will produce: 
这里写图片描述 

举例2:SELECT explode(myMap) AS (myMapKey, myMapValue) FROM myMapTable;

2)lateral view侧写函数

可使用lateral view解除以上限制,只要是想查询除exlpode之外的字段就要使用侧写函数lateral view

用法:LATERAL VIEW udtf(expression)  tableAlias(临时表名)  AS columnAlias(临时表的字段)

解释:用于和split, explodeUDTF一起使用,它能够将一列数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合。

Example

Consider the following base table named pageAds. It has two columns: pageid (name of the page) and adid_list (an array of ads appearing on the page) 
这里写图片描述

An example table with two rows: 
这里写图片描述

and the user would like to count the total number of times an ad appears across all pages. 
A lateral view with explode() can be used to convert adid_list into separate rows using the query;

SELECT pageid, adid
FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;

The resulting output will be 
这里写图片描述 
Then in order to count the number of times a particular ad appears, count/group by can be used:

SELECT adid, count(1)
FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid
GROUP BY adid;

The resulting output will be 
这里写图片描述

由此可见,lateral view与explode等udtf就是天生好搭档,explode将复杂结构一行拆成多行,然后再用lateral view做各种聚合。

 

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/AllinToyou/article/detail/426553
推荐阅读
相关标签
  

闽ICP备14008679号