赞
踩
原文地址:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#
lateral view用于和split, explode等UDTF一起使用,它能够将一列数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合。
一个简单的例子,假设我们有一张表pageAds,它有两列数据,第一列是pageid string,第二列是adid_list,即用逗号分隔的广告ID集合:
string pageid | Array<int> adid_list |
"front_page" | [1, 2, 3] |
"contact_page" | [3, 4, 5] |
要统计所有广告ID在所有页面中出现的次数。
首先分拆广告ID:
- SELECT pageid, adid
- FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid;
执行结果如下:
string pageid | int adid |
"front_page" | 1 |
"front_page" | 2 |
"front_page" | 3 |
"contact_page" | 3 |
"contact_page" | 4 |
"contact_page" | 5 |
接下来就是一个聚合的统计:
- SELECT adid, count(1)
- FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid
- GROUP BY adid;
-
- 执行结果如下:
-
-
- int adidcount(1)1121324151
- 多个lateral view语句:
- 一个FROM语句后可以跟多个lateral view语句,后面的lateral view语句能够引用它前面的所有表和列名。
- 以下面的表为例:
-
-
- Array<int>
- col1Array<string> col2[1, 2][a", "b", "c"][3, 4][d", "e", "f"]
- SELECT myCol1, col2 FROM baseTable
- LATERAL VIEW explode(col1) myTable1 AS myCol1;
的执行结果为:
int
mycol1 Array<string> col2 1 [a", "b", "c"] 2 [a", "b", "c"] 3 [d",
"e", "f"] 4 [d", "e", "f"]
加上一个lateral view:
- SELECT myCol1, myCol2 FROM baseTable
- LATERAL VIEW explode(col1) myTable1 AS myCol1
- LATERAL VIEW explode(col2) myTable2 AS myCol2;
它的执行结果为:
int
myCol1 string myCol2 1 "a" 1 "b" 1 "c" 2 "a" 2 "b" 2 "c" 3 "d" 3 "e" 3 "f" 4 "d" 4 "e" 4 "f"
注意上面语句中,两个lateral view按照出现的次序被执行。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。