当前位置:   article > 正文

Hive字符串转为复杂格式数据_hive字符串转数组

hive字符串转数组
1、字符串转为map
str_to_map(text[, delimiter1, delimiter2])
使用两个分隔符将文本拆分为键值对。 Delimiter1将文本分成K-V对,Delimiter2分割每个K-V对。对于delimiter1默认分隔符是',',对于delimiter2默认分隔符是'='。
示例:
  1. select str_to_map('aaa:11&bbb:22', '&', ':');
  2. select str_to_map('aaa:11&bbb:22', '&', ':')['aaa'];
  3. select str_to_map('device_ds:2&uid_cnt:1','&',',') --键值分割不到,值会出现Null
综合使用示范:
  1. select a1.appkey,a1.appsource,index_key,index_value
  2. from tab_sum a1
  3. lateral view explode(str_to_map(concat('device_ds:',a1.device_ds_cnt,'&','uid_cnt:',a1.uid_cnt),'&',':')) mid_list_tab as index_key,index_value;
2、字符串转为array
分割字符串函数: split
语法:  split(string str, stringpat)
返回值:  array
说明:
按照pat字符串分割str,会返回分割后的字符串数组
举例:
  1. select split('aaa:11:bbb:22',':');
  2. ["aaa","11","bbb","22"]
  3. select split('aaa:11:bbb:22',':')[0];
  4. aaa
3、字符字段去重汇总转成array
collect_set函数:该函数的作用是将某字段的值进行去重汇总,产生Array类型字段。
  1. drop table if exists xxxxx_tabletest;
  2. CREATE TABLE xxxxx_tabletest(
  3. id string,
  4. name string)
  5. ROW FORMAT SERDE
  6. 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
  7. WITH SERDEPROPERTIES (
  8. 'field.delim'=',',
  9. 'line.delim'='\n',
  10. 'serialization.format'=',');
  11. insert into xxxxx_tabletest(id,name)
  12. values
  13. ('1','A'),
  14. ('1','C'),
  15. ('1','B'),
  16. ('2','B'),
  17. ('2','C'),
  18. ('2','D'),
  19. ('3','B'),
  20. ('3','C'),
  21. ('3','D');
  22. select id,collect_set(name) from xxxxx_tabletest group by id;
  23. OK
  24. 1 ["A","C","B"]
  25. 2 ["B","C","D"]
  26. 3 ["B","C","D"]
  27. Time taken: 36.966 seconds, Fetched: 3 row(s)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/空白诗007/article/detail/911193
推荐阅读
相关标签
  

闽ICP备14008679号