赞
踩
1. Transform问题
不能这样用 select usrid, movieid, rating, transform(ts) using “python stamp2date.py” as date from rating_table; 只能这样用 select transform(usrid, movieid, rating, ts) using “python stamp2date.py” as usrid, movieid, rating, date from rating_table;
Stamp2date.py 里面是用 split(‘\t’),因为select之后的字段是用’\t’隔开的
Cat stamp2date.py
import sys
from datetime import datetime
for ss in sys.stdin:
userid, movieid, rating, timest = ss.strip().split('\t')
ymddate = datetime.fromtimestamp(int(timest)).date()
ymdstr = ymddate.strftime("%Y-%m-%d")
print ','.join([userid, movieid, rating, ymdstr])
这样子输出之后只能作为一个字段,如果作为as usrid, movieid, rating, date会输出1,1029,3.0,2012-10-01\N\N\N;所以as ss才会输出1,1029,3.0,2012-10-01
2. hive遇到FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask错误
三种方案:改变引擎,调整map reduce的内存,设置三台机子同步
hive>set hive.execution.engine=tez;
改变引擎
https://www.cnblogs.com/hankedang/p/4210598.html
https://jingyan.baidu.com/article/bad08e1e4e425b49c8512188.html
调整map reduce的内存(主要是这个原因)
https://blog.csdn.net/random0815/article/details/84944815
https://www.cnblogs.com/ITtangtang/p/7683028.html
设置三台机子同步
退出安全模式
https://www.jianshu.com/p/de308d935d9b
通过tracking url查找错误(一定要用谷歌浏览器)
先启动https://blog.csdn.net/weixin_43481376/article/details/88662831
https://blog.csdn.net/lcm_linux/article/details/103835204
3. Select创建的表格不能是外部表
USE practice;
CREATE TABLE behavior_table
LOCATION '/hive-test/behavior' 这些话是在as之前
as
SELECT A.movieid, B.userid, A.title, B.rating
FROM
(SELECT movieid, title FROM movie_table) A
INNER JOIN
(SELECT userid, movieid, rating FROM rating_table) B
on A.movieid = B.movieid;
4. 创建分区表
https://www.jianshu.com/p/69efe36d068b
5. 动态分区异常处理
https://blog.csdn.net/helloxiaozhe/article/details/79710707
6. 运行hive时ls: 无法访问/usr/local/src/spark-2.0.2-bin-hadoop2.6/lib/spark-assembly-*.jar: 没有那个文件或目录
https://blog.csdn.net/weixin_42496757/article/details/87555292
7. [动态分区中]如何减少map文件数量,就算不是动态分区也适用
8. Container killed on request. Exit code is 143
https://blog.csdn.net/yijichangkong/article/details/51332432
9. 创建分桶表不能用LIKE
10. Hive的日志
https://www.cnblogs.com/kouryoushine/p/7805657.html
https://www.cnblogs.com/hello-wei/p/10645740.html
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。