当前位置:   article > 正文

Hive综合案例实战_hive应用案例

hive应用案例

一 、数据源的准备工作

首先我们去一个网站下载相关的数据,之后通过hive导入进行实验.http://grouplens.org/

二 、内部表

创建内部表并载入数据

  1. hadoop@hadoopmaster:~$ beeline-u jdbc:hive2://hadoopmaster:10000/

  2. Beelineversion2.1.0byApacheHive

  3. 0:jdbc:hive2://hadoopmaster:10000/> show databases;

  4. OK

  5. +----------------+--+

  6. |database_name|

  7. +----------------+--+

  8. |default|

  9. |fincials|

  10. +----------------+--+

  11. 2rows selected(1.038seconds)

  12. 0:jdbc:hive2://hadoopmaster:10000/> use default;

  13. OK

  14. Norows affected(0.034seconds)

  15. 0:jdbc:hive2://hadoopmaster:10000/> create table u_data (userid INT, movieid INT, rating INT, unixtime STRING) row format delimited fields terminated by '\t' lines terminated by '\n';

  16. OK

  17. Norows affected(0.242seconds)

  18. 0:jdbc:hive2://hadoopmaster:10000/> LOAD DATA LOCAL INPATH '/home/hadoop/u.data' OVERWRITE INTO TABLE u_data;

  19. Loadingdata to tabledefault.u_data

  20. OK

  21. Norows affected(0.351seconds)

  22. 0:jdbc:hive2://hadoopmaster:10000/> select * from u_data;

  23. OK

  24. +----------------+-----------------+----------------+------------------+--+

  25. |u_data.userid|u_data.movieid|u_data.rating|u_data.unixtime|

  26. +----------------+-----------------+----------------+------------------+--+

  27. |196|242|3|881250949|

  28. |186|302|3|891717742|

  29. |22|377|1|878887116|

  30. |244|51|2|880606923|

  31. |166|346|1|886397596|

  32. |298|474|4|884182806|

  33. |115|265|2|881171488|

  34. |253|465|5|891628467|

  35. |305|451|3|886324817|

  36. |6|86|3|883603013|

  37. |62|257|2|879372434|

  38. |286|1014|5|879781125|

查看占用的HDFS空间

  1. hadoop@hadoopmaster:~$ hdfs dfs-ls/user/hive/warehouse/u_data

  2. Found1items

  3. -rwxrwxr-x2hadoop supergroup19791732016-07-2210:19/user/hive/warehouse/u_data/u.data

写脚本反复导入100次

先查看以前有多少行

  1. 0:jdbc:hive2://hadoopmaster:10000/> select count(*) from u_data;

  2. WARNING:Hive-on-MRisdeprecatedinHive2andmaynotbe availableinthe future versions.Considerusinga different execution engine(i.e.tez,spark)orusingHive1.Xreleases.

  3. QueryID=hadoop_20160722102853_77aa1bc6-79c2-4916-9b07-a763d112ef41

  4. Totaljobs=1

  5. LaunchingJob1outof1

  6. Numberof reduce tasks determined at compile time:1

  7. Inorder to change the average loadfora reducer(inbytes):

  8. sethive.exec.reducers.bytes.per.reducer=<number>

  9. Inorder to limit the maximum number of reducers:

  10. sethive.exec.reducers.max=<number>

  11. Inorder toseta constant number of reducers:

  12. setmapreduce.job.reduces=<number>

  13. StartingJob=job_1468978056881_0003,TrackingURL=http://hadoopmaster:8088/proxy/application_1468978056881_0003/

  14. KillCommand=/usr/local/hadoop/bin/hadoop job-kill job_1468978056881_0003

  15. Hadoopjob informationforStage-1:number of mappers:1;number of reducers:1

  16. 2016-07-2210:28:58,786Stage-1map=0%,reduce=0%

  17. 2016-07-2210:29:03,890Stage-1map=100%,reduce=0%,CumulativeCPU0.89sec

  18. 2016-07-2210:29:10,005Stage-1map=100%,reduce=100%,CumulativeCPU1.71sec

  19. MapReduceTotalcumulative CPU time:1seconds710msec

  20. EndedJob=job_1468978056881_0003

  21. MapReduceJobsLaunched:

  22. Stage-Stage-1:Map:1Reduce:1CumulativeCPU:1.71sec   HDFSRead:19

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/588146
推荐阅读
相关标签
  

闽ICP备14008679号