赞
踩
搭建过程参考网上各种教程, 现在汇总下具体步骤内容。
:~$ hive Hive Session ID = 6eed60ea-639e-4b17-ad3f-4ef008c510f0 Logging initialized using configuration in file:/opt/apache-hive-bin/conf/hive-log4j2.properties Async: true Hive Session ID = d028c5da-691d-4313-a388-c2c00a4c306b hive> use hive; OK Time taken: 0.451 seconds hive> show tables; OK tab_name hive_table Time taken: 0.142 seconds, Fetched: 1 row(s) hive> desc hive_table; OK col_name data_type comment id int name string ver string package string path string Time taken: 0.117 seconds, Fetched: 5 row(s) hive> select * from hive_table; OK hive_table.id hive_table.name hive_table.ver hive_table.package hive_table.path 1 hadoop 3.3.0 hadoop-3.3.0.tar.gz /opt/hadoop 2 hive 3.2.1 apache-hive-3.1.2-bin.tar.gz /opt/apache-hive-bin 3 mysql 8.0.20 mysql-server /usr/local/mysql 4 spark 2.4.7 spark-2.4.7-bin-without-hadoop.tgz /opt/spark-bin-without-hadoop Time taken: 1.25 seconds, Fetched: 4 row(s) hive> select t1.id, t1.name, t1.ver, t1.package, row_number() over(partition by t1.id order by t2.id desc) as row_id, count(1) over(partition by t1.name) as cnt from hive_table t1 left join hive_table t2 on 1 = 1 where t2.id is not null; Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Warning: Map Join MAPJOIN[19][bigTable=?] in task 'Stage-1:MAPRED' is a cross product Query ID = ***_20201025134120_9a652c42-4ceb-47e5-bcca-1b213b8c6cd7 Total jobs = 2 Launching Job 1 out of 2 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Running with YARN Application = application_1603552715345_0005 Kill Command = /opt/hadoop/bin/yarn application -kill application_1603552715345_0005 Hive on Spark Session Web UI URL: http://localhost:4040 Query Hive on Spark job[0] stages: [0] Spark job[0] status = RUNNING -------------------------------------------------------------------------------------- STAGES ATTEMPT STATUS TOTAL COMPLETED RUNNING PENDING FAILED -------------------------------------------------------------------------------------- Stage-0 ........ 0 FINISHED 1 1 0 0 0 -------------------------------------------------------------------------------------- STAGES: 01/01 [==========================>>] 100% ELAPSED TIME: 7.07 s -------------------------------------------------------------------------------------- Spark job[0] finished successfully in 7.07 second(s) Launching Job 2 out of 2 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Running with YARN Application = application_1603552715345_0005 Kill Command = /opt/hadoop/bin/yarn application -kill application_1603552715345_0005 Hive on Spark Session Web UI URL: http://localhost:4040 Query Hive on Spark job[1] stages: [1, 2, 3] Spark job[1] status = RUNNING -------------------------------------------------------------------------------------- STAGES ATTEMPT STATUS TOTAL COMPLETED RUNNING PENDING FAILED -------------------------------------------------------------------------------------- Stage-1 ........ 0 FINISHED 1 1 0 0 0 Stage-2 ........ 0 FINISHED 1 1 0 0 0 Stage-3 ........ 0 FINISHED 1 1 0 0 0 -------------------------------------------------------------------------------------- STAGES: 03/03 [==========================>>] 100% ELAPSED TIME: 8.07 s -------------------------------------------------------------------------------------- Spark job[1] finished successfully in 8.07 second(s) OK t1.id t1.name t1.ver t1.package row_id cnt 1 hadoop 3.3.0 hadoop-3.3.0.tar.gz 1 4 1 hadoop 3.3.0 hadoop-3.3.0.tar.gz 2 4 1 hadoop 3.3.0 hadoop-3.3.0.tar.gz 3 4 1 hadoop 3.3.0
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。