当前位置:   article > 正文

搭建Hive3.1.2 on Spark2.4.7单机环境_hive3.1.2onspark

hive3.1.2onspark

搭建Hive3.1.2 on Spark2.4.7单机环境

搭建过程参考网上各种教程, 现在汇总下具体步骤内容。

先上本机运行情况

  • 执行Hive on Spark
:~$ hive
Hive Session ID = 6eed60ea-639e-4b17-ad3f-4ef008c510f0

Logging initialized using configuration in file:/opt/apache-hive-bin/conf/hive-log4j2.properties Async: true
Hive Session ID = d028c5da-691d-4313-a388-c2c00a4c306b
hive> use hive;
OK
Time taken: 0.451 seconds
hive> show tables;
OK
tab_name
hive_table
Time taken: 0.142 seconds, Fetched: 1 row(s)
hive> desc hive_table;
OK
col_name	data_type	comment
id                  	int                 	                    
name                	string              	                    
ver                 	string              	                    
package             	string              	                    
path                	string              	                    
Time taken: 0.117 seconds, Fetched: 5 row(s)
hive> select * from hive_table;
OK
hive_table.id	hive_table.name	hive_table.ver	hive_table.package	hive_table.path
1	hadoop	3.3.0	hadoop-3.3.0.tar.gz	/opt/hadoop
2	hive	3.2.1	apache-hive-3.1.2-bin.tar.gz	/opt/apache-hive-bin
3	mysql	8.0.20	mysql-server	/usr/local/mysql
4	spark	2.4.7	spark-2.4.7-bin-without-hadoop.tgz	/opt/spark-bin-without-hadoop
Time taken: 1.25 seconds, Fetched: 4 row(s)
hive> select t1.id, t1.name, t1.ver, t1.package, row_number() over(partition by t1.id order by t2.id desc) as row_id, count(1) over(partition by t1.name) as cnt from hive_table t1 left join hive_table t2 on 1 = 1 where t2.id is not null;
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Warning: Map Join MAPJOIN[19][bigTable=?] in task 'Stage-1:MAPRED' is a cross product
Query ID = ***_20201025134120_9a652c42-4ceb-47e5-bcca-1b213b8c6cd7
Total jobs = 2
Launching Job 1 out of 2
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Running with YARN Application = application_1603552715345_0005
Kill Command = /opt/hadoop/bin/yarn application -kill application_1603552715345_0005
Hive on Spark Session Web UI URL: http://localhost:4040

Query Hive on Spark job[0] stages: [0]
Spark job[0] status = RUNNING
--------------------------------------------------------------------------------------
          STAGES   ATTEMPT        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
--------------------------------------------------------------------------------------
Stage-0 ........         0      FINISHED      1          1        0        0       0  
--------------------------------------------------------------------------------------
STAGES: 01/01    [==========================>>] 100%  ELAPSED TIME: 7.07 s     
--------------------------------------------------------------------------------------
Spark job[0] finished successfully in 7.07 second(s)
Launching Job 2 out of 2
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Running with YARN Application = application_1603552715345_0005
Kill Command = /opt/hadoop/bin/yarn application -kill application_1603552715345_0005
Hive on Spark Session Web UI URL: http://localhost:4040

Query Hive on Spark job[1] stages: [1, 2, 3]
Spark job[1] status = RUNNING
--------------------------------------------------------------------------------------
          STAGES   ATTEMPT        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
--------------------------------------------------------------------------------------
Stage-1 ........         0      FINISHED      1          1        0        0       0  
Stage-2 ........         0      FINISHED      1          1        0        0       0  
Stage-3 ........         0      FINISHED      1          1        0        0       0  
--------------------------------------------------------------------------------------
STAGES: 03/03    [==========================>>] 100%  ELAPSED TIME: 8.07 s     
--------------------------------------------------------------------------------------
Spark job[1] finished successfully in 8.07 second(s)
OK
t1.id	t1.name	t1.ver	t1.package	row_id	cnt
1	hadoop	3.3.0	hadoop-3.3.0.tar.gz	1	4
1	hadoop	3.3.0	hadoop-3.3.0.tar.gz	2	4
1	hadoop	3.3.0	hadoop-3.3.0.tar.gz	3	4
1	hadoop	3.3.0	
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
本文内容由网友自发贡献,转载请注明出处:【wpsshop博客】
推荐阅读
  

闽ICP备14008679号