赞
踩
前面文章中讲解了Hive的数据定义语言,接下来就是数据操作语言,数据定义语言主要包含了Hive表结构的一些操作:包含创建表、删除表、修改表的结构、增加列等等一系列操作。数据操作语言主要是对表中的数据的增删改查。
数据插入
通过查询语句进行插入,这里不像是MySql可以一条一条的进行插入操作,在Hive中我试了几次都没有成功,搜了下也都是通过select语句进行插入的。这里也算是和普通的sql不同的地方吧。
- hive> insert overwrite table salaries select * from salaries_external where year = 2012;
- FAILED: SemanticException [Error 10004]: Line 1:70 Invalid table alias or column reference 'year': (pos ible column names are: yearid, teamid, lgid, playerid, salary)
- hive> insert overwrite table salaries select * from salaries_external where yearid = 2012;
- Total jobs = 3
- Launching Job 1 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_1474609415284_0001, Tracking URL = http://hadoopwy1:8088/proxy/application_147460941 284_0001/
- Kill Command = /usr/local/hadoop2/bin/hadoop job -kill job_1474609415284_0001
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2016-09-23 03:59:05,119 Stage-1 map = 0%, reduce = 0%
- 2016-09-23 03:59:33,232 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.13 sec
- MapReduce Total cumulative CPU time: 2 seconds 130 msec
- Ended Job = job_1474609415284_0001
- Stage-4 is selected by condition resolver.
- Stage-3 is filtered out by condition resolver.
- Stage-5 is filtered out by condition resolver.
- Moving data to: hdfs://hadoopnodeservice1/tmp/hive-root/hive_2016-09-23_03-57-59_773_846459559745817905 -1/-ext-10000
- Loading data to table default.salaries
- rmr: DEPRECATED: Please use 'rm -r' instead.
- Deleted hdfs://hadoopnodeservice1/user/hive/warehouse/salaries
- Table default.salaries stats: [numFiles=1, numRows=1696, totalSize=52808, rawDataSize=51112]
- MapReduce Jobs Launched:
- Job 0: Map: 1 Cumulative CPU: 2.13 sec HDFS Read: 1354022 HDFS Write: 52885 SUCCESS
- Total MapReduce CPU Time Spent: 2 seconds 130 msec
- OK
- Time taken: 98.887 seconds
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
- 2012 WAS NL carrobr01 580000.0
- 2012 WAS NL desmoia01 512500.0
- 2012 WAS NL espinda01 506000.0
- 2012 WAS NL storedr01 498750.0
- 2012 WAS NL bernaro01 493500.0
- 2012 WAS NL ramoswi01 491250.0
- 2012 WAS NL rodrihe03 491000.0
- 2012 WAS NL detwiro01 485000.0
- 2012 WAS NL stammcr01 485000.0
- 2012 WAS NL marrech01 481000.0
- 2012 WAS NL matthry01 481000.0
- 2012 WAS NL lombast02 481000.0
- Time taken: 0.077 seconds, Fetched: 1696 row(s)
插入数据并指定其分区
该命令要求我们创建表的时候,就已经按照某种形式分区了。
- hive> describe salaries_partition;
- OK
- teamid string
- playerid string
- lgid string
- salary float
- yearid int
-
- # Partition Information
- # col_name data_type comment
-
- yearid int
- Time taken: 0.122 seconds, Fetched: 10 row(s)
- hive> insert overwrite table salaries_partition partition(yearid = 2012) select ex.teamid, ex.playerid, ex.lgid, ex.salary from salaries_external ex where ex.yearid = 2012;
- Total jobs = 3
- Launching Job 1 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_1474609415284_0005, Tracking URL = http://hadoopwy1:8088/proxy/application_1474609415284_0005/
- Kill Command = /usr/local/hadoop2/bin/hadoop job -kill job_1474609415284_0005
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2016-09-23 05:10:09,350 Stage-1 map = 0%, reduce = 0%
- 2016-09-23 05:10:25,157 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.79 sec
- MapReduce Total cumulative CPU time: 1 seconds 790 msec
- Ended Job = job_1474609415284_0005
- Stage-4 is selected by condition resolver.
- Stage-3 is filtered out by condition resolver.
- Stage-5 is filtered out by condition resolver.
- Moving data to: hdfs://hadoopnodeservice1/tmp/hive-root/hive_2016-09-23_05-09-55_703_8904828298967067898-1/-ext-10000
- Loading data to table default.salaries_partition partition (yearid=2012)
- Partition default.salaries_partition{yearid=2012} stats: [numFiles=1, numRows=1696, totalSize=44328, rawDataSize=42632]
- MapReduce Jobs Launched:
- Job 0: Map: 1 Cumulative CPU: 1.79 sec HDFS Read: 1354022 HDFS Write: 44427 SUCCESS
- Total MapReduce CPU Time Spent: 1 seconds 790 msec
- OK
- Time taken: 32.188 seconds
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
- hive> insert overwrite table salaries_partition partition(yearid) select ex.teamid, ex.playerid, ex.lgid, ex.salary from salaries_external ex;
- FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict
这里出现了异常,说是现在是strict状态,不允许你这样做,你需要把strict状态改成nonstrict状态才行。
这里设置了strict模式后仍然是会出现问题的。需要设置以下三个参数。
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
- <strong>hive> set hive.exec.max.dynamic.partitions.pernode=3000;
- hive> insert overwrite table salaries_partition partition(yearid) select ex.yearid, ex.teamid, ex.playerid, ex.lgid, ex.salary from salaries_external ex;
- Total jobs = 1
- Launching Job 1 out of 1
- Number of reduce tasks not specified. Estimated from input data size: 1
- In order to change the average load for a reducer (in bytes):
- set hive.exec.reducers.bytes.per.reducer=<number>
- In order to limit the maximum number of reducers:
- set hive.exec.reducers.max=<number>
- In order to set a constant number of reducers:
- set mapreduce.job.reduces=<number>
- Starting Job = job_1474609415284_0007, Tracking URL = http://hadoopwy1:8088/proxy/application_1474609415284_0007/
- Kill Command = /usr/local/hadoop2/bin/hadoop job -kill job_1474609415284_0007
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
- 2016-09-23 06:20:01,525 Stage-1 map = 0%, reduce = 0%
- 2016-09-23 06:20:32,504 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.1 sec
- 2016-09-23 06:20:55,706 Stage-1 map = 100%, reduce = 68%, Cumulative CPU 5.39 sec
- 2016-09-23 06:20:58,864 Stage-1 map = 100%, reduce = 70%, Cumulative CPU 6.02 sec
- 2016-09-23 06:21:02,023 Stage-1 map = 100%, reduce = 72%, Cumulative CPU 6.53 sec
- 2016-09-23 06:21:05,182 Stage-1 map = 100%, reduce = 73%, Cumulative CPU 6.99 sec
- 2016-09-23 06:21:08,338 Stage-1 map = 100%, reduce = 75%, Cumulative CPU 7.61 sec
- 2016-09-23 06:21:10,432 Stage-1 map = 100%, reduce = 76%, Cumulative CPU 8.12 sec
- 2016-09-23 06:21:13,578 Stage-1 map = 100%, reduce = 77%, Cumulative CPU 8.57 sec
- 2016-09-23 06:21:16,729 Stage-1 map = 100%, reduce = 78%, Cumulative CPU 9.08 sec
- 2016-09-23 06:21:19,883 Stage-1 map = 100%, reduce = 79%, Cumulative CPU 9.55 sec
- 2016-09-23 06:21:23,085 Stage-1 map = 100%, reduce = 80%, Cumulative CPU 10.01 sec
- 2016-09-23 06:21:26,218 Stage-1 map = 100%, reduce = 81%, Cumulative CPU 10.45 sec
- 2016-09-23 06:21:32,505 Stage-1 map = 100%, reduce = 82%, Cumulative CPU 11.36 sec
- 2016-09-23 06:21:37,774 Stage-1 map = 100%, reduce = 83%, Cumulative CPU 12.21 sec
- 2016-09-23 06:21:44,050 Stage-1 map = 100%, reduce = 84%, Cumulative CPU 13.06 sec
- 2016-09-23 06:21:47,205 Stage-1 map = 100%, reduce = 85%, Cumulative CPU 13.48 sec
- 2016-09-23 06:21:50,352 Stage-1 map = 100%, reduce = 86%, Cumulative CPU 13.92 sec
- 2016-09-23 06:21:53,538 Stage-1 map = 100%, reduce = 87%, Cumulative CPU 14.3 sec
- 2016-09-23 06:21:56,694 Stage-1 map = 100%, reduce = 88%, Cumulative CPU 14.72 sec
- 2016-09-23 06:21:59,850 Stage-1 map = 100%, reduce = 89%, Cumulative CPU 15.15 sec
- 2016-09-23 06:22:05,074 Stage-1 map = 100%, reduce = 90%, Cumulative CPU 16.09 sec
- 2016-09-23 06:22:08,220 Stage-1 map = 100%, reduce = 91%, Cumulative CPU 16.51 sec
- 2016-09-23 06:22:14,509 Stage-1 map = 100%, reduce = 92%, Cumulative CPU 17.33 sec
- 2016-09-23 06:22:17,665 Stage-1 map = 100%, reduce = 93%, Cumulative CPU 17.77 sec
- 2016-09-23 06:22:20,830 Stage-1 map = 100%, reduce = 94%, Cumulative CPU 18.24 sec
- 2016-09-23 06:22:26,128 Stage-1 map = 100%, reduce = 95%, Cumulative CPU 19.1 sec
- 2016-09-23 06:22:29,284 Stage-1 map = 100%, reduce = 96%, Cumulative CPU 19.42 sec
- 2016-09-23 06:22:35,616 Stage-1 map = 100%, reduce = 97%, Cumulative CPU 20.3 sec
- 2016-09-23 06:22:41,909 Stage-1 map = 100%, reduce = 98%, Cumulative CPU 21.12 sec
- 2016-09-23 06:22:48,204 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 21.91 sec
- 2016-09-23 06:22:59,670 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 23.45 sec
- 2016-09-23 06:24:00,528 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 23.88 sec
- MapReduce Total cumulative CPU time: 23 seconds 880 msec
- Ended Job = job_1474609415284_0007
- Loading data to table default.salaries_partition partition (yearid=null)
- Failed with exception Number of dynamic partitions created is 2881, which is more than 1000. To solve this try to set hive.exec.max.dynamic.partitions to at least 2881.
- FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
- MapReduce Jobs Launched:
- Job 0: Map: 1 Reduce: 1 Cumulative CPU: 28.11 sec HDFS Read: 1354022 HDFS Write: 1191371 SUCCESS
- Total MapReduce CPU Time Spent: 28 seconds 110 msec</strong>
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
导出数据
将数据导出到文件中。
- hive> insert overwrite local directory '/usr/local/temp'
- > select ex.yearid, ex.teamid, ex.playerid, ex.salary
- > from salaries_external ex;
- Total jobs = 1
- Launching Job 1 out of 1
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_1474609415284_0008, Tracking URL = http://hadoopwy1:8088/proxy/application_1474609415284_0008/
- Kill Command = /usr/local/hadoop2/bin/hadoop job -kill job_1474609415284_0008
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2016-09-23 06:54:25,754 Stage-1 map = 0%, reduce = 0%
- 2016-09-23 06:54:42,450 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.56 sec
- MapReduce Total cumulative CPU time: 1 seconds 560 msec
- Ended Job = job_1474609415284_0008
- Copying data to local directory /usr/local/temp
- Copying data to local directory /usr/local/temp
- MapReduce Jobs Launched:
- Job 0: Map: 1 Cumulative CPU: 1.56 sec HDFS Read: 1354022 HDFS Write: 1302986 SUCCESS
- Total MapReduce CPU Time Spent: 1 seconds 560 msec
- OK
- Time taken: 44.155 seconds
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。