赞
踩
在Apache下载最新的Pig软件包,点击下载会推荐最快的镜像站点,以下为下载地址:
pig下载地址
解压缩到安装路径, 用如下命令编辑/etc/profile文件:
Pig工作模式
本地模式:只需要配置PATH环境变量${PIG_HOME}/bin即可,适用于测试
Mapreduce模式:需要添加环境变量PIG_CLASSPATH=${HADOOP_HOME}/conf/,指向hadoop的conf目录,我的是hadoop2.6 ,hadoop home: /usr/local/hadoop/etc/hadoop
sudo vi /etc/profile
添加:
export PIG_HOME=/app/pig-0.13.0
export PIG_CLASSPATH=/usr/local/hadoop/etc/hadoop
export PATH=$PATH:$PIG_HOME/bin
将测试数据复制到hdfs上: 测试数据下载
hadoop fs -put ncdc_data.txt /input/in1/
第一次将地址写错了, 导致一直没有找到文件
grunt> A = LOAD '/input/in1/ncdc_data.txt' USING PigStorage(':') AS (year:int, temp:int, quality:int);
grunt> B = FILTER A BY temp != 9999 AND ((chararray)quality matches '[01459]');
或 B = FILTER A BY temp != 9999 AND (quality == 0 OR quality == 1 OR quality == 4 OR quality == 5 OR quality == 9);
grunt> C = GROUP B BY year;
grunt> D = FOREACH C GENERATE group, MAX(B.temp) AS max_temp;
grunt> DUMP D;
2016-11-20 06:02:41,902 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER
2016-11-20 06:02:42,053 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:02:42,054 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-11-20 06:02:42,067 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2016-11-20 06:02:42,069 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2016-11-20 06:02:42,107 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2016-11-20 06:02:42,114 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2016-11-20 06:02:42,140 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2016-11-20 06:02:42,140 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2016-11-20 06:02:42,241 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:02:42,250 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:02:42,263 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2016-11-20 06:02:42,278 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2016-11-20 06:02:42,280 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2016-11-20 06:02:42,280 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2016-11-20 06:02:42,308 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=3673672
2016-11-20 06:02:42,308 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2016-11-20 06:02:42,308 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2016-11-20 06:02:43,095 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp-60624248/tmp72750994/pig-0.16.0-core-h2.jar
2016-11-20 06:02:43,367 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-60624248/tmp-2105835473/automaton-1.11-8.jar
2016-11-20 06:02:43,518 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-60624248/tmp1218719075/antlr-runtime-3.4.jar
2016-11-20 06:02:43,701 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-60624248/tmp-2048402576/joda-time-2.9.3.jar
2016-11-20 06:02:43,707 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2016-11-20 06:02:43,710 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2016-11-20 06:02:43,710 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2016-11-20 06:02:43,710 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2016-11-20 06:02:43,840 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2016-11-20 06:02:43,847 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:02:44,029 [JobControl] WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2016-11-20 06:02:44,159 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2016-11-20 06:02:44,172 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-11-20 06:02:44,172 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2016-11-20 06:02:44,350 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2016-11-20 06:02:44,709 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2016-11-20 06:02:47,105 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1479576092520_0006
2016-11-20 06:02:47,816 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2016-11-20 06:02:53,694 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1479576092520_0006
2016-11-20 06:02:54,016 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://TEST:8088/proxy/application_1479576092520_0006/
2016-11-20 06:02:54,017 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1479576092520_0006
2016-11-20 06:02:54,017 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B,C,D
2016-11-20 06:02:54,017 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[5,4],A[-1,-1],B[6,4],D[8,4],C[7,4] C: D[8,4],C[7,4] R: D[8,4]
2016-11-20 06:02:54,252 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-11-20 06:02:54,252 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:04:53,944 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 5% complete
2016-11-20 06:04:53,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:04:56,974 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 21% complete
2016-11-20 06:04:56,978 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:05:04,031 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete
2016-11-20 06:05:04,031 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:05:24,319 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-11-20 06:05:24,320 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:10:06,870 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2016-11-20 06:10:06,870 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:10:14,258 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2016-11-20 06:10:14,258 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:10:22,325 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0006]
2016-11-20 06:11:03,514 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:03,646 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:49,363 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:49,434 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:49,883 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:49,910 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:50,354 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-11-20 06:11:50,367 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.16.0 chb 2016-11-20 06:02:42 2016-11-20 06:11:50 GROUP_BY,FILTER
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_1479576092520_0006 1 1 88 88 88 88 310 310 310 310 A,B,C,D GROUP_BY,COMBINER hdfs://192.168.1.124:9000/tmp/temp-60624248/tmp-1087782019,
Input(s):
Successfully read 321146 records (3674048 bytes) from: "/input/in1/ncdc_data.txt"
Output(s):
Successfully stored 43 records (430 bytes) in: "hdfs://192.168.1.124:9000/tmp/temp-60624248/tmp-1087782019"
Counters:
Total records written : 43
Total bytes written : 430
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1479576092520_0006
2016-11-20 06:11:50,377 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:50,397 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:50,554 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:50,573 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:51,275 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:11:51,349 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:11:52,066 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-11-20 06:11:52,068 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:11:52,069 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-11-20 06:11:52,070 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2016-11-20 06:11:52,528 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-11-20 06:11:52,528 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1901,317)
(1902,261)
(1903,278)
(1904,194)
(1905,278)
(1906,283)
(1907,300)
(1908,322)
(1909,350)
(1910,322)
(1911,322)
(1912,411)
(1913,361)
(1914,378)
(1915,411)
(1916,289)
(1917,478)
(1918,450)
(1919,428)
(1920,344)
(1921,417)
(1922,400)
(1923,394)
(1924,456)
(1925,322)
(1926,411)
(1928,161)
(1929,178)
(1930,311)
(1931,450)
(1932,322)
(1933,411)
(1934,300)
(1935,311)
(1936,389)
(1937,339)
(1938,411)
(1939,433)
(1940,433)
(1941,462)
(1942,278)
(1949,367)
(1953,400)
grunt>
grunt> STORE D INTO 'max_temp' USING PigStorage(':');
2016-11-20 06:28:32,644 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:28:32,645 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-11-20 06:28:32,925 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.textoutputformat.separator is deprecated. Instead, use mapreduce.output.textoutputformat.separator
2016-11-20 06:28:33,159 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER
2016-11-20 06:28:33,444 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:28:33,444 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-11-20 06:28:33,447 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2016-11-20 06:28:33,448 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2016-11-20 06:28:33,496 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2016-11-20 06:28:33,520 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2016-11-20 06:28:33,546 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2016-11-20 06:28:33,546 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2016-11-20 06:28:33,751 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-11-20 06:28:33,773 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:28:33,781 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2016-11-20 06:28:33,804 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2016-11-20 06:28:33,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2016-11-20 06:28:33,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2016-11-20 06:28:33,826 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=3673672
2016-11-20 06:28:33,826 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2016-11-20 06:28:33,826 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2016-11-20 06:28:36,502 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/pig-0.16.0-core-h2.jar to DistributedCache through /tmp/temp-60624248/tmp-1199985731/pig-0.16.0-core-h2.jar
2016-11-20 06:28:36,765 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-60624248/tmp721246289/automaton-1.11-8.jar
2016-11-20 06:28:37,076 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-60624248/tmp341502194/antlr-runtime-3.4.jar
2016-11-20 06:28:37,560 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/local/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-60624248/tmp-587981636/joda-time-2.9.3.jar
2016-11-20 06:28:37,567 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2016-11-20 06:28:37,574 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2016-11-20 06:28:37,574 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2016-11-20 06:28:37,574 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2016-11-20 06:28:37,907 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2016-11-20 06:28:37,943 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:28:38,104 [JobControl] WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2016-11-20 06:28:38,208 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2016-11-20 06:28:38,233 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-11-20 06:28:38,234 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2016-11-20 06:28:38,249 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2016-11-20 06:28:38,887 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2016-11-20 06:28:39,586 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1479576092520_0007
2016-11-20 06:28:39,610 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2016-11-20 06:28:39,843 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1479576092520_0007
2016-11-20 06:28:39,945 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://TEST:8088/proxy/application_1479576092520_0007/
2016-11-20 06:28:39,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1479576092520_0007
2016-11-20 06:28:39,947 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B,C,D
2016-11-20 06:28:39,947 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[5,4],A[-1,-1],B[6,4],D[8,4],C[7,4] C: D[8,4],C[7,4] R: D[8,4]
2016-11-20 06:28:40,011 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-11-20 06:28:40,011 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:30:39,691 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 9% complete
2016-11-20 06:30:39,704 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:30:44,340 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete
2016-11-20 06:30:44,340 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:30:54,464 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-11-20 06:30:54,465 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:32:23,937 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2016-11-20 06:32:23,937 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:32:29,164 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1479576092520_0007]
2016-11-20 06:32:51,921 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:32:52,670 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:02,007 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:33:02,123 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:02,537 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:33:02,561 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:02,822 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-11-20 06:33:02,824 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.16.0 chb 2016-11-20 06:28:33 2016-11-20 06:33:02 GROUP_BY,FILTER
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_1479576092520_0007 1 1 35 35 35 35 97 97 97 97 A,B,C,D GROUP_BY,COMBINER hdfs://192.168.1.124:9000/user/chb/max_temp,
Input(s):
Successfully read 321146 records (3674048 bytes) from: "/input/in1/ncdc_data.txt"
Output(s):
Successfully stored 43 records (387 bytes) in: "hdfs://192.168.1.124:9000/user/chb/max_temp"
Counters:
Total records written : 43
Total bytes written : 387
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1479576092520_0007
2016-11-20 06:33:02,847 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:33:02,884 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:03,175 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:33:03,209 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:03,469 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at TEST/192.168.1.124:8032
2016-11-20 06:33:03,491 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2016-11-20 06:33:03,725 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
grunt>
grunt> cat max_temp
1901:317
1902:261
1903:278
1904:194
1905:278
1906:283
1907:300
1908:322
1909:350
1910:322
1911:322
1912:411
1913:361
1914:378
1915:411
1916:289
1917:478
1918:450
1919:428
1920:344
1921:417
1922:400
1923:394
1924:456
1925:322
1926:411
1928:161
1929:178
1930:311
1931:450
1932:322
1933:411
1934:300
1935:311
1936:389
1937:339
1938:411
1939:433
1940:433
1941:462
1942:278
1949:367
1953:400
grunt>
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。