i learning hadoop , created simple pig script. reading file works, writing file not. script runs fine, dump f
command shows me 10 records, expected. when store same relation file (store f 'result.csv';
), there odd messages on console, , reason, in end have result file first 3 records.
my questions are:
what's matter ioexception, when reading worked , writing worked @ least partly?
why console tell me
total records written : 0
, when 3 records have been written?- why didn't store 10 records, expected?
my script (it's sandbox playing)
cd /user/samples c = load 'crimes.csv' using pigstorage(',') (id:int,case_number:int,date:chararray,block:chararray,iucr:chararray,primary_type,description,locationdescription,arrest:boolean,domestic,beat,district,ward,communityarea,fbicode,xcoordinate,ycoordinate,year,updatedon,latitude,longitude,location); c = limit c 1000; t = foreach c generate id, date, arrest, year; f = filter t arrest==true; f = limit f 10; dump f; store f 'result.csv';
part of console output:
2016-07-21 15:55:07,435 [main] info org.apache.hadoop.ipc.client - retrying connect server: 0.0.0.0/0.0.0.0:10020. tried 9 time(s); retry policy retryuptomaximumcountwithfixedsleep(maxretries=10, sleeptime=1000 milliseconds) 2016-07-21 15:55:07,537 [main] warn org.apache.pig.tools.pigstats.mapreduce.mrjobstats - unable job counters java.io.ioexception: java.io.ioexception: java.net.connectexception: call m1.hdp2/192.168.178.201 0.0.0.0:10020 failed on connection exception: java.net.connectexception: connection refused; more details see: http://wiki.apache.org/hadoop/connectionrefused @ org.apache.pig.backend.hadoop.executionengine.shims.hadoopshims.getcounters(hadoopshims.java:132) @ org.apache.pig.tools.pigstats.mapreduce.mrjobstats.addcounters(mrjobstats.java:284) @ org.apache.pig.tools.pigstats.mapreduce.mrpigstatsutil.addsuccessjobstats(mrpigstatsutil.java:235) @ org.apache.pig.tools.pigstats.mapreduce.mrpigstatsutil.accumulatestats(mrpigstatsutil.java:165) @ org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher.launchpig(mapreducelauncher.java:360) @ org.apache.pig.backend.hadoop.executionengine.hexecutionengine.launchpig(hexecutionengine.java:308) @ org.apache.pig.pigserver.launchplan(pigserver.java:1474) @ org.apache.pig.pigserver.executecompiledlogicalplan(pigserver.java:1459) @ org.apache.pig.pigserver.execute(pigserver.java:1448) @ org.apache.pig.pigserver.access$500(pigserver.java:118) @ org.apache.pig.pigserver$graph.registerquery(pigserver.java:1773) @ org.apache.pig.pigserver.registerquery(pigserver.java:707) @ org.apache.pig.tools.grunt.gruntparser.processpig(gruntparser.java:1075) @ org.apache.pig.tools.pigscript.parser.pigscriptparser.parse(pigscriptparser.java:505) @ org.apache.pig.tools.grunt.gruntparser.parsestoponerror(gruntparser.java:231) @ org.apache.pig.tools.grunt.gruntparser.parsestoponerror(gruntparser.java:206) @ org.apache.pig.tools.grunt.grunt.run(grunt.java:66) @ org.apache.pig.main.run(main.java:564) @ org.apache.pig.main.main(main.java:176) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.hadoop.util.runjar.run(runjar.java:221) @ org.apache.hadoop.util.runjar.main(runjar.java:136) caused by: java.io.ioexception: java.net.connectexception: call m1.hdp2/192.168.178.201 0.0.0.0:10020 failed on connection exception: java.net.connectexception: connection refused; more details see: http://wiki.apache.org/hadoop/connectionrefused @ org.apache.hadoop.mapred.clientservicedelegate.invoke(clientservicedelegate.java:343) @ org.apache.hadoop.mapred.clientservicedelegate.getjobstatus(clientservicedelegate.java:428) @ org.apache.hadoop.mapred.yarnrunner.getjobstatus(yarnrunner.java:572) @ org.apache.hadoop.mapreduce.cluster.getjob(cluster.java:184) @ org.apache.pig.backend.hadoop.executionengine.shims.hadoopshims.getcounters(hadoopshims.java:126) ... 24 more caused by: java.net.connectexception: call m1.hdp2/192.168.178.201 0.0.0.0:10020 failed on connection exception: java.net.connectexception: connection refused; more details see: http://wiki.apache.org/hadoop/connectionrefused @ sun.reflect.generatedconstructoraccessor18.newinstance(unknown source) @ sun.reflect.delegatingconstructoraccessorimpl.newinstance(delegatingconstructoraccessorimpl.java:45) @ java.lang.reflect.constructor.newinstance(constructor.java:423) @ org.apache.hadoop.net.netutils.wrapwithmessage(netutils.java:792) @ org.apache.hadoop.net.netutils.wrapexception(netutils.java:732) @ org.apache.hadoop.ipc.client.call(client.java:1479) @ org.apache.hadoop.ipc.client.call(client.java:1412) @ org.apache.hadoop.ipc.protobufrpcengine$invoker.invoke(protobufrpcengine.java:229) @ com.sun.proxy.$proxy14.getjobreport(unknown source) @ org.apache.hadoop.mapreduce.v2.api.impl.pb.client.mrclientprotocolpbclientimpl.getjobreport(mrclientprotocolpbclientimpl.java:133) @ sun.reflect.generatedmethodaccessor7.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:498) @ org.apache.hadoop.mapred.clientservicedelegate.invoke(clientservicedelegate.java:324) ... 28 more caused by: java.net.connectexception: connection refused @ sun.nio.ch.socketchannelimpl.checkconnect(native method) @ sun.nio.ch.socketchannelimpl.finishconnect(socketchannelimpl.java:717) @ org.apache.hadoop.net.socketiowithtimeout.connect(socketiowithtimeout.java:206) @ org.apache.hadoop.net.netutils.connect(netutils.java:531) @ org.apache.hadoop.net.netutils.connect(netutils.java:495) @ org.apache.hadoop.ipc.client$connection.setupconnection(client.java:614) @ org.apache.hadoop.ipc.client$connection.setupiostreams(client.java:712) @ org.apache.hadoop.ipc.client$connection.access$2900(client.java:375) @ org.apache.hadoop.ipc.client.getconnection(client.java:1528) @ org.apache.hadoop.ipc.client.call(client.java:1451) ... 36 more 2016-07-21 15:55:07,540 [main] info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mapreducelauncher - 100% complete 2016-07-21 15:55:07,571 [main] info org.apache.pig.tools.pigstats.mapreduce.simplepigstats - script statistics: hadoopversion pigversion userid startedat finishedat features 2.7.2 0.16.0 hadoop 2016-07-21 15:50:17 2016-07-21 15:55:07 filter,limit success! job stats (time in seconds): jobid maps reduces maxmaptime minmaptime avgmaptime medianmaptime maxreducetime minreducetime avgreducetime medianreducetime alias feature outputs job_1469130571595_0001 3 1 n/a n/a n/a n/a n/a n/a n/a n/a c job_1469130571595_0002 1 1 n/a n/a n/a n/a n/a n/a n/a n/a c,f,t hdfs://localhost:9000/user/samples/result.csv, input(s): read 0 records from: "hdfs://localhost:9000/user/samples/crimes.csv" output(s): stored 0 records in: "hdfs://localhost:9000/user/samples/result.csv" counters: total records written : 0 total bytes written : 0 spillable memory manager spill count : 0 total bags proactively spilled: 0 total records proactively spilled: 0 job dag: job_1469130571595_0001 -> job_1469130571595_0002, job_1469130571595_0002 2016-07-21 15:55:07,573 [main] info org.apache.hadoop.yarn.client.rmproxy - connecting resourcemanager @ /0.0.0.0:8032 2016-07-21 15:55:07,585 [main] info org.apache.hadoop.mapred.clientservicedelegate - application state completed. finalapplicationstatus=succeeded. redirecting job history server 2016-07-21 15:55:08,592 [main] info org.apache.hadoop.ipc.client - retrying connect server: 0.0.0.0/0.0.0.0:10020. tried 0 time(s); retry policy retryuptomaximumcountwithfixedsleep(maxretries=10, sleeptime=1000 milliseconds)
Comments
Post a Comment