Description
Feed Replication with empty directories are failing with the following error in application log:
2016-06-23 08:35:21,475 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml_tmp to hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059_conf.xml 2016-06-23 08:35:21,476 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist_tmp to hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/mr-history/tmp/hrt_qa/job_1466658266370_0059-1466670911340-hrt_qa-distcp%3A+oozie%3Aaction%3AT%3Djava%3AW%3DFALCON_FEED_REPLICAT-1466670921283-0-0-FAILED-default-1466670920070.jhist 2016-06-23 08:35:21,477 INFO [Thread-66] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop() 2016-06-23 08:35:21,479 INFO [Thread-66] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to No of maps and reduces are 0 job_1466658266370_0059 Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException: hdfs://nat-os-r7-wkqu-falcon-multicluster-10.openstacklocal:8020/tmp/falcon-regression/FeedReplicationTest/target/2016/06/23/08/32 doesn't exist at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86) at org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:241) at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
Feed submitted:
<?xml version="1.0" encoding="UTF-8"?><feed xmlns="uri:falcon:feed:0.1" name="A7769e4e0-49663d60" description="Input File"> <partitions> <partition name="colo"/> <partition name="eventTime"/> <partition name="impressionHour"/> <partition name="pricingModel"/> </partitions> <availabilityFlag>availabilityFlag.txt</availabilityFlag> <frequency>minutes(5)</frequency> <late-arrival cut-off="days(100000)"/> <clusters> <cluster name="A7769e4e0-0af6c74b" type="source"> <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/> <retention limit="days(1000000)" action="delete"/> </cluster> <cluster name="A7769e4e0-25f87f0e" type="target"> <validity start="2016-06-23T08:32Z" end="2016-06-23T08:37Z"/> <retention limit="days(1000000)" action="delete"/> <locations> <location type="data" path="/tmp/falcon-regression/FeedReplicationTest/target/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/> </locations> </cluster> </clusters> <locations> <location type="data" path="/tmp/falcon-regression/FeedReplicationTest/source/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}"/> <location type="stats" path="/data/regression/fetlrc/billing/stats"/> <location type="meta" path="/data/regression/fetlrc/billing/metadata"/> </locations> <ACL owner="hrt_qa" group="users" permission="0x755"/> <schema location="/databus/streams_local/click_rr/schema/" provider="protobuf"/> <properties> <property name="field1" value="value1"/> <property name="field2" value="value2"/> <property name="job.counter" value="true"/> </properties> </feed>
It is failing because of the target directories are not exists to replicate.
Attachments
Issue Links
- links to