Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.12.0
-
None
-
None
Description
I've noticed that HiveColumnarLoader will thrown java.io.FileNotFoundException when used with glob path on Hadoop 2.0. It will run just fine on Hadoop 1.0:
Failed to parse: java.io.FileNotFoundException: File /home/jarcec/cloudera/repos/pig/contrib/piggybank/java/simpleDataDir1395623312698/*.txt does not exist at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:198) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1676) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1623) at org.apache.pig.PigServer.registerQuery(PigServer.java:575) at org.apache.pig.PigServer.registerQuery(PigServer.java:588) at org.apache.pig.piggybank.test.storage.TestHiveColumnarLoader.testHdfdsGlobbing(TestHiveColumnarLoader.java:220) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at junit.framework.TestCase.runTest(TestCase.java:176) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) at junit.framework.TestSuite.runTest(TestSuite.java:255) at junit.framework.TestSuite.run(TestSuite.java:250) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File /home/jarcec/cloudera/repos/pig/contrib/piggybank/java/simpleDataDir1395623312698/*.txt does not exist at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:362) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1484) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1524) at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:564) at org.apache.pig.piggybank.storage.partition.PathPartitioner.getPartitionKeys(PathPartitioner.java:105) at org.apache.pig.piggybank.storage.partition.PathPartitionHelper.getPartitionKeys(PathPartitionHelper.java:101) at org.apache.pig.piggybank.storage.HiveColumnarLoader.getPartitionColumns(HiveColumnarLoader.java:576) at org.apache.pig.piggybank.storage.HiveColumnarLoader.getSchema(HiveColumnarLoader.java:646) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175) at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:853) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3479) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1536) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1013) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:553) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 20 more Caused by: java.io.FileNotFoundException: File /home/jarcec/cloudera/repos/pig/contrib/piggybank/java/simpleDataDir1395623312698/*.txt does not exist ... 37 more
I've dived into the problem and found a difference in Hadoop implementation of DistributedFileSystem. For non existing directory method listStatus will return null in Hadoop 1:
if (thisListing == null) { // the directory does not exist return null; }
But will thrown an exception in Hadoop 2:
if (thisListing == null) { // the directory does not exist throw new FileNotFoundException("File " + p + " does not exist."); }
Attachments
Attachments
Issue Links
- links to