Description
MAPREDUCE-1981 introduced a new API for FileSystem - listLocatedStatus. It is used in Hadoop's FileInputFormat.getSplits(). Hive's ProxyFileSystem class needs to implement this API in order to make Hive unit test work.
Otherwise, you'll see these exceptions when running TestCliDriver test case, e.g. results of running allcolref_in_udf.q:
[junit] Running org.apache.hadoop.hive.cli.TestCliDriver [junit] Begin query: allcolref_in_udf.q [junit] java.lang.IllegalArgumentException: Wrong FS: pfile:/GitHub/Monarch/project/hive-monarch/build/ql/test/data/warehouse/src, expected: file:/// [junit] at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:642) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:69) [junit] at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:375) [junit] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1482) [junit] at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1522) [junit] at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:1798) [junit] at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1797) [junit] at org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:579) [junit] at org.apache.hadoop.fs.FilterFileSystem.listLocatedStatus(FilterFileSystem.java:235) [junit] at org.apache.hadoop.fs.FilterFileSystem.listLocatedStatus(FilterFileSystem.java:235) [junit] at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264) [junit] at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217) [junit] at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:69) [junit] at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:385) [junit] at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:351) [junit] at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:389) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:503) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:495) [junit] at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:390) [junit] at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) [junit] at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1481) [junit] at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) [junit] at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) [junit] at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:552) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1481) [junit] at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:552) [junit] at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:543) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) [junit] at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:688) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Attachments
Attachments
Issue Links
- is broken by
-
MAPREDUCE-1981 Improve getSplits performance by using listLocatedStatus
- Closed