Pig
  1. Pig
  2. PIG-2791

Pig does not work with ViewFileSystem

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: None
    • Component/s: grunt
    • Labels:
      None
    • Environment:

      Pig QE

      Description

      The Yahoo Pig QE team ran into a blocking issue when trying to test Client-Side Mount Tables, on a Federated cluster with two NNs, this blocks Pig Testing on Federation.

      Federation relies strongly on the use of CSMT with viewFS, QE found that in this configuration it is not possible to enter grunt shell because Pig makes a call to getDefaultReplication() on the fs, which is ambiguous over viewFS and causes core to throw a org.apache.hadoop.fs.viewfs.NotInMountpointException: "getDefaultReplication on empty path is invalid".

      This in turn cause Pig to exit with an internal error as follows:

      2012-07-06 22:20:25,657 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.1.0.1206081058 (r1348169) compiled Jun 08 2012, 17:58:42
      2012-07-06 22:20:26,074 [main] WARN org.apache.hadoop.conf.Configuration - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
      2012-07-06 22:20:26,076 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: viewfs:///
      2012-07-06 22:20:26,080 [main] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
      2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. getDefaultReplication on empty path is invalid
      2012-07-06 22:20:26,522 [main] WARN org.apache.pig.Main - There is no log file to write to.
      2012-07-06 22:20:26,522 [main] ERROR org.apache.pig.Main - org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultReplication on empty path is invalid
      at org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:482)
      at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:77)
      at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:205)
      at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:118)
      at org.apache.pig.impl.PigContext.connect(PigContext.java:208)
      at org.apache.pig.PigServer.<init>(PigServer.java:246)
      at org.apache.pig.PigServer.<init>(PigServer.java:231)
      at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:47)
      at org.apache.pig.Main.run(Main.java:487)
      at org.apache.pig.Main.main(Main.java:111)

      1. PIG-2791-0.patch
        1 kB
        Daniel Dai
      2. PIG-2791-1.patch
        7 kB
        Rohini Palaniswamy
      3. PIG-2791-2.patch
        7 kB
        Rohini Palaniswamy
      4. asf_test_notes.txt
        7 kB
        Araceli Henley
      5. PIG-2791-3-branch10.patch
        15 kB
        Rohini Palaniswamy
      6. PIG-2791-3-trunk.patch
        16 kB
        Rohini Palaniswamy
      7. PIG-2791-4-branch10.patch
        15 kB
        Rohini Palaniswamy
      8. PIG-2791-4-trunk.patch
        16 kB
        Rohini Palaniswamy
      9. PIG-2791-5-trunk.patch
        15 kB
        Rohini Palaniswamy
      10. FixMiniCluster-branch10.patch
        2 kB
        Rohini Palaniswamy
      11. FixMiniCluster-branch10-1.patch
        1 kB
        Rohini Palaniswamy

        Issue Links

          Activity

          Hide
          Daniel Dai added a comment -

          Committed to trunk.

          Show
          Daniel Dai added a comment - Committed to trunk.
          Hide
          Rohini Palaniswamy added a comment -

          Daniel,
          You have not committed the patch to trunk yet.

          Show
          Rohini Palaniswamy added a comment - Daniel, You have not committed the patch to trunk yet.
          Hide
          Daniel Dai added a comment -

          Commit FixMiniCluster-branch10-1.patch to 0.10.

          Show
          Daniel Dai added a comment - Commit FixMiniCluster-branch10-1.patch to 0.10.
          Hide
          Daryn Sharp added a comment -

          Updated summary since the issue is unrelated to federation.

          Show
          Daryn Sharp added a comment - Updated summary since the issue is unrelated to federation.
          Hide
          Rohini Palaniswamy added a comment -

          Updated FixMiniCluster-branch10-1.patch as it did not apply cleanly (https://reviews.apache.org/r/6082/).

          Removed the change to build/classes as porting PIG-2326 to branch-0.10 will include that change and should handle the multiple builds failing in same build machine problem.

          Show
          Rohini Palaniswamy added a comment - Updated FixMiniCluster-branch10-1.patch as it did not apply cleanly ( https://reviews.apache.org/r/6082/ ). Removed the change to build/classes as porting PIG-2326 to branch-0.10 will include that change and should handle the multiple builds failing in same build machine problem.
          Hide
          Rohini Palaniswamy added a comment -

          Uploaded the change for branch-0.10 to put back conf_file.delete and change homedir/pigtest/conf to build/classes

          Show
          Rohini Palaniswamy added a comment - Uploaded the change for branch-0.10 to put back conf_file.delete and change homedir/pigtest/conf to build/classes
          Hide
          Rohini Palaniswamy added a comment -

          Daniel,
          Realized what caused your test failure. Had removed conf_file.delete() in MiniDFSCluster.java as it was causing tests to randomly fail without hadoop-site.xml if two builds were running simultaneously in hudson (patch builds and actual builds). If you switch between versions (23 and then 20), then the hadoop-site.xml created in /<homedir>/pigtest/conf is not correct and causes failure during MiniDfsCluster setup. Saw that in 0.10 it is created in build/classes instead of home dir and that fixes the problem better. Updated the trunk patch putting the conf_file.delete() back.

          Should I create a separate jira for putting conf_file.delete back in pig 0.10 with build/classes or post the patch here itself as this jira is not closed yet. Don't want developers to waste time debugging this as a issue.

          Show
          Rohini Palaniswamy added a comment - Daniel, Realized what caused your test failure. Had removed conf_file.delete() in MiniDFSCluster.java as it was causing tests to randomly fail without hadoop-site.xml if two builds were running simultaneously in hudson (patch builds and actual builds). If you switch between versions (23 and then 20), then the hadoop-site.xml created in /<homedir>/pigtest/conf is not correct and causes failure during MiniDfsCluster setup. Saw that in 0.10 it is created in build/classes instead of home dir and that fixes the problem better. Updated the trunk patch putting the conf_file.delete() back. Should I create a separate jira for putting conf_file.delete back in pig 0.10 with build/classes or post the patch here itself as this jira is not closed yet. Don't want developers to waste time debugging this as a issue.
          Hide
          Daniel Dai added a comment -

          Commit PIG-2791-4-branch10.patch to 0.10 branch. Will check into trunk once trunk tests pass.

          Show
          Daniel Dai added a comment - Commit PIG-2791 -4-branch10.patch to 0.10 branch. Will check into trunk once trunk tests pass.
          Hide
          Rohini Palaniswamy added a comment -

          Just made a minor change to the patch so that if the ivy dependency was changed to 0.23.3 also all the tests pass. Changed Util.Hadoop2_0() to (Util.isHadoop23 || Util.isHadoop2_0()) for fs copy command.

          Show
          Rohini Palaniswamy added a comment - Just made a minor change to the patch so that if the ivy dependency was changed to 0.23.3 also all the tests pass. Changed Util.Hadoop2_0() to (Util.isHadoop23 || Util.isHadoop2_0()) for fs copy command.
          Hide
          Daryn Sharp added a comment -

          +1 Rohini and I spoke and she cleared up my confusion. Looks good!

          Show
          Daryn Sharp added a comment - +1 Rohini and I spoke and she cleared up my confusion. Looks good!
          Hide
          Rohini Palaniswamy added a comment -

          Daryn,
          The HadoopShims in hadoop20 is for 20.x and 1.x versions of hadoop and does not use Path based variant of the API. The HadoopShims in hadoop23 is for 23 and 2.0 versions of hadoop and uses the Path based variant of the API. I am assuming you have mistaken 20 as 2.0.

          Show
          Rohini Palaniswamy added a comment - Daryn, The HadoopShims in hadoop20 is for 20.x and 1.x versions of hadoop and does not use Path based variant of the API. The HadoopShims in hadoop23 is for 23 and 2.0 versions of hadoop and uses the Path based variant of the API. I am assuming you have mistaken 20 as 2.0.
          Hide
          Daryn Sharp added a comment -

          Question: why does the shim use the Path based variant for 23 by not 2.0? 23, 2.0, and trunk all require use of the new api. The new api is also present in later versions of 1.x but isn't strictly required since ViewFileSystem isn't in 1.x.

          Show
          Daryn Sharp added a comment - Question: why does the shim use the Path based variant for 23 by not 2.0? 23, 2.0, and trunk all require use of the new api. The new api is also present in later versions of 1.x but isn't strictly required since ViewFileSystem isn't in 1.x.
          Hide
          Rohini Palaniswamy added a comment -

          Had also tried latest 0.23.3-SNAPSHOT instead of 2.0.0-alpha. Had same unit test failures. So decided to go with 2.0.0-alpha itself as dependency.

          Show
          Rohini Palaniswamy added a comment - Had also tried latest 0.23.3-SNAPSHOT instead of 2.0.0-alpha. Had same unit test failures. So decided to go with 2.0.0-alpha itself as dependency.
          Hide
          Rohini Palaniswamy added a comment -

          In addition to PIG-2791-2.patch, this patch ontains fix for unit tests for hadoop 23. Even after including Cheolsoo's patch from PIG-2700, there were lot of unit test failures.

          1) MiniYarnCluster node manager would not start with guice errors due to to jersey-guice-1.8 and guice-2.0 incompatibility. Took 2 days to figure this one out. Had to change dependency to guice-3.0 and add few more dependencies.
          2) FSShell copy command behaviour has changed. Had to do mkdir before copy commands as copy commands do not create the destination directory structure if it does not exist. Also had to add creation of fs working directory (user home directory in dfs) while creating MiniCluster.

          Ran full suite of unit and e2e tests for hadoop 23 with branch 10.

          Show
          Rohini Palaniswamy added a comment - In addition to PIG-2791 -2.patch, this patch ontains fix for unit tests for hadoop 23. Even after including Cheolsoo's patch from PIG-2700 , there were lot of unit test failures. 1) MiniYarnCluster node manager would not start with guice errors due to to jersey-guice-1.8 and guice-2.0 incompatibility. Took 2 days to figure this one out. Had to change dependency to guice-3.0 and add few more dependencies. 2) FSShell copy command behaviour has changed. Had to do mkdir before copy commands as copy commands do not create the destination directory structure if it does not exist. Also had to add creation of fs working directory (user home directory in dfs) while creating MiniCluster. Ran full suite of unit and e2e tests for hadoop 23 with branch 10.
          Hide
          Rohini Palaniswamy added a comment -

          @Cheolsoo
          Thanks for the pointer. I had only run the test-commit unit tests. I will include your patch also and do a full unit test suite run to see if there are any other issues. Thanks again.

          Show
          Rohini Palaniswamy added a comment - @Cheolsoo Thanks for the pointer. I had only run the test-commit unit tests. I will include your patch also and do a full unit test suite run to see if there are any other issues. Thanks again.
          Hide
          Araceli Henley added a comment -

          Short answers: the patch looks good.
          Comments:
          Here is a list of the manual tests I attempted. I tried running without a client side mount table and with a client side table. With hdfs as the default and viewfs as the default. With two name nodes. Reading and writing accross namenodes. It looks pretty good. There were some errors, but I believe they are expected due to invalid syntax.

          Show
          Araceli Henley added a comment - Short answers: the patch looks good. Comments: Here is a list of the manual tests I attempted. I tried running without a client side mount table and with a client side table. With hdfs as the default and viewfs as the default. With two name nodes. Reading and writing accross namenodes. It looks pretty good. There were some errors, but I believe they are expected due to invalid syntax.
          Hide
          Cheolsoo Park added a comment -

          Hi, I did some work to get unit test passing against 2.0.0-snapthot a while ago and recorded what I found at PIG-2700.

          Please feel free to use the patch if that's still applicable.

          Thanks!

          Show
          Cheolsoo Park added a comment - Hi, I did some work to get unit test passing against 2.0.0-snapthot a while ago and recorded what I found at PIG-2700 . Please feel free to use the patch if that's still applicable. Thanks!
          Hide
          Rohini Palaniswamy added a comment -

          @Daryn,
          The path variable could refer to other schemes like hbase://. Even though the variable is not used elsewhere I did not want to reassign a different value and change the meaning of it. Also there is a fs.setWorkingDirectory() in the code after that.

          Source code: http://svn.apache.org/viewvc/pig/branches/branch-0.10/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigInputFormat.java?view=markup [Line 232]

          @Daniel,
          It should not be a problem as long as the APIs are compatible. Only if there was a change of class from interface to abstract or vice versa between 0.23.1 and 2.0.0-alpha then there might be runtime issues even if compilation was successful. But just to be sure, I will kick off the e2e tests for 23 with the patch. I will also check with the core team folks who were doing 0.23 to 2.0.0 tests.

          Show
          Rohini Palaniswamy added a comment - @Daryn, The path variable could refer to other schemes like hbase://. Even though the variable is not used elsewhere I did not want to reassign a different value and change the meaning of it. Also there is a fs.setWorkingDirectory() in the code after that. Source code: http://svn.apache.org/viewvc/pig/branches/branch-0.10/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigInputFormat.java?view=markup [Line 232] @Daniel, It should not be a problem as long as the APIs are compatible. Only if there was a change of class from interface to abstract or vice versa between 0.23.1 and 2.0.0-alpha then there might be runtime issues even if compilation was successful. But just to be sure, I will kick off the e2e tests for 23 with the patch. I will also check with the core team folks who were doing 0.23 to 2.0.0 tests.
          Hide
          Daryn Sharp added a comment -

          You might consider replacing fs = new Path("/").getFileSystem(conf) with fs = FileSystem.get(conf) to get the default filesystem. I'd suggest adding path = fs.getWorkingDirectory() after that line so the isFsPath boolean can be removed.

          Do you need to add anything to support the new api on trunk?

          Show
          Daryn Sharp added a comment - You might consider replacing fs = new Path("/").getFileSystem(conf) with fs = FileSystem.get(conf) to get the default filesystem. I'd suggest adding path = fs.getWorkingDirectory() after that line so the isFsPath boolean can be removed. Do you need to add anything to support the new api on trunk?
          Hide
          Daniel Dai added a comment -

          2.0.0-alpha seems to be a big change. We need to carefully test it to make sure unit tests pass.

          Show
          Daniel Dai added a comment - 2.0.0-alpha seems to be a big change. We need to carefully test it to make sure unit tests pass.
          Hide
          Rohini Palaniswamy added a comment -

          Added getDefaultBlockSize(Path) to the HadoopShims.

          Had to use 2.0.0-alpha (http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/) as 0.23.1 from maven did not have fs.getDefaultBlockSize(Path) api.

          Show
          Rohini Palaniswamy added a comment - Added getDefaultBlockSize(Path) to the HadoopShims. Had to use 2.0.0-alpha ( http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/ ) as 0.23.1 from maven did not have fs.getDefaultBlockSize(Path) api.
          Hide
          Daryn Sharp added a comment -

          The problem is there is no single replication factor or block size with viewfs. It's dependent on each mounted fs. The getDefaultBlockSize() and getDefaultReplication() were deprecated in favor of methods that accept a Path. This allows a fs like viewfs to resolve the mount point for a path and return the correct values.

          The easy solution is don't call the methods at all and let the smaller signature create, etc methods implicitly get the replication and block size. I understand that's not an option for most of pig's use cases.

          Hbase encountered the same problem and fixed it with a little reflection magic on HBASE-6067.

          Show
          Daryn Sharp added a comment - The problem is there is no single replication factor or block size with viewfs. It's dependent on each mounted fs. The getDefaultBlockSize() and getDefaultReplication() were deprecated in favor of methods that accept a Path . This allows a fs like viewfs to resolve the mount point for a path and return the correct values. The easy solution is don't call the methods at all and let the smaller signature create, etc methods implicitly get the replication and block size. I understand that's not an option for most of pig's use cases. Hbase encountered the same problem and fixed it with a little reflection magic on HBASE-6067 .
          Hide
          Rohini Palaniswamy added a comment -

          Daniel,
          This is the actual cause.

          Caused by: org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultBlockSize on empty path is invalid
          at org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultBlockSize(ViewFileSystem.java:477)
          at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:276)

          Show
          Rohini Palaniswamy added a comment - Daniel, This is the actual cause. Caused by: org.apache.hadoop.fs.viewfs.NotInMountpointException: getDefaultBlockSize on empty path is invalid at org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultBlockSize(ViewFileSystem.java:477) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:276)
          Hide
          Daniel Dai added a comment -

          Can you see AM log on the webUI?

          Show
          Daniel Dai added a comment - Can you see AM log on the webUI?
          Hide
          Araceli Henley added a comment -

          Hi Daniel
          I tried a couple of tests with the patch you provided. I'm getting a different error now.

          Assuming the client side mount table has the following:

          <property><name>fs.viewfs.impl</name>
          <value>org.apache.hadoop.fs.viewfs.ViewFileSystem</value>
          <description>The File System for viewfs:uris</description>
          </property>

          <property><name>fs.default.name</name>
          <value>viewfs:///</value>
          <final>true</final>
          </property>

          <property>
          <name>fs.viewfs.mounttable.default.link./data1</name>
          <value>hdfs://mycluster.yahoo.com:8020/user/me/pig/tests/data</value>
          </property>

          I confirm the file is visible ( I also tried fs -cat and it was successful)
          -bash-3.1$ hadoop fs -ls /data1/singlefile/studenttab10k
          rw-rr- 3 hadoopqa hdfs 219190 2012-07-10 23:02 /data1/singlefile/studenttab10k

          Next I try to a simple load and dump or store as follows:

          a = load '/data1/singlefile/studenttab10k' as (name, age, gpa);
          dump a;

          This results in a stack trace:

          RROR 1066: Unable to open iterator for alias a. Backend error : Trying to get information for an absent application application_1341957183614_0010

          org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a. Backend error : Trying to get information for an absent application application_1341957183614_0010
          at org.apache.pig.PigServer.openIterator(PigServer.java:852)
          at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682)
          at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
          at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
          at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
          at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
          at org.apache.pig.Main.run(Main.java:490)
          at org.apache.pig.Main.main(Main.java:111)
          Caused by: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent application application_1341957183614_0010
          at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:156)
          at $Proxy9.getApplicationReport(Unknown Source)
          at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClientImpl.java:116)
          at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:338)
          at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:143)
          at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:298)
          at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:383)
          at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:481)
          at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:184)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:627)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:625)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
          at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:625)
          at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:679)
          at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:673)
          at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:148)
          at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:383)
          at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
          at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1260)
          at org.apache.pig.PigServer.storeEx(PigServer.java:957)
          at org.apache.pig.PigServer.store(PigServer.java:924)
          at org.apache.pig.PigServer.openIterator(PigServer.java:837)
          ... 7 more
          ================================================================================
          Pig Stack Trace
          ---------------
          ERROR 2997: Encountered IOException. File or directory -l does not exist.

          java.io.IOException: File or directory -l does not exist.
          at org.apache.pig.tools.grunt.GruntParser.processLS(GruntParser.java:766)
          at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:366)
          at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
          at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
          at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
          at org.apache.pig.Main.run(Main.java:490)
          at org.apache.pig.Main.main(Main.java:111)

          Additionally the following also fails:

          cd /data1/singlefile
          a = load 'studenttab10k' as (name, age, gpa);
          dump a;

          Show
          Araceli Henley added a comment - Hi Daniel I tried a couple of tests with the patch you provided. I'm getting a different error now. Assuming the client side mount table has the following: <property><name>fs.viewfs.impl</name> <value>org.apache.hadoop.fs.viewfs.ViewFileSystem</value> <description>The File System for viewfs:uris</description> </property> <property><name>fs.default.name</name> <value>viewfs:///</value> <final>true</final> </property> <property> <name>fs.viewfs.mounttable.default.link./data1</name> <value>hdfs://mycluster.yahoo.com:8020/user/me/pig/tests/data</value> </property> I confirm the file is visible ( I also tried fs -cat and it was successful) -bash-3.1$ hadoop fs -ls /data1/singlefile/studenttab10k rw-r r - 3 hadoopqa hdfs 219190 2012-07-10 23:02 /data1/singlefile/studenttab10k Next I try to a simple load and dump or store as follows: a = load '/data1/singlefile/studenttab10k' as (name, age, gpa); dump a; This results in a stack trace: RROR 1066: Unable to open iterator for alias a. Backend error : Trying to get information for an absent application application_1341957183614_0010 org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias a. Backend error : Trying to get information for an absent application application_1341957183614_0010 at org.apache.pig.PigServer.openIterator(PigServer.java:852) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:682) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:490) at org.apache.pig.Main.main(Main.java:111) Caused by: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: Trying to get information for an absent application application_1341957183614_0010 at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:156) at $Proxy9.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ClientRMProtocolPBClientImpl.getApplicationReport(ClientRMProtocolPBClientImpl.java:116) at org.apache.hadoop.mapred.ResourceMgrDelegate.getApplicationReport(ResourceMgrDelegate.java:338) at org.apache.hadoop.mapred.ClientServiceDelegate.getProxy(ClientServiceDelegate.java:143) at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:298) at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:383) at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:481) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:184) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:627) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:625) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:625) at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:679) at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:673) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:148) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:383) at org.apache.pig.PigServer.launchPlan(PigServer.java:1275) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1260) at org.apache.pig.PigServer.storeEx(PigServer.java:957) at org.apache.pig.PigServer.store(PigServer.java:924) at org.apache.pig.PigServer.openIterator(PigServer.java:837) ... 7 more ================================================================================ Pig Stack Trace --------------- ERROR 2997: Encountered IOException. File or directory -l does not exist. java.io.IOException: File or directory -l does not exist. at org.apache.pig.tools.grunt.GruntParser.processLS(GruntParser.java:766) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:366) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:490) at org.apache.pig.Main.main(Main.java:111) Additionally the following also fails: cd /data1/singlefile a = load 'studenttab10k' as (name, age, gpa); dump a;
          Hide
          Daniel Dai added a comment -

          Seems we can drop "getDefaultReplication". It does not get used anywhere.

          Show
          Daniel Dai added a comment - Seems we can drop "getDefaultReplication". It does not get used anywhere.

            People

            • Assignee:
              Rohini Palaniswamy
              Reporter:
              patrick white
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development