Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.12.0
    • None
    • None
    • Reviewed

    Description

      Another category of failure in e2e tests, such as ComputeSpec_1, ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, RaceConditions_4, RaceConditions_7, RaceConditions_8.

      Here is stack:
      ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files (x86)/GnuWin32/bin/head.exe

      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration.
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151)
      at org.apache.pig.PigServer.launchPlan(PigServer.java:1318)
      at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303)
      at org.apache.pig.PigServer.execute(PigServer.java:1293)
      at org.apache.pig.PigServer.executeBatch(PigServer.java:364)
      at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
      at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
      at org.apache.pig.Main.run(Main.java:561)
      at org.apache.pig.Main.main(Main.java:111)
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files (x86)/GnuWin32/bin/head.exe
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447)

      Attachments

        1. PIG-2956-2.patch
          3 kB
          Daniel Dai
        2. PIG-2956-1.patch
          1 kB
          Daniel Dai
        3. PIG-2956-1_0.10.patch
          0.9 kB
          Daniel Dai

        Activity

          Why not replace "new URL(src.toString())" with src.toURI() in the first place?

          dvryaboy Dmitriy V. Ryaboy added a comment - Why not replace "new URL(src.toString())" with src.toURI() in the first place?
          daijy Daniel Dai added a comment -

          I did that before and hit exception in Linux (I forget the exception). So src.toURI() is only for Windows, for Linux, have to use new URL(src.toString()). I will post the Linux exception later.

          daijy Daniel Dai added a comment - I did that before and hit exception in Linux (I forget the exception). So src.toURI() is only for Windows, for Linux, have to use new URL(src.toString()). I will post the Linux exception later.
          daijy Daniel Dai added a comment -

          Here is the failure using "src.toURI()" directly with a order by statement:
          Message: java.io.FileNotFoundException: File does not exist: /tmp/temp-1510081022/tmp-1308657145#pigsample_1889145873_1351808882314
          at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:517)
          at org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185)
          at org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:721)
          at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:763)
          at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:655)
          at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:174)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:865)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
          at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
          at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
          at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
          at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
          at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
          at java.lang.Thread.run(Thread.java:680)
          at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)

          src.toURI() does encode "#" character, and Hadoop on Linux have trouble finding the distributed cache item with "#" encoded. However, Windows is happy to take encoded "#" character, I am not sure why. And the original "new URI(src.toString())" fail if src contains ":" character. So the logic becomes:
          1. On Linux, "new URI(src.toString())" always success, src never encoded, and DistributedCache is happy
          2. On Windows, src contains ":", "new URI(src.toString())" fail, src will be encoded using "src.toUri()", and DistributedCache don't mind

          daijy Daniel Dai added a comment - Here is the failure using "src.toURI()" directly with a order by statement: Message: java.io.FileNotFoundException: File does not exist: /tmp/temp-1510081022/tmp-1308657145#pigsample_1889145873_1351808882314 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:517) at org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:185) at org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:721) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:763) at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:655) at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:865) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157) at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134) at java.lang.Thread.run(Thread.java:680) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257) src.toURI() does encode "#" character, and Hadoop on Linux have trouble finding the distributed cache item with "#" encoded. However, Windows is happy to take encoded "#" character, I am not sure why. And the original "new URI(src.toString())" fail if src contains ":" character. So the logic becomes: 1. On Linux, "new URI(src.toString())" always success, src never encoded, and DistributedCache is happy 2. On Windows, src contains ":", "new URI(src.toString())" fail, src will be encoded using "src.toUri()", and DistributedCache don't mind
          daijy Daniel Dai added a comment -

          PIG-2956-1_0.10.patch is the same patch for 0.10 branch.

          daijy Daniel Dai added a comment - PIG-2956 -1_0.10.patch is the same patch for 0.10 branch.
          julienledem Julien Le Dem added a comment -

          you should still catch exceptions that come out of toURI() and handle them in the same way the URI exception was handled before I think. Bad URI should still be handled.

          julienledem Julien Le Dem added a comment - you should still catch exceptions that come out of toURI() and handle them in the same way the URI exception was handled before I think. Bad URI should still be handled.
          julienledem Julien Le Dem added a comment -

          Daniel? any update on this?

          julienledem Julien Le Dem added a comment - Daniel? any update on this?
          gates Alan Gates added a comment -

          Canceling patch pending addressing Julien's comments.

          gates Alan Gates added a comment - Canceling patch pending addressing Julien's comments.
          daijy Daniel Dai added a comment -

          Dive deeper into the issue. Here is what I find:
          1. The problem with "new URI(src.toString())": It is not compatible with Windows path style, colon and space will result an exception

          2. The problem with "src.toUri()": It encodes "#" character. Hadoop will use URI.getFragment() to get the symlink from a uri, getFragment only search for "#" character not the encoded one

          I attach a new patch, which takes out the symlink part, use "src.toUri()" to encode the rest, then append the symlink. Tested both Windows and Linux works.

          daijy Daniel Dai added a comment - Dive deeper into the issue. Here is what I find: 1. The problem with "new URI(src.toString())": It is not compatible with Windows path style, colon and space will result an exception 2. The problem with "src.toUri()": It encodes "#" character. Hadoop will use URI.getFragment() to get the symlink from a uri, getFragment only search for "#" character not the encoded one I attach a new patch, which takes out the symlink part, use "src.toUri()" to encode the rest, then append the symlink. Tested both Windows and Linux works.
          gates Alan Gates added a comment -

          +1

          gates Alan Gates added a comment - +1
          daijy Daniel Dai added a comment -

          Patch committed to trunk.

          daijy Daniel Dai added a comment - Patch committed to trunk.

          People

            daijy Daniel Dai
            daijy Daniel Dai
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: