Pig
  1. Pig
  2. PIG-2634

JarManager fails for rsrc: jars

    Details

      Description

      The following error occurs when using PigSever class, while using registerScript (InputStream). The Eclipse Maven project adds "pig-0.9.2.jar" as a dependency. When runned, if fails to merge pig-0.9.2.jar. The whole trace is:

      ###
      Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration.
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:727)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:150)
      at org.apache.pig.PigServer.launchPlan(PigServer.java:1314)
      at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1299)
      at org.apache.pig.PigServer.execute(PigServer.java:1289)
      at org.apache.pig.PigServer.access$400(PigServer.java:125)
      at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1591)
      ... 44 more
      Caused by: java.io.FileNotFoundException: rsrc:pig-0.9.2-core.jar (No such file or directory)
      at java.io.FileInputStream.open(Native Method)
      at java.io.FileInputStream.<init>(FileInputStream.java:137)
      at java.io.FileInputStream.<init>(FileInputStream.java:96)
      at org.apache.pig.impl.util.JarManager.mergeJar(JarManager.java:176)
      at org.apache.pig.impl.util.JarManager.createJar(JarManager.java:118)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:412)
      ###

      Pig should not fail for jars on the "rsrc:" path.

      1. mergeJarFix.diff
        1 kB
        Catalin Alexandru Zamfir
      2. pig_1333633836139.log
        5 kB
        Catalin Alexandru Zamfir

        Activity

        Catalin Alexandru Zamfir created issue -
        Hide
        Catalin Alexandru Zamfir added a comment -

        Attaching log files while trying to use PigRunner instead of PigServer. Seems the same problem of import pig-0.9.2-core.jar from the classpath.

        Show
        Catalin Alexandru Zamfir added a comment - Attaching log files while trying to use PigRunner instead of PigServer. Seems the same problem of import pig-0.9.2-core.jar from the classpath.
        Catalin Alexandru Zamfir made changes -
        Field Original Value New Value
        Attachment pig_1333633836139.log [ 12521500 ]
        Catalin Alexandru Zamfir made changes -
        Labels engine mapreducel merge parser pigserver
        Catalin Alexandru Zamfir made changes -
        Priority Major [ 3 ] Blocker [ 1 ]
        Hide
        Daniel Dai added a comment -

        Hi, Catalin,
        Do you have "register pig-0.9.2.jar" in your script? Why you register it? Is pig-0.9.2.jar in your current directory?

        Show
        Daniel Dai added a comment - Hi, Catalin, Do you have "register pig-0.9.2.jar" in your script? Why you register it? Is pig-0.9.2.jar in your current directory?
        Hide
        Catalin Alexandru Zamfir added a comment -

        Hi Daniel,
        The quick way to test this is:

        • build an Eclipse (Maven) project;
        • add the necessary libraries (for example pig-0.9.2-core.jar) to the build path;
        • export (packaged) the project and run it with a given script. The script runs properly when run via "pig -x mapreduce order.pig" for example.
        • but fails when the Java program, using PigServer/PigRunner tries to run the same script.
        • it fails as i cannot find "rsrc: pig-0.9.2-core.ajr" in the createJar/mergeJar methods.

        So as far as I'm guessing, it does not include jars that are currently on the build path. For example if I add pig-0.9.2-core.jar as an Eclipse dependency for the project, it does not find it, when it should.

        I tried adding a "register /path/to/pig.jar" in the script, but still fails with the same name of the Eclipse dependency.
        If I don't add pig-0.9.2-core.jar as a dependency in the Eclipse project, I cannot use PigServer/PigRunner.

        So, the mergeJar/createJar methods need to be fised, maybe using Apache VFS2 to get jars on the build-path or in a path inside the running jar and add them to the jar that's going to be built by these methods. (which seems to be the JobXXXXXXXXXX.jar that gets submitted to the jobtracker).

        Help would be appreciated, if not by a fix, then by help or direction, as I'm able to fix it on my own and provide a patch back to ASF/Pig.

        Show
        Catalin Alexandru Zamfir added a comment - Hi Daniel, The quick way to test this is: build an Eclipse (Maven) project; add the necessary libraries (for example pig-0.9.2-core.jar) to the build path; export (packaged) the project and run it with a given script. The script runs properly when run via "pig -x mapreduce order.pig" for example. but fails when the Java program, using PigServer/PigRunner tries to run the same script. it fails as i cannot find "rsrc: pig-0.9.2-core.ajr" in the createJar/mergeJar methods. So as far as I'm guessing, it does not include jars that are currently on the build path. For example if I add pig-0.9.2-core.jar as an Eclipse dependency for the project, it does not find it, when it should. I tried adding a "register /path/to/pig.jar" in the script, but still fails with the same name of the Eclipse dependency. If I don't add pig-0.9.2-core.jar as a dependency in the Eclipse project, I cannot use PigServer/PigRunner. So, the mergeJar/createJar methods need to be fised, maybe using Apache VFS2 to get jars on the build-path or in a path inside the running jar and add them to the jar that's going to be built by these methods. (which seems to be the JobXXXXXXXXXX.jar that gets submitted to the jobtracker). Help would be appreciated, if not by a fix, then by help or direction, as I'm able to fix it on my own and provide a patch back to ASF/Pig.
        Hide
        Catalin Alexandru Zamfir added a comment -

        Fixed it on a checkout of the trunk by checking in mergeJar if the "jar" contains "rsrc" and if so, using the second "mergeJar" method. Can anyone instruct me how I can provide a patch for it? Do I upload it here?

        Show
        Catalin Alexandru Zamfir added a comment - Fixed it on a checkout of the trunk by checking in mergeJar if the "jar" contains "rsrc" and if so, using the second "mergeJar" method. Can anyone instruct me how I can provide a patch for it? Do I upload it here?
        Hide
        Gianmarco De Francisci Morales added a comment -

        Hi Catalin,
        looks like this issue got lost in the queue.
        Yes, if you have a fix you can upload your patch here and it will be reviewed by a committer.

        Show
        Gianmarco De Francisci Morales added a comment - Hi Catalin, looks like this issue got lost in the queue. Yes, if you have a fix you can upload your patch here and it will be reviewed by a committer.
        Hide
        Catalin Alexandru Zamfir added a comment -

        Attached fix in JarManager.java, to take in account that some paths are "rsrc:/path/to/library.jar". Usually happens when exporting a runnable jar from Eclipse via the "package referenced libraries" option.

        Show
        Catalin Alexandru Zamfir added a comment - Attached fix in JarManager.java, to take in account that some paths are "rsrc:/path/to/library.jar". Usually happens when exporting a runnable jar from Eclipse via the "package referenced libraries" option.
        Catalin Alexandru Zamfir made changes -
        Attachment mergeJarFix.diff [ 12526112 ]
        Hide
        Catalin Alexandru Zamfir added a comment -

        Hy Gianmarco,
        I've attached the "diff". If a developer chose to provide "pig-xx.jar" in a monolithic jar, along-side his code and used Eclipse to export a runnable jar, while choosing to package these jars, dependencies on the "rsrc:" path were not found. Hope this gets fixed by 0.11.

        Show
        Catalin Alexandru Zamfir added a comment - Hy Gianmarco, I've attached the "diff". If a developer chose to provide "pig-xx.jar" in a monolithic jar, along-side his code and used Eclipse to export a runnable jar, while choosing to package these jars, dependencies on the "rsrc:" path were not found. Hope this gets fixed by 0.11.
        Catalin Alexandru Zamfir made changes -
        Affects Version/s 0.10.0 [ 12316246 ]
        Hide
        Catalin Alexandru Zamfir added a comment -

        Confirming it affects 0.10 also. Can anyone merge the diff with the trunk? So what happens here is simple: if someone adds the "pig.jar" to an eclipse project and when exporting a "Runnable jar" choses to "package jars" instead of "extract jars", then any pig script will fail with the exception FileNotFound, "rsrc: pig-xxx.jar". The diff attached fixes the bug by checking if the path starts with "rsrc:". If that's so, it opens a stream to the resource.

        Thanks!

        Show
        Catalin Alexandru Zamfir added a comment - Confirming it affects 0.10 also. Can anyone merge the diff with the trunk? So what happens here is simple: if someone adds the "pig.jar" to an eclipse project and when exporting a "Runnable jar" choses to "package jars" instead of "extract jars", then any pig script will fail with the exception FileNotFound, "rsrc: pig-xxx.jar". The diff attached fixes the bug by checking if the path starts with "rsrc:". If that's so, it opens a stream to the resource. Thanks!
        Hide
        Daniel Dai added a comment -

        Hi, Catalin, we use eclipse-files targets to generate eclipse projects. I am not sure why eclipse-maven tries to merge "rsrc: pig-0.9.2-core.jar", but it sounds more an eclipse-maven issue than Pig issue, I am not sure if we want to put this code in Pig. Can you use "ant eclipse-files" to generate eclipse project instead?

        Show
        Daniel Dai added a comment - Hi, Catalin, we use eclipse-files targets to generate eclipse projects. I am not sure why eclipse-maven tries to merge "rsrc: pig-0.9.2-core.jar", but it sounds more an eclipse-maven issue than Pig issue, I am not sure if we want to put this code in Pig. Can you use "ant eclipse-files" to generate eclipse project instead?
        Hide
        Catalin Alexandru Zamfir added a comment -

        Don't think eclipse-maven is doing the "rsrc: pig-0.9.2-core.jar". As you can see from the exception above, it happens at the "mapReduceLayer". It tries to build a .jar with all registered packages at the time of execution. When someone exports a runnable jar via Eclipse, with the "package jars" option, all dependencies (including pig-0.9.2.jar) will be in the generated jar. So the path inside the app is: "rsrc: pig-0.9.2-core.jar". But the JarManager only knows to read jars from the file path not the "rsrc" path.

        The diff fixes that. And also works for our development environment. If it can be included, oky. If not, the JIRA ticket should be left for posterity as others may have the same problem. It took us a few days of digging in code and understanding what was going on.

        Thanks! We will continue to checkout the trunk and patch it every time we need a new Pig version and build our own flavor of Pig with this issue fixed. But would have been wonderful to have it in the vanilla Pig distribution. Saves a ton of trouble.

        Show
        Catalin Alexandru Zamfir added a comment - Don't think eclipse-maven is doing the "rsrc: pig-0.9.2-core.jar". As you can see from the exception above, it happens at the "mapReduceLayer". It tries to build a .jar with all registered packages at the time of execution. When someone exports a runnable jar via Eclipse, with the "package jars" option, all dependencies (including pig-0.9.2.jar) will be in the generated jar. So the path inside the app is: "rsrc: pig-0.9.2-core.jar". But the JarManager only knows to read jars from the file path not the "rsrc" path. The diff fixes that. And also works for our development environment. If it can be included, oky. If not, the JIRA ticket should be left for posterity as others may have the same problem. It took us a few days of digging in code and understanding what was going on. Thanks! We will continue to checkout the trunk and patch it every time we need a new Pig version and build our own flavor of Pig with this issue fixed. But would have been wonderful to have it in the vanilla Pig distribution. Saves a ton of trouble.
        Hide
        Daniel Dai added a comment -

        I guess here is what happen: Pig try to locate pig.jar by finding enclosing Pig classes in classpath, and merge pig.jar into job.jar. In this case eclipse find "rsrc: pig-0.9.2-core.jar" instead of pig-core.jar. Adjusting the order of jars in eclipse classpath may let Pig find the right jar.

        Show
        Daniel Dai added a comment - I guess here is what happen: Pig try to locate pig.jar by finding enclosing Pig classes in classpath, and merge pig.jar into job.jar. In this case eclipse find "rsrc: pig-0.9.2-core.jar" instead of pig-core.jar. Adjusting the order of jars in eclipse classpath may let Pig find the right jar.

          People

          • Assignee:
            Unassigned
            Reporter:
            Catalin Alexandru Zamfir
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development