Pig
  1. Pig
  2. PIG-2761

With hadoop23 importing modules inside python script does not work

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.1
    • Fix Version/s: 0.9.3, 0.11, 0.10.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Because unjar has been removed from 23, registering scripts has issue. PIG-2745 addresses the issue of registering scripts with pig. But if the registered py script imports other modules then it does not work. Steps to reproduce the issue in https://issues.apache.org/jira/browse/PIG-2745?focusedCommentId=13396965&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13396965

      1. PIG-2761.patch
        6 kB
        Rohini Palaniswamy
      2. PIG-2761-branch09.patch
        6 kB
        Rohini Palaniswamy
      3. PIG-2761-branch10_1.patch
        7 kB
        Rohini Palaniswamy
      4. PIG-2761-initial.patch
        3 kB
        Rohini Palaniswamy
      5. PIG-2761-trunk.patch
        6 kB
        Rohini Palaniswamy

        Issue Links

          Activity

          Hide
          Rohini Palaniswamy added a comment -

          Initial patch for review. Reverted 2745 as removing leading / is moved to PigContext itself. Easier to do that in PigContext.addScriptFile() than repeat it in each of the ScriptEngine implementations and PigServer. Changed the ScriptEngine.getScriptAsStream() to try all classloaders.

          Working on writing a e2e test for this.

          This patch does not address 2760. Trying to see if there is a easy way to accomodate that in this patch without impacting the changes for s3 PIG-2623. The easier thing would be to add two copies of the script file to the jar - one with absolute path and one relative path but it is not efficient.

          Show
          Rohini Palaniswamy added a comment - Initial patch for review. Reverted 2745 as removing leading / is moved to PigContext itself. Easier to do that in PigContext.addScriptFile() than repeat it in each of the ScriptEngine implementations and PigServer. Changed the ScriptEngine.getScriptAsStream() to try all classloaders. Working on writing a e2e test for this. This patch does not address 2760. Trying to see if there is a easy way to accomodate that in this patch without impacting the changes for s3 PIG-2623 . The easier thing would be to add two copies of the script file to the jar - one with absolute path and one relative path but it is not efficient.
          Hide
          Rohini Palaniswamy added a comment -

          Modified scriptingudf.py to import another module so that it is exercised as part of current e2e scripting tests.

          Also included the fix from PIG-2760.

          Show
          Rohini Palaniswamy added a comment - Modified scriptingudf.py to import another module so that it is exercised as part of current e2e scripting tests. Also included the fix from PIG-2760 .
          Hide
          Rohini Palaniswamy added a comment -

          Patch contains a newly added file. svn add needs to be done before committing.

          svn add test/e2e/pig/udfs/python/stringutil.py

          Show
          Rohini Palaniswamy added a comment - Patch contains a newly added file. svn add needs to be done before committing. svn add test/e2e/pig/udfs/python/stringutil.py
          Hide
          Mathias Herberts added a comment -

          Fixes the issue of PIG-2760.

          Show
          Mathias Herberts added a comment - Fixes the issue of PIG-2760 .
          Hide
          Rohini Palaniswamy added a comment -

          Attaching separate patches for branch 0.10.1 and trunk. scriptingudf.py has conflicts with PIG-2761.patch in trunk.

          Show
          Rohini Palaniswamy added a comment - Attaching separate patches for branch 0.10.1 and trunk. scriptingudf.py has conflicts with PIG-2761 .patch in trunk.
          Hide
          Daniel Dai added a comment -

          +1

          Patch committed to 0.10 branch/trunk.

          Thanks Rohini!

          Show
          Daniel Dai added a comment - +1 Patch committed to 0.10 branch/trunk. Thanks Rohini!
          Hide
          Rohini Palaniswamy added a comment -

          Patch for pig 0.9 to work with python scripts in H23. Had to port it to pig-0.9 as many of our users have not migrated to 0.10 yet and so we need 0.9 to work with Hadoop 23.

          Show
          Rohini Palaniswamy added a comment - Patch for pig 0.9 to work with python scripts in H23. Had to port it to pig-0.9 as many of our users have not migrated to 0.10 yet and so we need 0.9 to work with Hadoop 23.
          Hide
          Daniel Dai added a comment -

          Committed to 0.9 branch as well as per requested by Rohini.

          Show
          Daniel Dai added a comment - Committed to 0.9 branch as well as per requested by Rohini.

            People

            • Assignee:
              Rohini Palaniswamy
              Reporter:
              Rohini Palaniswamy
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development