Pig
  1. Pig
  2. PIG-2101

Registering a Python function in a directory other than the current working directory fails

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.1
    • Fix Version/s: None
    • Component/s: impl
    • Labels:
      None

      Description

      In MapReduce mode, if the register command references a directory other than the current one, executing the Python UDF on the backend fails with: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[../udfs/python/production.py, production]'

      I assume it is using the path on the backend to try to locate the UDF.

      The script is:

      register '../udfs/python/production.py' using jython as bballudfs;
      players  = load 'baseball' as (name:chararray, team:chararray,
                      pos:bag{t:(p:chararray)}, bat:map[]);
      nonnull  = filter players by bat#'slugging_percentage' is not null and
                      bat#'on_base_percentage' is not null;
      calcprod = foreach nonnull generate name, bballudfs.production(
                      (float)bat#'slugging_percentage',
                      (float)bat#'on_base_percentage');
      dump calcprod;
      

        Issue Links

          Activity

          Hide
          Daniel Eklund added a comment -

          I can actually still get it to work with a relative directory not involving '..'
          For instance
          Register 'test/simple.py' as myNamespace;

          where test is a subdir in the working directory. But any path with '..' fails.
          Would also be nice to add something in the documentation about NOT using absolute paths.

          Show
          Daniel Eklund added a comment - I can actually still get it to work with a relative directory not involving '..' For instance Register 'test/simple.py' as myNamespace; where test is a subdir in the working directory. But any path with '..' fails. Would also be nice to add something in the documentation about NOT using absolute paths.
          Hide
          Daniel Eklund added a comment -

          As per:
          http://mail-archives.apache.org/mod_mbox/pig-user/201106.mbox/browser

          While implicit in the title of this bug, the deserialization on the back-end is failing for any python file located in anything other than a relative directory Registered on the front end.

          The functionality should support all directory references: absolute, parent-relative (i.e. '../..'), and child-relative.

          Show
          Daniel Eklund added a comment - As per: http://mail-archives.apache.org/mod_mbox/pig-user/201106.mbox/browser While implicit in the title of this bug, the deserialization on the back-end is failing for any python file located in anything other than a relative directory Registered on the front end. The functionality should support all directory references: absolute, parent-relative (i.e. '../..'), and child-relative.

            People

            • Assignee:
              Unassigned
              Reporter:
              Alan Gates
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development