Pig
  1. Pig
  2. PIG-3593

Import jython standard module fail on cluster

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.1
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following script fail on cluster:

      import urllib
      
      @outputSchema("url:chararray")
      def urlDecode(str):
          return urllib.unquote_plus( str )
      
      register '126.py' using jython as myfuncs;
      
      a = load 'studenttab10k' using PigStorage() as (name:chararray, age:int, gpa:double);
      b = foreach a generate myfuncs.urlDecode(name);
      dump b;
      

      Error stack:

      java.io.IOException: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[127.py, resplit]'
      at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:59)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:180)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:394)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
      at org.apache.hadoop.mapred.Child.main(Child.java:249)
      Caused by: java.lang.RuntimeException: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[127.py, resplit]'
      at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:727)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:126)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.readObject(POUserFunc.java:567)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.ArrayList.readObject(ArrayList.java:593)
      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.HashMap.readObject(HashMap.java:1030)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.ArrayList.readObject(ArrayList.java:593)
      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.HashMap.readObject(HashMap.java:1030)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:57)
      ... 9 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:695)
      ... 75 more
      Caused by: java.lang.IllegalStateException: Could not initialize: 127.py
      at org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java:92)
      ... 80 more
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1121: Python Error. Traceback (most recent call last):
      File "127.py", line 2, in <module>
      import re
      File "_pyclasspath_/re$py.class", line 279, in <module>
      java.lang.ArrayIndexOutOfBoundsException: 10
      at org.python.objectweb.asm.ClassReader.a(Unknown Source)
      at org.python.objectweb.asm.ClassReader.accept(Unknown Source)
      at org.python.objectweb.asm.ClassReader.accept(Unknown Source)
      at org.python.core.AnnotationReader.<init>(AnnotationReader.java:44)
      at org.python.core.imp.readCode(imp.java:219)
      at org.python.core.util.importer.getModuleCode(importer.java:202)
      at org.python.core.util.importer.importer_load_module(importer.java:95)
      at org.python.core.ClasspathPyImporter.ClasspathPyImporter_load_module(ClasspathPyImporter.java:63)
      at org.python.core.ClasspathPyImporter$ClasspathPyImporter_load_module_exposer._call_(Unknown Source)
      at org.python.core.PyBuiltinMethodNarrow._call_(PyBuiltinMethodNarrow.java:47)
      at org.python.core.imp.loadFromLoader(imp.java:518)
      at org.python.core.imp.find_module(imp.java:472)
      at org.python.core.imp.import_next(imp.java:718)
      at org.python.core.imp.import_module_level(imp.java:827)
      at org.python.core.imp.importName(imp.java:917)
      at org.python.core.ImportFunction._call(builtin_.java:1220)
      at org.python.core.PyObject._call_(PyObject.java:357)
      at org.python.core._builtin.import(builtin_.java:1173)
      at org.python.core.imp.importOne(imp.java:936)
      at re$py.f$0(/home/frank/hg/jython/jython/dist/Lib/re.py:289)
      at re$py.call_function(/home/frank/hg/jython/jython/dist/Lib/re.py)
      at org.python.core.PyTableCode.call(PyTableCode.java:165)
      at org.python.core.PyCode.call(PyCode.java:18)
      at org.python.core.imp.createFromCode(imp.java:391)
      at org.python.core.util.importer.importer_load_module(importer.java:109)
      at org.python.core.ClasspathPyImporter.ClasspathPyImporter_load_module(ClasspathPyImporter.java:63)
      at org.python.core.ClasspathPyImporter$ClasspathPyImporter_load_module_exposer._call_(Unknown Source)
      at org.python.core.PyBuiltinMethodNarrow._call_(PyBuiltinMethodNarrow.java:47)
      at org.python.core.imp.loadFromLoader(imp.java:518)
      at org.python.core.imp.find_module(imp.java:472)
      at org.python.core.imp.import_next(imp.java:718)
      at org.python.core.imp.import_module_level(imp.java:827)
      at org.python.core.imp.importName(imp.java:917)
      at org.python.core.ImportFunction._call(builtin_.java:1220)
      at org.python.core.PyObject._call_(PyObject.java:357)
      at org.python.core._builtin.import(builtin_.java:1173)
      at org.python.core.imp.importOne(imp.java:936)
      at org.python.pycode._pyx3.f$0(127.py:3)
      at org.python.pycode._pyx3.call_function(127.py)
      at org.python.core.PyTableCode.call(PyTableCode.java:165)
      at org.python.core.PyCode.call(PyCode.java:18)
      at org.python.core.Py.runCode(Py.java:1275)
      at org.python.util.PythonInterpreter.execfile(PythonInterpreter.java:235)
      at org.apache.pig.scripting.jython.JythonScriptEngine$Interpreter.execfile(JythonScriptEngine.java:217)
      at org.apache.pig.scripting.jython.JythonScriptEngine$Interpreter.init(JythonScriptEngine.java:163)
      at org.apache.pig.scripting.jython.JythonScriptEngine.getFunction(JythonScriptEngine.java:388)
      at org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java:55)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
      at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:695)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:126)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.readObject(POUserFunc.java:567)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.ArrayList.readObject(ArrayList.java:593)
      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.HashMap.readObject(HashMap.java:1030)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.ArrayList.readObject(ArrayList.java:593)
      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at java.util.HashMap.readObject(HashMap.java:1030)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
      at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
      at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
      at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
      at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
      at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:57)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:180)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:394)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
      at org.apache.hadoop.mapred.Child.main(Child.java:249)

      Seems objectweb asm does not like repackaged jython libraries.

      We have similar tests in TestScriptUDF with MiniCluster but those are running fine. Those are added by PIG-1824 and that ticket seems solves the issue. But I check all released versions since 0.10.0, none is working on cluster. I am not sure whether we fixed issue once in a while then break it, or we didn't solve the issue on the cluster.

        Activity

        Hide
        Daniel Dai added a comment -

        In my case, even register fail. job.jar always take precedence.

        Patch committed to 0.12 branch and trunk. Thanks Rohini for quick review!

        Show
        Daniel Dai added a comment - In my case, even register fail. job.jar always take precedence. Patch committed to 0.12 branch and trunk. Thanks Rohini for quick review!
        Hide
        Rohini Palaniswamy added a comment -

        I am not sure whether we fixed issue once in a while then break it, or we didn't solve the issue on the cluster.

        This issue does not happen always. Only happens when there are imports within the module that you are importing and happens in special cases. We also hit a similar error, but that went away when we did a register of the jython jar in the script. This would help get rid of the extra register we did to workaround the problem.

        Show
        Rohini Palaniswamy added a comment - I am not sure whether we fixed issue once in a while then break it, or we didn't solve the issue on the cluster. This issue does not happen always. Only happens when there are imports within the module that you are importing and happens in special cases. We also hit a similar error, but that went away when we did a register of the jython jar in the script. This would help get rid of the extra register we did to workaround the problem.
        Hide
        Rohini Palaniswamy added a comment -

        +1. Just a minor comment. You don't have to do scriptJar.toString() as it is already a string

        Show
        Rohini Palaniswamy added a comment - +1. Just a minor comment. You don't have to do scriptJar.toString() as it is already a string
        Hide
        Daniel Dai added a comment -

        asm does not complain if we ship jython.jar as a single unit and put in distributed cache. Attach patch. Since this issue only manifest on cluster, and similar tests on MiniCluster pass, I don't include a test case here.

        Show
        Daniel Dai added a comment - asm does not complain if we ship jython.jar as a single unit and put in distributed cache. Attach patch. Since this issue only manifest on cluster, and similar tests on MiniCluster pass, I don't include a test case here.

          People

          • Assignee:
            Daniel Dai
            Reporter:
            Daniel Dai
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development