Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26624

Different classloader use on subsequent call to same query, causing different behavior

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.2.0
    • None
    • SQL
    • None

    Description

      When calling a Hive UDF function from spark shell, we get the output when we call the query first time, but when we call the query again it gives following error
      #spark2-shell

      scala> spark.sql("select test(name) from customers limit 2").show (50, false)
      org.apache.spark.sql.AnalysisException: No handler for Hive UDF 'com.vnb.fgp.generic.udf.encrypt.EncryptGenericUDF':

      We have not provided the UDF jar files on the command line, but still we get the output. The function test is created in Hive service as a permanent function using the jar file.

      Debugging it further we see that on first invocation of the select command the following classLoader is being used and it has a path pointing to the hdfs directory as set in Hive service:

      loader: org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@42cef0af
      hdfs:/tmp/bimal/hive-extensions-1.0-SNAPSHOT-jar-with-dependencies.jar
      file:/usr/java/jdk1.8.0_162/jre/lib/resources.jar
      file:/usr/java/jdk1.8.0_162/jre/lib/rt.jar

      On subsequent calls, a different class loader is being used:

      loader scala.tools.nsc.interpreter.IMain$TranslatingClassLoader@7bc3ec95
      file:/usr/java/jdk1.8.0_162/jre/lib/resources.jar
      file:/usr/java/jdk1.8.0_162/jre/lib/rt.jar
      file:/usr/java/jdk1.8.0_162/jre/lib/jsse.jar
      file:/usr/java/jdk1.8.0_162/jre/lib/jce.jar

      This does not have the hdfs path for the jar file and hence the exception is generated.

      Most probably the classloader is picking things from Hive metastore.

      If we pass the UDF jar files on command line using --jars option, everything works fine.
      But this indicates that the classLoader and classpaths are different when called first and second time causing inconsistent behavior and cause problem.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bimalenduc Bimalendu Choudhary
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: