[SPARK-26624] Different classloader use on subsequent call to same query, causing different behavior - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.2.0
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

When calling a Hive UDF function from spark shell, we get the output when we call the query first time, but when we call the query again it gives following error
#spark2-shell

scala> spark.sql("select test(name) from customers limit 2").show (50, false)
org.apache.spark.sql.AnalysisException: No handler for Hive UDF 'com.vnb.fgp.generic.udf.encrypt.EncryptGenericUDF':

We have not provided the UDF jar files on the command line, but still we get the output. The function test is created in Hive service as a permanent function using the jar file.

Debugging it further we see that on first invocation of the select command the following classLoader is being used and it has a path pointing to the hdfs directory as set in Hive service:

loader: org.apache.spark.sql.internal.NonClosableMutableURLClassLoader@42cef0af
hdfs:/tmp/bimal/hive-extensions-1.0-SNAPSHOT-jar-with-dependencies.jar
file:/usr/java/jdk1.8.0_162/jre/lib/resources.jar
file:/usr/java/jdk1.8.0_162/jre/lib/rt.jar

On subsequent calls, a different class loader is being used:

loader scala.tools.nsc.interpreter.IMain$TranslatingClassLoader@7bc3ec95
file:/usr/java/jdk1.8.0_162/jre/lib/resources.jar
file:/usr/java/jdk1.8.0_162/jre/lib/rt.jar
file:/usr/java/jdk1.8.0_162/jre/lib/jsse.jar
file:/usr/java/jdk1.8.0_162/jre/lib/jce.jar

This does not have the hdfs path for the jar file and hence the exception is generated.

Most probably the classloader is picking things from Hive metastore.

If we pass the UDF jar files on command line using --jars option, everything works fine.
But this indicates that the classLoader and classpaths are different when called first and second time causing inconsistent behavior and cause problem.

Attachments

Issue Links

duplicates

SPARK-26560 Repeating select on udf function throws analysis exception - function not registered

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Bimalendu Choudhary

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 15/Jan/19 14:47

Updated:: 01/Mar/19 21:50

Resolved:: 01/Mar/19 21:50