Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13314

Hive on spark mapjoin errors if spark.master is not set

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • Spark
    • None

    Description

      There are some errors that happen if spark.master is not set.

      This is despite the code defaulting to yarn-cluster if spark.master is not set by user or on the config files: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java#L51

      The funny thing is that while it works the first time due to this default, subsequent tries will fail as the hiveConf is refreshed without that default being set.

      https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java#L180

      Exception is follows:

      Job aborted due to stage failure: Task 40 in stage 1.0 failed 4 times, most recent failure: Lost task 40.3 in stage 1.0 (TID 22, d2409.halxg.cloudera.com): java.lang.RuntimeException: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:154)
      	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
      	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
      	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
      	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
      	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
      	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
      	at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
      	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
      	at org.apache.spark.scheduler.Task.run(Task.scala:89)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:117)
      	at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:197)
      	at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:223)
      	at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051)
      	at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
      	at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
      	at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055)
      	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:490)
      	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)
      	... 16 more
      Caused by: java.lang.NullPointerException
      	at org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.isDedicatedCluster(SparkUtilities.java:108)
      	at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:124)
      	at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:114)
      	... 24 more
      
      Driver stacktrace:
      

      Attachments

        Issue Links

          Activity

            People

              szehon Szehon Ho
              szehon Szehon Ho
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: