Hive
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-7773

Union all query finished with errors [Spark Branch]

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.0
    • Component/s: Spark
    • Labels:
      None

      Description

      When I run a union all query, I found the following error in spark log (the query finished with correct results though):

      java.lang.RuntimeException: Map operator initialization failed
              at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:127)
              at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:52)
              at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
              at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
              at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164)
              at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
              at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596)
              at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
              at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
              at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
              at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
              at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
              at org.apache.spark.scheduler.Task.run(Task.scala:54)
              at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:744)
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Configuration and input path are inconsistent
              at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:404)
              at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:93)
              ... 16 more
      

      Judging from the log, I think we don't properly handle the input paths when cloning the job conf, so it may also affect other queries with multiple maps or reduces.

      1. HIVE-7773.spark.patch
        8 kB
        Rui Li
      2. HIVE-7773.2-spark.patch
        8 kB
        Brock Noland

        Issue Links

          Activity

          Hide
          Rui Li added a comment -

          I found the problem is that IOContext is used to store and retrieve input path for the operators. IOContext is a singleton when I submit the query via hive cli. Since spark tasks are threads within a JVM, the input path in IOContext will get messed up if concurrent tasks have different input paths. In my test case, two map works run concurrently for two different tables.
          This patch makes sure we always use a thread local IOContext.

          Show
          Rui Li added a comment - I found the problem is that IOContext is used to store and retrieve input path for the operators. IOContext is a singleton when I submit the query via hive cli. Since spark tasks are threads within a JVM, the input path in IOContext will get messed up if concurrent tasks have different input paths. In my test case, two map works run concurrently for two different tables. This patch makes sure we always use a thread local IOContext.
          Hide
          Brock Noland added a comment -

          Same patch, I just removed the section commented out in IOContext

          Show
          Brock Noland added a comment - Same patch, I just removed the section commented out in IOContext
          Hide
          Brock Noland added a comment -

          Hi Rui Li, yes thank you very much for updating IOContext. I have removed the section of code which was commented out. I also hit that issue when looking at joins! FYI Szehon Ho

          +1 pending tests

          Show
          Brock Noland added a comment - Hi Rui Li , yes thank you very much for updating IOContext. I have removed the section of code which was commented out. I also hit that issue when looking at joins! FYI Szehon Ho +1 pending tests
          Hide
          Brock Noland added a comment -

          Marking "Patch Available"

          Show
          Brock Noland added a comment - Marking "Patch Available"
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12662771/HIVE-7773.2-spark.patch

          ERROR: -1 due to 9 failed/errored test(s), 5925 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union8
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9
          org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/62/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/62/console
          Test logs: http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-62/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 9 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12662771

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12662771/HIVE-7773.2-spark.patch ERROR: -1 due to 9 failed/errored test(s), 5925 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union8 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2 Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/62/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/62/console Test logs: http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-62/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed This message is automatically generated. ATTACHMENT ID: 12662771
          Hide
          Brock Noland added a comment -

          FYI the Tez union failures are due to HIVE-7786.

          Show
          Brock Noland added a comment - FYI the Tez union failures are due to HIVE-7786 .
          Hide
          Brock Noland added a comment -

          Thank you so much for your contribution Rui! I have committed this to spark!

          Show
          Brock Noland added a comment - Thank you so much for your contribution Rui! I have committed this to spark!
          Hide
          Rui Li added a comment -

          Thank you Brock Noland

          Show
          Rui Li added a comment - Thank you Brock Noland

            People

            • Assignee:
              Rui Li
              Reporter:
              Rui Li
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development