Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-9104

windowing.q failed when mapred.reduce.tasks is set to larger than one

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.14.0
    • 1.1.0
    • PTF-Windowing
    • None

    Description

      Test windowing.q is actually not enabled in Spark branch - in test configurations it is windowing.q.q.

      I just run this test, and query

      -- 12. testFirstLastWithWhere
      select  p_mfgr,p_name, p_size,
      rank() over(distribute by p_mfgr sort by p_name) as r,
      sum(p_size) over (distribute by p_mfgr sort by p_name rows between current row and current row) as s2,
      first_value(p_size) over w1 as f,
      last_value(p_size, false) over w1 as l
      from part
      where p_mfgr = 'Manufacturer#3'
      window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding and 2 following);
      

      failed with the following exception:

      java.lang.RuntimeException: Hive Runtime Error while closing operators: null
        at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446)
        at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58)
        at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108)
        at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
        at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115)
        at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
        at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
      Caused by: java.util.NoSuchElementException
        at java.util.ArrayDeque.getFirst(ArrayDeque.java:318)
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290)
        at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413)
        at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
        at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
        at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431)
        ... 15 more
      

      We need to find out:

      • Since which commit this test started failing, and
      • Why it fails

      Attachments

        1. HIVE-9104.patch
          24 kB
          Chao Sun
        2. HIVE-9104.2.patch
          24 kB
          Chao Sun

        Activity

          People

            csun Chao Sun
            csun Chao Sun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: