Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26654 Test with the TPC-DS benchmark
  3. HIVE-27006

ParallelEdgeFixer inserts misconfigured operator and does not connect it in Tez DAG

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Hive fails to run the below query on 1TB ORC formatted TPC-DS dataset because of runtime error happens in one Operator.
      I found that the problematic operator is inserted by ParallelEdgeFixer.
      Also I observed that the corresponding vertex has no descendant vertex although its ReduceSinkOperator has a SemiJoin edge connected to TableScanOperator.
      (I attached the figure of Tez DAG and OperatorGraph. One can check that Cluster6 and Cluster7 are connected while Reducer4 and Map7 are not.)

       

      Query

      set hive.optimize.shared.work=true;
      set hive.optimize.shared.work.parallel.edge.support=true;
      with
        inv00 as (select inv_item_sk, inv_warehouse_sk from inventory, date_dim where inv_date_sk = d_date_sk and d_year = 2000),
        inv01 as (select inv_item_sk, inv_warehouse_sk from inventory, date_dim where inv_date_sk = d_date_sk and d_year = 2001),
        inv02 as (select inv_item_sk, inv_warehouse_sk from inventory, date_dim where inv_date_sk = d_date_sk and d_year = 2002),
        sd00 as (select inv_item_sk id, w_zip zip from inv00 full outer join warehouse on inv_warehouse_sk = w_warehouse_sk where w_state = 'SD'),
        sd01 as (select inv_item_sk id, w_zip zip from inv01 full outer join warehouse on inv_warehouse_sk = w_warehouse_sk where w_state = 'SD'),
        sd02 as (select inv_item_sk id, w_zip zip from inv02 full outer join warehouse on inv_warehouse_sk = w_warehouse_sk where w_state = 'SD')
      select * from sd00, sd01, sd02 where sd00.id = sd01.id and sd00.id = sd02.id; 

       

      Error message

      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
              at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:385)
              at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:301)
              ... 18 more
      Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: cannot find field _col0 from []
              at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:384)
              at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
              at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:94)
              at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370)
              ... 19 more
      Caused by: java.lang.RuntimeException: cannot find field _col0 from []
              at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:550)
              at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153)
              at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
              at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:1073)
              at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:1099)
              at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:305)
              ... 22 more 

      Attachments

        1. after.PEF.png
          400 kB
          Seonggon Namgung
        2. tez-dag.png
          221 kB
          Seonggon Namgung

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            seonggon Seonggon Namgung Assign to me
            seonggon Seonggon Namgung
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 50m
              50m

              Slack

                Issue deployment