The list in pushdownPreds of ppd.ExprWalkerInfo should not be allowed to grow very large


    • Affects Version/s: 1.1.0, 2.0.0
    • Fix Version/s: 1.3.0, 2.0.0
    • Component/s: Logical Optimizer
      Some queries are very slow in compile time, for example following query

      select * from tt1 nf 
      join tt2 a1 on (nf.col1 = a1.col1 and nf.hdp_databaseid = a1.hdp_databaseid) 
      join tt3 a2 on        (a2.col2 = a1.col2 and a2.col3 = nf.col3 and a2.hdp_databaseid = nf.hdp_databaseid) 
      join tt4 a3 on              (a3.col4 = a2.col4 and a3.col3 = a2.col3) 
      join tt5 a4 on     (a4.col4 = a2.col4 and a4.col5 = a2.col5 and a4.col3 = a2.col3 and a4.hdp_databaseid = nf.hdp_databaseid) 
      join tt6 a5 on              (a5.col3 = a2.col3 and a5.col2 = a2.col2 and a5.hdp_databaseid = nf.hdp_databaseid) 
      JOIN tt7 a6 ON (a2.col3 = a6.col3 and a2.col2 = a6.col2 and a6.hdp_databaseid = nf.hdp_databaseid) 
      JOIN tt8 a7 ON (a2.col3 = a7.col3 and a2.col2 = a7.col2 and a7.hdp_databaseid = nf.hdp_databaseid)
      where nf.hdp_databaseid = 102 limit 10;

      takes around 120 seconds to compile in hive 1.1 when
      and hive is not in test mode.
      All the above tables are tables with one column as partition. But all the tables are empty table. If the tables are not empty, it is reported that the compile so slow that it looks like hive is hanging.
      In hive 2.0, the compile is much faster, explain takes 6.6 seconds. But it is still a lot of time. One of the problem slows ppd down is that list in pushdownPreds can grow very large which makes extractPushdownPreds bad performance:

      public static ExprWalkerInfo extractPushdownPreds(OpWalkerInfo opContext,
          Operator<? extends OperatorDesc> op, List<ExprNodeDesc> preds)

      During run the query above, in the following break point preds has size of 12051, and most entry of the list is: GenericUDFOPEqual(Column[hdp_databaseid], Const int 102), GenericUDFOPEqual(Column[hdp_databaseid], Const int 102), GenericUDFOPEqual(Column[hdp_databaseid], Const int 102), GenericUDFOPEqual(Column[hdp_databaseid], Const int 102), ....
      Following code in extractPushdownPreds will clone all the nodes in preds and do the walk. Hive 2.0 is faster because HIVE-11652(and other jiras) makes startWalking much faster, but we still clone thousands of nodes with same expression. Should we store so many same predicates in the list or just one is good enough?

          List<Node> startNodes = new ArrayList<Node>();
          List<ExprNodeDesc> clonedPreds = new ArrayList<ExprNodeDesc>();
          for (ExprNodeDesc node : preds) {
            ExprNodeDesc clone = node.clone();
            exprContext.getNewToOldExprMap().put(clone, node);
          egw.startWalking(startNodes, null);

      Should we change java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java
      public void addFinalCandidate(String alias, ExprNodeDesc expr)
      public void addPushDowns(String alias, List<ExprNodeDesc> pushDowns)

      to only add expr which is not in the PushDown list for an alias?




              ychena Yongzhi Chen
              ychena Yongzhi Chen
