Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
    • Docs Text:
      This change codegens the sorting expressions in merging exchange operator. Benchmarking shows there can be up to 50% improvement in average runtime of merging exchange operator in certain cases.

      Description

      We should codegen the TupleRowComparator for merging-exchange operator. This should help a lot for queries such as:

      SELECT *
      FROM   (SELECT Rank()
                       OVER(
                         ORDER BY  l_orderkey) AS rank
              FROM   lineitem
              WHERE  l_shipdate < '1992-05-09') a
      WHERE  rank < 10;
      
       03:SELECT  
       |  predicates: rank() < 10
       |
       02:ANALYTIC
       |  functions: rank()
       |  order by: l_orderkey ASC  
       |  window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       |  
       04:MERGING-EXCHANGE [UNPARTITIONED]
       |  order by: l_orderkey ASC 
       | 
       01:SORT
       |  order by: l_orderkey ASC 
       | 
       00:SCAN HDFS [tpch_parquet.lineitem] 
         partitions=1/1 files=3 size=193.61MB
         predicates: l_shipdate < '1992-05-09' 
      

        Activity

        Hide
        mmokhtar Mostafa Mokhtar added a comment -

        Is this the same issue as in IMPALA-3101?

        Show
        mmokhtar Mostafa Mokhtar added a comment - Is this the same issue as in IMPALA-3101 ?
        Hide
        kwho Michael Ho added a comment -

        Not exactly the same issue. This one only addresses merging-exchange node (in particular the TupleRowComparator used by the SortedRunMerger in DataStreamRecvr). I already had it as part of IMPALA-3638 but don't want to lump it in case there is any regression.

        Show
        kwho Michael Ho added a comment - Not exactly the same issue. This one only addresses merging-exchange node (in particular the TupleRowComparator used by the SortedRunMerger in DataStreamRecvr). I already had it as part of IMPALA-3638 but don't want to lump it in case there is any regression.
        Hide
        kwho Michael Ho added a comment -

        https://github.com/apache/incubator-impala/commit/502220c69d483f785632eaf0030babf40be78ff4

        IMPALA-4269: Codegen merging exchange node
        This change enables codegen for the tuple row comparator
        used in merging-exchange node.

        With this change, merging-exchange operator improves by
        40% and 50% respectively for primitive_orderby_bigint and
        primitive_orderby_all on TPCH-300, speeding up the query by
        6% and 11% respectively.

        Change-Id: I944b8d52ea63ede58e4dc6fbe6e6953756394d41
        Reviewed-on: http://gerrit.cloudera.org:8080/4759
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/502220c69d483f785632eaf0030babf40be78ff4 IMPALA-4269 : Codegen merging exchange node This change enables codegen for the tuple row comparator used in merging-exchange node. With this change, merging-exchange operator improves by 40% and 50% respectively for primitive_orderby_bigint and primitive_orderby_all on TPCH-300, speeding up the query by 6% and 11% respectively. Change-Id: I944b8d52ea63ede58e4dc6fbe6e6953756394d41 Reviewed-on: http://gerrit.cloudera.org:8080/4759 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            kwho Michael Ho
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development